Why Does Numpy Silently Convert My Int Array To Strings When Calling Searchsorted?
I found a nasty bug in my code where I forgot to convert an integer from str to int before looking it up in a sorted array of integers. Having fixed it, I am still surprised that t
Solution 1:
This behavior happens because searchsorted
requires the needle and haystack to have the same dtype. This is achieved using np.promote_types
, which has the (perhaps unfortunate) behavior:
>>> np.promote_types(int, str)
dtype('S11')
This means that to get matching dtypes for an integer haystack and a string needle, the only valid transformation is to convert the haystack to a string type.
Once we have a common dtype, we check if it's possible to use with np.can_cast
. This explains why floats aren't turned into strings, but ints are:
In [1]: np.can_cast(np.float, np.promote_types(np.float, str))
Out[1]: False
In [2]: np.can_cast(np.int, np.promote_types(np.int, str))
Out[2]: True
So to summarize, the strange behavior is a combination of promotion rules where numeric + string => string, and casting rules where int => string is allowable.
Post a Comment for "Why Does Numpy Silently Convert My Int Array To Strings When Calling Searchsorted?"