Skip to content Skip to sidebar Skip to footer

Why Does Numpy Silently Convert My Int Array To Strings When Calling Searchsorted?

I found a nasty bug in my code where I forgot to convert an integer from str to int before looking it up in a sorted array of integers. Having fixed it, I am still surprised that t

Solution 1:

This behavior happens because searchsorted requires the needle and haystack to have the same dtype. This is achieved using np.promote_types, which has the (perhaps unfortunate) behavior:

>>> np.promote_types(int, str)
dtype('S11')

This means that to get matching dtypes for an integer haystack and a string needle, the only valid transformation is to convert the haystack to a string type.

Once we have a common dtype, we check if it's possible to use with np.can_cast. This explains why floats aren't turned into strings, but ints are:

In [1]: np.can_cast(np.float, np.promote_types(np.float, str))
Out[1]: False

In [2]: np.can_cast(np.int, np.promote_types(np.int, str))
Out[2]: True

So to summarize, the strange behavior is a combination of promotion rules where numeric + string => string, and casting rules where int => string is allowable.

Post a Comment for "Why Does Numpy Silently Convert My Int Array To Strings When Calling Searchsorted?"