Skip to content Skip to sidebar Skip to footer

Ndarray To Structured_array And Float To Int

The problem I encounter is that, by using ndarray.view(np.dtype) to get a structured array from a classic ndarray seems to miscompute the float to int conversion. Example talks be

Solution 1:

This is the difference between doing somearray.view(new_dtype) and calling astype.

What you're seeing is exactly the expected behavior, and it's very deliberate, but it's uprising the first time you come across it.

A view with a different dtype interprets the underlying memory buffer of the array as the given dtype. No copies are made. It's very powerful, but you have to understand what you're doing.

A key thing to remember is that calling view never alters the underlying memory buffer, just the way that it's viewed by numpy (e.g. dtype, shape, strides). Therefore, viewdeliberately avoids altering the data to the new type and instead just interprets the "old bits" as the new dtype.

For example:

In [1]: import numpy as np

In [2]: x = np.arange(10)

In [3]: x
Out[3]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [4]: x.dtype
Out[4]: dtype('int64')

In [5]: x.view(np.int32)
Out[5]: array([0, 0, 1, 0, 2, 0, 3, 0, 4, 0, 5, 0, 6, 0, 7, 0, 8, 0, 9, 0],
              dtype=int32)

In [6]: x.view(np.float64)
Out[6]:
array([  0.00000000e+000,   4.94065646e-324,   9.88131292e-324,
         1.48219694e-323,   1.97626258e-323,   2.47032823e-323,
         2.96439388e-323,   3.45845952e-323,   3.95252517e-323,
         4.44659081e-323])

If you want to make a copy of the array with a new dtype, use astype instead:

In [7]: x
Out[7]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [8]: x.astype(np.int32)
Out[8]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)

In [9]: x.astype(float)
Out[9]: array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.])

However, using astype with structured arrays will probably surprise you. Structured arrays treat each element of the input as a C-like struct. Therefore, if you call astype, you'll run into several suprises.


Basically, you want the columns to have a different dtype. In that case, don't put them in the same array. Numpy arrays are expected to be homogenous. Structured arrays are handy in certain cases, but they're probably not what you want if you're looking for something to handle separate columns of data. Just use each column as its own array.

Better yet, if you're working with tabular data, you'll probably find its easier to use pandas than to use numpy arrays directly. pandas is oriented towards tabular data (where columns are expected to have different types), while numpy is oriented towards homogenous arrays.

Solution 2:

Actually, from_arrays work, but it doesn't explain this weird comportment.

Here is the solution I've found:

np.core.records.fromarrays(B.T, dtype=A.dtype)

Solution 3:

The only solution which worked for me in similar situation:

np.array([tuple(row) forrowin B], dtype=A.dtype)

Post a Comment for "Ndarray To Structured_array And Float To Int"