Skip to content Skip to sidebar Skip to footer

Drop Duplicate Times In Xarray

I'm reading NetCDF files with open_mfdataset, which contain duplicate times. For each duplicate time I only want to keep the first occurrence, and drop the second (it will never oc

Solution 1:

I think xarray does not have its own method for this purpose, but the following works,

In [7]: _, index = np.unique(f['time'], return_index=True)

In [8]: index
Out[8]: array([ 0,  1,  2,  3,  4,  5,  7,  8,  9, 10, 11])

In [9]: f.isel(time=index)
Out[9]: 
<xarray.Dataset>
Dimensions:  (time: 11)
Coordinates:
  *time     (time) datetime64[ns] 2001-01-012001-01-01T01:00:00 ...
Data variables:
   var      (time) float64 dask.array<shape=(11,), chunksize=(6,)>

Solution 2:

Apparently stackoverflow won't let me comment... I wanted to add to Keisuke's answer. You can also use the get_index() function to get a pandas index.

f.sel(time=~f.get_index("time").duplicated())

Post a Comment for "Drop Duplicate Times In Xarray"