How Can I Count A Resampled Multi-indexed Dataframe In Pandas
Solution 1:
When using count
, state isn't a nuisance column (it can count strings) so the resample
is going to apply count to it (although the output is not what I would expect). You could do something like (tell it only to apply count
to value_a
),
>>> print df2.groupby(['State']).resample('W',how={'value_a':'count'})
value_a
State Date
Alabama 2012-01-01 2
2012-01-08 6
Georgia 2012-01-01 2
2012-01-08 6
Or more generally, you can apply different kinds of how
to different columns:
>>> print df2.groupby(['State']).resample('W',how={'value_a':'count','State':'last'})
State value_a
State Date
Alabama 2012-01-01 Alabama 2
2012-01-08 Alabama 6
Georgia 2012-01-01 Georgia 2
2012-01-08 Georgia 6
So while the above allows you to count
a resampled multi-index dataframe it doesn't explain the behavior of output fromhow='count'
. The following is closer to the way I would expect it to behave:
print df2.groupby(['State']).resample('W',how={'value_a':'count','State':'count'})
State value_a
State Date
Alabama 2012-01-01 2 2
2012-01-08 6 6
Georgia 2012-01-01 2 2
2012-01-08 6 6
Solution 2:
@Karl D soln is correct; this will be possible in 0.14/master (releasing shortly), see docs here
In [118]: df2.groupby([pd.Grouper(level='Date',freq='W'),'State']).count()
Out[118]:
value_a
Date State
2012-01-01 Alabama 2
Georgia 2
2012-01-08 Alabama 6
Georgia 6
Prior to 0.14 it was difficult to groupby / resample with a time based grouper and another grouper. pd.Grouper
allows a very flexible specification to do this.
Post a Comment for "How Can I Count A Resampled Multi-indexed Dataframe In Pandas"