Stacking And Shaping Slices Of Dataframe (pandas) Without Looping
I have a DataFrame of the following form: var1 var2 var3 day 0 -0.001284819 0.00138089 1.022781 1 1 -0.001310201 0.001377473 1.022626
Solution 1:
One way is to use groupby's cumcount to create a column to track if it's first or second:
In [11]: df['occurrence'] = df.groupby('day').cumcount()
In [12]: df = df.set_index(['day', 'occurrence'])
Now you can do a bit of stacking and unstacking:
In [13]: df.stack(0).unstack(0)
Out[13]:
day 12345
occurrence
0 var1 -0.001285 -0.001331 -0.001362 -0.001394 -0.001467
var2 0.0013810.0013750.0014440.0014370.001433
var3 1.0227811.0224771.0222801.0220171.0217491 var1 -0.001310 -0.001360 -0.001372 -0.001431 -0.001491
var2 0.0013770.0014300.0014410.0014340.001432
var3 1.0226261.0223851.0221611.0219081.021602
Solution 2:
It's not necessarily the prettiest, but in the past I've done things like
df = pd.read_csv("vd.csv", sep="\s+")
d2 = pd.melt(df, id_vars="day")
d2["sample"] = d2.groupby(["variable", "day"])["day"].rank("first")
d3 = d2.pivot_table(index=["variable", "sample"], columns="day")
which gives
>>>d3
value
day 1 2 3 4 5
variable sample
var1 1 -0.001285 -0.001331 -0.001362 -0.001394 -0.001467
2 -0.001310 -0.001360 -0.001372 -0.001431 -0.001491
var2 1 0.001381 0.001375 0.001444 0.001437 0.001433
2 0.001377 0.001430 0.001441 0.001434 0.001432
var3 1 1.022781 1.022477 1.022280 1.022017 1.021749
2 1.022626 1.022385 1.022161 1.021908 1.021602
[6 rows x 5 columns]
(Although to be honest, I think Andy's way is slicker. I'll leave this here though because the melt-modify-pivot pattern has proved pretty useful for me in the past in harder cases.)
Post a Comment for "Stacking And Shaping Slices Of Dataframe (pandas) Without Looping"