Skip to content Skip to sidebar Skip to footer

Modify Value Of Pandas Dataframe Groups

We have the following dataframe (df) that has 3 columns. The goal is to make sure that the summation of 'Load' for each group based on IDs is equal to 1. pd.DataFrame({'ID':['AEC',

Solution 1:

You can use drop_duplicates to keep the first record in each group and then change the Load value so that its group Load sum is 1.

df.loc[df.ID.drop_duplicates().index, 'Load'] -= df.groupby('ID').Load.sum().subtract(1).values

df
Out[92]: 
   Num   ID      Load
01  AEC  0.46159112  AEC  0.53840923  CIZ  0.10686934  CIZ  0.74656645  CIZ  0.146566

df.groupby('ID').Load.sum()
Out[93]: 
ID
AEC    1.0
CIZ    1.0
Name: Load, dtype: float64

Solution 2:

I am using resample random pick one value from each group to make the change

df['New']=(1-df.groupby('ID').Load.transform('sum'))

df['Load']=df.Load.add(df.groupby('ID').New.apply(lambda x : x.sample(1)).reset_index('ID',drop=True)).fillna(df.Load)

df.drop('New',1)
Out[163]: 
   Num   ID      Load
01  AEC  0.20932712  AEC  0.79067323CIZ0.14656634CIZ0.74656645CIZ0.106869

Check

df.drop('New',1).groupby('ID').Load.sum()
Out[164]: 
ID
AEC    1.0
CIZ    1.0
Name: Load, dtype: float64

Post a Comment for "Modify Value Of Pandas Dataframe Groups"