Taking Two And More Data Frames And Extracting Data On Unique Keys In Python
Firstly I have 2 data frames one in which I have name of a guy and pages liked by him in columns. So no. of columns will be different for different person here is the example. 1st
Solution 1:
Thank you for your code. Now it is more clear.
I try optimalize your loops and I think you can rather use isin
with any
for mask
with boolean indexing
. Also I simplier code in concat
:
##adding a column category in df1 based on index
df1['category'] = df2['categories']
##creating a list of page which i have in meta_data
meta_list = list(df3.iloc[:,0])
mask = df1.isin(meta_list).any(1)
new_df1 = (df1[mask])
new_df2 = (df1[~mask])
## merging newdf1 and newdf2 on page_name and category repectively
mdf1 = pd.merge(new_df1, df3, how= 'left', on ='page_name')
mdf2 = pd.merge(new_df2, df4, how= 'left', on='category')
## concatenating the 2 data frame mdf1 and mdf2 and summing the tags for each of them
finaldf = pd.concat([mdf1,mdf2])
## finally grouping on user and summing the tags for each user
finaldf1 = finaldf.groupby('user', as_index=False).sum()
print (finaldf1)
user tag1 tag2 tag3
0 Roshan ghai 0.01.01.01 mank nion 1.01.02.02 pop rajuel 2.00.01.03 random guy 2.01.01.0
Post a Comment for "Taking Two And More Data Frames And Extracting Data On Unique Keys In Python"