Skip to content Skip to sidebar Skip to footer

Split Pandas Dataframe String Into Separate Rows

I have a dataframe of text strings which essentially represents one or many journeys per row. I'm trying to split the legs of the journey so I can see them individually. The exampl

Solution 1:

Try with explode

df=df_input.assign(var2=df_input.var2.str.split('/')).explode('var2')
  var1 var2  var3
0    A    x  abc1
0    A    y  abc1
0    A    z  abc1
1    B   xx  abc2
1    B   yy  abc2
2    c   zz  abcd

Then groupby + shift

df.var1=df.groupby(level=0).var2.shift().fillna(df.var1)
df
  var1 var2  var3
0    A    x  abc1
0    x    y  abc1
0    y    z  abc1
1    B   xx  abc2
1   xx   yy  abc2
2    c   zz  abcd

Solution 2:

Solution

Try this.

EDIT: Made a change based on the suggestion from @Ben.T.

df = pd.concat([df.rename(columns={'var2': 'var2old'}), 
                df.var2.str.split('/').explode()], 
               axis=1, join='outer')
## CREDIT: @Ben.T
df['var1'] = df['var1'].where(df['var1'].ne(df['var1'].shift()), df['var2'].shift())
print(df)

Output:

  var1 var2old  var3 var2
0    A   x/y/z  abc1    x
0    x   x/y/z  abc1    y
0    y   x/y/z  abc1    z
1    B   xx/yy  abc2   xx
1   xx   xx/yy  abc2   yy
2    c      zz  abcd   zz

Dummy Data

The data originally posted by the OP (Original Poster of the question).

import pandas as pd

df = pd.DataFrame([{'var1':'A', 'var2':'x/y/z', 'var3':'abc1'}, 
                   {'var1':'B', 'var2':'xx/yy', 'var3':'abc2'}, 
                   {'var1':'c', 'var2':'zz', 'var3':'abcd'}])

Post a Comment for "Split Pandas Dataframe String Into Separate Rows"