Skip to content Skip to sidebar Skip to footer

Order String Sequences Within A Cell

I have the following data in a column of a Pandas dataframe: col_1 ,B91-10,B7A-00,B7B-00,B0A-01,B0A-00,B64-03,B63-00,B7B-01 ,B8A-01,B5H-02,B32-02,B57-00 ,B83-01,B83-00,B5H-00 ,B83-

Solution 1:

Option 1 If you want to sort these lexicographically, split on comma and then use np.sort:

v = np.sort(df.col_1.str.split(',', expand=True).fillna(''), axis=1)
df = pd.DataFrame(v).agg(','.join, 1).str.strip(',')

df

0    B0A-00,B0A-01,B63-00,B64-03,B7A-00,B7B-00,B7B-...
1                          B32-02,B57-00,B5H-02,B8A-01
2                                 B5H-00,B83-00,B83-01
3                                        B83-00,B83-01
4                                        B83-00,B83-01
5                          B0N-02,B83-00,B92-00,B92-01
6                                               B91-16

Option 2 Split on comma and call apply + sorted:

df.col_1.str.split(',').apply(sorted, 1).str.join(',').str.strip(',')

0    B0A-00,B0A-01,B63-00,B64-03,B7A-00,B7B-00,B7B-...
1                          B32-02,B57-00,B5H-02,B8A-01
2                                 B5H-00,B83-00,B83-01
3                                        B83-00,B83-01
4                                        B83-00,B83-01
5                          B0N-02,B83-00,B92-00,B92-01
6                                               B91-16
Name: col_1, dtype: object

Thanks to @Dark for the improvement!

Post a Comment for "Order String Sequences Within A Cell"