Order String Sequences Within A Cell
I have the following data in a column of a Pandas dataframe: col_1 ,B91-10,B7A-00,B7B-00,B0A-01,B0A-00,B64-03,B63-00,B7B-01 ,B8A-01,B5H-02,B32-02,B57-00 ,B83-01,B83-00,B5H-00 ,B83-
Solution 1:
Option 1
If you want to sort these lexicographically, split on comma and then use np.sort
:
v = np.sort(df.col_1.str.split(',', expand=True).fillna(''), axis=1)
df = pd.DataFrame(v).agg(','.join, 1).str.strip(',')
df
0 B0A-00,B0A-01,B63-00,B64-03,B7A-00,B7B-00,B7B-...
1 B32-02,B57-00,B5H-02,B8A-01
2 B5H-00,B83-00,B83-01
3 B83-00,B83-01
4 B83-00,B83-01
5 B0N-02,B83-00,B92-00,B92-01
6 B91-16
Option 2
Split on comma and call apply
+ sorted
:
df.col_1.str.split(',').apply(sorted, 1).str.join(',').str.strip(',')
0 B0A-00,B0A-01,B63-00,B64-03,B7A-00,B7B-00,B7B-...
1 B32-02,B57-00,B5H-02,B8A-01
2 B5H-00,B83-00,B83-01
3 B83-00,B83-01
4 B83-00,B83-01
5 B0N-02,B83-00,B92-00,B92-01
6 B91-16
Name: col_1, dtype: object
Thanks to @Dark for the improvement!
Post a Comment for "Order String Sequences Within A Cell"