Skip to content Skip to sidebar Skip to footer

Pandas: Set Row Values To Letter Of The Alphabet Corresponding To Index Number?

I have a dataframe: a b c country 0 5 7 11 Morocco 1 5 9 9 Nigeria 2 6 2 13 Spain I'd like to add a column e that is the letter of the alphabe

Solution 1:

One method would be to convert the index to a Series and then call apply and pass a lambda:

In[271]:
df['e'] = df.index.to_series().apply(lambda x: chr(ord('a') + x)).str.upper()
df

Out[271]: 
   a  b   c  country  e
05711  Morocco  A
1599  Nigeria  B
26213    Spain  C

basically your error here is that df.index is of type Int64Index and the chr function doesn't understand how to operate with this so by calling apply on a Series we iterate row-wise to convert.

I think performance-wise a list comprehension will be faster:

In[273]:
df['e'] = [chr(ord('a') + x).upper() for x in df.index]
df

Out[273]: 
   a  b   c  country  e
05711  Morocco  A
1599  Nigeria  B
26213    Spain  C

Timings

%timeit df.index.to_series().apply(lambda x: chr(ord('a') + x)).str.upper()
%timeit [chr(ord('a') + x).upper() for x in df.index]
1000 loops, best of 3: 491 µs per loop
100000 loops, best of 3: 19.2 µs per loop

Here the list comprehension method is significantly faster

Solution 2:

Here is an alternative functional solution. Assumes you have less countries than letters.

from string import ascii_uppercase
from operator import itemgetter

df['e'] = itemgetter(*df.index)(ascii_uppercase)

print(df)

   a  b   c  country  e
05711  Morocco  A
1599  Nigeria  B
26213    Spain  C

Solution 3:

you can use map and get values from df.index as well:

df['e'] = map(chr, ord('A') + df.index.values)

If you do speed comparison:

# Edchum
%timeit df.index.to_series().apply(lambda x: chr(ord('A') + x))
10000 loops, best of 3: 135 µs per loop
%timeit [chr(ord('A') + x) forx in df.index]
100000 loops, best of 3: 7.38 µs per loop
# jpp
%timeit itemgetter(*df.index)(ascii_uppercase)
100000 loops, best of 3: 7.23 µs per loop
# Me
%timeit map(chr,ord('A') + df.index.values)
100000 loops, best of 3: 3.12 µs per loop

so map seems the faster but it might be because of the length of the data sample

Post a Comment for "Pandas: Set Row Values To Letter Of The Alphabet Corresponding To Index Number?"