Skip to content Skip to sidebar Skip to footer

The Truth Value Of A Series Is Ambiguous In Dataframe

I have the same code,I'm trying to create new field in pandas dataframe with simple conditions: if df_reader['email1_b']=='NaN': df_reader['email1_fin']=df_reader['email1_a'] e

Solution 1:

df_reader['email1_b']=='NaN' is a vector of Boolean values (one per row), but you need one Boolean value for if to work. Use this instead:

df_reader['email1_fin'] = np.where(df_reader['email1_b']=='NaN', 
                                   df_reader['email1_a'],
                                   df_reader['email1_b'])

As a side note, are you sure about 'NaN'? Is it not NaN? In the latter case, your expression should be:

df_reader['email1_fin'] = np.where(df_reader['email1_b'].isnull(), 
                                   df_reader['email1_a'],
                                   df_reader['email1_b'])

Solution 2:

if expects a scalar value to be returned, it doesn't understand an array of booleans which is what is returned by your conditions. If you think about it what should it do if a single value in this array is False/True?

to do this properly you can do the following:

df_reader['email1_fin'] = np.where(df_reader['email1_b'] == 'NaN', df_reader['email1_a'], df_reader['email1_b'] )

also you seem to be comparing against the str'NaN' rather than the numerical NaN is this intended?

Post a Comment for "The Truth Value Of A Series Is Ambiguous In Dataframe"