Skip to content Skip to sidebar Skip to footer

Alternate Method To Avoid Loop In Pandas Dataframe

I have the following dataframe: table2 = pd.DataFrame({ 'Product Type': ['A', 'B', 'C', 'D'], 'State_1_Value': [10, 11, 12, 13], 'State_2_Value': [20, 21, 22, 2

Solution 1:

I was able to accomplish this with no loops using the following code:

As a result on my 10k x 200 table it ran in 3 minutes instead of the previous 2 hours.

Unfortunately now I need to run it on a 10k x 4k table, and I hit MemoryError on that one, but it may be out of the scope of this question.

df= pd.DataFrame({
            'Product Type': ['A', 'B', 'C', 'D'],
            'State_1_Value': [10, 11, 12, 13],
        'State_2_Value': [20, 21, 22, 23],
        'State_3_Value': [30, 31, 32, 33],
        'State_4_Value': [40, 41, 42, 43],
        'State_5_Value': [50, 51, 52, 53],
        'State_6_Value': [60, 61, 62, 63],
        'Lower_Bound': [-1, 1, .5, 5],
        'Upper_Bound': [1, 2, .625, 15],
        'sim_1': [0, 0, .61, 7],
        'sim_2': [1, 1.5, .7, 9],
        })


buckets = df.ix[:,-2:].sub(df['Lower_Bound'],axis=0).div(df['Upper_Bound'].sub(df['Lower_Bound'],axis=0),axis=0) * 5 + 1
low = buckets.applymap(int)
high = buckets.applymap(int) + 1
low = low.applymap(lambda x: 1if x < 1else x)
low = low.applymap(lambda x: 5if x > 5else x)
high = high.applymap(lambda x: 6if x > 6else x)
high = high.applymap(lambda x: 2if x < 2else x)
low_value = pd.DataFrame(df.filter(regex="State|Type").values[np.arange(low.shape[0])[:,None], low])
high_value = pd.DataFrame(df.filter(regex="State|Type").values[np.arange(high.shape[0])[:,None], high])
df1 = (high_value - low_value).mul((buckets - low).values) + low_value
df1['Product Type'] = df['Product Type']

Post a Comment for "Alternate Method To Avoid Loop In Pandas Dataframe"