Skip to content Skip to sidebar Skip to footer

How To Calculate Volume Weighted Average Price (vwap) Using A Pandas Dataframe With Ask And Bid Price?

How do i create another column called vwap which calculates the vwap if my table is as shown below? time bid_size bid ask ask_size trade trade_size

Solution 1:

You can use np.where to give you the price from the correct column (bid or ask) depending on the value in the trade column. Note that this gives you the bid price when no trade occurs, but because this is then multiplied by a NaN trade size it won't matter. I also forward filled the VWAP.

volume=df['trade_size']price=np.where(df['trade'].eq('ask'),df['ask'],df['bid'])df=df.assign(VWAP=((volume*price).cumsum()/vol.cumsum()).ffill())>>>dftimebid_sizebidaskask_sizetradetrade_sizephaseVWAP02019-01-07  07:45:01.064515495152.52152.5419NaNNaNOPENNaN12019-01-07  07:45:01.11007231152.53152.5419NaNNaNOPENNaN22019-01-07  07:45:01.11659632152.53152.5419NaNNaNOPENNaN32019-01-07  07:45:01.11686032152.53152.5421NaNNaNOPENNaN42019-01-07  07:45:01.11690534152.53152.5421NaNNaNOPENNaN52019-01-07  07:45:01.11698234152.53152.5431NaNNaNOPENNaN62019-01-07  07:45:01.14790138152.53152.5431NaNNaNOPENNaN72019-01-07  07:45:01.18997138152.53152.5431ask15.0OPEN152.5482019-01-07  07:45:01.18997138152.53152.5416NaNNaNOPEN152.5492019-01-07  07:45:01.19076637152.53152.5416NaNNaNOPEN152.54102019-01-07  07:45:01.19085637152.53152.5415NaNNaNOPEN152.54112019-01-07  07:45:01.19085637152.53152.5416ask1.0OPEN152.54122019-01-07  07:45:01.19393837152.53152.55108NaNNaNOPEN152.54132019-01-07  07:45:01.19393837152.53152.5415ask15.0OPEN152.54142019-01-07  07:45:01.1943262152.54152.55108NaNNaNOPEN152.54152019-01-07  07:45:01.1944532152.54152.5597NaNNaNOPEN152.54162019-01-07  07:45:01.1944796152.54152.5597NaNNaNOPEN152.54172019-01-07  07:45:01.19450719152.54152.5597NaNNaNOPEN152.54182019-01-07  07:45:01.19453219152.54152.5577NaNNaNOPEN152.54192019-01-07  07:45:01.19459819152.54152.5579NaNNaNOPEN152.54

Solution 2:

Here is one possible approach

Append VMAP column full of NaNs

df['VMAP'] = np.nan

Calculate VMAP (based on this equation provided by the OP) and assign values based on ask or bid, as requierd by the OP

for trade in ['ask','bid']:
    # Find indexes of `ask` or `buy`
    bid_idx = df[df.trade==trade].index

    # Slice DF based on `ask` or `buy`, using indexes
    df.loc[bid_idx, 'VMAP'] = (
        (df.loc[bid_idx, 'trade_size'] * df.loc[bid_idx, trade]).cumsum()
        /
        (df.loc[bid_idx, 'trade_size']).cumsum()
                )

print(df.iloc[:,1:])
               time  bid_size     bid     ask  ask_size trade  trade_size phase    VMAP
007:45:01.064515495152.52152.5419   NaN         NaN  OPEN     NaN
107:45:01.11007231152.53152.5419   NaN         NaN  OPEN     NaN
207:45:01.11659632152.53152.5419   NaN         NaN  OPEN     NaN
307:45:01.11686032152.53152.5421   NaN         NaN  OPEN     NaN
407:45:01.11690534152.53152.5421   NaN         NaN  OPEN     NaN
507:45:01.11698234152.53152.5431   NaN         NaN  OPEN     NaN
607:45:01.14790138152.53152.5431   NaN         NaN  OPEN     NaN
707:45:01.18997138152.53152.5431   ask        15.0OPEN152.54807:45:01.18997138152.53152.5416   NaN         NaN  OPEN     NaN
907:45:01.19076637152.53152.5416   NaN         NaN  OPEN     NaN
1007:45:01.19085637152.53152.5415   NaN         NaN  OPEN     NaN
1107:45:01.19085637152.53152.5416   ask         1.0OPEN152.541207:45:01.19393837152.53152.55108   NaN         NaN  OPEN     NaN
1307:45:01.19393837152.53152.5415   ask        15.0OPEN152.541407:45:01.1943262152.54152.55108   NaN         NaN  OPEN     NaN
1507:45:01.1944532152.54152.5597   NaN         NaN  OPEN     NaN
1607:45:01.1944796152.54152.5597   NaN         NaN  OPEN     NaN
1707:45:01.19450719152.54152.5597   NaN         NaN  OPEN     NaN
1807:45:01.19453219152.54152.5577   NaN         NaN  OPEN     NaN
1907:45:01.19459819152.54152.5579   NaN         NaN  OPEN     NaN

EDIT

As @edinhocorrectly indicated, the VMAP is the same as the trade_price column.

Solution 3:

Ok, here it is

df['trade_price'] = df.apply(lambda x: x['bid'] if x['trade']=='bid'else x['ask'], axis=1)
df['vwap'] = (df['trade_price'] * df['trade_size']).cumsum() / df['trade_size'].fillna(0).cumsum()

The first line: It saves the trade_price in a new column, so it is easier to retrieve it later. If you want, you can delete this line and make a function (maybe it is easier to read). But I prefer to see the intermediary results. Q: why it has values even when there is no trade? A: because of the way the lambda is written. The else captures the ask price. But it won't make a difference, because of the next step.

Second line: Here the real calculation takes places. The first part calculate the total volume traded until that moment (as you said, using cumulative sums makes life easier). The second part calculates the total volume traded until that moment (again, cumulative sums). If you want, you can break this line and make more intermediary columns. Q: why the fillna(0)? A: so the total volume don't get NaNs and you don't get a division error Q: why so many NaNs in the vwap column? A: Because of the lines that don't have trade. You can fill them with 0s, but would be better to keep the 'no trade' information.

Ps.: you may get a wrong result as it is considering volume and price only in the same direction. But, you could try to invert some signal to fix the volume in the way you expect (for instance: changing the ask price to negative).

and this code output:

    trade_price vwap
1152.54NaN2152.54NaN3152.54NaN4152.54NaN5152.54NaN6152.54NaN7152.54NaN8152.54152.549152.54NaN10152.54NaN11152.54NaN12152.54152.5413152.55NaN14152.54152.5415152.55NaN16152.55NaN17152.55NaN18152.55NaN19152.55NaN20152.55NaN

Post a Comment for "How To Calculate Volume Weighted Average Price (vwap) Using A Pandas Dataframe With Ask And Bid Price?"