Skip to content Skip to sidebar Skip to footer

How To Shift Dates In A Pandas Dataframe (add X Months)?

I have a dataframe with columns of dates. I know how to shift dates by a fixed number of months (eg add 3 months to all the dates in column x); however, I cannot figure out how to

Solution 1:

IIUC you could use apply with axis=1:

In [23]:df.apply(lambdax:x['mydate']+pd.DateOffset(months=x['monthshift']),axis=1)Out[23]:02000-03-0112001-04-0122002-05-0132003-06-0142004-07-0152005-08-0162006-09-0172007-10-0182008-11-0192009-12-01dtype:datetime64[ns]

Solution 2:

"one"-liner using the underlying numpy functionality:

df['my date shifted'] = (
    df["mydate"].values.astype("datetime64[M]") 
    + df["month shift"].values.astype("timedelta64[M]")
)

Solution 3:

EdChurn's solution is indeed much faster than the answer of Anton Protopopov and in fact in my use case it executes in milliseconds versus the one with apply taking minutes. The problem is that the solution EdChurn posted in his comment gives slightly incorrect results. In the example:

import pandas as pd
import numpy as np
import datetime

df = pd.DataFrame()
df['year'] = np.arange(2000,2010)
df['month'] = 3

df['mydate'] = pd.to_datetime((df.year * 10000 + df.month * 100 + 1).apply(str), format='%Y%m%d')
df['month shift'] = np.arange(0,10)

The answer of:

df['my date shifted'] = df['mydate'] + pd.TimedeltaIndex( df['month shift'], unit='M')

gives: EdChurn solution

The correct solution can be obtained with:

defset_to_month_begin(series):
    #Following doesn't work:#  res = series.dt.floor("MS")#This also doesn't work (it fails in case the date is already the first day of the month):#  res = series - pd.offsets.MonthBegin(1)

    res = pd.to_datetime(series).dt.normalize()
    res = res - pd.to_timedelta(res.dt.day - 1, unit='d')
    return res

defadd_months(df, date_col, months_num_col):
    """This function adds the number of months specified per each row in `months_num_col` to date in `date_col`.
    This method is *significantly* faster than:
        df.apply(lambda x: x[date_col] + pd.DateOffset(months = x[months_num_col]), axis=1)
    """
    number_of_days_in_avg_month = 365.24 / 12
    time_delta = pd.TimedeltaIndex(df[months_num_col] * number_of_days_in_avg_month + 10, unit='D')
    return set_to_month_begin(df[date_col] + time_delta)

df['my date shifted'] = add_months(df, 'mydate', 'month shift')

This gives the following result: correct solution

Post a Comment for "How To Shift Dates In A Pandas Dataframe (add X Months)?"