Python Pandas Check That String Is Only "date" Or Only "time" Or "datetime"
I am reading a csv using pandas str,date,float,time,datetime a,10/11/19,1.1,10:30:00,10/11/19 10:30 b,10/11/19,1.2,10:00:00,10/11/19 10:30 c,10/11/19,1.3,11:10:11,10/11/19 10:30 d
Solution 1:
If want test times, pandas by default use today dates, so possible solution is test them with Series.dt.date
, Timestamp.date
and Series.all
if all values of column match.
Also added another solution for test dates - test if same values after removed times by Series.dt.floor
:
df = pd.DataFrame({'a':['2019-01-01 12:23:10',
'2019-01-02 12:23:10'],
'b':['2019-01-01',
'2019-01-02'],
'c':['12:23:10',
'15:23:10'],
'd':['a','b']})
print (df)
a b c d
02019-01-01 12:23:102019-01-01 12:23:10 a
12019-01-02 12:23:102019-01-02 15:23:10 b
defcheck(col):
try:
dt = pd.to_datetime(df[col])
if (dt.dt.floor('d') == dt).all():
return ('Its a pure date field')
elif (dt.dt.date == pd.Timestamp('now').date()).all():
return ('Its a pure time field')
else:
return ('Its a Datetime field')
except:
return ('its not a datefield')
print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield
Another idea is also test if numeric columns and by default return not numeric for prevent casting numeric to datetimes, but if possible all datetimes contains only todays dates (f
column) then test for times is different with Series.str.contains
for match pattern HH:MM:SS
or H:MM:SS
:
df = pd.DataFrame({'a':['2019-01-01 12:23:10',
'2019-01-02'],
'b':['2019-01-01',
'2019-01-02'],
'c':['12:23:10',
'15:23:10'],
'd':['a','b'],
'e':[1,2],
'f':['2019-11-13 12:23:10',
'2019-11-13'],})
print (df)
a b c d e f
0 2019-01-01 12:23:10 2019-01-01 12:23:10 a 1 2019-11-13 12:23:10
1 2019-01-02 2019-01-02 15:23:10 b 2 2019-11-13
defcheck(col):
if np.issubdtype(df[col].dtype, np.number):
return ('its not a datefield')
try:
dt = pd.to_datetime(df[col])
if (dt.dt.floor('d') == dt).all():
return ('Its a pure date field')
elif df[col].str.contains(r"^\d{1,2}:\d{2}:\d{2}$").all():
return ('Its a pure time field')
else:
return ('Its a Datetime field')
except:
return ('its not a datefield')
print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
print (check('e'))
print (check('f'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield
its not a datefield
Its a Datetime field
Post a Comment for "Python Pandas Check That String Is Only "date" Or Only "time" Or "datetime""