1 Format datetime
sample un-formatted date in my dataset:
17-Jan-18 use "%d-%b-%y" to impute , which changes to 2017-01-18
Aug 20,2017 , use %b %d, %Y" to impute
other version see reference: http://strftime.org/
所有的format都是有跡可循,先按照原本的format去讀,再change to format datetime. 這樣就可以使用<,>,=來subset particular date interval. i.e. Aug, 這樣月的縮寫,用%b去讀,August用%B去讀。use lambda apply to whole column.
def format_date1(x, col_name):
s = x[col_name]
return datetime.strptime(s,'%d-%b-%y')
data['Day'] = data.apply(lambda x: format_date1(x, 'Day'), axis=1)
def format_date2(x, col_name):
s = x[col_name]
return datetime.strptime(s,'%b %d, %Y')
data['Day'] = data.apply(lambda x: format_date2(x, 'Day'), axis=1)
data.sort_values(by="Day")
2. data processing with datetime conditions
i.e.
data.loc[(data['Day']>='2017-07-10') & (data['Day']<="2017-07-30"),'Headline 2'] = "3 For Free Sale"
data.loc[data.Campaign == "First - Bath Tampa Bay", 'Headline 1'] = "One Day Bath Remodels"