Pandas中的to_datetime函數用法
import datetime
import pandas as pd
import numpy as np
將字符串轉換爲日期時間:
pd.to_datetime('2023-09-06')
Timestamp('2023-09-06 00:00:00')
將多個字符串轉換爲日期時間:
pd.to_datetime(['2023-09-06', '2023-09-07', '2023-09-08'])
DatetimeIndex(['2023-09-06', '2023-09-07', '2023-09-08'], dtype='datetime64[ns]', freq=None)
處理缺失值,將不合法的日期轉換爲NaT(Not a Timestamp):
pd.to_datetime(['2023-09-06', '2023-09-07', 'invalid_date', '2023-09-08'], errors='coerce')
DatetimeIndex(['2023-09-06', '2023-09-07', 'NaT', '2023-09-08'], dtype='datetime64[ns]', freq=None)
指定日期時間格式:
pd.to_datetime('06/09/23 12:34:56', format='%d/%m/%y %H:%M:%S')
Timestamp('2023-09-06 12:34:56')
處理時間戳(Unix時間戳):
pd.to_datetime(1630899296, unit='s')
Timestamp('2021-09-06 03:34:56')
處理多個日期列,生成DataFrame:
data = {'date1': ['2023-09-06', '2023-09-07', '2023-09-08'],
'date2': ['2023-09-09', '2023-09-10', '2023-09-11']}
df = pd.DataFrame(data)
df['date1'] = pd.to_datetime(df['date1'])
df['date2'] = pd.to_datetime(df['date2'])
df['date1']
0 2023-09-06
1 2023-09-07
2 2023-09-08
Name: date1, dtype: datetime64[ns]
df['date2']
0 2023-09-09
1 2023-09-10
2 2023-09-11
Name: date2, dtype: datetime64[ns]
從DataFrame的多個列中組合日期時間
參數可以是常見的縮寫,例如['year','month','day','minute','second','ms','us','ns'],或者是它們的複數形式。
df = pd.DataFrame({'year': [2015, 2016],
'month': [2, 3],
'day': [4, 5]})
df
year | month | day | |
---|---|---|---|
0 | 2015 | 2 | 4 |
1 | 2016 | 3 | 5 |
pd.to_datetime(df)
0 2015-02-04
1 2016-03-05
dtype: datetime64[ns]