文章目錄
1. date_range
- 時間戳 ( timestamp )
- 固定週期 ( period )
- 時間間隔 ( interval )
可以指定開始時間與週期
- H:小時
- D:天
- M:月
import numpy as np
import pandas as pd
rng = pd.date_range('2018-08-08', periods=10, freq='3D')
rng
import datetime
time = pd.Series(np.random.randn(20), index=pd.date_range(datetime.datetime(2018,8,8),periods=20))
time
2018-08-08 -0.116898
2018-08-09 0.236001
2018-08-10 0.465807
…
2018-08-26 1.008301
2018-08-27 0.225361
Freq: D, dtype: float64
2. truncate 過濾
time.truncate(before = '2018-8-15')
time.truncate(after = '2018-8-15')
2018-08-15 -1.244359
2018-08-16 1.043819
2018-08-17 1.870143
…
2018-08-26 1.008301
2018-08-27 0.225361
Freq: D, dtype: float64
time['2018-8-10' : '2018-8-14']
2018-08-10 0.465807
2018-08-11 1.365110
2018-08-12 -2.545710
2018-08-13 1.568111
2018-08-15 -1.244359
Freq: D, dtype: float64
data = pd.date_range('2018-1-1', '2019-1-1', freq='M')
data
3. Timestamp, Period, Timedelta
3.1 Timestamp 時間戳
print(pd.Timestamp('2019-9-9 10'))
pd.Timestamp('2019-9-9 10:15')
2019-09-09 10:00:00
Timestamp(‘2019-09-09 10:15:00’)
3.2 Period 時間區間
pd.Period('2019-1')
Period(‘2019-01’, ‘M’)
pd.Period('2019-1-1')
Period(‘2019-01-01’, ‘D’)
3.3 Timedelta 時間差
pd.Timedelta('1 day')
Timedelta(‘1 day 00:00:00’)
3.4 時間轉換
pd.Period('2019-1-1 10:25') + pd.Timedelta('1day')
Period(‘2019-01-02 10:25’, ‘T’)
pd.Timestamp('2019-1-1 10:25') + pd.Timedelta('1day')
Timestamp(‘2019-01-02 10:25:00’)
pd.Timestamp('2019-1-1 10:25') + pd.Timedelta('15 ns')
Timestamp(2019-01-01 10:25:00.000000015’)
4. period_range
p1 = pd.period_range('2019-1-1 10:25', freq='25H', periods=10)
p2 = pd.period_range('2019-1-1 10:25', freq='1D1H', periods=10)
p1
# p2
5. 時間索引
rng = pd.date_range('2019 Jul 1', periods=10, freq='D`)
pd.Series(range(len(rng)), index = rng)
periods = [pd.Period('2019-05'), pd.Period('2019-06'), pd.Period('2019-07')]
ts = pd.Series(np.random.randn(len(periods)), index = periods)
print(ts.index)
ts
pandas.core.indexes.period.PeriodIndex
2019-05 0.295038
2019-06 2.265023
2019-07 -0.108697
Freq: M, dtype: float64
6. 時間戳Timestamp 和時間週期period 轉換
ts = pd.Series(range(10), pd.date_range('11-11-19 12:20', periods=10, freq='H'))
ts
ts_period = ts.to_period()
ts_period
ts_period['2019-11-11 14:30' : '2019-11-11 16:30']
ts['2019-11-11 14:30' : '2019-11-11 16:30']
7. 重採樣 resample
- 時間數據由一個頻率轉換到另一個頻率
- 降採樣
- 升採樣
import numpy as np
import pandas as pd
rng = pd.date_range('1/1/2019', periods=90, freq='D')
ts = pd.Series(np.random.randn(len(rng)), index=rng)
ts.head(6)
ts.resample('M').sum()
day10Ts = ts.resample('10D').mean()
day10Ts
day10Ts.resample('D').asfreq()
2019-01-01 0.354829
2019-01-02 NaN
2019-01-03 NaN
2019-01-04 NaN
2019-01-05 NaN
2019-01-06 NaN
2019-01-07 NaN
2019-01-08 NaN
2019-01-09 NaN
2019-01-10 NaN
2019-01-11 -0.635114
2019-01-12 NaN
…
2019-03-21 NaN
2019-03-22 -0.242809
Freq: D, Length: 81, dtype: float64
8. 插值方法 fill
- ffill 空值取前面的值
- bfill 空值取後面的值
- interpolate 線性取值
day10Ts.resample('D`).ffill(1)
day10Ts.resample('D').bfill(1)
day10Ts.resample('D').interpolate('linear')
9. Moving Window Functions 滑動窗口 rolling
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.Series(np.random.randn(600), index=pd.date_range('1/15/2020', freq='D', periods=600))
df.head()
r = df.rolling(window = 10)
r
Rolling [window=10,center=False,axis=0]
# r.max(), r.mediam(), r.std(), r.sum(), r.var()
# r.skew() 數據的偏度(Skewness)
r.mean()
Freq: D, Length: 600, dtype: float64
plt.figure(figsize = (18, 6))
df.plot(style = 'r--')
df.rolling(window = 10).mean().plot(style = 'b')
plt.figure(figsize = (18, 6))
df.plot(style = 'r--')
df.rolling(window = 10).skew().plot(style = 'b')
[ 時間更多相關操作見https://blog.csdn.net/sanjianjixiang/article/details/102892864 ]