Pair trading 策略
ETF50 & ETF500 小demo
import pandas as pd
import numpy as np
import tushare as ts
import seaborn
from matplotlib import pyplot as plt
plt.style.use('seaborn')
%matplotlib inline
stocks_pair = ['50ETF', '500ETF']
1. 數據準備
# 加載數據
data1 = ts.get_k_data('510050', start='2018-04-01', end='2019-04-01')[['date','close']]
data2 = ts.get_k_data('510500', start='2018-04-01', end='2019-04-01')['close']
# 按行拼接收盤價
data = pd.concat([data1, data2], axis=1)
data.set_index('date',inplace = True)
# 重命名列('50ETF'、'500ETF')
data.columns = stocks_pair
data.head()
50ETF | 500ETF | |
---|---|---|
date | ||
2018-04-02 | 2.702 | 6.424 |
2018-04-03 | 2.693 | 6.373 |
2018-04-04 | 2.694 | 6.321 |
2018-04-09 | 2.711 | 6.331 |
2018-04-10 | 2.775 | 6.380 |
畫圖
data.plot(figsize= (8,6));
2. 策略開發思路
data.corr() # 協方差矩陣
50ETF | 500ETF | |
---|---|---|
50ETF | 1.000000 | 0.800654 |
500ETF | 0.800654 | 1.000000 |
# 數據可視化,看相關關係
plt.figure(figsize =(8,6))
plt.title('Stock Correlation')
plt.plot(data['50ETF'], data['500ETF'], '.');
plt.xlabel('50ETF')
plt.ylabel('500ETF')
data.dropna(inplace = True)
# 對兩股票價格做線性迴歸(白噪聲項符合正態分佈)
[slope, intercept] = np.polyfit(data.iloc[:,0], data.iloc[:,1], 1).round(2)
slope,intercept
(3.75, -4.27)
(y+4.27-3.75x) 符合Stationary
# 算出 (y+4.27-3.75x) 一列
data['spread'] = data.iloc[:,1] - (data.iloc[:,0]*slope + intercept)
data.head()
50ETF | 500ETF | spread | |
---|---|---|---|
date | |||
2018-04-02 | 2.702 | 6.424 | 0.56150 |
2018-04-03 | 2.693 | 6.373 | 0.54425 |
2018-04-04 | 2.694 | 6.321 | 0.48850 |
2018-04-09 | 2.711 | 6.331 | 0.43475 |
2018-04-10 | 2.775 | 6.380 | 0.24375 |
data['spread'].plot(figsize = (8,6),title = 'Price Spread');
# 對 spread 進行標準化
data['zscore'] = (data['spread'] - data['spread'].mean())/data['spread'].std()
data.head()
50ETF | 500ETF | spread | zscore | |
---|---|---|---|---|
date | ||||
2018-04-02 | 2.702 | 6.424 | 0.56150 | 1.523889 |
2018-04-03 | 2.693 | 6.373 | 0.54425 | 1.477487 |
2018-04-04 | 2.694 | 6.321 | 0.48850 | 1.327522 |
2018-04-09 | 2.711 | 6.331 | 0.43475 | 1.182938 |
2018-04-10 | 2.775 | 6.380 | 0.24375 | 0.669158 |
# 可視化標準化後的值
data['zscore'].plot(figsize = (10,8),title = 'Z-score');
plt.axhline(1.5)
plt.axhline(0)
plt.axhline(-1.5)
<matplotlib.lines.Line2D at 0x2e8d8383c8>
產生交易信號
data['position_1'] = np.where(data['zscore'] > 1.5, 1, np.nan)
data['position_1'] = np.where(data['zscore'] < -1.5, -1, data['position_1'])
data['position_1'] = np.where(abs(data['zscore']) < 0.5, 0, data['position_1'])
data['position_1'] = data['position_1'].fillna(method = 'ffill')
data['position_1'].plot(ylim=[-1.1, 1.1], figsize=(10, 6),title = 'Trading Signal_Uptrade');
data['position_2'] = -np.sign(data['position_1'])
data['position_2'].plot(ylim=[-1.1, 1.1], figsize=(10, 6),title = 'Trading Signal_Downtrade');
3. 計算策略年化收益並可視化
# 算離散收益率
data['returns_1'] = np.log(data['50ETF'] / data['50ETF'].shift(1))
data['returns_2'] = np.log(data['500ETF'] / data['500ETF'].shift(1))
# 算策略列
data['strategy'] = 0.5*(data['position_1'].shift(1) * data['returns_1']) + 0.5*(data['position_2'].shift(1) * data['returns_2'])
# 計算累積收益率
data[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).tail(1)
returns_1 | returns_2 | strategy | |
---|---|---|---|
date | |||
2019-04-01 | 1.063657 | 0.96155 | 1.114828 |
# 畫出累積收益率
data[['returns_1','returns_2','strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 8),title = 'Strategy_Backtesting');
# 計算年化收益率
data[['returns_1','returns_2','strategy']].dropna().mean() * 252
returns_1 0.064263
returns_2 -0.040828
strategy 0.113192
dtype: float64
# 計算年化風險
data[['returns_1','returns_2','strategy']].dropna().std() * 252 ** 0.5
returns_1 0.236365
returns_2 0.264628
strategy 0.057670
dtype: float64
# 策略累積收益率
data['cumret'] = data['strategy'].dropna().cumsum().apply(np.exp)
# 策略累積最大值
data['cummax'] = data['cumret'].cummax()
# 算回撤序列
drawdown = (data['cummax'] - data['cumret'])
# 算最大回撤
drawdown.max()
0.053458095761447444
小結
策略的思考
- 對多隻ETF進行配對交易,是很多實盤量化基金的交易策略;
策略的風險和問題:
-
Spread不迴歸的風險,當市場結構發生重大改變時,用過去歷史迴歸出來的Spread會發生不迴歸的重大風險;
-
中國市場做空受到限制,策略中有部分做空的收益是無法獲得的;
-
迴歸係數需要Rebalancing;
-
策略沒有考慮交易成本和其他成本;