Home 利用因子分析建構投資組合 – 以德國 DAX 指數為例
Post
Cancel

利用因子分析建構投資組合 – 以德國 DAX 指數為例

相關連結

結果

優化傳統 CAPM 模型,找出可能帶來超額報酬的因子,投資組合在 2021 年超越 DAX 大盤指數 7%

程式碼

Get data from yahoo finance

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import yfinance as yf
import numpy as np
import pandas as pd

start_date = '2020-01-01'
end_date = '2020-12-31'

Constituent = list(pd.read_html('https://en.wikipedia.org/wiki/DAX')[3]['Ticker symbol'])
shares = {}
df = pd.DataFrame()

for ticker in Constituent:
    try:
        df[ticker] = yf.download(ticker, start = start_date, end = end_date)['Adj Close']
    except Exception as e:
        print('Failed to Download or merge : '+ticker)

    try:
        shares[ticker] = yf.Ticker(ticker).info['floatShares']
    except:
        print(ticker+' shares unfound')

Slice price and market equity data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# get price
PRC = df.copy()
PRC.drop(['ENR.DE'], axis=1, inplace=True) # ENR.DE is not listed in DAX30 until 2020-12, so drop it 

# get market equity
for ticker in df.columns:
    df[ticker] *= shares[ticker]

# market equity cutoff
cutoff = df.quantile(0.5, axis=1)
port_df = pd.DataFrame(columns=df.columns,index=df.index)
port_df[df.gt(cutoff,axis=0)] = 'Big'
port_df[df.le(cutoff,axis=0)] = 'Small'


ME_port = port_df.copy()
ME_port.drop(columns=['ENR.DE'],inplace=True)
ME_port = ME_port.iloc[13:,:]

Get momentum according to formula

1
2
3
4
5
6
7
8
9
10
11
12
RET = PRC.pct_change()
tmp_RET = (RET + 1)
tmp_RET = tmp_RET.iloc[1:, :] # delete the first one since it has no return
MOM = tmp_RET.rolling(11).apply(np.prod).shift(2)
MOM = MOM.apply(pd.to_numeric)
L_cutoff = pd.to_numeric(MOM.quantile(.3,axis=1,numeric_only=False))
H_cutoff = pd.to_numeric(MOM.quantile(.7,axis=1,numeric_only=False))
WL_port = pd.DataFrame(index=RET.index, columns=RET.columns)
WL_port[MOM.gt(H_cutoff, axis=0)] = 'Winner'
WL_port[(MOM.le(H_cutoff, axis=0)) & (MOM.ge(L_cutoff, axis=0))] = 'Neutral'
WL_port[MOM.lt(L_cutoff, axis=0)] = 'Loser'
WL_port.tail()

Set rebalance rule and combine two cut-off portfolio

Since the momentum will be updated everyday, and market equity will not change significantly, we decide to only get momentum data on the last day of every month.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
WL_port.index = pd.to_datetime(WL_port.index, format='%Y%m%d', errors='ignore')+ pd.offsets.MonthEnd(0)

# store 2020 Feb and months other than Feb's date
other_list = []
feb_list = []

feb = pd.DataFrame()
other = pd.DataFrame()

for i in PRC.index:
    if i.month == 2:
        feb_list.append(i)
    else:
        other_list.append(i)

feb = WL_port[WL_port.index.month == 2]
other = WL_port[WL_port.index.month != 2]

feb.index = feb_list
other.index = other_list

# merge the two dataframes (Feb and Non-Feb)
WL_port = other.merge(feb, how='outer')
# then revise the index to the correct order (Jan, Feb, Mar,....)
all_date = feb_list + other_list
all_date.sort()
WL_port.index = all_date
WL_port.dropna(inplace=True)

ME_MOM_port = ME_port + WL_port
ME_MOM_port.head(5)

Stimulate real world trading and the visualization of our strategy (backtesting)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
ME_lag = df.shift(1)
unique_port = ['SmallLoser', 'SmallNeutral', 'SmallWinner', 'BigLoser','BigNeutral', 'BigWinner']

RET_port = pd.DataFrame(index=RET.index, columns=unique_port)
N_firm = pd.DataFrame(index=RET.index, columns=unique_port)

for p in unique_port:
  TMP_RET = RET[ME_MOM_port==p].apply(pd.to_numeric)
  TMP_ME = ME_lag[ME_MOM_port==p].apply(pd.to_numeric)
  TMP_PROD = TMP_RET*TMP_ME
  RET_port[p] = TMP_PROD.sum(axis=1)/TMP_ME.sum(axis=1)
  N_firm[p] = TMP_RET.count(axis=1)
RET_port = RET_port.dropna()

import matplotlib.pyplot as plt
fig = plt.figure(figsize=(15, 10))
ax = fig.add_subplot(111)

tmp = (RET_port.dropna()+1).cumprod()

ax.plot(tmp, label=unique_port)
ax.legend(loc='best')
ax.set_xlabel('Time')
ax.set_ylabel('Return')
ax.set_title('Backtest: Momentum-Market Equity Strategy on DAX30')

backtest result

[Explanation] After backtesting the data for one year, we define our strategy as:

  • Long BigLoser, SmallNeutral, BigNeutral (top 3 best performers on 2020), Short SamllLoser, SmallWinner, BigWinner (worst 3 performers on 2020)
  • How to determine weight: the weight on 2020-12-30
  • How to rank the stocks: the ranking on 2020-12-30

Identify the weight of each stock on 2020-12-30 and identify the good stock of each stock on 2020-12-30

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# identify stocks that we are going to long
filter_ = (last_rank.T['2020-12-30'] == 'BigLoser') | (last_rank.T['2020-12-30'] == 'SmallNeutral') | (last_rank.T['2020-12-30'] == 'SmallLoser') 
good_rank = list(last_rank.T[filter_].index)
last_date = df.iloc[-1:]

good_weights = {} # store the weight for each stock
sum_me = 0
last_date = df.iloc[-1:]

for c in df.columns:
  if c in good_rank:
    me = last_date[c]
    sum_me += me # calculate total market equity

for c in df.columns:
  if c in good_rank:
    good_weights[c] = last_date[c] / sum_me # calculate the value-weighted weight

Identify the bad stock of each stock on 2020-12-30

1
2
3
4
5
6
7
8
9
10
11
12
13
14
filter_ = (last_rank.T['2020-12-30'] == 'BigNeutral') | (last_rank.T['2020-12-30'] == 'BigWinner') | (last_rank.T['2020-12-30'] == 'SmallWinner') 
bad_rank = list(last_rank.T[filter_].index)
bad_weights = {}
sum_me = 0
last_date = df.iloc[-1:]

for c in df.columns:
  if c in bad_rank:
    me = last_date[c]
    sum_me += me 

for c in df.columns:
  if c in bad_rank:
    bad_weights[c] = last_date[c] / sum_me

Plot the out-of-sample performance

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# merge the two dicts
for ticker, value in (good_weights.items()):
    bad_weights[ticker] = value
weights = bad_weights

start_date = '2021-01-01'
end_date = '2021-12-22'
#data = yf.download('DAX',start='2017-01-01',end='2021-01-01')

Constituent = list(pd.read_html('https://en.wikipedia.org/wiki/DAX')[3]['Ticker symbol'])
shares = {}
df_2021 = pd.DataFrame()
for ticker in Constituent:
    try:
        df_2021[ticker] = yf.download(ticker, start = start_date, end = end_date)['Close']
    except Exception as e:
        print('Failed to Download or merge : '+ticker)

# get the daily return 
ret_2021 = df_2021.pct_change()
ret_2021 = ret_2021.iloc[1:,]
ret_2021.drop(columns=['ENR.DE'], inplace=True)
weight_df = pd.DataFrame(np.nan, index=ret_2021.index, columns=ret_2021.columns)
# weight_df will store the weight of the stocks 
for c in weight_df:
  for k, v in weights.items():
    if c == k: # e.g. if 
      weight_df[c] = v[0] 

# calculate return
res_df = weight_df * ret_2021
res_df.sum(axis=1)
plt.style.use('seaborn')
fig = plt.figure(figsize=(15, 10))
ax = fig.add_subplot(111)

tmp = 1 * ((res_df).sum(axis=1) + 1).cumprod()

ax.plot(tmp)
ax.set_xlabel('Time')
ax.set_ylabel('Return')
ax.set_title('Out of Sample: LONG top three SHORT last three strategy')

out of sample result

1
2
3
4
5
6
7
8
final_res = 1 * ((res_df).sum(axis=1) + 1).cumprod()
VAR = (res_df.sum(axis=1) * 100 ).var()
MEAN = (res_df.sum(axis=1) * 100 ).mean()

# Sharpe Raito
# risk-free: 0%
SR = (MEAN-0) / (VAR)**0.5
SR

回饋與報告影片連結

This post is licensed under CC BY 4.0 by the author.