基于股票价格调整权重的投资组合使用Pandas

Date Category Company Price weight 1/1/2007 Automative Audi 1000 0.146 1/1/2007 Automative Alfa Romeo 400 0.143 1/1/2007 Automative Aston Martin500 0.002 1/1/2007 Automative Bentley 2000 0.025 1/1/2007 Automative Mercedes 3000 0.063 1/1/2007 Automative BMW 40 0.154 1/1/2007 Automative Volvo 3000 0.163 1/1/2007 Automative VW 200 0.003 1/1/2007 Technology Apple 400 0.120 1/1/2007 Technology Microsoft 5500 0.048 1/1/2007 Technology Google 230 0.069 1/1/2007 Technology Lenova 36 0.036 1/1/2007 Technology IBM 250 0.016 1/1/2007 Technology Sprint 231 0.013

Category Company Price weight Pctile Date 1/1/2007 Automative Audi 1000 0.146 0.625000 1/1/2007 Automative Alfa Romeo 400 0.143 0.375000 1/1/2007 Automative Aston Martin 500 0.002 0.500000 1/1/2007 Automative Bentley 2000 0.025 0.750000 1/1/2007 Automative Mercedes 3000 0.063 0.937500 1/1/2007 Automative BMW 40 0.154 0.125000 1/1/2007 Automative Volvo 3000 0.163 0.937500 1/1/2007 Automative VW 200 0.003 0.250000 1/1/2007 Technology Apple 400 0.120 0.833333 1/1/2007 Technology Microsoft 5500 0.048 1.000000 1/1/2007 Technology Google 230 0.069 0.333333 1/1/2007 Technology Lenova 36 0.036 0.166667 1/1/2007 Technology IBM 250 0.016 0.666667 1/1/2007 Technology Sprint 231 0.013 0.500000

Category Company Price weight Pctile Final_weight Date 1/1/2007 Automative Audi 1000 0.146 0.625000 0.146 1/1/2007 Automative Alfa Romeo 400 0.143 0.375000 0.143 1/1/2007 Automative Aston Martin 500 0.002 0.500000 0.002 1/1/2007 Automative Bentley 2000 0.025 0.750000 0.041 1/1/2007 Automative Mercedes 3000 0.063 0.937500 0.102 1/1/2007 Automative BMW 40 0.154 0.125000 0.000 1/1/2007 Automative Volvo 3000 0.163 0.937500 0.265 1/1/2007 Automative VW 200 0.003 0.250000 0 1/1/2007 Technology Apple 400 0.120 0.833333 0.146 1/1/2007 Technology Microsoft 5500 0.048 1.000000 0.058 1/1/2007 Technology Google 230 0.069 0.333333 0.069 1/1/2007 Technology Lenova 36 0.036 0.166667 0.000 1/1/2007 Technology IBM 250 0.016 0.666667 0.016 1/1/2007 Technology Sprint 231 0.013 0.500000 0.013

1条回答

网友

1楼 · 发布于 2024-06-28 10:46:42

虽然我希望这是一个groupby-a-groupby解决方案，但事实并非如此。这是一个肮脏的黑客攻击。我不能使用groupby解决方案的原因是，据我所知，无法使用groupby选择列并将其传递到multiple argument functions。不能做的事已经够多了。。。在

现在我说这是黑客，所以试试你的数据集。我不知道它有多快在一个大的数据集，但请告诉我。在

import pandas as pd

#make a lazy example
date = ['1/1/2017']*10
category = ['car']*5 + ['tech']*5
company = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
price = [10, 300, 100, 400, 500, 230, 324, 543, 234, 124]
weight = [0.2, 0.1, 0.3, 0.2, 0.2, 0.15, 0.15, 0.4, 0.1, 0.2]

data = {'date': date, 'category': category, 'company': company, 'price': price, 'weight': weight}
df = pd.DataFrame(data)

#do you percentile thing
df['pctile'] = df.price.groupby([df.date, df.category]).rank(pct='True')

# define a function?
def seventy_thirty(df):
    s = df.ix[df.pctile > 0.7, 'pctile']
    s.ix[:] = 'upper'
    l = df.ix[df.pctile < 0.3, 'pctile']
    l.ix[:] = 'lower'
    s = s.append(l)
    return s

df['pctile_summary'] = seventy_thirty(df)

# created a dataframe with weights the we can merge make into another column
weighted = df.groupby(['date', 'category', 'pctile_summary']).sum().ix[:, ['weight']]

# add lowers onto uppers as we'll need them in final_weights
add_lower = weighted.ix[weighted.index.get_level_values('pctile_summary')=='lower', ['weight']].reset_index(level=2)
add_lower.pctile_summary = 'upper'
add_lower = add_lower.set_index('pctile_summary', append=True)
weighted = pd.merge(weighted, add_lower, how='left', left_index=True, right_index=True, suffixes=['', '_lower'])

# Now add all new columns and calculate the final_weight
df1 = pd.merge(df, weighted.reset_index(), how='left', on=['date', 'category', 'pctile_summary'], suffixes=['', '_sum'])
df1.ix[df1.pctile_summary=='lower', 'final_weight'] = 0
df1.ix[df1.pctile_summary.isnull(), 'final_weight'] = df1.weight
df1.ix[df1.pctile_summary=='upper', 'final_weight'] = (df1.weight / df1.weight_sum) * (df1.weight_sum + df1.weight_lower)

#finally tidy up (delete all that hardwork!)
df1 = df1.drop(['pctile_summary', 'weight_sum', 'weight_lower'], axis=1)
df1

相关问题更多 >

编程相关推荐

热门问题

热门文章