尝试创建一个数据帧,其中包含选举的名称、结果(共和党-民主党的民意支持率,作为分数)和每次民意测验的民调差异。目前我的代码:
def results_polls_diff(editinfo, polls):
rows = []
for i, election in enumerate(editinfo):
polls_key = election['slug']
this_election = polls[polls_key]
npolls = this_election.shape[0]
diff = (this_election[candidates['R'].ix[i]] - this_election[candidates['D or I'].ix[i]])/100
for c in election['estimates']:
if c['party'] == 'Rep' :
r1 = c['value']
for c in election['estimates']:
if c['party'] == 'Dem' or c['party'] == 'ind' :
r2 = c['value']
result = (r1-r2)/100
#init_rows = []
#for d in diff:
# init_rows.append((polls_key, result, d))
#return init_rows
rows.append((polls_key, result, [d for d in diff]))
return rows
result_df = pd.DataFrame(results_polls_diff(editinfo, polls), columns = ['race', 'result', 'diff_list'])
result_df.head()
输出:
^{pr2}$我的目标是这样的:
race result diff_list
0 2014-delaware-senate-wade-vs-coons -0.22 -0.18
1 2014-delaware-senate-wade-vs-coons -0.22 -0.16
2 2014-delaware-senate-wade-vs-coons -0.22 -0.25
3 2014-delaware-senate-wade-vs-coons -0.22 -0.15
如果我使用了代码的散列部分并将append改为rows.append((init_rows))
,我得到了这个结果,但它似乎不再遍历所有的{diff_list
列中提取一个列表,以便元素占据该列中的一个单元格,然后复制行的其余部分。在
这是一个策略。考虑一下
df
选项1
^{pr2}$使用
set_index
,apply
,unstack
选项2
构建新的索引和数据帧,然后
unstack
相关问题 更多 >
编程相关推荐