多亏了@Woody Pride的回答:https://stackoverflow.com/a/19791302/5608428,我已经完成了95%的目标。你知道吗
也就是说,顺便说一下,从一个大的df中创建一个子数据帧的dict。你知道吗
我只需要对字典中的每个数据帧进行排序。这是一件小事,但我在这里或谷歌上找不到答案。你知道吗
import pandas as pd
import numpy as np
import itertools
def points(row):
if row['Ob1'] > row['Ob2']:
val = 2
else:
val = 1
return val
#create some data with Names column
data = pd.DataFrame({'Names': ['Joe', 'John', 'Jasper', 'Jez'] *4, \
'Ob1' : np.random.rand(16), 'Ob2' : np.random.rand(16)})
#create list of unique pairs
comboNames = list(itertools.combinations(data.Names.unique(), 2))
#create a data frame dictionary to store your data frames
DataFrameDict = {elem : pd.DataFrame for elem in comboNames}
for key in DataFrameDict.keys():
DataFrameDict[key] = data[:][data.Names.isin(key)]
#Add test calculated column
for tbl in DataFrameDict:
DataFrameDict[tbl]['Test'] = DataFrameDict[tbl].apply(points, axis=1)
#############################
#Checking test and sorts
##############################
#access df's to print head
for tbl in DataFrameDict:
print(DataFrameDict[tbl].head())
print()
#access df's to print summary
for tbl in DataFrameDict:
print(str(tbl[0])+" vs "+str(tbl[1])+": "+str(DataFrameDict[tbl]['Ob2'].sum()))
print()
#trying to sort each df
for tbl in DataFrameDict:
#Doesn't work
DataFrameDict[tbl].sort_values(['Ob1'])
#mistakenly deleted other attempts (facepalm)
for tbl in DataFrameDict:
print(DataFrameDict[tbl].head())
print()
代码运行,但无论我尝试什么,都不会对每个df进行排序。我可以访问每个df,打印等没有问题,但是没有.sort_values()
另一方面,用元组作为名称(键)来创建df有点麻烦。有没有更好的办法?你知道吗
非常感谢
看起来您只需要将排序后的数据帧分配回dict:
相关问题 更多 >
编程相关推荐