擅长:python、mysql、java
<p>索引可以通过以下函数进行规范化:</p>
<pre><code>def normalizeIndex(x):
splittedString = list(filter(None, x.split(" ")))#split the input string into token with blank space separator and remove empty results
splittedString.sort()#sort the token list
return " ".join(splittedString) #return normalized string concatenating ordered token list
</code></pre>
<p>在对索引上的df进行分组并选择第一次出现之前,可以将该函数应用于索引(无论如何,可以应用进一步的分组选项):</p>
<pre><code> df = pd.DataFrame({'Correlations': [0.984395, 0.981778,0.981778,0.984395,0.973801,],
'adf':[-5.484766,-5.465284,-5.420976,-5.175268,-4.919812]},
index=['FITB RF','WAT SWK','SWK WAT','RF FITB','MCO BK',])
df.index = df.index.map(lambda x: normalizeIndex(x)) #Apply reordering function to df index
df = df.groupby(df.index).first() #Group the resulting dataframe, by index and, take the first occurence
print(df)
</code></pre>
<p>输出:</p>
<pre><code> Correlations adf
BK MCO 0.973801 -4.919812
FITB RF 0.984395 -5.484766
SWK WAT 0.981778 -5.465284
</code></pre>