<p>为了得到这些结果,我编写了一个非常快速的转换版本。你可以做np.ushort公司在发电机内部,它仍然很快,但在外面快得多:</p>
<pre><code>import time
df = pd.DataFrame(
np.random.randn(8, 4**7),
index=[np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux']),
np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'])])
start = time.time()
df.loc[:,] = np.ushort(df)
df = df.transform(lambda x: [ i if i> 10 else namednumber2numbername[x.name[1]][i] for i in x], axis=1)
end = time.time()
print(end - start)
# 1.150895118713379
</code></pre>
<p>原来的时间是这样的:</p>
^{pr2}$
<p>我找到了一条线索:</p>
^{3}$
<p>确定没有.items()的版本:</p>
<pre><code>
def what(x):
if type(x[0]) == np.float64:
if np.ushort(x[0])>10:
return np.ushort(x[0])
else:
return(namednumber2numbername[x.index[0][1]][np.ushort(x[0])])
df.groupby(level=[0,1]).transform(what)
0 1 2 3
bar one zero one zero zero
two i ii 65535 i
baz one zero zero 65535 zero
two i i i i
foo one zero one zero zero
two i i i i
qux one two zero zero 65534
two i i i ii
</code></pre>
<p>还有一条线!!!!不,按你的要求做!我们按级别0和1分组,然后执行计算以确定值:</p>
<pre><code>df.groupby(level=[0,1]).transform(lambda x: np.ushort(x[0]) if type(x[0]) == np.float64 and np.ushort(x[0]) >10 else namednumber2numbername[x.index[0][1]][np.ushort(x[0])])
0 1 2 3
bar one zero one zero zero
two i ii 65535 i
baz one zero zero 65535 zero
two i i i i
foo one zero one zero zero
two i i i i
qux one two zero zero 65534
two i i i ii
</code></pre>
<p>为了得到其他值,我这样做了:</p>
<pre><code>df.transform(lambda x: [ str(x.name[0]) + '_' + str(x.name[1]) + '_' + str( pos)+ '_' +str(value) for pos,value in x.items()])
print('Transformed DataFrame:\n',
df.transform(what), sep='')
Transformed DataFrame:
α ... ω ε
f a b c ... b c j
one α_a_one_79.96465755359696 α_b_one_31.32938096131651 α_c_one_2.61444370203201 ... ω_b_one_35.7457972161041 ω_c_one_40.224465043054195 ε_j_one_43.527184108357496
two α_a_two_42.66244395377804 α_b_two_65.92020941618344 α_c_two_77.26467264185487 ... ω_b_two_40.91908469505522 ω_c_two_50.395561828234555 ε_j_two_71.67418483119914
one α_a_one_47.9769845681328 α_b_one_38.90671671550259 α_c_one_67.13601594352508 ... ω_b_one_23.23799084164898 ω_c_one_63.551178212994465 ε_j_one_16.975582723809303
</code></pre>
<p>这里有一个没有。物品:</p>
<pre><code>df.transform(lambda x: ['_'.join((x.name[0], x.name[1], x.index[0], str(i) if type(i) == float else 0)) for i in list(x)])
</code></pre>
<p>输出</p>
^{8}$
<p>我也没有分组:</p>
<pre><code>df.T.apply(lambda x: x.name[0] + '_'+ x.name[1] + '_' + df.T.eq(x).columns + '_' + x.astype(str) , axis=1).T
or even better and most simple:
df.T.apply(lambda x: x.name[0] + '_'+ x.name[1] + '_' + x.index + '_' + x.astype(str) , axis=1).T
or
df.T.transform(lambda x: x.name[0] + '_'+ x.name[1] + '_' + x.index + '_' + x.astype(str) , axis=1).T
or with no .T:
df.transform(lambda x: x.index[0][0] + '_'+ x.index[0][1] + '_' + x.name + '_' + x.astype(str) , axis=1)
α ... ω ε
f a b c ... b c j
one α_a_one_79.96465755359696 α_b_one_31.32938096131651 α_c_one_2.61444370203201 ... ω_b_one_35.7457972161041 ω_c_one_40.224465043054195 ε_j_one_43.527184108357496
two α_a_two_42.66244395377804 α_b_two_65.92020941618344 α_c_two_77.26467264185487 ... ω_b_two_40.91908469505522 ω_c_two_50.395561828234555 ε_j_two_71.67418483119914
one α_a_one_47.9769845681328 α_b_one_38.90671671550259 α_c_one_67.13601594352508 ... ω_b_one_23.23799084164898 ω_c_one_63.551178212994465 ε_j_one_16.975582723809303
</code></pre>