<p>考虑替换<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html" rel="nofollow noreferrer">^{<cd1>}</a>,因为不是对每一列或每一行执行独占操作,而是基于其他列的条件逻辑,<em>Field1</em>和<em>Field2</em>,用于<em>new</em>列。此外,<a href="https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.assign.html" rel="nofollow noreferrer">^{<cd2>}</a>采用了一个不带引号的名称,该名称将不适用于传入的字符串</p>
<p>相反,运行一个简单的Python函数调用,按字符串分配列,然后返回一个新的df。下面用随机数据演示,为再现性设定种子,有条件地产生<em>字段3-5</em>:</p>
<pre><code>import pandas as pd
import numpy as np
np.random.seed(55)
Table1 = pd.DataFrame({'ID': [np.random.randint(15) for _ in range(50)],
'Field1': [np.random.choice(['CEQTY','LPCEQ','ABCDE','WVXYZ','12345'],1).item(0)
for _ in range(50)],
'Field2': np.random.randn(50)*100
}, columns=['ID', 'Field1', 'Field2'])
def func(df):
# ITERATE THROUGH LIST OF TUPLES (NEW COL AND LIST OF SEARCH ITEMS)
for i in [('Field3',['CEQTY','LPCEQ']),
('Field4',['ABCDE','WVXYZ']),
('Field5',['12345'])]:
# ASSIGN NEW COL, i[0], BY STRING BASED ON SEARCH LIST, i[1]
df[i[0]] = np.where(df.Field1.astype(str).str[0:5].isin(i[1]), df.Field2, 0)
return df
output = func(Table1)
print(output.head(10))
# ID Field1 Field2 Field3 Field4 Field5
# 0 13 LPCEQ 105.640854 105.640854 0.000000 0.000000
# 1 10 12345 -13.049038 0.000000 0.000000 -13.049038
# 2 7 CEQTY -85.079280 -85.079280 0.000000 0.000000
# 3 8 12345 12.047304 0.000000 0.000000 12.047304
# 4 13 12345 -29.095108 0.000000 0.000000 -29.095108
# 5 13 12345 -24.229704 0.000000 0.000000 -24.229704
# 6 13 LPCEQ -97.472869 -97.472869 0.000000 0.000000
# 7 5 ABCDE -221.743951 0.000000 -221.743951 0.000000
# 8 7 LPCEQ -0.155842 -0.155842 0.000000 0.000000
# 9 5 CEQTY 2.297829 2.297829 0.000000 0.000000
</code></pre>