<p>你应该尝试使用更多的“应用”方法和熊猫的方法。在熊猫中使用“for循环”是非常糟糕的。。。这会毁了你的表演</p>
<p>一种可能的解决方案如下:</p>
<pre><code>import pandas as pd
# read the file
emp=pd.read_csv("employee_huge.txt", sep=" ")
# generate unique lists containing LocationX and TitleX
lnewcols_location=set(emp["Location"].to_list())
lnewcols_title=set(emp["Title"].to_list())
# a function to compare a cell (like "Location1") to a string that is the name of the column
# like "Location2". If they match return 1, otherwise 0
def same_as_col(acell, col):
if(acell==col):
return(1)
else:
return(0)
# generate all the LocationN columns with 1 or 0 if there is a match
for i in lnewcols_location:
emp[i]=emp["Location"].apply(same_as_col, col=i)
# generate all the TitleN columns with 1 or 0 if there is a match
for i in lnewcols_title:
emp[i]=emp["Title"].apply(same_as_col, col=i)
# removing Location and Title columns
emp=emp.drop(["Location", "Title"], axis=1)
</code></pre>
<p>最后,我生成了一个名为employee_hug.txt的文件。其内容的格式如下所示:</p>
<pre><code>Employee Location Title
0 Location4 Title1
1 Location1 Title3
2 Location1 Title2
3 Location1 Title4
4 Location4 Title1
</code></pre>