<p>你听说过<a href="http://pandas.pydata.org/" rel="nofollow">^{<cd1>}</a>吗?它可以帮助你!你知道吗</p>
<pre><code>import numpy as np
import pandas as pd
# Load data set
data = pd.read_csv(inputFile, delimiter='|')
# Tag
def func(ssn):
if ssn == 123456789:
return 10001
if ssn == 987654321:
return 10002
data['ID'] = data['SSN'].apply(func)
# Reorder columns
new_cols = np.concatenate((data.columns[-1:], data.columns[:-1]), axis=0)
data = data[new_cols]
# Save file
data.to_csv(outputFile, sep='|', index=False)
</code></pre>
<p>输出为:</p>
<pre><code>ID|RefID|FirstName|MiddleName|LastName|SSN|DOB|School Year|Age|District LEA|District Description|School LEA|Location Description|title|frng_amt
10001|1|JULIE|A|ADAMS|123456789|654321|20142015|47|101000|DEWITTSCHOOLDISTRICT|P|14||
10001|2|JULIE|A|ADAMS|123456789|654321|20132014|46|101000|DEWITTSCHOOLDISTRICT|S|13100||
10001|3|JULIE|A|ADAMS|123456789|654321|20122013|45|101000|DEWITTSCHOOLDISTRICT|P|14||
10001|4|JULIE|A|ADAMS|123456789|654321|20132014|46|101000|DEWITTSCHOOLDISTRICT|P|14||
10001|5|JULIE|A|ADAMS|123456789|654321|20142015|47|101000|DEWITTSCHOOLDISTRICT|S|15000||
10001|6|JULIE|A|ADAMS|123456789|654321|20122013|45|101000|DEWITTSCHOOLDISTRICT|S|13100||
10002|7|SHIRLEY||ADAMS|987654321|987890|20122013|49|101000|DEWITTSCHOOLDISTRICT|S|13100||
10002|8|SHIRLEY||ADAMS|987654321|987890|20092010|46|101000|DEWITTSCHOOLDISTRICT|P|14||
10002|9|SHIRLEY||ADAMS|987654321|987890|20102011|47|101000|DEWITTSCHOOLDISTRICT|P|14||
10002|10|SHIRLEY||ADAMS|987654321|987890|20132014|50|101000|DEWITTSCHOOLDISTRICT|S|13100||
10002|11|SHIRLEY||ADAMS|987654321|987890|20132014|50|101000|DEWITTSCHOOLDISTRICT|P|14||
10002|12|SHIRLEY||ADAMS|987654321|987890|20122013|49|101000|DEWITTSCHOOLDISTRICT|P|14||
10002|13|SHIRLEY||ADAMS|987654321|987890|20102011|47|101000|DEWITTSCHOOLDISTRICT|A|13100||
10002|14|SHIRLEY||ADAMS|987654321|987890|20142015|51|101000|DEWITTSCHOOLDISTRICT|S|15000||
10002|15|SHIRLEY||ADAMS|987654321|987890|20092010|46|101000|DEWITTSCHOOLDISTRICT|A|13100||
10002|16|SHIRLEY||ADAMS|987654321|987890|20142015|51|101000|DEWITTSCHOOLDISTRICT|P|14||
</code></pre>
<p><strong>更新</p>
<p>正如与<a href="https://stackoverflow.com/users/2141635/padraic-cunningham">Padraic Cunningham</a>讨论的,OP可以有两个以上的<code>SSN</code>。在这种情况下,bes解决方案是:</p>
<pre><code>import numpy as np
import pandas as pd
# Load data set
data = pd.read_csv(inputFile, delimiter='|')
# Tag
tag ={k:10001+k for i, k in enumerate(data['SSN'].unique())}
data['ID'] = data['SSN'].apply(lambda ssn: tag[ssn])
# Reorder columns
new_cols = np.concatenate((data.columns[-1:], data.columns[:-1]), axis=0)
data = data[new_cols]
# Save file
data.to_csv(outputFile, sep='|', index=False)
</code></pre>