<p>为新列创建<code>DataFrame</code>和<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.split.html" rel="nofollow noreferrer">^{<cd2>}</a>和<code>expand=True</code>:</p>
<pre><code>a = np.array([['1;"Female";133;132;124;"118";"64.5";816932'],
['2;"Male";140;150;124;".";"72.5";1001121'],
['3;"Male";139;123;150;"143";"73.3";1038437'],
['4;"Male";133;129;128;"172";"68.8";965353'],
['5;"Female";137;132;134;"147";"65.0";951545'],
['6;"Female";99;90;110;"146";"69.0";928799'],
['7;"Female";138;136;131;"138";"64.5";991305']], dtype=object)
df = pd.DataFrame(a)[0].str.split(';', expand=True)
df.columns = ['ID',"Gender","FSIQ","VIQ","PIQ","Weight","Height","MRI_Count"]
</code></pre>
<p>最后一些数据清理-通过<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.strip.html" rel="nofollow noreferrer">^{<cd5>}</a>删除<code>""</code>,并通过<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_numeric.html" rel="nofollow noreferrer">^{<cd6>}</a>和<a href="http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html" rel="nofollow noreferrer">^{<cd7>}</a>将列转换为数字:</p>
<pre><code>df['Gender'] = df['Gender'].str.strip('"')
c = ["ID", "FSIQ","VIQ","PIQ","Weight","Height","MRI_Count"]
df[c] = df[c].apply(lambda x: pd.to_numeric(x.str.strip('"'), errors='coerce'))
print (df)
ID Gender FSIQ VIQ PIQ Weight Height MRI_Count
0 1 Female 133 132 124 118.0 64.5 816932
1 2 Male 140 150 124 NaN 72.5 1001121
2 3 Male 139 123 150 143.0 73.3 1038437
3 4 Male 133 129 128 172.0 68.8 965353
4 5 Female 137 132 134 147.0 65.0 951545
5 6 Female 99 90 110 146.0 69.0 928799
6 7 Female 138 136 131 138.0 64.5 991305
</code></pre>