<p>IIUC您可以使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.extract.html" rel="nofollow noreferrer">^{<cd1>}</a>:</p>
<pre><code>print (df)
data
0 SubjectUserName=myuser, SubjectDomainName=XX, ...
1 SubjectUserName=myuser, SubjectDomainName=XX, ...
#temporaly display with of one column to 100
with pd.option_context('display.max_colwidth', 100):
print (df.data)
0 SubjectUserName=myuser, SubjectDomainName=XX, TargetUserName=XXXXX, TargetDomainName=XXXXX
1 SubjectUserName=myuser, SubjectDomainName=XX, TargetUserName=XXXXX, TargetDomainName=XXXXX
Name: data, dtype: object
print (df.data.str.extract('SubjectUserName=(.*), SubjectDomainName', expand=False))
0
0 myuser
1 myuser
</code></pre>
<p>另一种可能的解决方案是使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html" rel="nofollow noreferrer">^{<cd2>}</a>并按<code>,</code>(默认分隔符)将数据拆分为<code>4</code>列,然后<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.replace.html" rel="nofollow noreferrer">^{<cd5>}</a>:</p>
<pre><code>import pandas as pd
import numpy as np
from pandas.compat import StringIO
temp=u"""SubjectUserName=myuser, SubjectDomainName=XX, TargetUserName=XXXXX, TargetDomainName=XXXXX
SubjectUserName=myuser, SubjectDomainName=XX, TargetUserName=XXXXX, TargetDomainName=XXXXX
"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), skipinitialspace=True, names=['SubjectUserName','SubjectDomainName','TargetUserName','TargetDomainName'])
print (df)
SubjectUserName SubjectDomainName TargetUserName \
0 SubjectUserName=myuser SubjectDomainName=XX TargetUserName=XXXXX
1 SubjectUserName=myuser SubjectDomainName=XX TargetUserName=XXXXX
TargetDomainName
0 TargetDomainName=XXXXX
1 TargetDomainName=XXXXX
print (df.SubjectUserName.str.replace('SubjectUserName=', ''))
0 myuser
1 myuser
Name: SubjectUserName, dtype: object
</code></pre>