<p>您可以使用<a href="https://docs.python.org/3/howto/regex.html" rel="nofollow noreferrer">regex</a>获得答案。附件中附有文件链接</p>
<p>首先,我用<code>''</code>替换所有<code>\n</code>。这样,所有换行符都将从<code>Details</code>列中删除</p>
<p>然后我抓取两个关键字之间的所有文本。
对于类型,数据介于<code>'Type:'</code>和<code>'Vector:'</code>之间。矢量和任务也是如此。注意,我正在抓取<code>'Note user:'</code>之后的所有数据。现在已经从<code>Details</code>列中提取了数据,可以删除该列了</p>
<pre><code>import pandas as pd
data = {'ID': ['A0001', 'A0002', 'A0003', 'A0004', 'A0005'],
'Name': ['John', 'Micheal', 'Angle', 'Jim', 'Rome'],
'Details': ['Type:\nHouse\nVector:\nTriangle\n\nMission:\nCompleted,lv5\n\nNote user:\n#', 'Type:\n#\nVector:\n\n\nMission:\nFailed\nNote user:\n#', 'Type:\nCar\nVector:\nSquare\nMission:\nCompleted\nNote user:\n', 'Type:\n#\nVector:\n#\nMission:\nCompleted without award\n\nNote user:\nNo end', 'Type:\n#\nVector:\n#\nMission:\n\n\nNote user:\nThere are many mistake.\nI cant choose.\nI cant buy.']
}
df = pd.DataFrame (data, columns=['ID', 'Name', 'Details'])
df['Details'] = df.Details.str.replace('\n','', regex=True)
df['Type'] = df.Details.str.extract('Type\:(.*)Vector')
df['Vector'] = df.Details.str.extract('Vector\:(.*)Mission')
df['Mission'] = df.Details.str.extract('Mission\:(.*)Note')
df['Note'] = df.Details.str.extract('Note user\:(.*)')
print (df[['ID','Name','Type','Vector']])
print (df[['Mission','Note']])
</code></pre>
<p>其输出将为:</p>
<pre><code> ID Name Type Vector
0 A0001 John House Triangle
1 A0002 Micheal #
2 A0003 Angle Car Square
3 A0004 Jim # #
4 A0005 Rome # #
Mission Note
0 Completed,lv5 #
1 Failed #
2 Completed
3 Completed without award No end
4 There are many mistake.I cant choose.I cant buy.
</code></pre>