<p>假设您的文件名为<code>file_of_text.txt</code>,包含以下内容:</p>
<pre><code>level country ^ layla
hello sandra ^ organization
hello people ^ layla
hello samar ^ organization
</code></pre>
<p>您可以使用以下代码行将数据从文件中获取到与所需输出类似的数据帧:</p>
<pre><code>import re
import pandas as pd
def main(myfile):
# Open the file and read the lines
text = open(myfile,'r').readlines()
# Split the lines into lists
text = list(map(lambda x: re.split(r"\s[\^\s]*",x.strip()), text))
# Put it in a DataFrame
data = pd.DataFrame(text, columns = ['A','B','C'])
# Create an output DataFrame with rows "item0" and "item1"
final_data = pd.DataFrame(['item0','item1'],columns=['D'])
# Create your desired column
final_data['E'] = data.groupby('C')['B'].apply(lambda x: tuple(x.values)).values
print(final_data)
if __name__ == "__main__":
myfile = "file_of_text.txt"
main(myfile)
</code></pre>
<p>其思想是从文本文件中读取行,然后使用<code>split</code>方法从<code>re</code>模块中拆分每一行。然后将结果传递给<code>DataFrame</code>方法以生成名为<code>data</code>的数据帧,该数据帧用于创建所需的数据帧<code>final_data</code>。结果应如下所示:</p>
<pre><code># data
A B C
0 level country layla
1 hello sandra organization
2 hello people layla
3 hello samar organization
# final_data
D E
0 item0 (country, people)
1 item1 (sandra, samar)
</code></pre>
<p>如果你有任何问题,请看一下剧本并进一步提问。你知道吗</p>
<p>我希望这有帮助。你知道吗</p>