python中的文件读取与操作

def text_file(file): list=[] file_of_text = "text.txt" with open(file_of_context) as f: for l in f: l_dict = l.split(" ") list.append(l_dict) return(list) def items(file_of_text): list_of_items= text_file(file_of_text) for a in list_of_items: for b in a: if a[-1]== def main(): file_of_text = "text.txt" if __name__ == "__main__": main()

2条回答

网友

1楼 · 编辑于 2024-10-03 02:33:53

从指定“^”作为分隔符并使用任意列名开始

df = pd.read_csv('data.csv', delimiter='\^', names=['A', 'B'])
print (df)
                A              B
0  level country           layla
1  hello sandra     organization
2   hello people           layla
3   hello samar     organization

然后我们分开得到我们想要的值。我相信这在熊猫16中是新发现的

df['A'] = df['A'].str.split(' ', expand=True)[1]
print(df)
         A              B
0  country          layla
1   sandra   organization
2   people          layla
3    samar   organization

然后我们将B列分组并应用tuple函数。注意：我们正在重置索引，以便稍后使用

g = df.groupby('B')['A'].apply(tuple).reset_index()
print(g)
              B                  A
0          layla  (country, people)
1   organization    (sandra, samar)

使用字符串“item”和索引创建新列

   g['item'] = 'item' + g.index.astype(str)
    print (g[['item','A']])
        item                  A
    0  item0  (country, people)
    1  item1    (sandra, samar)

网友

2楼 · 编辑于 2024-10-03 02:33:53

假设您的文件名为file_of_text.txt，包含以下内容：

level country ^ layla
hello sandra  ^ organization
hello people ^ layla
hello samar  ^ organization

您可以使用以下代码行将数据从文件中获取到与所需输出类似的数据帧：

import re
import pandas as pd

def main(myfile):
    # Open the file and read the lines
    text = open(myfile,'r').readlines()

    # Split the lines into lists
    text = list(map(lambda x: re.split(r"\s[\^\s]*",x.strip()), text))

    # Put it in a DataFrame
    data = pd.DataFrame(text, columns = ['A','B','C'])

    # Create an output DataFrame with rows "item0" and "item1"
    final_data = pd.DataFrame(['item0','item1'],columns=['D'])

    # Create your desired column
    final_data['E'] = data.groupby('C')['B'].apply(lambda x: tuple(x.values)).values

    print(final_data)

if __name__ == "__main__":
    myfile = "file_of_text.txt"
    main(myfile)

其思想是从文本文件中读取行，然后使用split方法从re模块中拆分每一行。然后将结果传递给DataFrame方法以生成名为data的数据帧，该数据帧用于创建所需的数据帧final_data。结果应如下所示：

# data

       A        B             C
0  level  country         layla
1  hello   sandra  organization
2  hello   people         layla
3  hello    samar  organization


# final_data

       D                  E
0  item0  (country, people)
1  item1    (sandra, samar)

如果你有任何问题，请看一下剧本并进一步提问。你知道吗

我希望这有帮助。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章