将文件解析为Datafram

2024-10-02 12:28:39 发布

您现在位置:Python中文网/ 问答频道 /正文

data = open("state_towns.txt")
    for line in data:
        print(line)

返回以下列表:

Colorado[edit]

Alamosa (Adams State College)[2]

Boulder (University of Colorado at Boulder)[12]

Durango (Fort Lewis College)[2]

Connecticut[edit]

Fairfield (Fairfield University, Sacred Heart University)

Middletown (Wesleyan University)

New Britain (Central Connecticut State University)

我想返回一个包含两列的数据帧,state和region,如下所示:

    State        Town
0   Colorado     Alamosa
1   Colorado     Boulder
2   Colorado     Durango 
3   Connecticut  Fairfield
4   Connecticut  Middletown
5   Connecticut  New Britain

如何拆分列表,以便将包含“[edit]”的任何行添加到state列?你知道吗

另外,我如何删除括号中的所有文字从城镇条目?你知道吗

谢谢


Tags: 列表datalineeditstateuniversitycollegeboulder
1条回答
网友
1楼 · 发布于 2024-10-02 12:28:39
d = {"state":[], "town":[]} #dictionary to hold the data
state = "" #placeholder state var
town = "" #placeholder town var

data = open("state_towns.txt")
    for line in data:
        if "[edit]" in line:
            state = line.replace("[edit]","") #set the state var if it has edit
        else:
            town = line.split()[0] #remove the extra town line info
        if state != "" and town != "": # if both vars are filled add to dictionary
            d["state"].append(state)
            d["town"].append(town)


import pandas as pd
df = pd.DataFrame(d)
print(df)

这是相当古怪,但它做的工作。你知道吗

占位符状态,在循环中定义的占位符城镇。如果两者都已定义,则将它们添加到字典中,完成后将字典转换为数据帧。你知道吗

相关问题 更多 >

    热门问题