Pandas dataframe使用某些条件将一列数据拆分为两列

2024-09-28 01:25:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个数据帧,在下面-

             0  
    ____________________________________
0     Country| India  
60        Delhi  
62       Mumbai  
68       Chennai  
75    Country| Italy  
78        Rome  
80       Venice  
85        Milan  
88    Country| Australia  
100      Sydney  
103      Melbourne  
107      Perth  

我想把数据分成两列,这样一列是国家,另一列是城市。我不知道从哪里开始。我想要下面这样-

             0                    1
    ____________________________________
0     Country| India           Delhi
1     Country| India           Mumbai
2     Country| India           Chennai         
3    Country| Italy           Rome
4    Country| Italy           Venice   
5    Country| Italy           Milan        
6    Country| Australia       Sydney
7   Country| Australia       Melbourne
8   Country| Australia       Perth     

你知道怎么做吗


Tags: 数据国家countrysydneyindiaaustraliadelhirome
2条回答

使用^{}with ^{}^{}将不匹配的值替换为缺少的值,使用ffill向前填充缺少的值,然后通过^{}删除两个值相同的行,用于^{}中的不相等:

df.insert(0, 'country', df[0].where(df[0].str.startswith('Country')).ffill())
df = df[df['country'].ne(df[0])].reset_index(drop=True).rename(columns={0:'city'})
print (df)
             country       city
0      Country|India      Delhi
1      Country|India     Mumbai
2      Country|India    Chennai
3      Country|Italy       Rome
4      Country|Italy     Venice
5      Country|Italy      Milan
6  Country|Australia     Sydney
7  Country|Australia  Melbourne
8  Country|Australia      Perth

查找存在|的行并拉入另一列,然后填充新创建的列:

(
    df.rename(columns={"0": "city"})
    # this looks for rows that contain '|' and puts them into a 
    # new column called Country. rows that do not match will be
    # null in the new column.
    .assign(Country=lambda x: x.loc[x.city.str.contains("\|"), "city"])
    # fill down on the Country column, this also has the benefit
    # of linking the Country with the City, 
    .ffill()
    # here we get rid of duplicate Country entries in city and Country
    # this ensures that only Country entries are in the Country column
    # and cities are in the City column
    .query("city != Country")
    # here we reverse the column positions to match your expected output 
    .iloc[:, ::-1]
)


      Country           city
60  Country| India      Delhi
62  Country| India      Mumbai
68  Country| India      Chennai
78  Country| Italy      Rome
80  Country| Italy      Venice
85  Country| Italy      Milan
100 Country| Australia  Sydney
103 Country| Australia  Melbourne
107 Country| Australia  Perth

相关问题 更多 >

    热门问题