从多个列的行中获取新列(其中条目是列表)

2024-09-24 02:22:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个字典,我想把它变成一个数据帧,然后把数据帧中的一些列合并成一列。你知道吗

我的字典是这样的:

mydict = {'Participants': {'source': ['1', '2', '3'],
                           'name': ['A', 'B', 'C'],
                           'Entry (1)': ['Address1', 'Address2', 'Address3'],
                           'Entry (2)': ['Number1', 'Number2', 'Number2'],
                           'Entry (3)': ['Start1', 'Start2', 'Start3']},
            'Countries': {'DK': ['1', '2', '3'],
                      'UK': ['1', '3', '2'],
                      'CDN': ['3', '2', '1'],
                      'FR': ['1', '2', '3']}}

生成的数据帧如下所示: df = pd.DataFrame(mydict)

测向:

           Countries                    Participants
CDN        [3, 2, 1]                             NaN
DK         [1, 2, 3]                             NaN
Entry (1)        NaN  [Address1, Address2, Address3]
Entry (2)        NaN     [Number1, Number2, Number2]
Entry (3)        NaN        [Start1, Start2, Start3]
FR         [1, 2, 3]                             NaN
UK         [1, 3, 2]                             NaN
name             NaN                       [A, B, C]
source           NaN                       [1, 2, 3]

我有多个“Entry(n)”列,其中包含每个参与者的“Address,Number和Start”信息(df['Participants']['name'])。 我现在需要的是一个额外的列“Entries”,它为每一行组合了Entry (1)Entry (2)Entry(3)的信息。由于条目数(Entry (n))因数据源而异,我需要得到如下条目数:

entries = re.findall(r'Entry \(\d\)', str(mydict['Participants'].keys()))

剩下的是所有条目的列表:['Entry (1)', 'Entry (2)', 'Entry (3)']。你知道吗

我希望在结尾有这样一个数据帧:

           Countries                    Participants
CDN        [3, 2, 1]                             NaN
DK         [1, 2, 3]                             NaN
Entry (1)        NaN  [Address1, Address2, Address3]
Entry (2)        NaN  [Number1, Number2, Number2]
Entry (3)        NaN  [Start1, Start2, Start3]
Entries          Nan  ['Address1\nNumber1\Start1', 'Address2\nNumber2\Start2', 'Address3\nNumber3\nStart3']  <<-- I need this
FR         [1, 2, 3]                             NaN
UK         [1, 3, 2]                             NaN
name             NaN                       [A, B, C]
source           NaN                       [1, 2, 3]

有谁能告诉我一个具体的方法来实现这一点吗?你知道吗


Tags: 数据namesourcenanmydictentryparticipantsnumber1
2条回答

看来你需要

s=pd.DataFrame(df.filter(like='Entry',axis=0).Participants.tolist()).apply('/n'.join).tolist()
df.loc['Entries','Participants']=s
df
Out[64]: 
                                                Participants  Countries
CDN                                                      NaN  [3, 2, 1]
DK                                                       NaN  [1, 2, 3]
Entry (1)                     [Address1, Address2, Address3]        NaN
Entry (2)                        [Number1, Number2, Number2]        NaN
Entry (3)                           [Start1, Start2, Start3]        NaN
FR                                                       NaN  [1, 2, 3]
UK                                                       NaN  [1, 3, 2]
name                                               [A, B, C]        NaN
source                                             [1, 2, 3]        NaN
Entries    [Address1/nNumber1/nStart1, Address2/nNumber2/...        NaN

注意,您可以在末尾添加sort_index

让我们试试这个:

<罢工>测向at['Entries','Participants']=['\n'。join(i)for i in(zip)(*测向位置[[“条目(1)”,“条目(2)”,“条目(3)”,“参与者])]]

借用@W-B解决方案,使用过滤器代替索引列表:

df.at['Entries','Participants'] = ['\n'.join(i) for i in (zip(*df.filter(like='Entry', axis=0)['Participants']))]
df.sort_index()

输出:

                                                Participants  Countries
CDN                                                      NaN  [3, 2, 1]
DK                                                       NaN  [1, 2, 3]
Entries    [Address1\nNumber1\nStart1, Address2\nNumber2\...        NaN
Entry (1)                     [Address1, Address2, Address3]        NaN
Entry (2)                        [Number1, Number2, Number2]        NaN
Entry (3)                           [Start1, Start2, Start3]        NaN
FR                                                       NaN  [1, 2, 3]
UK                                                       NaN  [1, 3, 2]
name                                               [A, B, C]        NaN
source                                             [1, 2, 3]        NaN

相关问题 更多 >