大Pandas中的python vlookup应用%LIKE%

3条回答

网友

1楼 · 编辑于 2024-05-19 10:23:36

通过Series解决^{}：

from  itertools import chain

#split values by `,` to lists
lens = df2['Phone'].str.split(',')
#if some zero list remove it
df2 = df2.dropna(subset=['Phone'])

#explode Names by length of lists, flat values by chain.from_iterable
s = pd.Series(np.repeat(df2.Name.values, lens), 
              index= list(chain.from_iterable(df2.Phone.values)))
#convert index to int for match
s.index = s.index.astype(int)
print (s)
123     Winnie
234     Winnie
456     Winnie
777    Celeste
888    Celeste
999    Celeste
dtype: object

#map values to new column
df1['RecipientName'] = df1['Recipient'].map(s)
print(df1)
  Sender  Recipient RecipientName
0   Adam        123        Winnie
1   Alex        234        Winnie
2   John        123        Winnie
3   Adam        888       Celeste

#write to csv
df.to_csv('out.csv', sep=';', header=None)

Sender;Recipient;RecipientName
Adam;123;Winnie
Alex;234;Winnie
John;123;Winnie
Adam;888;Celeste

与^{}的解决方案类似：

df2['Phone'] = df2['Phone'].str.split(',')
df2 = df2.dropna(subset=['Phone'])

s = pd.Series(np.repeat(df2.Name.values, df2.Phone.str.len()), 
              index= list(chain.from_iterable(df2.Phone.values)))
s.index = s.index.astype(int)
s.name = 'RecipientName'
print (s)

df1 = df1.join(s, on='Recipient')
print(df1)
  Sender  Recipient RecipientName
0   Adam        123        Winnie
1   Alex        234        Winnie
2   John        123        Winnie
3   Adam        888       Celeste

编辑：

我的数据示例：

import pandas as pd
from pandas.compat import StringIO

temp=u"""
Sender;Recipient
Adam;123
Alex;234
John;123
Adam;888"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df1 = pd.read_csv(StringIO(temp), sep=";")
print (df1)
  Sender  Recipient
0   Adam        123
1   Alex        234
2   John        123
3   Adam        888

temp=u"""
Name;Phone
Winnie;123,234,456
Celeste;777,888,999"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df2 = pd.read_csv(StringIO(temp), sep=";")
print (df2)
      Name        Phone
0   Winnie  123,234,456
1  Celeste  777,888,999

网友

2楼 · 编辑于 2024-05-19 10:23:36

使用str.split将Phone列转换为列表
使用str.len()查找每个列表的长度。我们将使用它来分解'Name'列
把所有的清单放在一起。确保过滤掉长度为零的列表
使用repeat分解'Name'
创建一个字典，其中键是电话号码，值是姓名
创建一个d1的副本，在这里我们使用map和我们创建的新字典添加了新列。你知道吗

p = d2.Phone.str.split(',')
p = p[p.astype(bool)]
l = p.str.len()
p2 = np.concatenate(p.values).astype(int)
nm = d2.Name.repeat(l)
m = dict(zip(p2, nm))

df = d1.assign(RecipientName=d1.Recipient.map(m))
print(df)

  Sender  Recipient RecipientName
0   Adam        123        Winnie
1   Alex        234        Winnie
2   John        123        Winnie
3   Adam        888       Celeste

df.to_csv('out.csv', sep=';', header=None)

Sender;Recipient;RecipientName
Adam;123;Winnie
Alex;234;Winnie
John;123;Winnie
Adam;888;Celeste

网友

3楼 · 编辑于 2024-05-19 10:23:36

下面是一些伪代码和如何做到这一点的想法。你知道吗

我将从分析CSV2文件开始。跳过第一行，然后为下面的行解析出姓名和电话号码，然后维护一个字典，其中的姓名与每个电话号码相关联。你知道吗

numbers_to_names = {}
for line in open("csv2", "r").splitlines():
    name, phone_numbers = line.split(";")
    for phone_number in phone_numbers.split(","):
        numbers_to_names[phone_number] = name

然后当再次通过CSV1时，跳过第一行，然后解析出发送者和接收者，并与以前的字典结果相结合。你知道吗

for line in open("csv1", "r").splitlines():
    sender, recipient = line.split(";")
    print "%s;%s;%s" % (sender, recipient, numbers_to_names[recipient])

相关问题更多 >

编程相关推荐

热门问题

热门文章

大Pandas中的python vlookup应用%LIKE%

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >