在数据框中查找某一行,并将另一数据框中的值输入到特定列中

2024-05-02 10:28:42 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个数据帧。我目前有下面的代码,但我现在知道如何在dataframe 2中搜索客户ID,将数量放入相应的product列中

Dataframe 1 (INPUT DF): 
CustomerID Product Quantity
123        Ball    2
123        Leash   2
456        Ball    1

Dataframe 2 (OUTPUT DF):
CustomerID Ball Leash
123        0    0
456        0    0     

我希望他们看起来像这样

Dataframe 1 (INPUT):
CustomerID Product Quantity
123        Ball    2
123        Leash   2
456        Ball    1

Dataframe 2 (OUTPUT):
CustomerID Ball Leash
123        2    2
456        1    0

如果我需要进一步解释,请告诉我

### Adding how many for each customer
for index, row in df2.iterrows():
   ID = row["Customer_ID"]
   Product = row["Product_Name"]
   Quantity = row["Quantity"]
   df.loc[df.index[df[ID]], Product] = Quantity

2条回答

我想你在找df.pivot。从熊猫文档:

Reshape data (produce a “pivot” table) based on column values. Uses unique values from specified index / columns to form axes of the resulting DataFrame. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. See the User Guide for more on reshaping.

在案例中使用pivot会在索引中生成CustomerID,在列中生成Product:

In [4]: df.pivot('CustomerID', 'Product', 'Quantity')                                                                                                                                                     
Out[4]: 
Product     Ball  Leash
CustomerID             
123          2.0    2.0
456          1.0    NaN

然后可以使用^{}在剩余的单元格中获取0

如果我理解正确,您希望输入来自df1的每个用户的每个产品的计数,我将从df1本身生成:

  1. 为每个产品创建一列,包含相应的数量
  2. groupby customer\u id并对所有产品列求和
import pandas as pd
from io import StringIO

df1= StringIO("""CustomerID;Product;Quantity
123;Ball;2
123;Leash;2
456;Ball;1""")

df1=pd.read_csv(df1,sep=";")
unique_columns = list(df1["Product"].unique())

def productsAsColumns(row):
    columns = {c:0 for c in unique_columns}
    columns[row["Product"]] = row["Quantity"]
    return columns 

df1[unique_columns] = df1.apply(productsAsColumns, axis=1,result_type="expand")
df1.drop(columns=["Product","Quantity"],inplace=True)
df1 = df1.groupby("CustomerID").apply(sum)[unique_columns].reset_index() 
print(df1)

输出

   CustomerID  Ball  Leash
0         123     2      2
1         456     1      0

相关问题 更多 >