在数据框中查找某一行，并将另一数据框中的值输入到特定列中

Dataframe 1 (INPUT DF): CustomerID Product Quantity 123 Ball 2 123 Leash 2 456 Ball 1 Dataframe 2 (OUTPUT DF): CustomerID Ball Leash 123 0 0 456 0 0

### Adding how many for each customer for index, row in df2.iterrows(): ID = row["Customer_ID"] Product = row["Product_Name"] Quantity = row["Quantity"] df.loc[df.index[df[ID]], Product] = Quantity

2条回答

网友

1楼 · 编辑于 2024-05-02 10:28:42

我想你在找df.pivot。从熊猫文档：

Reshape data (produce a “pivot” table) based on column values. Uses unique values from specified index / columns to form axes of the resulting DataFrame. This function does not support data aggregation, multiple values will result in a MultiIndex in the columns. See the User Guide for more on reshaping.

在案例中使用pivot会在索引中生成CustomerID，在列中生成Product：

In [4]: df.pivot('CustomerID', 'Product', 'Quantity')                                                                                                                                                     
Out[4]: 
Product     Ball  Leash
CustomerID             
123          2.0    2.0
456          1.0    NaN

然后可以使用^{}在剩余的单元格中获取0

网友

2楼 · 编辑于 2024-05-02 10:28:42

如果我理解正确，您希望输入来自df1的每个用户的每个产品的计数，我将从df1本身生成：

为每个产品创建一列，包含相应的数量
groupby customer\u id并对所有产品列求和

import pandas as pd
from io import StringIO

df1= StringIO("""CustomerID;Product;Quantity
123;Ball;2
123;Leash;2
456;Ball;1""")

df1=pd.read_csv(df1,sep=";")
unique_columns = list(df1["Product"].unique())

def productsAsColumns(row):
    columns = {c:0 for c in unique_columns}
    columns[row["Product"]] = row["Quantity"]
    return columns 

df1[unique_columns] = df1.apply(productsAsColumns, axis=1,result_type="expand")
df1.drop(columns=["Product","Quantity"],inplace=True)
df1 = df1.groupby("CustomerID").apply(sum)[unique_columns].reset_index() 
print(df1)

输出

   CustomerID  Ball  Leash
0         123     2      2
1         456     1      0

相关问题更多 >

编程相关推荐

热门问题

热门文章