我有一个包含如下数据的数据集:
Month, Year, Quantity Sold, Product Name
11, 2017, 13, "Creatine Powder Supplement - 500g"
11, 2017, 10, "Gummies 1 bag"
11, 2017, 12, "Creatine Powder Supplement - 1000g"
11, 2017, 15, "Creatine Powder Supplement - 1500g"
11, 2017, 11, "Glucosamine - 500g"
11, 2017, 23, "Glucosamine - 1500g"
12, 2017, 17, "Creatine Powder Supplement - 1000g"
12, 2017, 24, "Glucosamine - 500g"
12, 2017, 13, "Glucosamine - 1500g"
1, 2018, 16, "Creatine Powder Supplement - 500g"
1, 2018, 13, "Creatine Powder Supplement - 1000g"
1, 2018, 10, "Gummies 1 bag"
1, 2018, 11, "Glucosamine - 500g"
1, 2018, 21, "Glucosamine - 1500g"
我想计算销售产品的总重量,用月份和年份分开,这需要从“产品名称”列中提取产品的重量,乘以“销售数量”列,然后提供相关产品的总重量
期望输出(我只计算了第一行的总销售重量):
Matched data set:
Month, Year, Product Name, Total Weight Sold
11, 2017, Creatine Powder Supplement, 41000
11, 2017, Glucosamine, <total>
12, 2017, Creatine Powder Supplement, <total>
12, 2017, Glucosamine, <total>
1, 2018, Creatine Powder Supplement, <total>
1, 2018, Glucosamine, <total>
除此之外,对于任何不以模式 - <number>g
结尾的产品,我希望将它们输出到一个单独的数据集中,以便可以查看它们
UNmatched data set:
Month, Year, Quantity Sold, Product Name
11, 2017, 10, "Gummies 1 bag"
1, 2018, 10, "Gummies 1 bag"
我正在考虑使用str.extract
,但我不完全确定如何进行数学运算,然后将计算出的合计结果与同一产品的其他行相加,放入新的数据帧或其他
谢谢
下面是一个Python解决方案。它将错误行写入输出文件,并将正确的行写入终端
打印到终端:
打印到err.txt:
我能想到的最直接的解决办法是
哪个输出
相关问题 更多 >
编程相关推荐