函数，而不必返回值？/通过df快速循环？问题的回答

函数，而不必返回值？/通过df快速循环？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我有一个很大的（约145000行）我正在研究的食谱数据库。我有一个“parsed_Components”列，如下所示（每行多个dict）： <pre><code>[{'orig_name': '1,00 kg Kalbsbraten ', 'orig_amount': '1.00', 'orig_unit': 'kg', 'amount': 0.25, 'unit': 'g', 'splitted_ingredient': 'Kalbsbraten', 'splitted_slized_ingredient': 'Kalbsbraten', 'further_specification': '', 'alternatives': '', 'matched_ingredient_id': 'U030100', 'matched_ingredient_st': 'Kalb Hackfleisch roh', 'calorie': 148, 'protein': 19.726, 'carb': 0.0, 'fat': 7.713}, {'orig_name': '1,00 Zwiebel(n) ', 'orig_amount': '1.00', 'orig_unit': 'Anzahl', 'amount': 9.0, 'unit': 'g', 'splitted_ingredient': 'Zwiebel(n)', ... ] </code></pre> 基本上，我正在尝试为基于项目和用户（基于内容）的推荐系统准备df，因此我正在尝试为配方中包含的每个成分创建一个矩阵 我尝试了以下方法，但遇到的问题是，对于如此多的行，速度非常慢： <pre><code>for index, row in df.iterrows(): extracted_ingredient = "" for ingredient in row["parsed_ingredients"]: extracted_ingredient = ingredient["matched_ingredient_st"] if not extracted_ingredient == "None": df.loc[index, extracted_ingredient] = 1 </code></pre> 因此，我尝试编写一个与apply一起使用的函数，因为我读到它的计算速度要快得多，但后来意识到apply总是希望我返回一些要保存在DF中的内容（否则我会得到'TypeError:'NoneType'对象不可调用'： <pre><code>def ingredient_extraction(content, dataframe=df): for newrow in content: for entry in newrow: if not entry["matched_ingredient_st"] == "None": df[entry["matched_ingredient_st"]] = 1 df.apply(ingredient_extraction(df["parsed_ingredients"], df), axis=1) </code></pre> 是否有任何方法可以让pandas将此功能应用于我的df？或者是否有更好的方法来加速在ItErrors中完成的操作

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

函数，而不必返回值？/通过df快速循环？

1 个回答

相关Python问题