替换特殊字符python问题的回答

替换特殊字符python

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

可以对发布的代码进行两项改进 <ul> <li>使用dataframe apply而不是使用Python for或while循环来处理每个标题（即非常慢）</li> <li>使用正则表达式，而不是循环检查字母表中的每个字母，以检查逗号后面是否有字母（也很慢）</li> </ul> 代码 <pre><code>import re def clean_title(title): " Expression to clean title " # Remove comma when followed by a word letter return re.sub(r',(\w)', lambda m: m.group(1), title) # Clean titles df['title'] = df['title'].apply(clean_title) </code></pre> 测试 <ul> <li>生成电影标题和发布年份的数据集列表</li> <li>标题中包含所需和不需要的逗号</li> </ul> 不需要的逗号示例： <ul> <li>那些人，甚至是武士</li> </ul> 所需逗号的示例： <ul> <li>“我，托尼亚”</li> </ul> 创建数据集 <pre><code>df = pd.DataFrame({'title':['Lock, Stock and Two Smoking Barrels', 'The S,even Samurai', 'B,onnie and C,lyde', 'Reser,voir Dogs', 'A,irplane!', 'Doct,or Zhiva,go', 'I, Tonya'], 'Year':['1998', '1954', '1967', '1992', '1980', '1965', '2017']}) print(df) </code></pre> 清理前的数据集 <pre><code> title Year 0 Lock, Stock and Two Smoking Barrels 1998 1 The S,even Samurai 1954 2 B,onnie and C,lyde 1967 3 Reser,voir Dogs 1992 4 A,irplane! 1980 5 Doct,or Zhiva,go 1965 6 I, Tonya 2017 </code></pre> 清理后的数据集 <pre><code> title Year 0 Lock, Stock and Two Smoking Barrels 1998 1 The Seven Samurai 1954 2 Bonnie and Clyde 1967 3 Reservoir Dogs 1992 4 Airplane! 1980 5 Doctor Zhivago 1965 6 I, Tonya 2017 </code></pre>

替换特殊字符python

1 个回答

相关Python问题