如何从字符串列表中删除\uxxx？ - 问答 - Python中文网

如何从字符串列表中删除\uxxx？

2024-09-30 01:29:39 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我想删除所有以\u开头的单词。我相信这些是unicode“\uxxx”

原始字符串：

"RT  \u2066als \u2066@WBHoekstra\u2069 zijn poot maar stijf houdt in de Italiaanse kwestie. Leest Mattheus 25, 2-13 '"

期望输出：

"RT @WBHoekstra zijn poot maar stijf houdt in de Italiaanse kwestie. Leest Mattheus 25, 2-13 '"

我试着像这样使用正则表达式：

re.sub('\u\w+','',item )

但我得到了以下错误：

"SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape"

Tags： in unicode de 单词 rt kwestie leest mattheus

1条回答

网友

1楼 · 发布于 2024-09-30 01:29:39

您可以使用.encode('ascii', 'ignore')来实现这一点

"RT  \u2066als \u2066@WBHoekstra\u2069 zijn poot maar stijf houdt in de Italiaanse kwestie. Leest Mattheus 25, 2-13 '".encode('ascii', 'ignore')

输出

 b"RT  als @WBHoekstra zijn poot maar stijf houdt in de Italiaanse kwestie. Leest Mattheus 25, 2-13 '"

相关问题更多 >

编程相关推荐

热门问题

热门文章