如何对python lis进行编码

# encoding text file with codecs.open('projectsinline.txt', 'r', encoding="utf-8") as f: for line in f: # Using re module to extract specific words unicode_pattern = re.compile(r'\b\w{4,20}\b', re.UNICODE) result = unicode_pattern.findall(line) word_counts = Counter(result) # It creates a dictionary key and wordCount Allwords = [] for clave in word_counts: if word_counts[clave] >= 10: # We look for the most repeated words word = clave Allwords.append(word) print Allwords

2条回答

网友

1楼 · 编辑于 2024-05-19 00:00:24

您的Unicode字符串列表是正确的。打印列表时，列表中的项目显示为它们的repr()函数。打印项目本身时，项目显示为其str()函数。它只是一个显示选项，类似于将整数打印为十进制或十六进制。你知道吗

因此，如果你想正确地看到这些单词，就把它们打印出来，但是为了比较，内容是正确的。你知道吗

值得注意的是，python3改变了repr()的行为，现在如果终端直接支持非ASCII字符，并且ascii()函数再现了python2 repr()的行为，那么它将显示不带转义码的非ASCII字符。你知道吗

网友

2楼 · 编辑于 2024-05-19 00:00:24

你可以试试看

打印（'['+'，'.join（Allwords）+']'）

相关问题更多 >

编程相关推荐

热门问题

热门文章