snscrape在python中解码tweet

2024-09-24 22:31:49 发布

男 | 程序猿一只，喜欢编程写python代码。

我试图用snscrap检索一些tweet，但生成的JSON文件编码为“cp1252”。我在文档中找不到是否有方法要求JSON文件按我所说的方式进行编码，但是，难道不可能吗，我如何将一个相当大的文本文件从cp1252转换为UTF-8？我见过很多这样的问题，但它们都解释了如何打印正确的文本，而不是将其存储在文件中

这个问题不是this one的两个例子，因为我不想通过cmd而是通过python来解决这个问题

编辑：我将尝试更好地解释这种情况：我正在检索推文，但它们碰巧包含unicode字符。这是我想解码的一个句子的例子：

La mia vita \u00e8 fantastica I extracted the encoding of the file this sentence is written in and it is 'cp-1252'. I'm not sure anymore if this is a 'cp-1252' file containing unicode characters (is this even possible?), but I had no luck converting that "\u00e8" to my "è".

在第一次评论之后，我尝试了以下内容：

file = open(file_name_input, encoding='cp1252')
file_output = open(file_name_output, 'w')
for line in file:
    file_output.write(line.encode('utf-8').decode())

Tags：文件 the in json 编码 output is unicode

0条回答

目前没有回答

snscrape在python中解码tweet

相关问题更多 >

编程相关推荐

热门问题

热门文章

snscrape在python中解码tweet

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >