Python中的文本处理如何处理无效字符串

2024-10-02 20:34:37 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在研究文本分类。我看到无效字符如下所示。有人能帮我把这些字符解码成实际值吗。任何指针也应该有帮助。在

"It wouldn\'t take much to do for **Ã\x86sop**,\n\n\n\n\n            would it?**â\x80\x9d** whispered Ivan to Alyosha.\n\n\n\n\n\n\n\n\n\n            **â\x80\x9c**God forbid!**â\x80\x9d** cried Alyosha.\n\n\n\n\n\n\n\n\n\n            **â\x80\x9c**Why should He forbid?**â\x80\x9d** Ivan went on in the\n\n\n\n\n            same whisper, with a malignant grimace. **â\x80\x9c**One reptile will devour the other., And serve them\n\n\n\n\n            both right, too.â\x80\x9d\n\n\n\n\n\n\n\n\n\n            Alyosha\n\n\n\n\n            shuddered.\n\n\n\n\n\n\n\n\n\n            â\x80\x9cOf course I won\'t let him be murdered as I didn\'t\n\n\n\n\n            just now., Stay here, Alyosha, I\'ll go for a turn in the yard., My\n\n\n\n\n            head\'s begun to ache.â\x80\x9d\n\n\n\n\n\n\n\n\n\n            Alyosha went\n\n\n\n\n            to his father\'s bedroom and sat by his bedside behind the screen\n\n\n\n\n            for about an hour., The old man suddenly opened his eyes and gazed\n\n\n\n\n            for a long while at Alyosha, evidently remembering and\n\n\n\n\n            meditating., All at once his face betrayed extraordinary\n\n\n\n\n            excitement.\n\n\n\n\n\n\n\n\n\n            â\x80\x9cAlyosha,â\x80\x9d he whispered apprehensively,\n\n\n\n\n            â\x80\x9cwhere\'s Ivan?â\x80\x9d\n\n\n\n\n\n\n\n\n\n            â\x80\x9cIn the yard., He\'s got a headache., He\'s on the\n\n\n\n\n            watch.â\x80\x9d\n\n\n\n\n\n\n\n\n\n            â\x80\x9cGive me that looking-glass., It stands over there.\n\n\n\n\n            Give it me.â\x80\x9d\n\n\n\n\n\n\n\n\n\n            Alyosha gave\n\n\n\n\n            him a little round folding looking-glass which stood on the chest\n\n\n\n\n            of drawers., The old man looked at himself in it; his nose was\n\n\n\n\n            considerably swollen, and on the left side of his forehead there\n\n\n\n\n            was a rather large crimson bruise.\n\n\n\n\n\n\n\n\n\n            â\x80\x9cWhat does Ivan say?

Tags: andthetoinforonitat
1条回答
网友
1楼 · 发布于 2024-10-02 20:34:37

看起来数据是双重编码的(你在用Python2吗?)。它可以通过编码到拉丁语-1,然后从UTF-8解码来修复。在

>>> data.encode('latin-1').decode('utf-8')
"It wouldn't take much to do for **Æsop**,\n\n\n\n\n            would it?**”** whispered Ivan to Alyosha.\n\n\n\n\n\n\n\n\n\n            **“**God forbid!**”** cried Alyosha.\n\n\n\n\n\n\n\n\n\n            **“**Why should He forbid?**”** Ivan went on in the\n\n\n\n\n            same whisper, with a malignant grimace. **“**One reptile will devour the other., And serve them\n\n\n\n\n            both right, too.”\n\n\n\n\n\n\n\n\n\n            Alyosha\n\n\n\n\n            shuddered.\n\n\n\n\n\n\n\n\n\n            “Of course I won't let him be murdered as I didn't\n\n\n\n\n            just now., Stay here, Alyosha, I'll go for a turn in the yard., My\n\n\n\n\n            head's begun to ache.”\n\n\n\n\n\n\n\n\n\n            Alyosha went\n\n\n\n\n            to his father's bedroom and sat by his bedside behind the screen\n\n\n\n\n            for about an hour., The old man suddenly opened his eyes and gazed\n\n\n\n\n            for a long while at Alyosha, evidently remembering and\n\n\n\n\n            meditating., All at once his face betrayed extraordinary\n\n\n\n\n            excitement.\n\n\n\n\n\n\n\n\n\n            “Alyosha,” he whispered apprehensively,\n\n\n\n\n            “where's Ivan?”\n\n\n\n\n\n\n\n\n\n            “In the yard., He's got a headache., He's on the\n\n\n\n\n            watch.”\n\n\n\n\n\n\n\n\n\n            “Give me that looking-glass., It stands over there.\n\n\n\n\n            Give it me.”\n\n\n\n\n\n\n\n\n\n            Alyosha gave\n\n\n\n\n            him a little round folding looking-glass which stood on the chest\n\n\n\n\n            of drawers., The old man looked at himself in it; his nose was\n\n\n\n\n            considerably swollen, and on the left side of his forehead there\n\n\n\n\n            was a rather large crimson bruise.\n\n\n\n\n\n\n\n\n\n            “What does Ivan say?"

相关问题 更多 >