使用Marathi语言创建JSON的Python代码,给出了不可读的JSON

2024-06-28 11:39:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用python代码创建JSON文件。已成功使用英语创建文件,但无法正确使用马拉地语言。 请查看代码:

import os
import json

jsonFilePath = "E:/file/"
captchaImgLocation = "E:/file/captchaimg/"

path_to_tesseract = r"C:/Program Files/Tesseract-OCR/tesseract.exe"
image_path = r"E:/file/captchaimg/captcha.png"


x = {
    "FName": "प्रवीण",
}

# convert into JSON:
y = json.dumps(x, ensure_ascii=False).encode('utf8')

# the result is a JSON string:
print(y.decode())

completeName = os.path.join(jsonFilePath, "searchResult_Unicode.json")
print(str(completeName))
file1 = open(completeName, "w")
file1.write(str(y))
file1.close()

O/P on console:
{"FName": "प्रवीण"}
<br>
File created inside folder like this:
b'{"FName": "\xe0\xa4\xaa\xe0\xa5\x8d\xe0\xa4\xb0\xe0\xa4\xb5\xe0\xa5\x80\xe0\xa4\xa3"}'

没有运行时或编译时错误,但JSON是使用上述格式创建的。 请给我建议解决办法


Tags: 文件path代码importjsonosfnamefile1
3条回答

以所需的编码方式打开文件,然后json.dump对其执行以下操作:

import os
import json

data = { "FName": "प्रवीण" }

# Writing human-readable.  Note some text viewers on Windows required UTF-8 w/ BOM
# to *display* correctly.  It's not a problem with writing, but you can use
# encoding='utf-8-sig' to hint to those programs that the file is UTF-8 if
# you see that issue.  MUST use encoding='utf8' to read it back correctly.
with open('out.json', 'w', encoding='utf8') as f:
    json.dump(data, f, ensure_ascii=False)

# Writing non-human-readable for non-ASCII, but others will have few
# problems reading it back into Python because all common encodings are ASCII-compatible.
# Using the default encoding this will work.  I'm being explicit about encoding
# because it is good practice.
with open('out2.json', 'w', encoding='ascii') as f:
    json.dump(data, f, ensure_ascii=True) # True is the default anyway

# reading either one is the same
with open('out.json', encoding='utf8') as f:
    data2 = json.load(f)

with open('out2.json', encoding='utf8') as f:  # UTF-8 is ASCII-compatible
    data3 = json.load(f)

# Round-tripping test
print(data == data2, data2)
print(data == data3, data3)

输出:

True {'FName': 'प्रवीण'}
True {'FName': 'प्रवीण'}

out.json(UTF-8编码):

{"FName": "प्रवीण"}

out2.json(ASCII编码):

{"FName": "\u092a\u094d\u0930\u0935\u0940\u0923"}

您希望json是人类可读的吗?这通常是不好的做法,因为您永远不知道使用什么编码。
您可以使用json模块写入/读取json文件,而无需担心编码问题:

import json

json_path = "test.json"
x = {"FName": "प्रवीण"}

with open(json_path, "w") as outfile:
    json.dump(x, outfile, indent=4)

with open(json_path, "r") as infile:
  print(json.load(infile))

您已经对JSON字符串进行了编码,因此必须以二进制模式打开文件,或者在写入文件之前对JSON进行解码,因此:

file1 = open(completeName, "wb")
file1.write(y)

file1 = open(completeName, "w")
file1.write(y.decode('utf-8'))

file1 = open(completeName, "w")
file1.write(str(y))

将字节的字符串表示形式写入文件,这样做总是错误的

相关问题 更多 >