python为文本fi上的每个单词和符号添加引号

#!/usr/bin/env python # -*- coding: utf-8 -*- import re, codecs, io with io.open ("turkish.txt", "r", encoding="utf-8") as myfile: text=myfile.read() replacer = re.compile("([\w'-]+|[.,!?;()%])", re.UNICODE) output_text = replacer.sub(r'"\1"', text).replace('""','" "') text_file = open("Output.txt", "w") text_file.write(output_text.encode('utf8')) text_file.close()

2条回答

网友

1楼 · 编辑于 2024-09-26 18:19:55

将[IVXLCDM]+\.|[\d\.]+(?:'\w+)?添加到regex模式的开头将匹配预期的“10.000”和“10.000'lerde”以及“I.”。你知道吗

replacer = re.compile(r"\b([IVXLCDM]+\.|[\d\.]+(?:'\w+)?|[\w'-]+|[.,!?;()%])", re.UNICODE)

网友

2楼 · 编辑于 2024-09-26 18:19:55

with open("turkish.txt", "r") as myfile:
    text=myfile.read()

output_text = text.split(" ")

with open("Output.txt", "w",) as outfile:
    for word in output_text:
        outfile.write(' "'+ word + '" ')

也许是更好的解决办法

编程相关推荐

java如何拆分字符串（基于各种分隔符），但不保留空格？
解析。Json格式的txt文件和knime中的java
java Spring rest api为什么在rest api调用的响应中更改了数据类型
升华文本3抛出java。lang.ClassNotFoundException，而记事本++不存在
java Android指纹扫描仪在尝试5次后停止工作？
java Android如何设置精确的重复报警？
java如何使用HTTPGET connect为access API输入用户名和密码
java当测试报告显示没有测试失败时，Gradle为什么说“有失败的测试”？
用Gson实现java获取响应
MapReduce程序中函数错误的java不可映射参数

相关问题更多 >

编程相关推荐

热门问题

热门文章