我试图处理一个文本文件和它的句子。我将文本分成几个句子,并通过一些python函数处理各个句子
with io.open ("articles.html", encoding="utf-8") as myfile:
data = myfile.read()
data = data.split('\n')
myfile.close()
基本上,我处理每个句子取决于它的长度和一些正则表达式过滤器。我已经将python函数(即process\u movie\u 1、process\u movie\u 2、process\u movie\u 3)存储在其他文件中,并导入到主脚本中
我用for循环调用每个句子。在当前的结构中,我的脚本在for循环中一次处理一个句子/一个函数。我需要修改脚本,以便我可以同时处理每个句子(在同一时间)。我需要你的想法,我可能想从主脚本中调用所有这些函数或分叉它。我想你们有些人可能会想出一个更好的主意。好的,现在我使用IDE来调用我的主脚本,但是我已经准备好使用命令提示符同时处理所有的句子。我也可以使用任何开源软件,可以帮助我考虑我的情况
from clip1 import *
from clip2 import *
from clip3 import *
from clip4 import *
for idx, sentence in enumerate(data):
serial = str(idx)
folder = str(idx)
string = str(sentence)
tokens = TextBlob(string)
wordcounts = len(tokens.words)
sep = re.split('; |\*|\n|--', string)
if len(sep) == 2:
a, b = [str(e) for e in sep]
a = TextBlob(a)
b = TextBlob(b)
idx, len(tokens.words), len(sep), len(a.words), len(b.words), sep
if (0 <= wordcounts <= 4):
len(tokens.words), sentence, sep
a, b = [str(e) for e in sep]
a = TextBlob(a)
b = TextBlob(b)
len(a.words), len(b.words), sep, sentence
process_movie_1(folder, gradient, fontface,
fontface_italic, highlight,
highlight_color, font_color, key_color,
first_key, second_key, third_key, string,
stroke_color, stroke_width, txt_under_color,
serial)
elif (5 <= wordcounts <= 6):
len(tokens.words), sentence, sep
a, b = [str(e) for e in sep]
a = TextBlob(a)
b = TextBlob(b)
len(a.words), len(b.words), sep, sentence
process_movie_2(folder, gradient, fontface, fontface_italic, highlight,
highlight_color, font_color, key_color,
first_key, second_key, third_key, string,
stroke_color, stroke_width, txt_under_color,
serial)
elif (7 <= wordcounts <= 15):
len(tokens.words), sentence, sep
a, b = [str(e) for e in sep]
a = TextBlob(a)
b = TextBlob(b)
len(a.words), len(b.words), sep, sentence
if (1 <= len(a.words) <= 3):
print idx, "(clip29 -- done)", len(tokens.words), sep
process_movie_3(folder, gradient, fontface, fontface_italic, highlight,
highlight_color, font_color, key_color,
first_key, second_key, third_key, string,
stroke_color, stroke_width, txt_under_color,
serial)
目前没有回答
相关问题 更多 >
编程相关推荐