如何重写发送终端的结果

#!/usr/bin/python # -*- coding: utf8 -*- import nlpnet def get_tags(content): #Labeling templates directory data_dir = 'pos-pt'; #Definition of the directory and language to be used tagger = nlpnet.POSTagger(data_dir, language='pt'); for i in range(content.__len__()): str = content[i]; # Método para a etiquetação da sentença tagged_str = tagger.tag(str); print(tagged_str); return content;

2条回答

网友

1楼 · 编辑于 2024-09-28 15:33:16

在命令行中运行程序时，写入$python python_filename.py > savingfilename.txt。这将把屏幕上打印的所有内容保存到文本文件中。在

网友

2楼 · 编辑于 2024-09-28 15:33:16

如果您只想对文档中的句子进行POS标记，并将包含N个以上所选POS的句子转储到文件中，则不需要您发布的第二个脚本。在

这是一个极其简化的例子：

import os
import nlpnet

TAGGER = nlpnet.POSTagger('pos-pt', language='pt')


# You could have a function that tagged and verified if a
# sentence meets the criteria for storage.

def is_worth_saving(text, pos, pos_count):
    # tagged sentences are lists of tagged words, which in
    # nlpnet are (word, pos) tuples. Tagged texts may contain
    # several sentences.
    pos_words = [word for sentence in TAGGER.tag(text)
                 for word in sentence
                 if word[1] == pos]
    return len(pos_words) >= pos_count


# Then you'd just need to open your original file, read a sentence, tag
# it, decide if it's worth saving, and save it or not. Until you consume 
# the entire original file. Thus not loading the entire dataset in memory 
# and keeping a small memory footprint.

with open('opiniaoaborto.txt', encoding='utf8') as original_file:
    with open('oracaos_interessantes.txt', 'w') as output_file:
        for text in original_file:
            # For example, only save sentences with more than 5 verbs in it
            if is_worth_saving(text, 'V', 5):
                output_file.write(text + os.linesep)

回答你的跟进。你要检查一个句子是否包含5个单词，这些单词都用给定列表中的任何词性标记。我设想两种情况：

A）这5个词必须属于同一个词性。例如，含有5个动词（‘Comendo，dançando，procurando，olhando e falando’）或5个名词（‘O gato，O sapo，O cãO，O loro e O ratãO foram as compras'），而不是5个动词+名词（‘O gato esta querendo comer O ratãO’[2个名词，3个动词]）。在

^{pr2}$

B）句子包含5个词性词组，由列表中任意一个词组的和组成。例如：“O gato esta querendo comer O ratãO”（2个名词+3个动词）

import os
import nlpnet

TAGGER = nlpnet.POSTagger('pos-pt', language='pt')

# Again, one of the arguments would have to take a list of valid POS
def is_worth_saving(text, pos_list, pos_count):
    pos_words = [word for sentence in TAGGER.tag(text)
                 for word in sentence
                 if word[1] in pos_list]
    return len(pos_words) >= pos_count

with open('opiniaoaborto.txt', encoding='utf8') as original_file:
    with open('oracaos_interessantes.txt', 'w') as output_file:
        for text in original_file:
            # For example, only save sentences whose sum of verbs and nouns count is 5
            if is_worth_saving(text, ['V', 'N'], 5):
                output_file.write(text + os.linesep)

相关问题更多 >

编程相关推荐

热门问题

热门文章