<p>我有一个数据帧它看起来像这样:</p>
<pre><code>id created_at text month
0 911721027587231746 2017-09-23 22:36:46 تفاصيل استخدام سيارات الإسعاف لتهريب المواد ال... 9
1 911719688257851397 2017-09-23 22:31:27 تطوير لقاح جديد لمحاربة تسوس الأسنان\n https:/... 9
2 911715658395725826 2017-09-23 22:15:26 "حمدي الميرغني" يشارك جمهوره بصورة جديدة من شه... 9
3 911715466166587392 2017-09-23 22:14:40 شخصية مصر.. في عيون جمال حمدان (2) https://t.c... 9
</code></pre>
<p>month列的值从1到11不等,我想根据月份数在文本数据上建立一个模型,我正试图获取输出并将其保存到一个txt文件中,但当我打开文件时,我发现每个文件只包含一行。你知道吗</p>
<p>我想要的是得到11个文本文件,每个文件的每个索引命名,每个文件应该包含12行。你知道吗</p>
<p>这是我的密码</p>
<pre><code>def model(final_text):
sentences = [clean(raw_sentence) for raw_sentence in final_text]
doc_clean = [i.split() for i in sentences]
dictionary = corpora.Dictionary(doc_clean)
doc_term_matrix = [dictionary.doc2bow(doc) for doc in doc_clean]
Lda = gensim.models.ldamodel.LdaModel
ldamodel = Lda(doc_term_matrix, num_topics=12, id2word = dictionary, passes = 100, alpha='auto', update_every=5)
x = ldamodel.print_topics(num_topics=12, num_words=5)
y = ldamodel.show_topics(num_topics=12, num_words=5, formatted=False)
topics_words = [(tp[0], [wd[0] for wd in tp[1]]) for tp in y]
for topic,words in topics_words:
#print(" ".join(words).encode('utf-8'))
#print(words)
f = open(str(i)+'.txt', 'wb')
f.write(" ".join(words).encode('utf-8'))
#f.write(words.encode('utf-8'))
f.close()
#clean is just a function for cleaning data and it returns text
for i in range(1,12):
df = parsed[parsed['month'] == i]
text = df.text
model(text)
</code></pre>
<p>我做错什么了?你知道吗</p>
<p>提前谢谢</p>
<pre><code>with open(str(i)+'.txt', 'wb') as f:
for topic,words in topics_words:
f.write(" ".join(words).encode('utf-8'))
</code></pre>
<p>我先打开文件,在里面运行循环,问题就解决了</p>