python中以10为基数的int()的文本无效

2024-09-27 22:33:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我的编码有问题。我写了一个python代码,它读取一个文本文件并将行拆分成单词,然后将这些单词放入数组中。 这是我的代码:

from whoosh import fields, index
import os.path
import csv
import codecs

# This list associates a name with each position in a row
columns = ["juza","chapter","verse","voc"]

schema = fields.Schema(juza=fields.NUMERIC,
                       chapter=fields.NUMERIC,
                       verse=fields.NUMERIC,
                       voc=fields.TEXT)

# Create the Whoosh index
indexname = "indexdir"
if not os.path.exists(indexname):
  os.mkdir(indexname)
ix = index.create_in(indexname, schema)

# Open a writer for the index
with ix.writer() as writer:
  # Open the CSV file
  with open("tt.txt", 'r') as txtfile:
    # Create a csv reader object for the file
    #csvreader = csv.reader(csvfile,delimiter='\t')
    lines=txtfile.readlines()

    # Read each row in the file
    for i in lines:

      # Create a dictionary to hold the document values for this row
      doc = {}
      thisline=i.split()
      u=0

      # Read the values for the row enumerated like
      # (0, "juza"), (1, "chapter"), etc.
      for w in thisline:

        # Get the field name from the "columns" list
          fieldname = columns[u]
          type(w)
          u+=1

        # Strip any whitespace and convert to unicode
        # NOTE: you need to pass the right encoding here!
        #value = unicode(value.strip(), "utf-8")

        # Put the value in the dictionary
          doc[fieldname] =u"w"

      # Pass the dictionary to the add_document method
      writer.add_document(**doc)
    writer.commit()

但我面临的问题是:

^{pr2}$

以下是输入文本文件的示例:

1   1   1   hala
1   1   2   yomna
1   1   3   hala
1   1   4   yomna
1   2   1   hala
1   2   2   yoomna

Tags: columnscsvthetoinimportfieldsfor
2条回答

你在做什么:

doc[fieldname] =u"w"

您正在为字段fieldname分配文本unicode字符串“w”。如果fieldname是一个整数,这显然会失败。在

相反,做一些类似的事情:

^{pr2}$

尝试将hala和其他元素转换为字符串。使用函数str()。在

相关问题 更多 >

    热门问题