ValueError:基为10的int()的文本无效:“3”\r'

2024-09-27 19:25:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我的csv文件(测试.csv)以下内容示例:注:我的测试.csv文件大小约为60MB。

"Position","Value"
"2545600","19"
"2545601","19"
"2545602","19"
"2545603","19"
"2545604","20"
"2545605","20"
"2545606","21"
"2545607","22"
"2545608","21"
"2545609","20"
"2545610","21"
"2545611","18"
"2545612","19"
"2545613","21"
"2545614","21"
"2545615","21"
"2545616","21"
"2545617","22"
"2545618","25"
"2545619","25"

我的python代码(测试.py)下图:

^{pr2}$

我的命令行:

python test.py test.csv test.igv 5

运行命令后,我遇到一个错误:

Traceback (most recent call last):
  File "test.py", line 15, in <module>
    line[1] = str(int(line[1])/mil)
ValueError: invalid literal for int() with base 10: '3"\r'

但是,如果我创建一个新的空csv文件,即。小.csv只复制/粘贴我的测试.csv文件。然后它成功地运行该命令。

python test.py small.csv small.igv 5

输入小.csv以下内容:

"Position","Value"
"2545600","19"
"2545601","19"
"2545602","19"
"2545603","19"
"2545604","20"
"2545605","20"
"2545606","21"
"2545607","22"
"2545608","21"
"2545609","20"

输出小.igv以下内容:

chr start   end feature small.igv
gi|255767013|ref|NC_000964.3|   2545600 2545600     3.8
gi|255767013|ref|NC_000964.3|   2545601 2545601     3.8
gi|255767013|ref|NC_000964.3|   2545602 2545602     3.8
gi|255767013|ref|NC_000964.3|   2545603 2545603     3.8
gi|255767013|ref|NC_000964.3|   2545604 2545604     4.0
gi|255767013|ref|NC_000964.3|   2545605 2545605     4.0
gi|255767013|ref|NC_000964.3|   2545606 2545606     4.2
gi|255767013|ref|NC_000964.3|   2545607 2545607     4.4
gi|255767013|ref|NC_000964.3|   2545608 2545608     4.2
gi|255767013|ref|NC_000964.3|   2545609 2545609     4.0

这就是我想要的。所以问题是,为什么我不能在一个更大的csv文件上做呢?


Tags: 文件csvpytest命令ref示例value
3条回答

正如建议的,csv模块更有帮助。在

例如:

import csv
f = open("ex.csv")
for line in csv.reader(f):
    print line

和数据

^{pr2}$

输出

['Position', 'Value']
['2545600', '19']
['2545601', '19']
['2545602', '19']
['2545603', '19']

更容易管理。在

另外,csv模块也编写csv文件。在

试试看

for line in ..... :
     line = line.strip()

这将从行字符串中删除行尾。在

更好的解决方案:使用Python的csv模块来处理这些方面。在

在这种情况下,使用csv模块要好得多。从csv文件读取的每一行都作为字符串列表返回。不会出现剥离空白的问题,您可以在csv.reader函数的参数中指定分隔符(这里不需要)。在

import csv
import sys

out = open(sys.argv[2], 'w')
mil = float(sys.argv[3])

out.write('chr\tstart\tend\tfeature\t'+sys.argv[2]+'\n')
with open(sys.argv[1], 'rb') as f:
    reader = csv.reader(f, delimiter=',')
    headers = reader.next()    # Consider headers separately
    for line in reader:
        line[1] = str(int(line[1])/mil)
        out.write('gi|255767013|ref|NC_000964.3|\t'+line[0]+'\t'+line[0]+'\t\t'+line[1]+'\n')
out.close()

python test.py test.csv test.igv 5 && cat test.igv应显示预期输出。在

相关问题 更多 >

    热门问题