如何在python中访问csvfile的特定单元格

2024-09-27 00:21:23 发布

您现在位置:Python中文网/ 问答频道 /正文

我有以下CSV文件:

Sample,Forward,Reverse
Micro1,EF30159600_EF30159600,EF30159601_EF30159601
Micro2,EF30159603_EF30159603,EF30159604_EF30159604
PseudaA,EF30159607_EF30159607,EF30159608_EF30159608

下面是一段代码:

#!/miniconda/bin/python
import csv

with open("/home/lamma/local-blast/scripts/test.csv", 'r') as file:
    samples = csv.reader(file)
    for row in samples:
        print(row[1])

我希望能够开始打印一行,例如:

Micro2,EF30159603_EF30159603,EF30159604_EF30159604

而不是:

Sample
Micro1
Micro2
PseudaA

这是目前正在发生的事情

但是我还希望能够迭代CSV文件并提取每一列,例如EF30159603_EF30159603。我希望这样做,因为我需要使用第2列和第3列中的值(即文件名),然后使用第1列中的值重命名这些文件

任何帮助都将不胜感激:)

编辑:

在@accdias的帮助下添加最后的代码

with open(args.csv, 'r') as file:
    samples = csv.reader(file)
    for row in list(samples)[1:]:
        os.rename(args.path + '/' + row[1] + '.seq', args.path + '/' + row[0] + '_1.fasta')
        os.rename(args.path + '/' + row[2] + '.seq', args.path + '/' + row[0] + '_2.fasta')

Tags: 文件csvsamplepathargsfilerowsamples
3条回答

试试DictReader

#!/miniconda/bin/python
import csv
with open("/home/lamma/local-blast/scripts/test.csv", 'r') as file:
    samples = csv.DictReader(file)
    for idx, row in enumerate(samples):
        print (*row.values(), sep="\t")
        print("Row", idx, "Forward: ", row["Forward"])
        print("Row", idx, "Reverse: ", row["Reverse"])
        # you can do anything with this row's vlaues

读卡器将每一行转换成有序的dict,这样您就可以用它们的名称访问vlaues

如果要经常使用数据文件,我建议使用pandas库。使用pandas,您可以执行以下操作:

import pandas as pd

df = pd.read_csv('/home/lamma/local-blast/scripts/test.csv')
# access row by index
row = df.iloc[1]
# convert the pandas series to a list
row_list = row.tolist()
# access cell by index
cell = df.iloc[1, 1]

一般来说,它使处理表格数据变得非常容易

查看ilocloc的文档

您也可以轻松地选择列作为列表:

import pandas as pd

df = pd.read_csv('/home/lamma/local-blast/scripts/test.csv')

# get the columns as lists
samples = df['Sample'].tolist()
forward = df['Forward'].tolist()
reverse = df['Reverse'].tolist()

总结我的评论,您可以这样做,以更直观的方式迭代CSV文件中的行:

import csv

csvfile = '/home/lamma/local-blast/scripts/test.csv'

with open(csvfile) as f:
    rows = csv.reader(f)

    headers = next(rows)

    for sample, forward, reverse in rows:
        # do something with sample, forward, and reverse
        # rinse and repeat

下面是使用pathlibf-strings(正如我在评论中所说,它需要Python 3.6+)更新代码的替代方案:

import csv
from pathlib import Path

# I'm assuming you are processing args somewhere else
# in your code
path = Path(args.path)
csvfile = Path(args.csv)

with csvfile.open() as f:
    rows = csv.reader(f)

    headers = next(rows)

    # avoid using generic indexed elements of rows
    # for clarity in the code
    for sample, forward, reverse in rows:
        # process forward sample file
        seq = path / f'{forward}.seq'
        fasta = path / f'{sample}_1.fasta'

        if seq.exists():
            seq.rename(fasta)

        # process reverse sample file
        seq = path / f'{reverse}.seq'
        fasta = path / f'{sample}_2.fasta'

        if seq.exists():
            seq.rename(fasta)

我希望有帮助

相关问题 更多 >

    热门问题