pandas dataframe读取csv，指定列同时保留整行为字符串

网友

1楼 · 编辑于 2024-10-01 02:25:11

我将conn.commit()放在for循环的外部。它将加载时间缩短到几分钟，尽管我猜它不太安全。在

不管怎样，谢谢你的帮助。在

网友

2楼 · 编辑于 2024-10-01 02:25:11

使用某种准分隔符将整行作为一个df读入（在im下面使用&；），然后使用usecols再次读取，并指定cols 1和15的索引并将它们相加。在

my_df_full = pd.read_csv("tablefile.txt", sep="&", lineterminator="\r", low_memory=False)
my_df_full.columns = ['full_line']

my_df_cols = pd.read_csv("tablefile.txt", sep="^", lineterminator="\r", low_memory=False, usecols=[1,15])

my_df_full[['col1', 'col15']] = my_df_cols

网友

3楼 · 编辑于 2024-10-01 02:25:11

首先，可以编译正则表达式以避免对每一行进行解析

import re

reCol1id = re.compile('^(\d+)\^')
reCol15id = re.compile('^.*\^.*\^(\d+)\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*\^.*')

count_1 = 0
for line in open('tablefile.txt'):
    if count_1 > 70:
        break
    else:
        col1id = reCol1id.findall(line)[0]
        col15id = reCol15id.findall(line)[0]
        line = line.strip()

        count_1 += 1

        cur.execute('''INSERT INTO mytable (mycol1id, mycol15id, wholeline) VALUES (?, ?, ?)''', 
        (col1id, col15id, line, ) )

        conn.commit()
    print('row count_1=',count_1)

相关问题更多 >

编程相关推荐

热门问题

热门文章

pandas dataframe读取csv，指定列同时保留整行为字符串

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >