Pandas分析csv错误应为1个找到的字段9

2024-10-17 02:33:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试从.csv文件解析:

planets = pd.read_csv("planets.csv", sep=',')

但我最后总是犯这个错误:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 13, saw 9

我的csv文件的前几行是这样的:

# This file was produced by the test
# Tue Apr  3 06:03:27 2018
#
# COLUMN pl_hostname:    Host Name
# COLUMN pl_discmethod:  Discovery Method
# COLUMN pl_pnum:        Number of Planets in System
# COLUMN pl_orbper:      Orbital Period [days]
# COLUMN pl_orbsmax:     Orbit Semi-Major Axis [AU])
# COLUMN st_dist:        Distance [pc]
# COLUMN st_teff:        Effective Temperature [K]
# COLUMN st_mass:        Stellar Mass [Solar mass] 
#
loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass
1,11 Com,Radial Velocity,1,326.03000000,1.290000,110.62,4742.00,2.70
2,11 UMi,Radial Velocity,1,516.22000000,1.540000,119.47,4340.00,1.80
3,14 And,Radial Velocity,1,185.84000000,0.830000,76.39,4813.00,2.20
4,14 Her,Radial Velocity,1,1773.40000000,2.770000,18.15,5311.00,0.90
5,16 Cyg B,Radial Velocity,1,798.50000000,1.681000,21.41,5674.00,0.99
6,18 Del,Radial Velocity,1,993.30000000,2.600000,73.10,4979.00,2.30
7,1RXS J160929.1-210524,Imaging,1,,330.000000,145.00,4060.00,0.85

编辑:这是第13行:

loc_rowid,pl_hostname,pl_discmethod,pl_pnum,pl_orbper,pl_orbsmax,st_dist,st_teff,st_mass

编辑:感谢@Rakesh,跳过前12行解决了问题

行星=pd.read_csv(“行星.csv”,sep=',,skiprows=12)


Tags: 文件csvdistcolumnhostnamemassplst
3条回答

函数^{}从第一行获取列数及其名称。默认情况下,它不考虑第一行是注释的选项。

正在发生的是,pandas读取第一行,将其拆分,发现只有一列,将此拆分插入到第13行,即第一个未注释行。要解决这个问题,可以使用参数comment

planets = pd.read_csv("planets.csv", comment='#')

与使用skiprows相比,这允许相同的代码加载planets.csv文件,即使注释行数不同。

看来你需要skiprows。你可以跳过所有的评论。

例如:

planets = pd.read_csv("planets.csv", sep=',', skiprows=12)

当我无法找出错误的确切原因时,我已经开始工作了:

planets = pd.read_csv('planets.csv', sep=',', error_bad_lines=False)

相关问题 更多 >