全部-
第一次在这里提问,如果格式不好,请告诉我如何改进我的问题。在
我正在寻求对熊猫.read_csv()功能。在
下面是我试图用python读取的原始数据的一个示例:
MiniSonde 5 43656
"Log File Name : lwrhyp_deploy_20170104"
"Setup Date (MMDDYY) : 010417"
"Setup Time (HHMMSS) : 114539"
"Starting Date (MMDDYY) : 010417"
"Starting Time (HHMMSS) : 140000"
"Stopping Date (MMDDYY) : 123169"
"Stopping Time (HHMMSS) : 235959"
"Interval (HHMMSS) : 010000"
"Sensor warmup (HHMMSS) : 000100"
"Circltr warmup (HHMMSS) : 000030"
"Date","Time","","Temp","","SpCond","","Sal","","Dep25","","TDG","","TDG","","LDO%","","LDO","","IBatt",""
"MMDDYY","HHMMSS","","øC","","mS/cm","","ppt","","meters","","mmHg","","psia","","Sat","","mg/l","","Volts",""
01/04/17,14:00:00,"",7.97,"",.0691,"",.02,"",.75,"",735,"",14.22,"",52.7,"",6.15,"",11.4,""
01/04/17,15:00:00,"",7.9,"",.0692,"",.02,"",.76,"",736,"",14.23,"",52.8,"",6.17,"",11.4,""
01/04/17,16:00:00,"",7.89,"",.0694,"",.02,"",.77,"",736,"",14.23,"",52.3,"",6.12,"",11.4,""
01/04/17,17:00:00,"",7.88,"",.0699,"",.02,"",.78,"",735,"",14.21,"",51.8,"",6.06,"",11.4,""
01/04/17,18:00:00,"",7.85,"",.0699,"",.02,"",.78,"",733,"",14.18,"",51.3,"",6.01,"",11.4,""
01/04/17,19:00:00,"",7.83,"",.0706,"",.02,"",.78,"",731,"",14.14,"",51.3,"",6.01,"",11.4,""
01/04/17,20:00:00,"",7.81,"",.0706,"",.02,"",.79,"",730,"",14.12,"",51.1,"",5.99,"",11.4,""
01/04/17,21:00:00,"",7.81,"",.0699,"",.02,"",.79,"",730,"",14.11,"",50.8,"",5.95,"",11.4,""
01/04/17,22:00:00,"",7.76,"",.0702,"",.02,"",.8,"",729,"",14.1,"",50.5,"",5.92,"",11.3,""
01/04/17,23:00:00,"",7.76,"",.0704,"",.02,"",.8,"",729,"",14.09,"",50.5,"",5.93,"",11.3,""
01/05/17,00:00:00,"",7.76,"",.07,"",.02,"",.8,"",729,"",14.09,"",50.5,"",5.92,"",11.3,""
我试图使用以“Date”开头的行或以“MMDDYY”开头的行作为标题行。当我在文本编辑器中打开原始数据时,与“Date”对应的行是第14行,这将是零索引python land中的第13行。在
我使用了以下代码,认为它应该跳过前12行,开始读取第13行的数据:
^{pr2}$但这就产生了错误:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf8 in position 0: invalid start byte
经过反复试验,我发现以下代码产生了我所追求的结果类型,但我不明白它为什么会起作用:
test = pd.read_csv(filepath, skiprows=[14], header=11, skip_blank_lines=True)
我不明白你是怎么数数的。标题行不在第11行,而是在第13行,这是不是我错了?代码只在skiprows=[14]时有效,这是为什么?在
在一个旁注下,有没有办法防止原始数据中存在的空白列被读取到数据文件中?在
首先,
skiprows
并没有像你想象的那样做。当您给它一个列表作为输入时,它将在解析文件时跳过这些行。对于您想要的,只需使用header
。在第二,pandas zero索引文件行。在
第三,当您有
skip_blank_lines=True
时,它似乎会在考虑#header#值之前重新索引文件的行。因此,在您的示例中,它不会在标题之前(即在标题后面的)中索引空白行11和12。记住pandas zero索引文件行,我们可以看到头上的header=11
行sup:相关问题 更多 >
编程相关推荐