来自CSV报告的python中的空值

2024-10-03 23:30:05 发布

您现在位置:Python中文网/ 问答频道 /正文

我在将CSV文件导入python时遇到问题。 .csv格式的整个文件在所有单元格中都有正常值,但在将数据复制到数据帧的过程中出现错误,并弹出一些空值,因此无法执行此操作

import pandas as pd

excel_path
df=pd.read_csv(excel_path, error_bad_lines=False, sep=';',dtype='c')
print (df)

我也尝试过其他工作方式,但效果相同

import csv

excel_path
with open(excel_path, 'r') as csv_file:
    csv_reader = csv.reader(csv_file)

您知道如何改变将数据加载到python中的方式吗?我已经检查了已有的主题,并尝试了不同的编码。在我的CSV文件中有数字、字符串和日期

空值没有错误。问题是这些值出现了。在CSV文件中有普通字符串和整数。我需要这些数据,这就是为什么我不能只传递空值

以下是此文件作为数据帧的外观:

     R  Unnamed: 1  Unnamed: 2  Unnamed: 3  Unnamed: 4  Unnamed: 5  
0    NaN         NaN         NaN         NaN         NaN         NaN   
1    NaN         NaN         NaN         NaN         NaN         NaN   
2    NaN         NaN         NaN         NaN         NaN         NaN   
3    NaN         NaN         NaN         NaN         NaN         NaN   
4    NaN         NaN         NaN         NaN         NaN         NaN   
...   ..         ...         ...         ...         ...         ...   
1326 NaN         NaN         NaN         NaN         NaN         NaN   
1327 NaN         NaN         NaN         NaN         NaN         NaN   
1328 NaN         NaN         NaN         NaN         NaN         NaN   
1329 NaN         NaN         NaN         NaN         NaN         NaN   
1330 NaN         NaN         NaN         NaN         NaN         NaN   

当我将'dtype='c'添加到此代码行时: df=pd.read\u csv(excel\u路径,错误\u错误\u行=False) 我收到这样的消息:

   R Unnamed: 1 Unnamed: 2 Unnamed: 3 Unnamed: 4 Unnamed: 5 Unnamed: 6  \
0     b''        b''        b''        b''        b''        b''        b''   
1     b''        b''        b''        b''        b''        b''        b''   
2     b''        b''        b''        b''        b''        b''        b''   
3     b''        b''        b''        b''        b''        b''        b''   
4     b''        b''        b''        b''        b''        b''        b''   
...   ...        ...        ...        ...        ...        ...        ...   
1326  b''        b''        b''        b''        b''        b''        b''   
1327  b''        b''        b''        b''        b''        b''        b''   
1328  b''        b''        b''        b''        b''        b''        b''   
1329  b''        b''        b''        b''        b''        b''        b''   
1330  b''        b''        b''        b''        b''        b''        b''   

我的CSV文件如下所示:

RNSP_ID;AUTHOR ID;PRODUCT FAMILY;REQUEST SCOPE;HC Prog;DRAWING NUMBER;VALIDITY;PART TYPE;DOCUMENT OF DEFINITION;CLASSIFICATION;ENGLISH DESIGNATION;FRENCH DESIGNATION;CREATION DATE;
RNSP11701;G700895;Fasteners;Selection;H60;U533A;Serial;Normalised;AS3510;4.7.6;"CABLE SAFETY KIT;"CABLE DE SECURITE;"SICHERUNGSDRAHTKIT;"CABLE SAFETY KIT;17/03/2015 13:38:23;
RNSP11701;G700895;Fasteners;Selection;H60;U533A;Serial;Normalised;AS3510;4.7.6;"CABLE SAFETY KIT;"CABLE DE SECURITE;"SICHERUNGSDRAHTKIT;"CABLE SAFETY KIT;17/03/2015 13:38:23;
RNSP11707;xa434956;Fasteners;Creation;H60;U311A;Serial;Normalised;NSA 551.33;4.8.1;"STUD;"FERMETURE RAPIDE;"VERSCHLUSSZAPFEN;"PASADOR DE CIERRE;19/03/2015 09:28:18;
RNSP11746;xa444992;Fasteners;Use of a new;H60;U...;Serial;Non Aero;ISO7070;4.7.1.1;"NUT HEXA;"ECROU;"NUSS;"NUT;27/03/2015 12:47:53;
RNSP11746;xa444992;Fasteners;Use of a new;H60;U...;Serial;Non Aero;ISO7071;4.7.1.1;"NUT HEXA;"ECROU;"NUSS;"NUT;27/03/2015 12:47:53;
RNSP11747;xa444992;Fasteners;Addition;H60;U...;Serial;Non Aero;DIN950;4.7.1.1;"HANDWHEELS;"VOLANTS;"HANDRADER;"HANDWHEELS;27/03/2015 13:19:24;
RNSP11749;xa444992;Fasteners;Addition;H60;U...;Serial;Non Aero;DIN934;4.2.1.1;"HEXAGONAL NUT;"HEXAGONAL NUT;"SECHSKANTMUTTER;"HEXAGONAL NUT;27/03/2015 13:48:24;
RNSP11749;xa444992;Fasteners;Addition;H60;U...;Serial;Non Aero;DIN934;4.2.1.1;"HEXAGONAL NUT;"HEXAGONAL NUT;"SECHSKANTMUTTER;"HEXAGONAL NUT;27/03/2015 13:48:24;
RNSP11750;xa444992;Fasteners;Addition;H10;U...;Serial;Non Aero;ISO7089;4.3.1;"WASHER, FLAT;"RONDELLE;"SCHEIBE, FLACH;"WASHER, FLAT;27/03/2015 14:01:53;
RNSP11750;xa444992;Fasteners;Addition;H10;U...;Serial;Non Aero;ISO7089;4.3.1;"WASHER, FLAT;"RONDELLE;"SCHEIBE, FLACH;"WASHER, FLAT;27/03/2015 14:01:53;

谢谢


Tags: 文件csv数据serialnanexcelnoncable
1条回答
网友
1楼 · 发布于 2024-10-03 23:30:05

我认为您的CSV文件有两个问题:

  1. 正如@TrentonMcKinney指出的那样,列的数量与标题的数量相差两倍,而
  2. 它有一些随机的"(单‘双引号’)使解析器抓狂

通过删除文件中的所有",然后应用以下命令导入它,我能够正确地解析它:

>>> df = pd.read_csv('20210108.csv',skiprows=[0],names=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],delimiter=';', index_col=False)

详细内容:

  • CSV文件已删除"
  • 分隔符设置为;
  • 我要求panda跳过第一行(未对齐的标题),并给列提供从1到15的编号标题(一个标题明显缺失,是一种语言,类似于“<;德语?>;名称”,另一个不是,因此我将其替换为数字)
  • 我还强制熊猫导入没有索引列,因为第一个看起来不像索引列

现在,数据框架似乎合理:

>>> df
          1         2          3             4    5      6       7   \
0  RNSP11701   G700895  Fasteners     Selection  H60  U533A  Serial   
1  RNSP11701   G700895  Fasteners     Selection  H60  U533A  Serial   
2  RNSP11707  xa434956  Fasteners      Creation  H60  U311A  Serial   
3  RNSP11746  xa444992  Fasteners  Use of a new  H60   U...  Serial   
4  RNSP11746  xa444992  Fasteners  Use of a new  H60   U...  Serial   
5  RNSP11747  xa444992  Fasteners      Addition  H60   U...  Serial   
6  RNSP11749  xa444992  Fasteners      Addition  H60   U...  Serial   
7  RNSP11749  xa444992  Fasteners      Addition  H60   U...  Serial   
8  RNSP11750  xa444992  Fasteners      Addition  H10   U...  Serial   
9  RNSP11750  xa444992  Fasteners      Addition  H10   U...  Serial   

           8           9        10                11                 12  \
0  Normalised      AS3510    4.7.6  CABLE SAFETY KIT  CABLE DE SECURITE   
1  Normalised      AS3510    4.7.6  CABLE SAFETY KIT  CABLE DE SECURITE   
2  Normalised  NSA 551.33    4.8.1              STUD   FERMETURE RAPIDE   
3    Non Aero     ISO7070  4.7.1.1          NUT HEXA              ECROU   
4    Non Aero     ISO7071  4.7.1.1          NUT HEXA              ECROU   
5    Non Aero      DIN950  4.7.1.1        HANDWHEELS            VOLANTS   
6    Non Aero      DIN934  4.2.1.1     HEXAGONAL NUT      HEXAGONAL NUT   
7    Non Aero      DIN934  4.2.1.1     HEXAGONAL NUT      HEXAGONAL NUT   
8    Non Aero     ISO7089    4.3.1      WASHER, FLAT           RONDELLE   
9    Non Aero     ISO7089    4.3.1      WASHER, FLAT           RONDELLE   

                   13                 14                   15  
0  SICHERUNGSDRAHTKIT   CABLE SAFETY KIT  17/03/2015 13:38:23  
1  SICHERUNGSDRAHTKIT   CABLE SAFETY KIT  17/03/2015 13:38:23  
2    VERSCHLUSSZAPFEN  PASADOR DE CIERRE  19/03/2015 09:28:18  
3                NUSS                NUT  27/03/2015 12:47:53  
4                NUSS                NUT  27/03/2015 12:47:53  
5           HANDRADER         HANDWHEELS  27/03/2015 13:19:24  
6     SECHSKANTMUTTER      HEXAGONAL NUT  27/03/2015 13:48:24  
7     SECHSKANTMUTTER      HEXAGONAL NUT  27/03/2015 13:48:24  
8      SCHEIBE, FLACH       WASHER, FLAT  27/03/2015 14:01:53  
9      SCHEIBE, FLACH       WASHER, FLAT  27/03/2015 14:01:53

相关问题 更多 >