Python使用pandas:如何忽略“”中的分隔符?

2024-05-19 13:59:25 发布

您现在位置:Python中文网/ 问答频道 /正文

我的CSV文件包含一个16列的标题。数据行包含16个用“,”分隔的值。在

刚刚发现有些行包含""内的值,这些值包含,。这让解析器很困惑。它不是期望15个逗号,而是得到18个逗号。以下一个例子:

"23210","Cosmetic","Lancome","Eyes Virtuose Palette Makeup","**7,2g**","W","Decorative range","5x**1,2**g Eye Shadow + **1,2**g Powder","http://image.jpg","","3660732000104","","No","","1","1"

如何使解析器忽略""内的逗号符号?在

我的代码如下:

^{pr2}$

Tags: 文件csv数据解析器标题range例子逗号
1条回答
网友
1楼 · 发布于 2024-05-19 13:59:25

传递参数quotechar='"'。从Pandas Documentation

quotechar : str (length 1), optional

The character used to denote the start and end of a quoted item. Quoted items can include the delimiter and it will be ignored.

例如:

In [9]:

t='''"23210","Cosmetic","Lancome","Eyes Virtuose Palette Makeup","7,2g","W","Decorative range","5x1,2g Eye Shadow + 1,2g Powder","http://image.jpg","","3660732000104","","No","","1","1"'''
df = pd.read_csv(io.StringIO(t), quotechar='"', header=None)
df
Out[9]:
      0         1        2                             3     4  5   \
0  23210  Cosmetic  Lancome  Eyes Virtuose Palette Makeup  7,2g  W   

                 6                                7                 8   9   \
0  Decorative range  5x1,2g Eye Shadow + 1,2g Powder  http://image.jpg NaN   

              10  11  12  13  14  15  
0  3660732000104 NaN  No NaN   1   1  

相关问题 更多 >