如何在单个字段中使用多个带引号的分隔符读取csv？问题的回答

如何在单个字段中使用多个带引号的分隔符读取csv？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

我只回答你问题的第一部分：内置的<code>csv</code>模块无法做到这一点 查看CPython源代码，<code>quotechar</code>选项在字段开头是<a href="https://github.com/python/cpython/blob/09eb81711597725f853e4f3b659ce185488b0d8c/Modules/_csv.c#L651" rel="nofollow noreferrer">only processed</a>： <pre class="lang-c prettyprint-override"><code> case START_FIELD: /* expecting field */ ... else if (c == dialect->quotechar && dialect->quoting != QUOTE_NONE) { /* start quoted field */ self->state = IN_QUOTED_FIELD; } ... break; </code></pre> 在字段中，<a href="https://github.com/python/cpython/blob/09eb81711597725f853e4f3b659ce185488b0d8c/Modules/_csv.c#L697" rel="nofollow noreferrer">there is no such check</a>： <pre class="lang-c prettyprint-override"><code> case IN_FIELD: /* in unquoted field */ if (c == '\n' || c == '\r' || c == '\0') { /* end of line - return [fields] */ if (parse_save_field(self) < 0) return -1; self->state = (c == '\0' ? START_RECORD : EAT_CRNL); } else if (c == dialect->escapechar) { /* possible escaped character */ self->state = ESCAPED_CHAR; } else if (c == dialect->delimiter) { /* save field - wait for new field */ if (parse_save_field(self) < 0) return -1; self->state = START_FIELD; } else { /* normal character - save in field */ if (parse_add_char(self, module_state, c) < 0) return -1; } break; </code></pre> 当解析器处于<code>IN_QUOTED_FIELD</code>状态时，检查<code>quotechar</code>；然而，当遇到引号时，它会返回到<code>IN_FIELD</code>状态，表明我们在一个未引用的字段中。所以这是可能的： <pre><code>>>> import csv >>> import io >>> print(next(csv.reader(io.StringIO('"a,b"cd,e')))) ['a,bcd', 'e'] </code></pre> 但一旦解析器到达初始引用部分的末尾，它将考虑任何后续引用作为数据的一部分。我不知道这种行为是否符合任何（书面或非书面）CSV规范，或者它是否只是一个bug

如何在单个字段中使用多个带引号的分隔符读取csv？

1 个回答

相关Python问题