回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我想在Python3中读取CSV文件,但由于某些限制,无法使用任何库。几乎在每一行中,一个或多个列都包含逗号(“,”),使用<code>row.split(',')</code>会导致列数增加而出现问题</p>
<p>我的代码是:</p>
<pre><code>import csv
file_name = "train_1.csv"
columns = [
"PassengerId",
"Survived",
"Pclass",
"Name",
"Sex",
"Age",
"SibSp",
"Parch",
"Ticket",
"Fare",
"Cabin",
"Embarked"
]
print("Total columns should be: {}".format(len(columns)))
with open(file_name, 'r') as reader:
for line in reader.readlines():
row_data = line.split(',')
if len(row_data) != len(columns):
print('This row does not have the required # of columns: {}'.format(
len(row_data)))
print(row_data)
</code></pre>
<p>我的输出(错误)是:</p>
<pre><code>['1', '0', '3', '"Braund', ' Mr. Owen Harris"', 'male', '22', '1', '0', 'A/5 21171', '7.25', '', 'S\n']
</code></pre>
<p>相反,它应该是:</p>
<pre><code>['1', '0', '3', '"Braund, Mr. Owen Harris"', 'male', '22', '1', '0', 'A/5 21171', '7.25', '', 'S']
</code></pre>
<p>额外的列是由于名称被拆分为两个而不是一个,以及最后一列中的<code>\n</code></p>
<p>然而,我主要关心的是额外的列被拆分。注意:这个问题由CSV阅读器解决,但由于库的限制,我不能真正使用任何库</p>
<p>部分输入为:</p>
<pre><code>PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
</code></pre>
<p>完整的数据可用<a href="https://pastebin.com/FxZPzP23" rel="nofollow noreferrer">here</a></p>