用Python清理数据集

1条回答

网友

1楼 · 发布于 2024-09-27 22:08:25

The problem is if some tweets contains commas, I don't want this character to be removed due to the splitting..

常规的Python标准库CSV module很好地处理了这种情况：

>>> import csv
>>> s = '''15,Oct 11,785816454042124288,/realDonaldTrump/status/785816454042124288,False,"Despite winning the second debate in a landslide (every poll), it is hard to do well when Paul Ryan and others give zero support!",DonaldTrump
16,Oct 10,785563318652178432,/realDonaldTrump/status/785563318652178432,False,"Wow, @CNN got caught fixing their ""focus group"" in order to make Crooked Hillary look better. Really pathetic and totally dishonest!",DonaldTrump
'''.splitlines()
>>> for fields in csv.reader(s):
        print(fields[2], fields[5])


785816454042124288 Despite winning the second debate in a landslide (every poll), it is hard to do well when Paul Ryan and others give zero support!
785563318652178432 Wow, @CNN got caught fixing their "focus group" in order to make Crooked Hillary look better. Really pathetic and totally dishonest!

相关问题更多 >

编程相关推荐

热门问题

热门文章

用Python清理数据集

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >