所以我有一个.txt文件需要读入到Python中,但是格式限制了我使用简单的熊猫.read\u csv功能。.txt文件如下所示:
"giver_username_if_known", "N/A"
"in_test_set", false
"number_of_downvotes_of_request_at_retrieval", 2
"number_of_upvotes_of_request_at_retrieval", 6
"post_was_edited", false
"request_id", "t3_w5491"
"request_number_of_comments_at_retrieval", 7
"request_text", "I'm not in College, or a starving artist or anything like that. I've just been a bit unlucky lately."
"request_text_edit_aware", "I'm not in College, or a starving artist or anything like that. I've just been a bit unlucky lately. I'm a 36 year old single guy with a job. But rent, and other bills killed me this month."
"request_title", "[Request] Ontario, Canada - On my 3rd of 5 days without food, and it's getting unbearable. Can anyone help?"
"requester_account_age_in_days_at_request", 14.416875
"requester_account_age_in_days_at_retrieval", 531.9697222222222
"requester_days_since_first_post_on_raop_at_request", 0.0
"requester_days_since_first_post_on_raop_at_retrieval", 517.5111805555556
"requester_number_of_comments_at_request", 8
"requester_number_of_comments_at_retrieval", 93
"requester_number_of_comments_in_raop_at_request", 0
"requester_number_of_comments_in_raop_at_retrieval", 4
"requester_number_of_posts_at_request", 1
"requester_number_of_posts_at_retrieval", 6
"requester_number_of_posts_on_raop_at_request", 0
"requester_number_of_posts_on_raop_at_retrieval", 2
"requester_number_of_subreddits_at_request", 8
"requester_received_pizza", true
"requester_subreddits_at_request", {
"AdviceAnimals"
"WTF"
"funny"
"gaming"
"movies"
"technology"
"todayilearned"
"videos"
}
%%%%%%%%%%
%%%%%%%%%%
在每组“%”之后,还有另一个格式相同的条目(总共5671个)。每行中的第一个字符串是列名,下面的字符串/整数是数据项。如何提取每个列名后面的数据?你知道吗
我有两个建议:
1)在调用读取\u csv时,添加str=','。这将告诉解析器每行中的列/数据的分隔方式。你知道吗
2)在您的呼叫读取\u csv时,添加comment=“%。这告诉解析器任何以“%%%%”开头的行都将被视为注释并被忽略。你知道吗
相关问题 更多 >
编程相关推荐