在没有格式化列的txt文件中读取Python

2024-10-04 05:24:21 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我有一个.txt文件需要读入到Python中,但是格式限制了我使用简单的熊猫.read\u csv功能。.txt文件如下所示:

"giver_username_if_known", "N/A"

"in_test_set", false 

"number_of_downvotes_of_request_at_retrieval", 2 

"number_of_upvotes_of_request_at_retrieval", 6 

"post_was_edited", false 

"request_id", "t3_w5491" 

"request_number_of_comments_at_retrieval", 7 

"request_text", "I'm not in College, or a starving artist or anything like that. I've just been a bit unlucky lately."

"request_text_edit_aware", "I'm not in College, or a starving artist or anything like that. I've just been a bit unlucky lately. I'm a 36 year old single guy with a job. But rent, and other bills killed me this month." 

"request_title", "[Request] Ontario, Canada - On my 3rd of 5 days without food, and it's getting unbearable. Can anyone help?" 

"requester_account_age_in_days_at_request", 14.416875 

"requester_account_age_in_days_at_retrieval", 531.9697222222222 

"requester_days_since_first_post_on_raop_at_request", 0.0 

"requester_days_since_first_post_on_raop_at_retrieval", 517.5111805555556 

"requester_number_of_comments_at_request", 8 

"requester_number_of_comments_at_retrieval", 93 

"requester_number_of_comments_in_raop_at_request", 0 

"requester_number_of_comments_in_raop_at_retrieval", 4 

"requester_number_of_posts_at_request", 1 

"requester_number_of_posts_at_retrieval", 6 

"requester_number_of_posts_on_raop_at_request", 0 

"requester_number_of_posts_on_raop_at_retrieval", 2 

"requester_number_of_subreddits_at_request", 8 

"requester_received_pizza", true 

"requester_subreddits_at_request", {
  "AdviceAnimals" 
  "WTF" 
  "funny" 
  "gaming" 
  "movies" 
  "technology" 
  "todayilearned" 
  "videos"
    } 

%%%%%%%%%%

%%%%%%%%%%

在每组“%”之后,还有另一个格式相同的条目(总共5671个)。每行中的第一个字符串是列名,下面的字符串/整数是数据项。如何提取每个列名后面的数据?你知道吗


Tags: orofintxtnumberonrequestrequester
1条回答
网友
1楼 · 发布于 2024-10-04 05:24:21

我有两个建议:

1)在调用读取\u csv时,添加str=','。这将告诉解析器每行中的列/数据的分隔方式。你知道吗

2)在您的呼叫读取\u csv时,添加comment=“%。这告诉解析器任何以“%%%%”开头的行都将被视为注释并被忽略。你知道吗

相关问题 更多 >