CSV筛选和升序ord

2条回答

网友

1楼 · 编辑于 2024-09-28 21:39:16

我建议使用pandas，因为它可以帮助您更快地过滤和执行进一步的分析。在

# import pandas and datetime
import pandas as pd
import datetime

# read csv file
df = pd.read_csv("sample_data.csv")

# convert created_at from unix time to datetime
df['created_at'] = pd.to_datetime(df['created_at'], unit='s')

# contents of df at this point
#   id          created_at first_name last_name
# 0   1 2011-06-29 20:50:45    Cecelia      Holt
# 1   2 2009-03-16 04:35:09       Emma   Allison
# 2   3 2011-04-23 19:08:31    Desiree      King
# 3   4 2009-01-05 17:15:16        Sam  Davidson

# filtering example
df_filtered = df[(df['created_at'] <= datetime.date(2011,3,22))]

# output of df_filtered
#    id          created_at first_name last_name
# 1   2 2009-03-16 04:35:09       Emma   Allison
# 3   4 2009-01-05 17:15:16        Sam  Davidson

# filter based on dates mentioned in the question
df_filtered = df[(df['created_at'] >= datetime.date(2016,3,22)) & (df['created_at'] <= datetime.date(2016,4,15))]

# output of df_filtered would be empty at this point since the 
# dates are out of this range

# sort
df_sorted = df_filtered.sort_values(['created_at'])

熊猫过滤解释：

首先需要知道的是，对dataframe使用比较运算符将返回一个带有布尔值的dataframe。在

^{pr2}$

会回来的

False
False
 True
 True

现在，pandas支持逻辑索引。因此，如果将带有布尔值的数据帧传递给pandas，if将只返回与True相对应的值。在

df[df['id'] > 2]

退货

3   1303585711  Desiree    King
4   1231175716  Sam        Davidson

这就是你在熊猫身上很容易过滤的方法

网友

2楼 · 编辑于 2024-09-28 21:39:16

下载和安装（和学习）pandas只是为了这样做似乎是太过分了。在

以下是如何仅使用Python的内置模块来实现：

import csv
from datetime import datetime, date
import sys

start_date = date(2011, 1, 1)
end_date = date(2011, 12, 31)

# Read csv data into memory filtering rows by the date in column 2 (row[1]).
csv_data = []
with open("sample_data.csv", newline='') as f:
    reader = csv.reader(f, delimiter='\t')
    header = next(reader)
    csv_data.append(header)
    for row in reader:
        creation_date = date.fromtimestamp(int(row[1]))
        if start_date <= creation_date <= end_date:
            csv_data.append(row)

if csv_data:  # Anything found?
    # Print the results in ascending date order.
    print(" ".join(csv_data[0]))
    # Converting the timestamp to int may not be necessary (but doesn't hurt)
    for row in sorted(csv_data[1:], key=lambda r: int(r[1])): 
        print(" ".join(row))

熊猫过滤解释：

相关问题更多 >

编程相关推荐

热门问题

热门文章

CSV筛选和升序ord

熊猫过滤解释：

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >