从CSV-fi比较python和Pandas中的一天时间

2024-07-05 09:43:25 发布

您现在位置:Python中文网/ 问答频道 /正文

所以我有一个很大的CSV文件,我想从中选择一列,从中读取日期,并将它们与今天进行比较。我阅读了Pandas文档,并使用datetime将列转换为datetime格式,但出现运行时错误'AttributeError:'Series'对象没有属性'days'。这是我转换日期的整个逻辑有缺陷还是我只是误用了to datetime。这是我目前的代码:

`df = pd.read_csv('roadData.csv',delimiter = ';',encoding = "latin1",error_bad_lines=False)
thisDate = datetime.date.today()
correctCars.dateRegistered = correctCars.dateRegistered.apply(str)
paivat = pd.to_datetime(correctCars.dateRegistered, errors='coerce')
fiveYears = paivat[(paivat.days - thisDate.days >= 0) & (paivat.days - thisDate.days <= 1825)]
print(fiveYears.count())
`

Tags: 文件csvto文档pandasdatetime格式days
1条回答
网友
1楼 · 发布于 2024-07-05 09:43:25

如果可以使用大数据,请使用参数usecols仅筛选某些列和筛选^{}

paivat = pd.read_csv('roadData.csv',
                     sep = ';',
                     encoding = "latin1",
                     error_bad_lines=False, 
                     usecols=['dateRegistered'],
                     parse_dates=['dateRegistered'])

#if parse_dates doesnt return datetimes
#paivat = pd.to_datetime(paivat.dateRegistered, errors='coerce')

#for compare need datetime
thisDate = datetime.datetime.now()

#get days
d = (paivat.dateRegistered - thisDate).dt.days
#filtering
fiveYears = paivat[d.between(0, 1825)]

或:

fiveYears = paivat[(d >= 0) & (d <= 1825)]

如果只需要计数:

print (d.between(0, 1825).sum())

或:

print (((d >= 0) & (d <= 1825)).sum())

样品:

import pandas as pd
import numpy as np
from pandas.compat import StringIO
import datetime

temp=u"""dateRegistered;col
2017-11-25;0
2017-12-26;1
2017-12-27;2
2017-11-28;3
2017-11-29;4
2017-11-30;5
2017-11-01;7
2017-11-02;8
2017-11-03;9"""
#after testing replace 'StringIO(temp)' to 'roadData.csv'
paivat = pd.read_csv(StringIO(temp),
                     sep = ';',
                     encoding = "latin1",
                     error_bad_lines=False, 
                     usecols=['dateRegistered'],
                     parse_dates=['dateRegistered'])

thisDate = datetime.datetime.now()

d = (paivat.dateRegistered - thisDate).dt.days
print (d)
0    14
1    45
2    46
3    17
4    18
5    19
6   -10
7    -9
8    -8
Name: dateRegistered, dtype: int64

print (d.between(0, 15).sum())
1

相关问题 更多 >