pandaps的to-datetime函数不改变dtyp

2024-06-03 06:57:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我最近一直在使用python,发现了一个似乎无法解决的问题。我使用的是pandas数据集,当我想使用to\u datetime函数将变量的数据类型从“object”更改为“datetime64”时,它不会将其更改为所需的“datetime64”数据类型。在

到目前为止,我只尝试了to\u datetime函数,但似乎不能解决问题。我正在寻找一个解决方案,使tou-datetime工作,或任何其他代码,可以将我的变量的数据类型从“object”更改为“datetime64”

您可以在这里找到有关数据集的信息:

df.head()
Formatted Date                      Summary  Precip Type Temperature (C)   Apparent Temperature (C)   Humidity   Wind Speed (km/h)   Wind Bearing (degrees)  Visibility (km)  Loud Cover Pressure (millibars)   Daily Summary
0   2006-04-01 00:00:00.000 +0200   Partly Cloudy   rain    9.472222    7.388889    0.89    14.1197     251.0   15.8263     0.0     1015.13     Partly cloudy throughout the day.
1   2006-04-01 01:00:00.000 +0200   Partly Cloudy   rain    9.355556    7.227778    0.86    14.2646     259.0   15.8263     0.0     1015.63     Partly cloudy throughout the day.
2   2006-04-01 02:00:00.000 +0200   Mostly Cloudy   rain    9.377778    9.377778    0.89    3.9284  204.0   14.9569     0.0     1015.94     Partly cloudy throughout the day.
3   2006-04-01 03:00:00.000 +0200   Partly Cloudy   rain    8.288889    5.944444    0.83    14.1036     269.0   15.8263     0.0     1016.41     Partly cloudy throughout the day.
4   2006-04-01 04:00:00.000 +0200   Mostly Cloudy   rain    8.755556    6.977778    0.83    11.0446     259.0   15.8263     0.0     1016.51     Partly cloudy throughout the day.

在这里,您可以看到在使用to\u datetime函数之前的数据类型:

^{pr2}$

在这里,在使用了to_datetime函数之后:

df['Date'] = pd.to_datetime(df['Formatted Date'])
df.dtypes

Formatted Date               object
Summary                      object
Precip Type                  object
Temperature (C)             float64
Apparent Temperature (C)    float64
Humidity                    float64
Wind Speed (km/h)           float64
Wind Bearing (degrees)      float64
Visibility (km)             float64
Loud Cover                  float64
Pressure (millibars)        float64
Daily Summary                object
Date                         object
dtype: object

你能告诉我我做错了什么吗? 提前谢谢!在


Tags: theto函数dfdatetimedateobject数据类型
3条回答

问题

您想将dtype值从object更改为datetime64。在

df = pd.DataFrame(data={'col':["2006-04-01 00:00:00.000 +0200"]})
df.dtypes

输出:

^{pr2}$

解决方案

要更改类型,您需要应用pd.to_datetime。在

df['col'] = df['col'].apply(pd.to_datetime)
df.dtypes

输出:

col    datetime64[ns, pytz.FixedOffset(120)]
dtype: object

如果这不起作用,那么列Formatted Date可能包含不一致的日期格式或NaN值。在

真实数据

使用数据集(https://www.kaggle.com/budincsevity/szeged-weather/):

import pandas as pd

# load dataset
df = pd.read_csv('weatherHistory.csv')
df.dtypes
Formatted Date               object
Summary                      object
Precip Type                  object
Temperature (C)             float64
Apparent Temperature (C)    float64
Humidity                    float64
Wind Speed (km/h)           float64
Wind Bearing (degrees)      float64
Visibility (km)             float64
Loud Cover                  float64
Pressure (millibars)        float64
Daily Summary                object
dtype: object
df['Date'] = df['Formatted Date'].apply(pd.to_datetime)
df.dtypes
Formatted Date                      object
Summary                             object
Precip Type                         object
Temperature (C)                    float64
Apparent Temperature (C)           float64
Humidity                           float64
Wind Speed (km/h)                  float64
Wind Bearing (degrees)             float64
Visibility (km)                    float64
Loud Cover                         float64
Pressure (millibars)               float64
Daily Summary                       object
Date                        datetime64[ns]
dtype: object

我在pandas和通过列标签获取元素方面遇到了麻烦。 我做了一个简化版的dataframe,并可以使用column by index的位置更改列数据类型。在

尝试更改您的:

 pd.to_datetime(df['Formatted Date'])

收件人:

^{pr2}$

它对我有用:

  data=['2006-04-01 00:00:00.000 +0200']

  df = pd.DataFrame(data)

  df2 = pd.to_datetime(df.iloc[0])

  print(df2.dtypes)

结果是:

  datetime64[ns, pytz.FixedOffset(120)]

我下载了您正在使用的相同数据,我认为这可能是您的数据集的一个可能的解决方案,只需扩展原始代码来处理日期格式:

  df['Date'] = pd.to_datetime(df['Formatted Date'], format = '%Y-%m-%d %H:%M:%S.%f %p', errors= 'coerce')

如您所见,列“Date”现在具有正确的数据类型:

Formatted Date                      object
Summary                             object
Precip Type                         object
Temperature (C)                    float64
Apparent Temperature (C)           float64
Humidity                           float64
Wind Speed (km/h)                  float64
Wind Bearing (degrees)             float64
Visibility (km)                    float64
Loud Cover                         float64
Pressure (millibars)               float64
Daily Summary                       object
Date                        datetime64[ns]

对于pandas>=0.24,您需要添加参数utc=True。在

import pandas as pd

# load dataset
df = pd.read_csv('weatherHistory.csv')

df['Date'] = df['Formatted Date'].apply(pd.to_datetime, utc=True)
^{pr2}$

相关问题 更多 >