Pandas、博士后和夏令时(DST)

2024-09-26 22:10:53 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Pandas vesion 0.22.0中的read_sql_query从本地PostgreSQL数据库中提取时间序列数据。如果不解析日期列,则会得到以下数据帧:

    dataid  localminute                 use
0   1642    2012-05-11 19:00:00-05:00   0.827
1   1642    2012-05-11 19:01:00-05:00   0.830
2   1642    2012-05-11 19:02:00-05:00   0.833
3   1642    2012-05-11 19:03:00-05:00   0.835
4   1642    2012-05-11 19:04:00-05:00   0.837

localminute列有dtype=object,并包含一个混合的对象,如

^{pr2}$

请注意,这些对象具有不同的tzinfo。在

如果我在使用pd.read_sql_query时试图传递parse_dates=["localminute"],我会得到以下错误和回溯。在

--------------------------------------------------------------------------
ValueError                               Traceback (most recent call last)
~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike(arg, box, format, name, tz)
    302             try:
--> 303                 values, tz = tslib.datetime_to_datetime64(arg)
    304                 return DatetimeIndex._simple_new(values, name=name, tz=tz)

pandas/_libs/tslib.pyx in pandas._libs.tslib.datetime_to_datetime64()

ValueError: Array must be all same time zone

During handling of the above exception, another exception occurred:

ValueError                               Traceback (most recent call last)
<ipython-input-8-0d092e8e183b> in <module>()
      6     end_time = pd.Timestamp("2015-06-01", tz="US/Central")
      7 
----> 8     usage_df = pecanpy.read_electricity_egauge_minutes_query(local_con, "public", "all", dataid, start_time, end_time)

c:\users\pughdr\research\pecanpy\pecanpy\api.py in read_electricity_egauge_minutes_query(con, schema, columns, dataid, start_time, end_time, tz)
     32     query = template.format(**kwargs)
     33     #parse_dates= {"localminute": {}}
---> 34     df = pd.read_sql_query(query, con=con, parse_dates=["localminute"])
     35 
     36     # if the time period of interest contains observations only from within the

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\io\sql.py in read_sql_query(sql, con, index_col, coerce_float, params, parse_dates, chunksize)
    330     return pandas_sql.read_query(
    331         sql, index_col=index_col, params=params, coerce_float=coerce_float,
--> 332         parse_dates=parse_dates, chunksize=chunksize)
    333 
    334 

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\io\sql.py in read_query(self, sql, index_col, coerce_float, parse_dates, params, chunksize)
   1102             frame = _wrap_result(data, columns, index_col=index_col,
   1103                                  coerce_float=coerce_float,
-> 1104                                  parse_dates=parse_dates)
   1105             return frame
   1106 

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\io\sql.py in _wrap_result(data, columns, index_col, coerce_float, parse_dates)
    157                                    coerce_float=coerce_float)
    158 
--> 159     _parse_date_columns(frame, parse_dates)
    160 
    161     if index_col is not None:

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\io\sql.py in _parse_date_columns(data_frame, parse_dates)
    138         except TypeError:
    139             fmt = None
--> 140         data_frame[col_name] = _handle_date_column(df_col, format=fmt)
    141 
    142     # we want to coerce datetime64_tz dtypes for now

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\io\sql.py in _handle_date_column(col, utc, format)
    117                     .astype('datetime64[ns, UTC]'))
    118         else:
--> 119             return to_datetime(col, errors='coerce', format=format, utc=utc)
    120 
    121 

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
    371     elif isinstance(arg, ABCSeries):
    372         from pandas import Series
--> 373         values = _convert_listlike(arg._values, True, format)
    374         result = Series(values, index=arg.index, name=arg.name)
    375     elif isinstance(arg, (ABCDataFrame, MutableMapping)):

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike(arg, box, format, name, tz)
    304                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
    305             except (ValueError, TypeError):
--> 306                 raise e
    307 
    308     if arg is None:

~\AppData\Local\Continuum\Anaconda3\envs\pecanpy-dev\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike(arg, box, format, name, tz)
    292                     dayfirst=dayfirst,
    293                     yearfirst=yearfirst,
--> 294                     require_iso8601=require_iso8601
    295                 )
    296 

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True

由于夏时制(DST)的原因,时间戳具有不同的时区,因此抛出ValueError。我不明白第二个错误的来源。在

我如何使用read_sql_query加载一个DataFrame列,该列具有"US/Central"时区的时区识别功能?在


Tags: nameinpyformatpandassqlindexparse

热门问题