更改数据帧列日期类型

2024-09-28 22:05:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我在创建的数据框中有一个已清理的数据集,现在正在更改其中列的数据类型。我尝试过使用astype方法,但是在使用时出现了一个错误

第二个问题,如何将列数据类型转换为可用于数据可视化的可用数据类型?你知道吗

我使用了astype方法,但没有成功

#Import data into a dataframe
raw_data = pd.read_csv('FuelPrices2016 -2019 ulsp.csv')
raw_data.head()
Date    Pump price in pence/litre ULSP  Duty rate in pence/litre/ULSP   VAT percentage rate Unnamed: 4
0   02/01/2012  132.40  57.95   20  NaN
1   09/01/2012  132.68  57.95   20  NaN
2   16/01/2012  133.29  57.95   20  NaN
3   23/01/2012  133.72  57.95   20  NaN
4   30/01/2012  134.10  57.95   20  NaN
b


#Drop unnamed column
raw_b = raw_data.drop(columns=['Unnamed: 4',])
raw_b

Date    Pump price in pence/litre ULSP  Duty rate in pence/litre/ULSP   VAT percentage rate
0   02/01/2012  132.40  57.95   20
1   09/01/2012  132.68  57.95   20
2   16/01/2012  133.29  57.95   20
3   23/01/2012  133.72  57.95   20
4   30/01/2012  134.10  57.95   20
... ... ... ... ...
396 05/08/2019  128.37  57.95   20
397 12/08/2019  128.36  57.95   20
398 19/08/2019  128.17  57.95   20
399 26/08/2019  128.22  57.95   20
400 02/09/2019  127.86  57.95   20
401 rows × 4 columns

#Describe the data
raw_b.describe()
Pump price in pence/litre ULSP  Duty rate in pence/litre/ULSP   VAT percentage rate
count   401.000000  4.010000e+02    401.0
mean    123.043840  5.795000e+01    20.0
std 10.175522   7.114304e-15    0.0
min 101.360000  5.795000e+01    20.0
25% 115.600000  5.795000e+01    20.0
50% 123.270000  5.795000e+01    20.0
75% 130.830000  5.795000e+01    20.0
max 142.170000  5.795000e+01    20.0


#Check the types of the columns 
raw_b.dtypes
Date                               object
Pump price in pence/litre ULSP    float64
Duty rate in pence/litre/ULSP     float64
VAT percentage rate                 int64
dtype: object
c

#Change date into a string 
raw_c = raw_b.astype({'Date': str})
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-20-64e4aadc3ce7> in <module>
----> 1 raw_d = raw_c.astype({'Date': str})

~\Anaconda3\envs\py3-TF2.0\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs)
   5855                 if col_name not in self:
   5856                     raise KeyError(
-> 5857                         "Only a column name can be used for the "
   5858                         "key in a dtype mappings argument."
   5859                     )

KeyError: 'Only a column name can be used for the key in a dtype mappings argument.'

你知道吗

我希望将数据帧更改为字符串,但输出是以下错误消息

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-20-64e4aadc3ce7> in <module>
----> 1 raw_d = raw_c.astype({'Date': str})

~\Anaconda3\envs\py3-TF2.0\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs)
   5855                 if col_name not in self:
   5856                     raise KeyError(
-> 5857                         "Only a column name can be used for the "
   5858                         "key in a dtype mappings argument."
   5859                     )

KeyError: 'Only a column name can be used for the key in a dtype mappings argument.'

Tags: the数据nameindatadaterawrate
2条回答

问题结束了,我创造了一个措辞更好的问题来解决我的问题

我相信这会有帮助?鉴于我不能使用您显示的数据作为输入,我创建了一个小示例数据框。你知道吗

import pandas as pd
a = {'a1':[2,3,1],'b2':[3,2,3],'c3':['a','b','c']}
raw_1 = pd.DataFrame(a)
raw_2 = pd.DataFrame(a)
column_names = list(raw_1)
for i in column_names:
    raw_2[i] = raw_2[i].astype(object)
print(raw_1.dtypes)
print(raw_2.dtypes)

以下是第一个数据帧的输出,其中前两列的类型为int,第三列的类型为object

a1     int64
b2     int64
c3    object
dtype: object

以及第二个数据帧,其中所有列都被更改为类型object

a1    object
b2    object
c3    object
dtype: object

相关问题 更多 >