我目前正试图计算灾害的长度,以天为单位,然后使用这一列,即开始日期和结束日期之间的差值,使用groupby(我认为),以计算每年的灾害长度,因为我的数据集是从1960年到现在。最后,我还想按灾害类型对其进行分组,以了解特定灾害的持续时间是如何随时间而变化的,但一步一步
到目前为止,我已经将日期转换为pd.datetime格式,然后使用下面的代码创建两个日期不同的列
#Create new Column == Disaster Length
df_time['Disaster_Length'] = (df_time.Start_Date_A - df_time.End_Date_A)
第2部分问题:
A.我该如何创建一个循环,循环的行是——对于列Start_Date_A==0中的I,添加+1——很抱歉,我对这一点不太熟悉,需要它来确保即使灾难在某一天开始和结束,它也算作1天而不是0
B.将灾难长度列G从一个系列更改为整数以便计算它们的最佳方法是什么
完整代码:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
#Import Datased
df = pd.read_csv('database.csv')
df_time = (df[['County','Disaster Type','Start Date', 'End Date']][0: :])
#Number of NaN values
df_nan = df[['County','Disaster Type','Start Date', 'End Date']].isna().sum()
#NaN values as a percentage as total
df_nan_percent = (df_nan.sum(axis=0))
NAN_percentage = ['0.0116%']
#Remove NaN values
df_time.dropna(subset = ["County", 'End Date'], inplace=True)
#Set Date Format
df_time['Start_Date_A'] = pd.to_datetime(df['Start Date'], format='%m/%d/%Y')
df_time['End_Date_A'] = pd.to_datetime(df['End Date'], format='%m/%d/%Y')
#Create new Column == Disaster Length
df_time['Disaster_Length'] = (df_time.Start_Date_A - df_time.End_Date_A)
#Dropped Date Old Date Formats from df
df_time = df_time.drop(columns=['Start Date', 'End Date'], axis=1)
#Make County Column the Index, as NaN altered Index Consistency
df_time.set_index('County', inplace=True)
可复制df
County,Disaster Type,Start_Date_A,End_Date_A,Disaster_Length
Clay County,Flood,1959-01-29,1959-01-29,0 days
Alpine County,Flood,1964-12-24,1964-12-24,0 days
Amador County,Flood,1964-12-24,1964-12-24,0 days
Butte County,Flood,1964-12-24,1964-12-24,0 days
Colusa County,Flood,1964-12-24,1964-12-24,0 days
Del Norte County,Flood,1964-12-24,1964-12-24,0 days
El Dorado County,Flood,1964-12-24,1964-12-24,0 days
Glenn County,Flood,1964-12-24,1964-12-24,0 days
Humboldt County,Flood,1964-12-24,1964-12-24,0 days
Lake County,Flood,1964-12-24,1964-12-24,0 days
Lassen County,Flood,1964-12-24,1964-12-24,0 days
Marin County,Flood,1964-12-24,1964-12-24,0 days
Mendocino County,Flood,1964-12-24,1964-12-24,0 days
Modoc County,Flood,1964-12-24,1964-12-24,0 days
Napa County,Flood,1964-12-24,1964-12-24,0 days
Nevada County,Flood,1964-12-24,1964-12-24,0 days
Placer County,Flood,1964-12-24,1964-12-24,0 days
Plumas County,Flood,1964-12-24,1964-12-24,0 days
Sacramento County,Flood,1964-12-24,1964-12-24,0 days
San Joaquin County,Flood,1964-12-24,1964-12-24,0 days
Shasta County,Flood,1964-12-24,1964-12-24,0 days
Sierra County,Flood,1964-12-24,1964-12-24,0 days
Siskiyou County,Flood,1964-12-24,1964-12-24,0 days
Solano County,Flood,1964-12-24,1964-12-24,0 days
Sonoma County,Flood,1964-12-24,1964-12-24,0 days
Stanislaus County,Flood,1964-12-24,1964-12-24,0 days
Sutter County,Flood,1964-12-24,1964-12-24,0 days
Tehama County,Flood,1964-12-24,1964-12-24,0 days
Trinity County,Flood,1964-12-24,1964-12-24,0 days
Tuolumne County,Flood,1964-12-24,1964-12-24,0 days
Yolo County,Flood,1964-12-24,1964-12-24,0 days
Yuba County,Flood,1964-12-24,1964-12-24,0 days
Baker County,Flood,1964-12-24,1964-12-24,0 days
Benton County,Flood,1964-12-24,1964-12-24,0 days
Clackamas County,Flood,1964-12-24,1964-12-24,0 days
Clatsop County,Flood,1964-12-24,1964-12-24,0 days
Columbia County,Flood,1964-12-24,1964-12-24,0 days
Coos County,Flood,1964-12-24,1964-12-24,0 days
Crook County,Flood,1964-12-24,1964-12-24,0 days
Curry County,Flood,1964-12-24,1964-12-24,0 days
Deschutes County,Flood,1964-12-24,1964-12-24,0 days
Douglas County,Flood,1964-12-24,1964-12-24,0 days
Gilliam County,Flood,1964-12-24,1964-12-24,0 days
Grant County,Flood,1964-12-24,1964-12-24,0 days
Harney County,Flood,1964-12-24,1964-12-24,0 days
Hood River County,Flood,1964-12-24,1964-12-24,0 days
Jackson County,Flood,1964-12-24,1964-12-24,0 days
Jefferson County,Flood,1964-12-24,1964-12-24,0 days
Josephine County,Flood,1964-12-24,1964-12-24,0 days
Klamath County,Flood,1964-12-24,1964-12-24,0 days
Lake County,Flood,1964-12-24,1964-12-24,0 days
Lane County,Flood,1964-12-24,1964-12-24,0 days
Lincoln County,Flood,1964-12-24,1964-12-24,0 days
Linn County,Flood,1964-12-24,1964-12-24,0 days
Malheur County,Flood,1964-12-24,1964-12-24,0 days
Marion County,Flood,1964-12-24,1964-12-24,0 days
Morrow County,Flood,1964-12-24,1964-12-24,0 days
Multnomah County,Flood,1964-12-24,1964-12-24,0 days
Polk County,Flood,1964-12-24,1964-12-24,0 days
Sherman County,Flood,1964-12-24,1964-12-24,0 days
Tillamook County,Flood,1964-12-24,1964-12-24,0 days
Umatilla County,Flood,1964-12-24,1964-12-24,0 days
Union County,Flood,1964-12-24,1964-12-24,0 days
Wallowa County,Flood,1964-12-24,1964-12-24,0 days
Wasco County,Flood,1964-12-24,1964-12-24,0 days
Washington County,Flood,1964-12-24,1964-12-24,0 days
Wheeler County,Flood,1964-12-24,1964-12-24,0 days
Yamhill County,Flood,1964-12-24,1964-12-24,0 days
Asotin County,Flood,1964-12-29,1964-12-29,0 days
Benton County,Flood,1964-12-29,1964-12-29,0 days
Clark County,Flood,1964-12-29,1964-12-29,0 days
Columbia County,Flood,1964-12-29,1964-12-29,0 days
Cowlitz County,Flood,1964-12-29,1964-12-29,0 days
Garfield County,Flood,1964-12-29,1964-12-29,0 days
Grays Harbor County,Flood,1964-12-29,1964-12-29,0 days
King County,Flood,1964-12-29,1964-12-29,0 days
Kittitas County,Flood,1964-12-29,1964-12-29,0 days
Klickitat County,Flood,1964-12-29,1964-12-29,0 days
Lewis County,Flood,1964-12-29,1964-12-29,0 days
Mason County,Flood,1964-12-29,1964-12-29,0 days
首先将列创建代码更改为:
那么这应该是可行的:
这样,任何持续一天的灾难都将等于1,而不是0
使用^{} 将TiemDelta转换为整数,然后使用
replace
:或通过掩码将值设置为
1
:可以在
apply
方法上使用lambda
函数,如下所示:相关问题 更多 >
编程相关推荐