将COVID19 JH数据透视到时间序列行

import pandas as pd import numpy as np deaths_url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_US.csv' confirmed_url = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_US.csv' dea = pd.read_csv(deaths_url) con = pd.read_csv(confirmed_url) dea = dea[(dea['Province_State'] == 'Texas')] con = con[(con['Province_State'] == 'Texas')]

# get the most recent data of data mostRecentDate = con.columns[-1] # gets the columns of the matrix # show the data frame con.sort_values(by=mostRecentDate, ascending = False).head(10) # save this index variable to save the order. index = data.columns.drop(['Province_State']) # The pivot_table method will eliminate duplicate entries from Countries with more than one city data.pivot_table(index = 'Admin2', aggfunc = sum) # formatting using a variety of methods to process and sort data finalFrame = data.transpose().reindex(index).transpose().set_index('Admin2').sort_values(by=mostRecentDate, ascending=False).transpose()

Date iso2 iso3 code3 FIPS Admin2 Province_State Country_Region Lat Long_ Combined_Key 1/22/2020 US USA 840 48001 Anderson Texas US 31.81534745 -95.65354823 Anderson, Texas, US 1/22/2020 US USA 840 48003 Andrews Texas US 32.30468633 -102.6376548 Andrews, Texas, US 1/22/2020 US USA 840 48005 Angelina Texas US 31.25457347 -94.60901487 Angelina, Texas, US 1/22/2020 US USA 840 48007 Aransas Texas US 28.10556197 -96.9995047 Aransas, Texas, US

1条回答

网友

1楼 · 发布于 2024-09-26 22:08:56

使用pandas melt。伟大的例子here

例如：

In [41]: cheese = pd.DataFrame({'first': ['John', 'Mary'],
   ....:                        'last': ['Doe', 'Bo'],
   ....:                        'height': [5.5, 6.0],
   ....:                        'weight': [130, 150]})
   ....: 

In [42]: cheese
Out[42]: 
  first last  height  weight
0  John  Doe     5.5     130
1  Mary   Bo     6.0     150

In [43]: cheese.melt(id_vars=['first', 'last'])
Out[43]: 
  first last variable  value
0  John  Doe   height    5.5
1  Mary   Bo   height    6.0
2  John  Doe   weight  130.0
3  Mary   Bo   weight  150.0

In [44]: cheese.melt(id_vars=['first', 'last'], var_name='quantity')
Out[44]: 
  first last quantity  value
0  John  Doe   height    5.5
1  Mary   Bo   height    6.0
2  John  Doe   weight  130.0
3  Mary   Bo   weight  150.0

在您的情况下，您需要在数据帧（即con或finalframe或日期列所在的任何位置）上进行操作。例如：

con.melt(id_vars=date_columns)

参见具体示例here

相关问题更多 >

编程相关推荐

热门问题

热门文章