通过展平两列来创建python数据帧

2024-10-03 11:20:30 发布

您现在位置:Python中文网/ 问答频道 /正文

我有这样一个数据帧:

zip      season   season_start_date   season_end_date
zip1     winter   2015-11-25          2016-03-09

我需要把开始日期和结束日期之间的日期弄平。 我希望输出如下:

zip       season   date   
zip1      winter   2015-11-25
zip1      winter   2015-11-26
.
.
zip1      winter   2016-03-09

我怎么能以更优雅的方式实现呢?你知道吗

data = {"zip":["zip1","zip1"],
    "season":["s6","s6"],
    "season_start_date": ["2011-01-01","2011-01-01"], 
    "season_end_date" : ["2012-01-05","2012-01-05"]
   }
df = pd.DataFrame(data=data)

谢谢。你知道吗


Tags: 数据dataframedfdatadate方式zipstart
2条回答
from datetime import datetime, timedelta

Row_to_split = df.loc[1]
Season = Row_to_split['season']
Start_Date = datetime.strptime(Row_to_split['season_start_date']', '%Y-%m-%d')
End_Date = datetime.strptime(Row_to_split['season_end_date']', '%Y-%m-%d')
# initialize new_df 
for i in range((End_Date - Start_Date).days+1):
    new_df.loc[i] = [season, (Start_Date+timedelta(i)).strftime('%Y-%m-%d')]

这是你想要的吗?我不确定zip列是否是一个索引,但如何插入它应该是显而易见的。你知道吗

您需要从每一行生成一个数据帧,然后将它们合并在一起:

res = pd.concat([
    pd.DataFrame({
        'zip': r.zip, 'season': r.season, 'date': pd.DatetimeIndex(
            start=r.season_start_date, end=r.season_end_date, freq='D'
        )
    }) for _, r in data.iterrows()
], sort=False)

相关问题 更多 >