Python：堆栈和枚举日期以创建新记录问题的回答

Python：堆栈和枚举日期以创建新记录

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

编辑：在第二眼看来，您确实希望填补日期的空白，然后向前填充数据，这可以很容易地完成： <pre><code>df.set_index('transaction_dt').resample('D').ffill() </code></pre> 但是，如果您不想创建连续索引，而是想添加任意数量的行<code>N</code>，您可以先将“transaction\u dt”移动到索引： <pre><code>df.set_index('transaction_dt', inplace=True) </code></pre> …然后使用这个lambda函数（使用numpy方法）： <pre><code>add_n_dates = lambda n: df.index.repeat(n) + \ np.tile(np.arange(n)*pd.Timedelta('1 days'), df.index.size) </code></pre> 。。。在最终重新索引+向前填充之前，要向新索引的每个元素添加<code>n</code>日期： <pre><code>df.reindex(add_n_dates(5), method='ffill') # id units measure # transaction_dt # 2014-01-06 1.0 30.0 30.5 # 2014-01-07 1.0 30.0 30.5 # 2014-01-08 1.0 30.0 30.5 # 2014-01-09 1.0 30.0 30.5 # 2014-01-10 1.0 30.0 30.5 # 2014-02-04 1.0 5.0 22.6 # 2014-02-05 1.0 5.0 22.6 # 2014-02-06 1.0 5.0 22.6 # 2014-02-07 1.0 5.0 22.6 # 2014-02-08 1.0 5.0 22.6 </code></pre> 编辑#2： 再次假设您已经将索引设置为<code>transaction_dt</code>，这可能是使用<code>units</code>中的值来确定要添加多少行的最简单方法。它使用<code>pd.date_range</code>通过传递<code>row.name</code>（即它的索引值）作为起始点，<code>row.units</code>作为要生成的时段来创建必要的日期值。你知道吗 <pre><code>df.apply(lambda x: pd.Series(pd.date_range(x.name, periods=x.units)), axis=1). \ stack(). \ reset_index(level=1). \ join(df['measure']). \ drop('level_1', axis=1). \ reset_index(). \ rename(columns={0:'enumerated_dt'}) # transaction_dt enumerated_dt measure # 0 2014-01-06 2014-01-06 30.5 # 1 2014-01-06 2014-01-07 30.5 # 2 2014-01-06 2014-01-08 30.5 # 3 2014-01-06 2014-01-09 30.5 # 4 2014-01-06 2014-01-10 30.5 # ... # 29 2014-01-06 2014-02-04 30.5 # 30 2014-02-04 2014-02-04 22.6 # 31 2014-02-04 2014-02-05 22.6 # 32 2014-02-04 2014-02-06 22.6 # 33 2014-02-04 2014-02-07 22.6 # 34 2014-02-04 2014-02-08 22.6 </code></pre>

Python：堆栈和枚举日期以创建新记录

1 个回答

相关Python问题