如何在不删除值的情况下按日期范围重新索引pandas数据帧

CEISales.head(10) Out[194]: Order_DateC RegionC SalesC 0 2014-01-30 Domestic 3530.00 1 2011-10-11 Domestic 136.00 2 1999-01-13 Domestic 30.00 3 1999-01-13 Domestic 55615.00 4 1999-01-13 Domestic 440.00 5 1999-01-13 Domestic 94.00 6 1999-01-05 Domestic 612.00 7 1999-01-14 Domestic 1067.00 8 1999-01-14 Domestic 26345.05 9 1999-01-15 Domestic 161858.72

CEIFinal.head(5) Out[206]: Order_DateC RegionC SalesC 2010-01-01 NaT NaN NaN 2010-01-02 NaT NaN NaN 2010-01-03 NaT NaN NaN 2010-01-04 NaT NaN NaN 2010-01-05 NaT NaN NaN

CEITest[CEITest['Order_DateC'] == '2010-01-04'] Out[210]: Order_DateC RegionC SalesC 18156 2010-01-04 Foreign 450.0 18155 2010-01-04 Domestic 1990.4 18154 2010-01-04 Domestic 37477.0 18152 2010-01-04 Domestic 0.0 18153 2010-01-04 Domestic 783.0

2条回答

网友

1楼 · 编辑于 2024-06-25 07:25:39

我认为您需要在重新编制索引之前从列Order_DateC设置索引：

CEITest = CEITest.set_index('Order_DateC')

最后，您可以通过^{}和{a2}检查notnull值：

^{pr2}$

总而言之：

print CEISales
  Order_DateC   RegionC     SalesC
0  2014-01-30  Domestic    3530.00
1  2011-10-11  Domestic     136.00
2  1999-01-13  Domestic      30.00
3  1999-01-13  Domestic   55615.00
4  1999-01-13  Domestic     440.00
5  1999-01-13  Domestic      94.00
6  1999-01-05  Domestic     612.00
7  1999-01-14  Domestic    1067.00
8  1999-01-14  Domestic   26345.05
9  1999-01-15  Domestic  161858.72

CEIFilter = CEISales[CEISales['Order_DateC'] > '2010-01-01']
CEITest = CEIFilter.sort_values('Order_DateC')
print CEITest
  Order_DateC   RegionC  SalesC
1  2011-10-11  Domestic     136
0  2014-01-30  Domestic    3530

#set index to datetimeindex
CEITest = CEITest.set_index('Order_DateC')
print CEITest
              RegionC  SalesC
Order_DateC                  
2011-10-11   Domestic     136
2014-01-30   Domestic    3530

date_index = pd.date_range(start='2010-01-01', end='2015-12-23' , freq='d')

CEIFinal= CEITest.reindex(date_index)

print CEIFinal.head()
           RegionC  SalesC
2010-01-01     NaN     NaN
2010-01-02     NaN     NaN
2010-01-03     NaN     NaN
2010-01-04     NaN     NaN
2010-01-05     NaN     NaN

可以有很多Nat和NaN，检查数据：

print CEIFinal[CEIFinal.notnull().any(axis=1)]
             RegionC  SalesC
2011-10-11  Domestic     136
2014-01-30  Domestic    3530

最后，您可以设置索引名，^{}index-column name是索引名：

CEIFinal.index.name = 'CEIFinal'
CEIFinal = CEIFinal.reset_index()
print CEIFinal.head()
   CEIFinal RegionC  SalesC
0 2010-01-01     NaN     NaN
1 2010-01-02     NaN     NaN
2 2010-01-03     NaN     NaN
3 2010-01-04     NaN     NaN
4 2010-01-05     NaN     NaN

网友

2楼 · 编辑于 2024-06-25 07:25:39

当索引不是DatetimeIndex时，您正在按DatetimeIndex编制索引：

      Order_DateC   RegionC   SalesC
18156  2010-01-04   Foreign    450.0
18155  2010-01-04  Domestic   1990.4
18154  2010-01-04  Domestic  37477.0
18152  2010-01-04  Domestic      0.0
18153  2010-01-04  Domestic    783.0

因此出现了NaNs和NaTs。在

也许您想将Order_DateC作为索引：

^{pr2}$

然后到resample。在

如果重新编制索引，将丢失具有重复日期的行。在

相关问题更多 >

编程相关推荐

热门问题

热门文章