我有一个条目表,如下所示。表中缺少日期项,我想用月末缺少的日期值填充这些项。对于第1列,缺少日期项的添加行应填充相同的值,对于值列,我希望它们填充0
FirstName MiddleName LastName Date Value1 Value2 Value3
first1 middle1 last1 1/31/2020 51 80 19
first1 middle1 last1 2/29/2020 14 44 56
first1 middle1 last1 4/30/2020 57 96 40
first1 middle1 last1 6/30/2020 58 65 3
first1 middle1 last1 8/31/2020 1 34 4
first1 middle1 last1 10/31/2020 40 38 53
first1 middle1 last1 12/31/2020 93 65 41
first1 middle1 last1 2/28/2021 3 43 0
first1 middle1 last1 4/30/2021 46 61 52
first2 middle2 last2 1/31/2020 64 19 33
first2 middle2 last2 2/29/2020 28 71 16
first2 middle2 last2 4/30/2020 2 94 78
first2 middle2 last2 5/31/2020 78 99 87
first2 middle2 last2 6/30/2020 10 70 14
first2 middle2 last2 7/31/2020 30 30 59
first2 middle2 last2 8/31/2020 55 96 73
first2 middle2 last2 10/31/2020 22 43 23
first2 middle2 last2 11/30/2020 12 4 84
first2 middle2 last2 1/31/2021 59 93 1
first2 middle2 last2 2/28/2021 19 33 52
first2 middle2 last2 3/31/2021 46 12 97
first2 middle2 last2 4/30/2021 41 44 59
first2 middle2 last2 5/31/2021 67 84 96
first2 middle2 last2 6/30/2021 52 69 78
first3 middle3 last3 4/30/2020 5 63 30
first3 middle3 last3 5/31/2020 45 22 7
first3 middle3 last3 6/30/2020 76 2 33
first3 middle3 last3 8/31/2020 81 25 52
first3 middle3 last3 9/30/2020 55 3 32
first3 middle3 last3 11/30/2020 46 45 80
first3 middle3 last3 12/31/2020 17 81 74
first3 middle3 last3 1/31/2021 98 6 55
'''
预期产量
FirstName MiddleName LastName Date Value1 Value2 Value3
first1 middle1 last1 1/31/2020 51 80 19
first1 middle1 last1 2/29/2020 14 44 56
first1 middle1 last1 3/31/2020 0 0 0
first1 middle1 last1 4/30/2020 57 96 40
first1 middle1 last1 5/31/2020 0 0 0
first1 middle1 last1 6/30/2020 58 65 3
first1 middle1 last1 7/31/2020 0 0 0
first1 middle1 last1 8/31/2020 1 34 4
first1 middle1 last1 9/30/2020 0 0 0
first1 middle1 last1 10/31/2020 40 38 53
first1 middle1 last1 11/30/2020 0 0 0
first1 middle1 last1 12/31/2020 93 65 41
first1 middle1 last1 1/31/2021 0 0 0
first1 middle1 last1 2/28/2021 3 43 0
first1 middle1 last1 3/31/2021 0 0 0
first1 middle1 last1 4/30/2021 46 61 52
first2 middle2 last2 1/31/2020 64 19 33
first2 middle2 last2 2/29/2020 28 71 16
first2 middle2 last2 3/31/2020 0 0 0
first2 middle2 last2 4/30/2020 2 94 78
first2 middle2 last2 5/31/2020 78 99 87
first2 middle2 last2 6/30/2020 10 70 14
first2 middle2 last2 7/31/2020 30 30 59
first2 middle2 last2 8/31/2020 55 96 73
first2 middle2 last2 9/30/2020 0 0 0
first2 middle2 last2 10/31/2020 22 43 23
first2 middle2 last2 11/30/2020 12 4 84
first2 middle2 last2 12/31/2020 0 0 0
first2 middle2 last2 1/31/2021 59 93 1
first2 middle2 last2 2/28/2021 19 33 52
first2 middle2 last2 3/31/2021 46 12 97
first2 middle2 last2 4/30/2021 41 44 59
first2 middle2 last2 5/31/2021 67 84 96
first2 middle2 last2 6/30/2021 52 69 78
first3 middle3 last3 4/30/2020 5 63 30
first3 middle3 last3 5/31/2020 45 22 7
first3 middle3 last3 6/30/2020 76 2 33
first3 middle3 last3 7/31/2020 0 0 0
first3 middle3 last3 8/31/2020 81 25 52
first3 middle3 last3 9/30/2020 55 3 32
first3 middle3 last3 10/31/2020 0 0 0
first3 middle3 last3 11/30/2020 46 45 80
first3 middle3 last3 12/31/2020 17 81 74
first3 middle3 last3 1/31/2021 98 6 55
我曾尝试对数据帧使用重采样,但没有得到所需的输出。 df=df.set_index('Date')。重采样('M')。ffill()。reset_index()
编辑:有一个比下面第二个更好的解决方案:
你的逻辑很接近。您可以只在
.resample
之前包含.groupby()
,在.asfreq()
之后包含.groupby()
方法2和
reindex
(更复杂)它变得有点复杂,因为您需要在一个组中填充值。我不确定是否有比以下解决方案更简单的方法:
freq='M'
创建一个date_range()
,其中将包含缺少的月末日期值。稍后,您将使用它重新编制索引李>FullName
的组创建唯一标识符。重新编制索引时还将传递此新列,以便在组内重新编制索引李>pd.MultiIndex.from_product()
创建由上面的#1和#2组成的MultiIndex
set_index()
并将FirstName
和Date
设置为索引。接下来,您将.reindex
传递我们在#3中创建的mmultiindex的数据帧李>.groupby
和.transform(max)
用正确的值填充FirstName
、MiddleName
和LastName
FullName
李>您需要使用
pd.MultiIndex.from_product()
和FirstName
重新索引,并使用名为s
的日期的date_range
。这是它的关键,它将允许您set_index
和reindex
来填充缺少的日期相关问题 更多 >
编程相关推荐