duplicated())对所有内容都返回false,尽管只有一个索引例外,它们是相同的
输入
old_data = old_data.loc[:, ~old_data.columns.str.contains('^Unnamed')]
print("bottom_slice")
bottom_slice_length = len(old_data.index)
adjusted_bottom_slice_legth = bottom_slice_length * 0.1
adjusted_bottom_slice_legth = int(adjusted_bottom_slice_legth)
bottom_slice = old_data[adjusted_bottom_slice_legth:]
print(bottom_slice)
new_data = pd.DataFrame.from_records(journal.data)
top_slice_length = len(new_data.index)
print("top slice")
adjusted_top_slice_legth = top_slice_length * 0.9
adjusted_top_slice_legth = int(adjusted_top_slice_legth)
top_slice = new_data[:adjusted_top_slice_legth]
print(top_slice)
kimera = pd.concat([top_slice, bottom_slice])
#print("kimera")
#print(kimera)
print(kimera.duplicated())
#kimera = kimera.drop_duplicates()
print("kimera1")
print(kimera)
输出
bottom_slice
client_id date ... type_id unit_price
4 94904480 2019-06-30T01:31:01+00:00 ... 11186 37177999.84
5 2113704258 2019-06-29T10:46:53+00:00 ... 12044 33996998.00
6 2115385566 2019-06-27T12:07:58+00:00 ... 11393 44899999.98
7 1732767131 2019-06-27T09:22:24+00:00 ... 38 325.24
8 93204128 2019-06-26T20:47:01+00:00 ... 11198 35999999.98
9 90216786 2019-06-25T23:51:48+00:00 ... 11172 35999999.99
10 91205905 2019-06-25T19:59:21+00:00 ... 16275 600.00
11 2113996003 2019-06-25T16:52:14+00:00 ... 11190 39999999.96
12 96345205 2019-06-25T16:39:49+00:00 ... 16275 600.00
13 95103814 2019-06-25T01:16:28+00:00 ... 11202 29999998.93
14 543983309 2019-06-24T14:05:49+00:00 ... 11172 27415377.17
15 2114159703 2019-06-23T21:20:04+00:00 ... 34 6.30
16 2114159703 2019-06-23T15:28:37+00:00 ... 16274 850.00
17 1872130440 2019-06-23T10:02:21+00:00 ... 11400 38498999.98
18 2112790910 2019-06-23T00:00:46+00:00 ... 11202 28394499.36
19 2115326382 2019-06-22T22:42:00+00:00 ... 11371 37150194.88
20 96768321 2019-06-22T17:02:14+00:00 ... 37481 88999999.99
21 1009077082 2019-06-21T23:35:03+00:00 ... 11379 42000000.00
22 755876330 2019-06-21T12:27:59+00:00 ... 11186 37177999.86
23 1556713165 2019-06-20T23:27:23+00:00 ... 11393 36997999.87
24 513171897 2019-06-19T15:58:51+00:00 ... 11381 43817993.86
25 96711003 2019-06-18T17:50:15+00:00 ... 11198 36999999.99
26 408059764 2019-06-18T15:36:49+00:00 ... 11172 35000000.00
27 1276544138 2019-06-17T21:32:47+00:00 ... 11379 41000000.00
28 94184713 2019-06-17T03:30:26+00:00 ... 37481 86999999.99
29 2113441660 2019-06-16T04:12:59+00:00 ... 37458 34948998.99
30 755284989 2019-06-15T19:54:44+00:00 ... 37458 34999999.97
31 1731319339 2019-06-13T12:00:14+00:00 ... 11379 42000000.00
32 96053157 2019-06-12T04:07:15+00:00 ... 37483 85500002.17
33 1690931127 2019-06-12T00:44:40+00:00 ... 37482 61699999.97
34 92812153 2019-06-11T05:23:09+00:00 ... 37460 36499999.99
35 2114791711 2019-06-10T16:14:59+00:00 ... 11371 41499999.99
36 1547875730 2019-06-10T15:22:53+00:00 ... 17887 999.99
37 227535700 2019-06-10T15:12:06+00:00 ... 16272 544.50
38 95165645 2019-06-10T06:32:52+00:00 ... 11393 53989999.99
39 1859791498 2019-06-10T05:35:57+00:00 ... 22460 62000000.00
40 2112629749 2019-06-09T15:46:46+00:00 ... 2549 1800000.00
41 94391975 2019-06-08T00:06:12+00:00 ... 37460 36499999.99
42 91521700 2019-06-07T14:11:45+00:00 ... 11393 49997999.98
43 1171184159 2019-06-06T18:10:19+00:00 ... 12044 33997997.81
44 96410073 2019-06-05T17:32:01+00:00 ... 11371 46999999.96
[41 rows x 10 columns]
top slice
client_id date ... type_id unit_price
0 96644839 2019-07-07T02:02:45+00:00 ... 37457 2.900000e+07
1 2113806433 2019-07-06T18:13:12+00:00 ... 37482 7.300000e+07
2 1240358507 2019-07-05T19:38:20+00:00 ... 11381 4.399900e+07
3 97005654 2019-07-05T04:12:23+00:00 ... 38 3.999900e+02
4 97005654 2019-07-05T02:49:26+00:00 ... 38 3.999900e+02
5 1857838543 2019-07-03T20:08:15+00:00 ... 37482 6.900000e+07
6 92337897 2019-07-03T14:44:32+00:00 ... 11365 4.480000e+07
7 2114793091 2019-07-01T23:04:26+00:00 ... 12044 3.000000e+07
8 95826459 2019-06-30T07:22:45+00:00 ... 37482 1.190000e+08
9 94904480 2019-06-30T01:31:01+00:00 ... 11186 3.717800e+07
10 2113704258 2019-06-29T10:46:53+00:00 ... 12044 3.399700e+07
11 2115385566 2019-06-27T12:07:58+00:00 ... 11393 4.490000e+07
12 1732767131 2019-06-27T09:22:24+00:00 ... 38 3.252400e+02
13 93204128 2019-06-26T20:47:01+00:00 ... 11198 3.600000e+07
14 90216786 2019-06-25T23:51:48+00:00 ... 11172 3.600000e+07
15 91205905 2019-06-25T19:59:21+00:00 ... 16275 6.000000e+02
16 2113996003 2019-06-25T16:52:14+00:00 ... 11190 4.000000e+07
17 96345205 2019-06-25T16:39:49+00:00 ... 16275 6.000000e+02
18 95103814 2019-06-25T01:16:28+00:00 ... 11202 3.000000e+07
19 543983309 2019-06-24T14:05:49+00:00 ... 11172 2.741538e+07
20 2114159703 2019-06-23T21:20:04+00:00 ... 34 6.300000e+00
21 2114159703 2019-06-23T15:28:37+00:00 ... 16274 8.500000e+02
22 1872130440 2019-06-23T10:02:21+00:00 ... 11400 3.849900e+07
23 2112790910 2019-06-23T00:00:46+00:00 ... 11202 2.839450e+07
24 2115326382 2019-06-22T22:42:00+00:00 ... 11371 3.715019e+07
25 96768321 2019-06-22T17:02:14+00:00 ... 37481 8.900000e+07
26 1009077082 2019-06-21T23:35:03+00:00 ... 11379 4.200000e+07
27 755876330 2019-06-21T12:27:59+00:00 ... 11186 3.717800e+07
28 1556713165 2019-06-20T23:27:23+00:00 ... 11393 3.699800e+07
29 513171897 2019-06-19T15:58:51+00:00 ... 11381 4.381799e+07
30 96711003 2019-06-18T17:50:15+00:00 ... 11198 3.700000e+07
31 408059764 2019-06-18T15:36:49+00:00 ... 11172 3.500000e+07
32 1276544138 2019-06-17T21:32:47+00:00 ... 11379 4.100000e+07
33 94184713 2019-06-17T03:30:26+00:00 ... 37481 8.700000e+07
34 2113441660 2019-06-16T04:12:59+00:00 ... 37458 3.494900e+07
35 755284989 2019-06-15T19:54:44+00:00 ... 37458 3.500000e+07
36 1731319339 2019-06-13T12:00:14+00:00 ... 11379 4.200000e+07
37 96053157 2019-06-12T04:07:15+00:00 ... 37483 8.550000e+07
38 1690931127 2019-06-12T00:44:40+00:00 ... 37482 6.170000e+07
39 92812153 2019-06-11T05:23:09+00:00 ... 37460 3.650000e+07
40 2114791711 2019-06-10T16:14:59+00:00 ... 11371 4.150000e+07
41 1547875730 2019-06-10T15:22:53+00:00 ... 17887 9.999900e+02
[42 rows x 10 columns]
0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
10 False
11 False
12 False
13 False
14 False
15 False
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 False
25 False
26 False
27 False
28 False
29 False
...
15 False
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 False
25 False
26 False
27 False
28 False
29 False
30 False
31 False
32 False
33 False
34 False
35 False
36 False
37 False
38 False
39 False
40 False
41 False
42 False
43 False
44 False
Length: 83, dtype: bool
kimera1
client_id date ... type_id unit_price
0 96644839 2019-07-07T02:02:45+00:00 ... 37457 2.900000e+07
1 2113806433 2019-07-06T18:13:12+00:00 ... 37482 7.300000e+07
2 1240358507 2019-07-05T19:38:20+00:00 ... 11381 4.399900e+07
3 97005654 2019-07-05T04:12:23+00:00 ... 38 3.999900e+02
4 97005654 2019-07-05T02:49:26+00:00 ... 38 3.999900e+02
5 1857838543 2019-07-03T20:08:15+00:00 ... 37482 6.900000e+07
6 92337897 2019-07-03T14:44:32+00:00 ... 11365 4.480000e+07
7 2114793091 2019-07-01T23:04:26+00:00 ... 12044 3.000000e+07
8 95826459 2019-06-30T07:22:45+00:00 ... 37482 1.190000e+08
9 94904480 2019-06-30T01:31:01+00:00 ... 11186 3.717800e+07
10 2113704258 2019-06-29T10:46:53+00:00 ... 12044 3.399700e+07
11 2115385566 2019-06-27T12:07:58+00:00 ... 11393 4.490000e+07
12 1732767131 2019-06-27T09:22:24+00:00 ... 38 3.252400e+02
13 93204128 2019-06-26T20:47:01+00:00 ... 11198 3.600000e+07
14 90216786 2019-06-25T23:51:48+00:00 ... 11172 3.600000e+07
15 91205905 2019-06-25T19:59:21+00:00 ... 16275 6.000000e+02
16 2113996003 2019-06-25T16:52:14+00:00 ... 11190 4.000000e+07
17 96345205 2019-06-25T16:39:49+00:00 ... 16275 6.000000e+02
18 95103814 2019-06-25T01:16:28+00:00 ... 11202 3.000000e+07
19 543983309 2019-06-24T14:05:49+00:00 ... 11172 2.741538e+07
20 2114159703 2019-06-23T21:20:04+00:00 ... 34 6.300000e+00
21 2114159703 2019-06-23T15:28:37+00:00 ... 16274 8.500000e+02
22 1872130440 2019-06-23T10:02:21+00:00 ... 11400 3.849900e+07
23 2112790910 2019-06-23T00:00:46+00:00 ... 11202 2.839450e+07
24 2115326382 2019-06-22T22:42:00+00:00 ... 11371 3.715019e+07
25 96768321 2019-06-22T17:02:14+00:00 ... 37481 8.900000e+07
26 1009077082 2019-06-21T23:35:03+00:00 ... 11379 4.200000e+07
27 755876330 2019-06-21T12:27:59+00:00 ... 11186 3.717800e+07
28 1556713165 2019-06-20T23:27:23+00:00 ... 11393 3.699800e+07
29 513171897 2019-06-19T15:58:51+00:00 ... 11381 4.381799e+07
.. ... ... ... ... ...
15 2114159703 2019-06-23T21:20:04+00:00 ... 34 6.300000e+00
16 2114159703 2019-06-23T15:28:37+00:00 ... 16274 8.500000e+02
17 1872130440 2019-06-23T10:02:21+00:00 ... 11400 3.849900e+07
18 2112790910 2019-06-23T00:00:46+00:00 ... 11202 2.839450e+07
19 2115326382 2019-06-22T22:42:00+00:00 ... 11371 3.715019e+07
20 96768321 2019-06-22T17:02:14+00:00 ... 37481 8.900000e+07
21 1009077082 2019-06-21T23:35:03+00:00 ... 11379 4.200000e+07
22 755876330 2019-06-21T12:27:59+00:00 ... 11186 3.717800e+07
23 1556713165 2019-06-20T23:27:23+00:00 ... 11393 3.699800e+07
24 513171897 2019-06-19T15:58:51+00:00 ... 11381 4.381799e+07
25 96711003 2019-06-18T17:50:15+00:00 ... 11198 3.700000e+07
26 408059764 2019-06-18T15:36:49+00:00 ... 11172 3.500000e+07
27 1276544138 2019-06-17T21:32:47+00:00 ... 11379 4.100000e+07
28 94184713 2019-06-17T03:30:26+00:00 ... 37481 8.700000e+07
29 2113441660 2019-06-16T04:12:59+00:00 ... 37458 3.494900e+07
30 755284989 2019-06-15T19:54:44+00:00 ... 37458 3.500000e+07
31 1731319339 2019-06-13T12:00:14+00:00 ... 11379 4.200000e+07
32 96053157 2019-06-12T04:07:15+00:00 ... 37483 8.550000e+07
33 1690931127 2019-06-12T00:44:40+00:00 ... 37482 6.170000e+07
34 92812153 2019-06-11T05:23:09+00:00 ... 37460 3.650000e+07
35 2114791711 2019-06-10T16:14:59+00:00 ... 11371 4.150000e+07
36 1547875730 2019-06-10T15:22:53+00:00 ... 17887 9.999900e+02
37 227535700 2019-06-10T15:12:06+00:00 ... 16272 5.445000e+02
38 95165645 2019-06-10T06:32:52+00:00 ... 11393 5.399000e+07
39 1859791498 2019-06-10T05:35:57+00:00 ... 22460 6.200000e+07
40 2112629749 2019-06-09T15:46:46+00:00 ... 2549 1.800000e+06
41 94391975 2019-06-08T00:06:12+00:00 ... 37460 3.650000e+07
42 91521700 2019-06-07T14:11:45+00:00 ... 11393 4.999800e+07
43 1171184159 2019-06-06T18:10:19+00:00 ... 12044 3.399800e+07
44 96410073 2019-06-05T17:32:01+00:00 ... 11371 4.700000e+07
[83 rows x 10 columns]
我正在寻找合并两个不同的数据帧,消除重复,如果他们是在他们的日期,我希望我可以求助于他们的规则。 但是现在我还不能去掉任何复制品
选择要比较的列。例如,如果你不在乎客户id是否不同,就不要考虑。我会这样做:
那对你有用吗
相关问题 更多 >
编程相关推荐