<p>我怀疑有没有熊猫的方法可以直接解决这个问题。
你必须手动计算交点才能得到你想要的结果。<a href="https://pypi.python.org/pypi/intervaltree/2.0.4" rel="nofollow noreferrer">intervaltree</a>库至少使区间重叠计算更简单、更有效。在</p>
<p><code>IntervalTree.search()</code>返回与提供的间隔重叠但不计算其交集的(完整)间隔。这就是为什么我还要应用我定义的<code>intersect()</code>函数。在</p>
<pre><code>import pandas as pd
from intervaltree import Interval, IntervalTree
def intersect(a, b):
"""Intersection of two intervals."""
intersection = max(a[0], b[0]), min(a[1], b[1])
if intersection[0] > intersection[1]:
return None
return intersection
def interval_df_intersection(df1, df2):
"""Calculate the intersection of two sets of intervals stored in DataFrames.
The intervals are defined by the "Start" and "End" columns.
The data in the rest of the columns of df1 is included with the resulting
intervals."""
tree = IntervalTree.from_tuples(zip(
df1.Start.values,
df1.End.values,
df1.drop(["Start", "End"], axis=1).values.tolist()
))
intersections = []
for row in df2.itertuples():
i1 = Interval(row.Start, row.End)
intersections += [list(intersect(i1, i2)) + i2.data for i2 in tree[i1]]
# Make sure the column names are in the correct order
data_cols = list(df1.columns)
data_cols.remove("Start")
data_cols.remove("End")
return pd.DataFrame(intersections, columns=["Start", "End"] + data_cols)
interval_df_intersection(mydataframe2, mydataframe1)
</code></pre>
<p>结果和你所追求的完全一样。在</p>