<p>您可以比较字符串的各个部分,然后按子集删除它们:</p>
<pre><code>print df.Interval.str[0:2]
1867 00
1868 04
1869 04
1870 08
1871 12
2838 00
2839 04
2840 04
2841 08
Name: Interval, dtype: object
print df.Interval.str[0:2] != df.Interval.str[9:11]
1867 True
1868 False
1869 True
1870 True
1871 True
2838 True
2839 False
2840 True
2841 True
Name: Interval, dtype: bool
print df[df.Interval.str[0:2] != df.Interval.str[9:11]]
UNIT EXITSn_hourly Interval
1867 R081 104 00:00:00-04:00:00
1869 R081 129 04:00:00-08:00:00
1870 R081 521 08:00:00-12:00:00
1871 R081 1048 12:00:00-16:00:00
2838 R032 38 00:00:00-04:00:00
2840 R032 89 04:00:00-08:00:00
2841 R032 470 08:00:00-12:00:00
</code></pre>
<p>编辑:</p>
<p>我检查您的代码,也许您可以省略<code>copy.deepcopy</code>并使用<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.copy.html" rel="nofollow">^{<cd2>}</a>:</p>
<pre><code>df = turnstile_data.copy(deep=True)
df['ENTRIESn_hourly'] = (df['ENTRIESn'] - df['ENTRIESn'].shift(periods=1)).fillna(0)
df['EXITSn_hourly'] = (df['EXITSn'] - df['EXITSn'].shift(periods=1)).fillna(0)
df['Interval'] = (df['TIMEn'].shift(periods=1)+'-'+ df['TIMEn']).fillna(0)
df.loc[(df['ENTRIESn'] == 0), 'ENTRIESn_hourly'] = 0
df.loc[(df['EXITSn'] == 0), 'EXITSn_hourly'] = 0
df.loc[(df['C/A'] != df['C/A'].shift(periods=1)) |
(df['UNIT'] != df['UNIT'].shift(periods=1)) |
(df['SCP'] != df['SCP'].shift(periods=1)),
['ENTRIESn_hourly', 'EXITSn_hourly','Interval']] = 0
print df.head(5)
ENTRIESn_hourly EXITSn_hourly Interval
0 0 0 0
1 36 3 00:00:00-04:00:00
2 13 31 04:00:00-08:00:00
3 100 69 08:00:00-12:00:00
4 195 51 12:00:00-16:00:00
required_df=df[['UNIT','EXITSn_hourly','Interval']].groupby(df.UNIT)
print required_df.head(5)
UNIT EXITSn_hourly Interval
0 R051 0 0
1 R051 3 00:00:00-04:00:00
2 R051 31 04:00:00-08:00:00
3 R051 69 08:00:00-12:00:00
4 R051 51 12:00:00-16:00:00
</code></pre>