<p>使用熊猫,这项任务变得更具可读性:</p>
<pre><code>import pandas as pd
import io
data = '''\
datatype1 designator1 3:30:14AM
datatype1 designator1 3:30:18AM
datatype1 designator1 3:45:14AM
datatype1 designator1 3:45:19AM
datatype1 designator1 3:45:26AM
datatype1 designator1 3:45:31AM
datatype1 designator1 4:10:05AM
datatype1 designator1 4:10:21AM
datatype1 designator1 4:10:30AM
datatype1 designator1 4:10:46AM'''
# Recreate dataset
df = pd.read_csv(io.StringIO(data),sep='\s+', header=None)
# Use this instead of above for real file
#df = pd.read_csv('path/to/file',sep='\s+', header=None)
# Get first and last by hour (convert to dt)
df[2] = sorted(pd.to_datetime(df[2]))
newdf = df.groupby((df[2].dt.hour, df[2].dt.minute // 15)).agg(['first', 'last'])
# Rename columns and drop duplicates
newdf.columns = list(range(len(newdf.columns)))
newdf.drop(newdf.columns[[1,3]], axis=1, inplace=True)
# Format time
newdf[[4,5]] = newdf[[4,5]].apply(lambda x: x.dt.strftime('%#H:%M:%S%p'))
# Output
print(newdf.to_csv('output.csv', index=False, header=False, sep=' '))
</code></pre>
<p>你知道吗输出.csv地址:</p>
<pre><code>datatype1 designator1 3:30:14AM 3:30:18AM
datatype1 designator1 3:45:14AM 3:45:31AM
datatype1 designator1 4:10:05AM 4:10:46AM
</code></pre>