<p>我通常使用<a href="http://docs.python.org/2.7/library/array.html#module-array" rel="nofollow">array-module</a>和<a href="http://docs.python.org/2.7/library/array.html#array.array.fromstring" rel="nofollow">fromstring</a>方法。在</p>
<p>我对数据块进行操作的标准模式是:</p>
<pre><code>def bytesfromfile(f):
while True:
raw = array.array('B')
raw.fromstring(f.read(8192))
if not raw:
break
yield raw
with open(f_in, 'rb') as fd_in:
for byte in bytesfromfile(fd_in):
# do stuff
</code></pre>
<p>上面的<code>'B'</code>表示无符号字符,即1字节。在</p>
<p>如果文件不是很大,那么您可以直接删除它:</p>
^{pr2}$
<p><a href="http://www.python.org/doc/essays/list2str.html" rel="nofollow">Guido can't be wrong</a>。。。在</p>
<p>如果您更喜欢<a href="http://www.numpy.org/" rel="nofollow">numpy</a>,我倾向于使用:</p>
<pre><code> fd_i = open(file.bin, 'rb')
fd_o = open(out.bin, 'wb')
while True:
# Read as uint8
chunk = np.fromfile(fd_i, dtype=np.uint8, count=8192)
# use int for calculations since uint wraps
chunk = chunk.astype(np.int)
if not chunk.any():
break
# do some calculations
data = ...
# convert back to uint8 prior to writing.
data = data.astype(np.uint8)
data.tofile(fd_o)
fd_i.close()
fd_o.close()
</code></pre>
<p>或者阅读整个文件:</p>
<pre><code>In [18]: import numpy as np
In [19]: f = open('foreman_cif_frame_0.yuv', 'rb')
In [20]: data = np.fromfile(f, dtype=np.uint8)
In [21]: data[0:10]
Out[21]: array([ 10, 40, 201, 255, 247, 254, 254, 254, 254, 254], dtype=uint8)
</code></pre>