<p>不能将bzip2压缩数据作为utf-8发送。它是二进制数据,不是文本。在</p>
<p>如果您的http客户端接受bzip2内容编码(<a href="http://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding" rel="nofollow noreferrer">^{<cd1>} is not standard</a>),那么您可以发送使用bzip2压缩的utf-8编码文本:</p>
<pre><code>#!/usr/bin/env python
import bz2
def app(environ, start_response):
status = '200 OK'
headers = [('Content-type', 'text/plain; charset=utf-8')]
data = (u'Hello \N{SNOWMAN}\n' * 10).encode('utf-8')
if 'bzip2' in environ.get('HTTP_ACCEPT_ENCODING', ''): # use bzip2 only if requested
data = bz2.compress(data)
headers.append(('Content-Encoding', 'bzip2'))
headers.append(('Content-Length', str(len(data))))
start_response(status, headers)
return data
</code></pre>
<h3>示例</h3>
<p>未压缩响应:</p>
^{pr2}$
<p>如果客户端指定接受bzip2,则bzip2压缩响应:</p>
<pre><code>$ http -v 127.0.0.1:8000 Accept-Encoding:bzip2
GET / HTTP/1.1
Accept: */*
Accept-Encoding: bzip2
Connection: keep-alive
Host: 127.0.0.1:8000
User-Agent: HTTPie/0.9.2
HTTP/1.1 200 OK
Connection: close
Content-Encoding: bzip2
Content-Length: 65
Content-type: text/plain; charset=utf-8
Date: Sun, 17 May 2015 18:48:23 GMT
Server: gunicorn/19.3.0
+ -+
| NOTE: binary data not shown in terminal |
+ -+
</code></pre>
<p>下面是使用<code>requests</code>库的相应http客户端:</p>
<pre><code>#!/usr/bin/env python
from __future__ import print_function
import bz2
import requests # $ pip install requests
r = requests.get('http://localhost:8000', headers={'Accept-Encoding': 'gzip, deflate, bzip2'})
content = r.content
print(len(content))
if r.headers['Content-Encoding'].endswith('bzip2'): # requests doesn't understand bzip2
content = bz2.decompress(content)
print(len(content))
text = content.decode(r.encoding)
print(len(text))
print(text, end='')
</code></pre>
<h3>输出</h3>
<pre><code>65
100
80
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
</code></pre>
<hr/>
<p>否则(没有非标准的接受编码),您应该以<code>application/octet-stream</code>作为<a href="https://stackoverflow.com/a/30253928/4279">@icedtrees suggested</a>发送数据:</p>
<pre><code>#!/usr/bin/env python
import bz2
def app(environ, start_response):
status = '200 OK'
headers = [('Content-type', 'application/octet-stream')]
data = bz2.compress((u'Hello \N{SNOWMAN}\n' * 10).encode('utf-8'))
headers.append(('Content-Length', str(len(data))))
start_response(status, headers)
return data
</code></pre>
<h3>示例</h3>
<pre><code>$ http 127.0.0.1:8000
HTTP/1.1 200 OK
Connection: close
Content-Length: 65
Content-type: application/octet-stream
Date: Sun, 17 May 2015 18:53:55 GMT
Server: gunicorn/19.3.0
+ -+
| NOTE: binary data not shown in terminal |
+ -+
</code></pre>
<p><code>bzcat</code>接受bzip2内容:</p>
<pre><code>$ http 127.0.0.1:8000 | bzcat
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
Hello ☃
</code></pre>
<p>由于终端使用utf-8编码,所以数据显示正确。在</p>