UnicodeDecodeError Python/Django应用程序

2024-06-28 11:06:10 发布

您现在位置:Python中文网/ 问答频道 /正文

我得到了这个错误

UnicodeDecodeError at /select_text 'utf-8' codec can't decode byte 0xe7 in position 92: invalid continuation byte Request Method: POST Request URL: http://agata.pgie.ufrgs.br/select_text Django Version: 2.0.1 Exception Type: UnicodeDecodeError Exception Value: 'utf-8' codec can't decode byte 0xe7 in position 92: invalid continuation byte Exception Location: /home/metis/public_html/AGATA/agataenv/lib/python3.4/codecs.py in decode, line 319 Python Executable: /usr/bin/python3 Python Version: 3.4.3 Python Path: ['/home/metis/public_html/AGATA', '/home/metis/public_html/AGATA/agataenv/lib/python3.4', '/home/metis/public_html/AGATA/agataenv/lib/python3.4/plat-x86_64-linux-gnu', '/home/metis/public_html/AGATA/agataenv/lib/python3.4/lib-dynload', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-x86_64-linux-gnu', '/home/metis/public_html/AGATA/agataenv/lib/python3.4/site-packages'] Server time: Thu, 22 Feb 2018 12:29:51 +0000 Unicode error hint The string that could not be encoded/decoded was: Varia��es nvironment:

Request Method: POST
Request URL: http://agata.pgie.ufrgs.br/select_text

Django Version: 2.0.1
Python Version: 3.4.3
Installed Applications:
['django.contrib.admin',
 'django.contrib.auth',
 'django.contrib.contenttypes',
 'django.contrib.sessions',
 'django.contrib.messages',
 'django.contrib.staticfiles',
 'textMining',
 'bootstrapform']
Installed Middleware:
['django.middleware.security.SecurityMiddleware',
 'django.contrib.sessions.middleware.SessionMiddleware',
 'django.middleware.common.CommonMiddleware',
 'django.middleware.csrf.CsrfViewMiddleware',
 'django.contrib.auth.middleware.AuthenticationMiddleware',
 'django.contrib.messages.middleware.MessageMiddleware',
 'django.middleware.clickjacking.XFrameOptionsMiddleware']



Traceback:

File "/home/metis/public_html/AGATA/agataenv/lib/python3.4/site-packages/django/core/handlers/exception.py" in inner
  35.             response = get_response(request)

File "/home/metis/public_html/AGATA/agataenv/lib/python3.4/site-packages/django/core/handlers/base.py" in _get_response
  128.                 response = self.process_exception_by_middleware(e, request)

File "/home/metis/public_html/AGATA/agataenv/lib/python3.4/site-packages/django/core/handlers/base.py" in _get_response
  126.                 response = wrapped_callback(request, *callback_args, **callback_kwargs)

File "/home/metis/public_html/AGATA/textMining/views.py" in select_text
  59.     text_mining = TextMining(file_path, keywords)

File "/home/metis/public_html/AGATA/textMining/TextMining.py" in __init__
  15.         self.separete_file_sentences()

File "/home/metis/public_html/AGATA/textMining/TextMining.py" in separete_file_sentences
  31.             file_text = text_file.read().decode('string-escape').decode("utf-8")

File "/home/metis/public_html/AGATA/agataenv/lib/python3.4/codecs.py" in decode
  319.         (result, consumed) = self._buffer_decode(data, self.errors, final)

Exception Type: UnicodeDecodeError at /select_text
Exception Value: 'utf-8' codec can't decode byte 0xe7 in position 92: invalid continuation byte

在我的Django应用程序上,已经在Apache上了,我无法找出问题出在哪里,因为我正在处理编码(至少我认为是这样的…)

我的代码(按顺序):

^{pr2}$

TextMining类的功能:

class TextMining(object):
    def __init__(self, file_path, keywords):
        self._file_path = file_path
        self._keywords = keywords
        self._sentences = list()
        self._keyword_sentences = dict()

        self.lower_keywords()
        self.separete_file_sentences()
...
    def separete_file_sentences(self):
        with open(self._file_path, "r", encoding='utf-8') as text_file:
            file_text = text_file.read()
            sentences = nltk.tokenize.sent_tokenize(file_text)

            for i in range(len(sentences)):
                if(len(sentences[i]) > 0):
                    self._sentences.append(sentences[i])

我已经处理了好几天了,尝试了很多方法,但都没用。。在

在网址.py(文本挖掘应用程序)

urlpatterns = [
        url(r'^$', views.index, name='index'),
        url(r'^select_text', views.select_text, name = 'select_text'),
        url(r'^edit_text', views.edit_text, name = 'edit_text'),
        url(r'^generate_aiml', views.generate_aiml, name = 'generate_aiml'),
]

在网址.py(文本管理项目)

urlpatterns = [
    url(r'^admin/', admin.site.urls),
    url(r'^', include('textMining.urls')),
] + static(settings.STATIC_URL, document_root=settings.STATIC_ROOT)

if settings.DEBUG is True:
    urlpatterns += static(settings.MEDIA_URL, document_root=settings.MEDIA_ROOT)

Tags: djangotextinpyselfhomelibhtml