<p>我没有找到任何Python亵渎库,所以我自己做了一个。</p>
<h2>参数</h2>
<hr/>
<h3><code>filterlist</code></h3>
<p>与禁止使用的单词相匹配的正则表达式的列表。请不要使用<code>\b</code>,它将根据<code>inside_words</code>插入。</p>
<p>示例:
<code>['bad', 'un\w+']</code></p>
<h3><code>ignore_case</code></h3>
<p>默认值:<code>True</code></p>
<p>不言而喻。</p>
<h3><code>replacements</code></h3>
<p>默认值:<code>"$@%-?!"</code></p>
<p>包含字符的字符串,替换字符串将从中随机生成。</p>
<p>例如:<code>"%&$?!"</code>或<code>"-"</code>等</p>
<h3><code>complete</code></h3>
<p>默认值:<code>True</code></p>
<p>控制是否替换整个字符串或是否保留第一个和最后一个字符。</p>
<h3><code>inside_words</code></h3>
<p>默认值:<code>False</code></p>
<p>控制是否在其他单词中搜索单词。禁用此</p>
<h2>模块源</h2>
<hr/>
<p>(最后举例)</p>
<pre><code>"""
Module that provides a class that filters profanities
"""
__author__ = "leoluk"
__version__ = '0.0.1'
import random
import re
class ProfanitiesFilter(object):
def __init__(self, filterlist, ignore_case=True, replacements="$@%-?!",
complete=True, inside_words=False):
"""
Inits the profanity filter.
filterlist -- a list of regular expressions that
matches words that are forbidden
ignore_case -- ignore capitalization
replacements -- string with characters to replace the forbidden word
complete -- completely remove the word or keep the first and last char?
inside_words -- search inside other words?
"""
self.badwords = filterlist
self.ignore_case = ignore_case
self.replacements = replacements
self.complete = complete
self.inside_words = inside_words
def _make_clean_word(self, length):
"""
Generates a random replacement string of a given length
using the chars in self.replacements.
"""
return ''.join([random.choice(self.replacements) for i in
range(length)])
def __replacer(self, match):
value = match.group()
if self.complete:
return self._make_clean_word(len(value))
else:
return value[0]+self._make_clean_word(len(value)-2)+value[-1]
def clean(self, text):
"""Cleans a string from profanity."""
regexp_insidewords = {
True: r'(%s)',
False: r'\b(%s)\b',
}
regexp = (regexp_insidewords[self.inside_words] %
'|'.join(self.badwords))
r = re.compile(regexp, re.IGNORECASE if self.ignore_case else 0)
return r.sub(self.__replacer, text)
if __name__ == '__main__':
f = ProfanitiesFilter(['bad', 'un\w+'], replacements="-")
example = "I am doing bad ungood badlike things."
print f.clean(example)
# Returns "I am doing --- ------ badlike things."
f.inside_words = True
print f.clean(example)
# Returns "I am doing --- ------ ---like things."
f.complete = False
print f.clean(example)
# Returns "I am doing b-d u----d b-dlike things."
</code></pre>