字幕非常干净
cleanit的Python项目详细描述
字幕非常干净。
Project page: | https://github.com/ratoaq2/cleanit |
---|
cleanit是一个命令行工具(用python编写),可以帮助您保持字幕的干净。您可以指定规则来检测要删除的字幕条目或要替换的模式。可以使用简单的文本匹配或复杂的正则表达式。
用法
cli
干净的字幕:
$ cleanit --config my-config.yml my-subtitle.srt Collected 1 subtitles Saving <Subtitle [my-subtitle.srt]> Saved <Subtitle [my-subtitle.srt]>
库
如何使用特定配置清除特定路径中的字幕:
fromcleanit.apiimportclean_subtitle,save_subtitlefromcleanit.configimportConfigfromcleanit.subtitleimportSubtitlesubtitle=Subtitle('/subtitle/path')config=Config.from_file('/config/path')ifclean_subtitle(subtitle,config.rules):save_subtitle(subtitle)
yaml配置文件
yaml配置文件有两个主要部分:templates和groups。
- templates可以帮助您定义要在多个组中使用的公共配置片段。
- groups:您可以在其中定义规则。
# Reference:# type: [text*, regex]# match: [contains*, exact, startswith, endswith]# flags: [ignorecase, dotall, multiline, locale, unicode, verbose]# whitelist: no*# rules:# - sometext# - (\b)(\d{1,2})x(\d{1,2})(\b): {replacement: \1S\2E\3\4, type: regex, match: contains, flags: [unicode], whitelist: no}templates:common:type:textmatch:containsgroups:# Groups can have any name, in this case 'blacklist' we have all the rules to remove subtitle entriesblacklist:template:commonrules:# Removes any subtitle entry that contains the word FooBar-FooBar# Removes any subtitle entry that contains the pattern S00E00# Example:# My Series S01E02-\bs\d{2}\s?e\d{2}\b:{type:regex, flags:ignorecase}# Removes any subtitle entry that is exactly the word: 'Ah' or 'Oh' (with 1 or more h)# Example:# Ohhh!-((Ah+)|(Oh+))\W?:{match:exact}# The group 'tidy' has all rules to replace certain patterns in your subtitles.tidy:template:commontype:regexrules:# Description: Replace extra spaces to a single space# Example:# Foo bar.# to# Foo bar.-\s{2,}:''# Description: Add space when starting phrase with '-'. It ignores tags, such as <i>, <b># Example:# <i>-Francine, what has happened?# -What has happened? You tell me!</i># to# <i>- Francine, what has happened?# - What has happened? You tell me!</i>-'(?:^(|(?:\<\w\>)))-([''"]?\w+)':{ replacement:'\1-\2', flags:[multiline,unicode]}
*如果未定义任何值,则为默认值
如果未定义配置文件,cleanit将尝试从~/.config/cleanit/config.yml加载配置文件。
更改日志
0.2.1
发布日期:2016-02-28 *在没有python魔术依赖的情况下添加guess编码。
0.2
发布日期:2016-02-27 *正在删除chardet和python magic依赖项。指定了编码,或者应该由pysrt猜测编码
0.1
发布日期:2015-10-16
- 初始版本