邮件样式标题的argparse
headerparser的Python项目详细描述
GitHub |PyPI |Documentation |Issues |Changelog
headerparser以rfc 822(电子邮件)的样式分析键值对 并将其转换为不区分大小写的字典 附加邮件正文(如果有)。字段可以转换为其他类型,标记为 使用基于标准库 argparse模块。(每个人都喜欢argparse,对吧?)低水平 只扫描头字段的函数(将它们分成 也包括没有任何进一步处理的键值对)。
格式
RFC822样式头是遵循 rfc 822和friends指定的电子邮件头:每个字段是 表单“Name: Value”,长值连续到多行 通过缩进多余的行(“折叠”)。空行标记 标题部分和消息正文的开头。
除了电子邮件之外,许多文本格式都使用这种基本语法, 包括但不限于:
- http请求和响应头
- usenet消息
- 大多数python打包元数据文件
- Debian打包控制文件
- META-INF/MANIFEST.MF在Java JAR中的文件
- YAML序列化格式的子集
-所有这些都是这个包可以解析的。
示例
定义解析器:
>>> import headerparser >>> parser = headerparser.HeaderParser() >>> parser.add_field('Name', required=True) >>> parser.add_field('Type', choices=['example', 'demonstration', 'prototype'], default='example') >>> parser.add_field('Public', type=headerparser.BOOL, default=False) >>> parser.add_field('Tag', multiple=True) >>> parser.add_field('Data')
分析一些标题并检查结果:
>>> msg = parser.parse_string('''\ ... Name: Sample Input ... Public: yes ... tag: doctest, examples, ... whatever ... TAG: README ... ... Wait, why I am using a body instead of the "Data" field? ... ''') >>> sorted(msg.keys()) ['Name', 'Public', 'Tag', 'Type'] >>> msg['Name'] 'Sample Input' >>> msg['Public'] True >>> msg['Tag'] ['doctest, examples,\n whatever', 'README'] >>> msg['TYPE'] 'example' >>> msg['Data'] Traceback (most recent call last): ... KeyError: 'data' >>> msg.body 'Wait, why I am using a body instead of the "Data" field?\n'
无法分析不符合您要求的邮件头:
>>> parser.parse_string('Type: demonstration') Traceback (most recent call last): ... headerparser.errors.MissingFieldError: Required header field 'Name' is not present >>> parser.parse_string('Name: Bad type\nType: other') Traceback (most recent call last): ... headerparser.errors.InvalidChoiceError: 'other' is not a valid choice for 'Type' >>> parser.parse_string('Name: unknown field\nField: Value') Traceback (most recent call last): ... headerparser.errors.UnknownFieldError: Unknown header field 'Field'
允许您甚至没有想到的字段:
>>> parser.add_additional() >>> msg = parser.parse_string('Name: unknown field\nField: Value') >>> msg['Field'] 'Value'
只需将一些标题拆分为名称和值,然后再担心其有效性:
>>> for field in headerparser.scan_string('''\ ... Name: Scanner Sample ... Unknown headers: no problem ... Unparsed-Boolean: yes ... CaSe-SeNsItIvE-rEsUlTs: true ... Whitespace around colons:optional ... Whitespace around colons : I already said it's optional. ... That means you have the _option_ to use as much as you want! ... ... And there's a body, too, I guess. ... '''): print(field) ('Name', 'Scanner Sample') ('Unknown headers', 'no problem') ('Unparsed-Boolean', 'yes') ('CaSe-SeNsItIvE-rEsUlTs', 'true') ('Whitespace around colons', 'optional') ('Whitespace around colons', "I already said it's optional.\n That means you have the _option_ to use as much as you want!") (None, "And there's a body, too, I guess.\n")