在Python3中解析Bash重定向

2024-10-02 02:38:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我目前正在用Python为一个库编写一个命令解析器模块,它将采用相当复杂的bash管道,将它们拆分并解析各个片段。你知道吗

用于解析器的正则表达式并不复杂,但使用命名组:

/(?P<command>.*?)( ((?P<redirect>[&\d-]?)>+ ?&?(?P<filename>\S+)))( ?< ?(?P<infile>.*))?/g

在这个表单中,我一直在使用以下[设计的]字符串进行测试:

sed 's/24/25/g'                            # Doesn't pass but not necessary
sed 's/24/25/g' &>/dev/null                # Works
sed 's/24/25/g' 1>&2                       # Works
sed 's/24/25/g' 2>&1 1>/dev/null           # Works
sed 's/24/25/g' &>/dev/null < infile.txt   # Works
grep -rin --col 'i < 24\|b>19' > /dev/null            # Works
grep -rin --col 'i < 24\|b > 19' > /dev/null          # Doesn't work

我不关心匹配sed 's/24/25/g',因为如果没有匹配,我可以将整个字符串指定为command,但是最后的grep会有问题,因为以这种方式提供的命令包含>符号是完全可行的。你知道吗

问题:这个正则表达式是否可以重写以包含最后一个示例,而不必使用pcre。你知道吗

示例:(Python3)

import re
import shlex
from collections import namedtuple

redirect_regex = re.compile(r"(?P<command>.*?)( (?P<redirect>[&\d]?)>+ ?&?(?P<filename>\S+))( ?< ?(?P<infile>.*))?", re.DOTALL)
command_list = [
    "sed 's/24/25/g'",
    "sed 's/24/25/g' &>/dev/null",
    "sed 's/24/25/g' 1>&2",
    "sed 's/24/25/g' 2>&1 1>/dev/null",
    "sed 's/24/25/g' &>/dev/null < infile.txt",
    "grep -rin --col 'i < 24\|b>19' > /dev/null",
    "grep -rin --col 'i < 24\|b > 19' > /dev/null"
]

command_structure = namedtuple('CommandStructure', 'command arguments redirects')
redirect = namedtuple('Redirect', 'stdout stderr stdin')
commands = []

for command in command_list:
    for com in command.split(' | '):
        structure = None
        matches = [match.groupdict() for match in redirect_regex.finditer(com)]
        if len(matches) == 0:
            structure = shlex.split(com)
            commands.append(
                command_structure(
                    command=structure[:1],
                    arguments=structure[1:],
                    redirects=None
                )
            )
        else:
            try:
                structure = shlex.split(matches[0]['command'])
            except ValueError as exception:
                print('Failed to parse command "' + com + '"')
                print('    reason was: ' + str(exception))
                continue
            structure_redirects = []
            for match in matches:
                stdout = match['filename'] if match['redirect'] in ['1', '', '&'] else None
                stderr = match['filename'] if match['redirect'] in ['2', '&'] else None
                stdin = match['infile'] if hasattr(match, 'infile') else None
                structure_redirects.append(
                    redirect(stdout=stdout, stderr=stderr, stdin=stdin)
                )
            commands.append(
                command_structure(
                    command=structure[:1],
                    arguments=structure[1:],
                    redirects=structure_redirects
                )
            )

print('--------------------------------')
for command in commands:
    print(command)

输出

Failed to parse command "grep -rin --col 'i < 24\|b > 19' > /dev/null"
    reason was: No closing quotation
--------------------------------
CommandStructure(command=['sed'], arguments=['s/24/25/g'], redirects=None)
CommandStructure(command=['sed'], arguments=['s/24/25/g'], redirects=[Redirect(stdout='/dev/null', stderr='/dev/null', stdin=None)])
CommandStructure(command=['sed'], arguments=['s/24/25/g'], redirects=[Redirect(stdout='2', stderr=None, stdin=None)])
CommandStructure(command=['sed'], arguments=['s/24/25/g'], redirects=[Redirect(stdout=None, stderr='1', stdin=None), Redirect(stdout='/dev/null', stderr=None, stdin=None)])
CommandStructure(command=['sed'], arguments=['s/24/25/g'], redirects=[Redirect(stdout='/dev/null', stderr='/dev/null', stdin=None)])
CommandStructure(command=['grep'], arguments=['-rin', '--col', 'i < 24\\|b>19'], redirects=[Redirect(stdout='/dev/null', stderr=None, stdin=None)])

Tags: devnonematchstderrstdinstdoutstructurearguments

热门问题