如何将这些嵌套的字符串转换成前缀表示法中的元组最为有效?

2024-09-27 00:21:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我想要一个通用的解决方案来转换字符串的前缀符号成元组。下面的三个例子说明了我希望实现的目标。每个字符串都是列表中的一个元素(只有第二个示例有多个元素)。你知道吗

['neg(or(p,q))']
['imp(p,q)', 'imp(neg(r),neg(q))']
['and(and(p,q),r)']

变成

[('neg',('or','p','q'))]
[('imp','p','q'), ('imp',('neg','r'),('neg','q'))]
[('and',('and','p','q'),'r')]

一般来说,我想要的格式是('connective'、'parameter'、'parameter'),其中connective可以是'and'、'or'、'iff'、'imp'、'neg'。每个参数后面都有两个参数,除了“neg”只需要一个参数。参数可以是上面示例中所示的其他元组,它演示了嵌套公式。你知道吗

另一种看待这一点的方法是首先考虑激发问题的公式。你知道吗

例1 ((p∧q)∨(∨p∧q))最终成为

[('or',('and','p','q'),('and',('neg','p'),('neg','q')))]

例2 (p→q),(r→q)最终变成

[('imp','p','q'), ('imp',('neg','r'),('neg','q'))]

我尝试过:

import re

def tuplify(raw_sequent):

    first_split = re.split('\(', raw_sequent, 1)
    connective = first_split[0]
    second_split = re.split('\,', first_split[1])
    right_arg = second_split[-1]
    second_split = second_split[:-1]
    left_arg = ','.join(second_split)
    tup = (connective, left_arg, right_arg[:-1])

    return tup

但是,这并不能很好地概括,只适用于输入

'and(and(p,q),or(r))'

我想这个解决方案可能需要regex和递归的结合。你知道吗


Tags: orand字符串re元素参数arg解决方案
3条回答

这是一个使用Regex和字符串操作的解决方案,希望注释能解释发生了什么:

import re


def convert_to_tuple(text):
    """ Take an expression and convert it to tuple
      'neg(or(p,q))'  -->  ('neg', ('or', 'p', 'q'))
    """
    # used to extract not nested expression
    pattern = re.compile('(\w+)\(([^\(]*?)\)')
    # extract the index of the expression
    index_pattern = re.compile('#(\d+)#')
    # contains all expression extracted from the text
    expressions = []
    # you only need to extract most global expression only
    global_expressions = []


    def fix_expression(expression):
        """  a recursive solution to rebuild nested expression. """
        if isinstance(expression, str):
            # if the expression is like #index# return the correct expression else return this str expression
            m = index_pattern.search(expression)
            if m:
                return tuple(fix_expression(expressions[int(m.group(1))]))
            return expression
        # if the expression is a tuple extract all fixed nested expression in a tuple
        return tuple(fix_expression(subexp) for subexp in expression)


    def extract_expression(code):
        """ keep extracting nested expressions list,  """

        def add_exp(match):
            """ handle the match return by sub to save the expression and return its index."""
            expressions.append(None)
            index = len(expressions) - 1
            name, args = match.groups()

            if ',' in args:
                # if what is between parentheses contains ',' split is
                args = tuple(args.split(','))
            else:
                # args should always be a tuple to support unpacking operation
                args = (args,)

            # expression transformed to a tuple
            expressions[index] = (name, *args)
            return '#{}#'.format(index)

        while pattern.search(code):
            code = re.sub(pattern, add_exp, code)

        # for debuging string after removing all expression
        # print(code)


    # this extract all nested expression in the  text
    extract_expression(text)

    # Global expression there index is not used in any other expression
    # the last expression is for sure a global one because.
    global_expressions.append(expressions[-1])
    for i, exp in enumerate(expressions[:-1]):
        for other_exp in expressions[i + 1:]:
            if any('#{}#'.format(i) in sub_exp for sub_exp in other_exp):
                break
        else:
            # this expression is not used in any expression it's a global one
            global_expressions.append(exp)

    # for debugging
    # print(expressions)
    # print(global_expressions)

    return fix_expression(global_expressions[0])


print([convert_to_tuple(x) for x in ['neg(or(p,q))']])  # [('neg', ('or', 'p', 'q'))]
print([convert_to_tuple(x) for x in ['imp(p,q)', 'imp(neg(r),neg(q))']])  # [('imp', 'p', 'q'), ('imp', ('neg', 'r'), ('neg', 'q'))]
print([convert_to_tuple(x) for x in ['and(and(p,q),r)']])  # [('and', ('and', 'p', 'q'), 'r')]

当我们使用Regex时,嵌套表达式中的技巧是我使用re.sub首先将一个非嵌套表达式提取到一个列表中,并用它的索引替换它。继续这样做,直到没有表达式可提取。然后处理提取的表达式列表以执行所需的工作。你知道吗

查看我对类似问题的回答,以便对这一技巧有一个清晰的认识:

Answer for Extract all variables from a string of Python code (regex or AST)

使用生成器的较短递归解决方案:

import re
def to_tuple(d, stop=None):
   a = next(d, None)
   if a and a != stop:
      b = next(d, None)
      if b is None or b == stop:
        yield a
      else:
         if b == '(':
            yield (a, *to_tuple(d, stop=')'))
         else:
            yield from filter(None, [a, b])
         yield from to_tuple(d, stop=stop)

def format_input(_d):
   return iter(re.findall('\(|\)|\w+', _d))

n = [['neg(or(p,q))'], ['imp(p,q)', 'imp(neg(r),neg(q))'], ['and(and(p,q),r)']]
r = [[tuple(to_tuple(format_input(i)))[0] for i in k] for k in n]

输出:

[[('neg', ('or', 'p', 'q'))], 
[('imp', 'p', 'q'), ('imp', ('neg', 'r'), ('neg', 'q'))], 
[('and', ('and', 'p', 'q'), 'r')]]

以下代码

import re

OPERATORS_DIC = {'and': 2, 'or': 2, 'iff': 2, 'imp': 2, 'neg': 1}
EXPRESSIONS = [ 'neg(or(p,q))'
                , 'imp(p,q)'
                , 'imp(neg(r),neg(q))'
                , 'and(and(p,q),r)' ]

def digest(expression):
    """expression is an expression that should be recursively digested
    """
    for op in OPERATORS_DIC:
        match = re.search( op + '\((.*)\)', expression )
        if not match:
            continue
        inside = match.groups()[0]
        # examples: inside is 'p,q' or 'p,neg(q)', or 'p,neg(and(q,r))'
        if OPERATORS_DIC[op] == 1:
            return (op, digest(inside))
        # else we try to get the two arguments by counting parentheses
        for k in range(len(inside)):
            if inside[k] != ',':
                continue
            before, after = inside[:k], inside[k+1:]
            if ( before.count('(') == before.count(')') and
                 after .count('(') == after .count(')') ):
                return (op, digest(before), digest(after))
    return expression


for expr in EXPRESSIONS:
    print('{} -> {}'.format(expr, digest(expr)))

提供:

neg(or(p,q)) -> ('neg', ('or', 'p', 'q'))
imp(p,q) -> ('imp', 'p', 'q')
imp(neg(r),neg(q)) -> ('imp', ('neg', 'r'), ('neg', 'q'))
and(and(p,q),r) -> ('and', ('and', 'p', 'q'), 'r')

因此EXPRESSIONS的输入所需的输出是

[ digest(expr) for expr in EXPRESSIONS ]

相关问题 更多 >

    热门问题