分析带嵌套引号的字符串

2024-06-26 16:40:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要解析一个如下所示的字符串:

"prefix 'field1', '', 'field2', 'field3', 'select ... where (column1 = '2017') and ((('literal1', 'literal2', 'literal3', 'literal4', 'literal5', 'literal6', 'literal7') OVERLAPS column2 Or ('literal8') OVERLAPS column3 And (column4 > 0.0 Or column6 > 0.0)) And column7 IN_COMMUNITY [int1] And column5 = 'literal9')  LIMIT 0 ', 'field5', 'field6', 'field7', 'field8', 'field9', '', 'field10'"

我想要一份清单如下:

^{pr2}$

我尝试过正则表达式,但在pseudo SQL语句的子字符串中不起作用。在

我怎样才能拿到那张单子?在


Tags: orand字符串prefixwhereselectcolumn1field2
3条回答

如果您知道SQL字符串应该是什么样子的话,这里有一个很简单的方法。在

我们匹配SQL字符串,并将其余字符串拆分为起始字符串和结束字符串。在

然后,我们匹配更简单的字段模式,并从start开始为该模式构建一个列表,在SQL匹配中添加,然后从结束字符串添加字段。在

sqlmatch = 'select .* LIMIT 0'
fieldmatch = "'(|\w+)'"
match = re.search(sqlmatch, mystring)
startstring = mystring[:match.start()]
sql = mystring[match.start():match.end()]
endstring = mystring[match.end():]
result = []
for found in re.findall(fieldmatch, startstring):
    result.append(found)

result.append(sql)
for found in re.findall(fieldmatch, endstring):
    result.append(found)

结果如下所示:

^{pr2}$

有人指出你的字符串格式不正确,我用了这个:

mystr = "prefix 'field1', '', 'field2', 'field3', 'select ... where (column1 = '2017') and ((('literal1', 'literal2', 'literal3', 'literal4', 'literal5', 'literal6', 'literal7') OVERLAPS column2 Or ('literal8') OVERLAPS column3 And" (column4 > 0.0 Or column6 > 0.0)) And column7 IN_COMMUNITY [int1] And column5 = 'literal9')  LIMIT 0 ', 'field5', 'field6', 'field7', 'field8', 'field9', '', 'field10'"

found = [a.replace("'", '').replace(',', '') for a in mystr.split(' ') if "'" in a]

返回:

^{pr2}$

由于字段的数量是固定的,并且非sql字段没有嵌入引号,因此有一个简单的三行解决方案:

prefix, other = string.partition(' ')[::2]
fields = other.strip('\'').split('\', \'')
fields[4:-7] = [''.join(fields[4:-7])]

print(fields)

输出:

^{pr2}$

相关问题 更多 >