<p>我正在处理一个类似的问题,并找到了一个更简单的解决方案,它似乎工作得很好。</p>
<pre><code>import re
def tables_in_query(sql_str):
# remove the /* */ comments
q = re.sub(r"/\*[^*]*\*+(?:[^*/][^*]*\*+)*/", "", sql_str)
# remove whole line -- and # comments
lines = [line for line in q.splitlines() if not re.match("^\s*(--|#)", line)]
# remove trailing -- and # comments
q = " ".join([re.split("--|#", line)[0] for line in lines])
# split on blanks, parens and semicolons
tokens = re.split(r"[\s)(;]+", q)
# scan the tokens. if we see a FROM or JOIN, we set the get_next
# flag, and grab the next one (unless it's SELECT).
table = set()
get_next = False
for tok in tokens:
if get_next:
if tok.lower() not in ["", "select"]:
table.add(tok)
get_next = False
get_next = tok.lower() in ["from", "join"]
dictTables = dict()
for table in tables:
fields = []
for token in tokens:
if token.startswith(table):
if token != table:
fields.append(token)
if len(list(set(fields))) >= 1:
dictTables[table] = list(set(fields))
return dictTables
</code></pre>
<p>代码改编自<a href="https://grisha.org/blog/2016/11/14/table-names-from-sql/" rel="nofollow noreferrer">https://grisha.org/blog/2016/11/14/table-names-from-sql/</a></p>