Python中的JavaScript解析器

2024-05-18 09:09:49 发布

您现在位置:Python中文网/ 问答频道 /正文

至少在C和Java(Mozilla)、JavaScript(Mozilla)和Ruby中有一个JavaScript解析器。Python目前有什么进展吗?

我不需要JavaScript解释器,本质上,只需要一个符合ECMA-262标准的解析器。

一个快速的谷歌搜索显示没有立即的答案,所以我向so社区询问。


Tags: 答案解析器mozilla标准sojavajavascript解释器
3条回答

正如pib所提到的,pynarcissus是用Python编写的Javascript标记器。它似乎有一些粗糙的边缘,但迄今为止一直在为我想完成的工作。

更新:对pynarcissus进行了另一次破解,下面是在类似访客模式的系统中使用pynarcissus的工作方向。不幸的是,我现在的客户购买了我的下一次实验,并决定不公开它的来源。下面代码的一个更清晰的版本基于要点here

from pynarcissus import jsparser
from collections import defaultdict

class Visitor(object):

    CHILD_ATTRS = ['thenPart', 'elsePart', 'expression', 'body', 'initializer']

def __init__(self, filepath):
    self.filepath = filepath
    #List of functions by line # and set of names
    self.functions = defaultdict(set)
    with open(filepath) as myFile:
        self.source = myFile.read()

    self.root = jsparser.parse(self.source, self.filepath)
    self.visit(self.root)


def look4Childen(self, node):
    for attr in self.CHILD_ATTRS:
        child = getattr(node, attr, None)
        if child:
            self.visit(child)

def visit_NOOP(self, node):
    pass

def visit_FUNCTION(self, node):
    # Named functions
    if node.type == "FUNCTION" and getattr(node, "name", None):
        print str(node.lineno) + " | function " + node.name + " | " + self.source[node.start:node.end]


def visit_IDENTIFIER(self, node):
    # Anonymous functions declared with var name = function() {};
    try:
        if node.type == "IDENTIFIER" and hasattr(node, "initializer") and node.initializer.type == "FUNCTION":
            print str(node.lineno) + " | function " + node.name + " | " + self.source[node.start:node.initializer.end]
    except Exception as e:
        pass

def visit_PROPERTY_INIT(self, node):

    # Anonymous functions declared as a property of an object
    try:
        if node.type == "PROPERTY_INIT" and node[1].type == "FUNCTION":
            print str(node.lineno) + " | function " + node[0].value + " | " + self.source[node.start:node[1].end]
    except Exception as e:
        pass


def visit(self, root):

    call = lambda n: getattr(self, "visit_%s" % n.type, self.visit_NOOP)(n)
    call(root)
    self.look4Childen(root)
    for node in root:
        self.visit(node)

filepath = r"C:\Users\dward\Dropbox\juggernaut2\juggernaut\parser\test\data\jasmine.js"
outerspace = Visitor(filepath)

现在,至少有一个更好的工具,叫做^{}

SlimIt is a JavaScript minifier written in Python. It compiles JavaScript into more compact code so that it downloads and runs faster.

SlimIt also provides a library that includes a JavaScript parser, lexer, pretty printer and a tree visitor.

演示:

假设我们有以下javascript代码:

$.ajax({
    type: "POST",
    url: 'http://www.example.com',
    data: {
        email: 'abc@g.com',
        phone: '9999999999',
        name: 'XYZ'
    }
});

现在我们需要从data对象中获取emailphonename值。

这里的想法是实例化一个slimit解析器,访问所有节点,过滤所有赋值并将它们放入字典中:

from slimit import ast
from slimit.parser import Parser
from slimit.visitors import nodevisitor


data = """
$.ajax({
    type: "POST",
    url: 'http://www.example.com',
    data: {
        email: 'abc@g.com',
        phone: '9999999999',
        name: 'XYZ'
    }
});
"""

parser = Parser()
tree = parser.parse(data)
fields = {getattr(node.left, 'value', ''): getattr(node.right, 'value', '')
          for node in nodevisitor.visit(tree)
          if isinstance(node, ast.Assign)}

print fields

它打印:

{'name': "'XYZ'", 
 'url': "'http://www.example.com'", 
 'type': '"POST"', 
 'phone': "'9999999999'", 
 'data': '', 
 'email': "'abc@g.com'"}

ANTLR, ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions in a variety of target languages.

ANTLR站点提供了许多语法,包括one for JavaScript

碰巧,这里有一个Python API可用,因此您可以直接从Python中调用语法生成的lexer(识别器)(祝您好运)。

相关问题 更多 >