ANTLR4:解析电子邮件头，lookahead不工作，Python-targ

parser grammar MyParser; options { tokenVocab=MyLexer; } received : Received fromToken byToken withToken idToken SemiColon date EOF ; fromToken : FromText ; byToken: ByText ; withToken : WithText ; idToken : IdText ; date : DateContents+ ;

token recognition error at: 'from server.mymailhost.com (mail.mymailhost.com [126.43.75.123]) by pilot01.cl.msu.edu (8.10.2/8.10.2) with ESMTP id NAA23597;Fri, 12 Jul 2002 16:11:20 -0400 (EDT)' mismatched input '<EOF>' expecting FromText

lexer grammar MyLexer; Received : 'Received: ' ; SemiColon : ';' ; FromText : 'from ' .+? { (self.input.LA(1) == 'b') and (self.input.LA(2) == 'y') }? ; ByText : 'by '.+? { (self.input.LA(1) == 'w') and (self.input.LA(2) == 'i') and (self.input.LA(3) == 't') and (self.input.LA(4) == 'h') }? ; WithText : 'with ' .+? { (self.input.LA(1) == 'i') and (self.input.LA(2) == 'd') }? ; IdText : 'id ' .+? { (self.input.LA(1) == ';') }? ; DateContents : ('Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' | 'Sun') (Letter | Number | Special)+ ; fragment Letter : 'A'..'Z' | 'a'..'z' ; fragment Number : '0'..'9' ; fragment Special : ' ' | '_' | '-' | '.' | ',' | '~' | ':' | '+' | '$' | '=' | '(' | ')' | '[' | ']' | '/' ; Whitespace : [\t\r\n]+ -> skip ;

1条回答

网友

1楼 · 发布于 2024-09-30 10:41:42

经过许多努力，我找到了答案。这是正在工作的lexer：

lexer grammar MyLexer;                  

Received : 'Received: ' ;
SemiColon : ';' ;

FromText : 'from ' .+? 
      {(self._input.LA(1) == ord('b')) and (self._input.LA(2) == ord('y'))}?
      ;

ByText : 'by '.+? 
      {(self._input.LA(1) == ord('w')) and (self._input.LA(2) == ord('i')) and (self._input.LA(3) == ord('t')) and (self._input.LA(4) == ord('h'))}? 
      ;

WithText : 'with ' .+? 
      {(self._input.LA(1) == ord('i')) and (self._input.LA(2) == ord('d'))}? 
      ;

IdText : 'id ' .+? 
      {(self._input.LA(1) == ord(';'))}? 
      ;

DateContents : ('Mon' | 'Tue' | 'Wed' | 'Thu' | 'Fri' | 'Sat' | 'Sun') (Letter | Number | Special)+ ;

fragment Letter :  'A'..'Z' | 'a'..'z' ;

fragment Number : '0'..'9' ;

fragment Special : ' ' | '_' | '-' | '.' | ',' | '~' | ':' | '+' | '$' | '=' | '(' | ')' | '[' | ']' | '/' ;

Whitespace : [\t\r\n]+ -> skip ;

相关问题更多 >

编程相关推荐

热门问题

热门文章