﻿<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>BlogJava-VIRGIN FOREST OF JAVA-文章分类-ANTLR</title><link>http://www.blogjava.net/RR00/category/14927.html</link><description>不要埋头苦干，要学习，学习，再学习。。。。。
&lt;br&gt;
powered  by &lt;font color='orange'&gt;R.Zeus&lt;/font&gt;</description><language>zh-cn</language><lastBuildDate>Tue, 27 Feb 2007 10:37:51 GMT</lastBuildDate><pubDate>Tue, 27 Feb 2007 10:37:51 GMT</pubDate><ttl>60</ttl><item><title>hql-sql.g</title><link>http://www.blogjava.net/RR00/articles/69366.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Wed, 13 Sep 2006 06:03:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/69366.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/69366.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/69366.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/69366.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/69366.html</trackback:ping><description><![CDATA[p:identifier {<br />     ....<br />#propertyRef = lookupProperty(#propertyRef,false,true);//do the same as below <font color="#ff1493">resolve(#p) <font color="#000000">if it don't resolved in some case.</font><br /></font>     ....<br />    resolve(#p);//change the class name with the table name and primary id.(em,<font color="#ff1493">a</font> in "select a.name from Aclass a" will change to <font color="#ff1493">aclassTable.id</font>)<br />   #propertyRef = #p;<br />  }<br /><br />---------------------------------------------------------------------------------------------------------------------------<br />selectExpr<br /> : p:propertyRef     { resolveSelectExpression(#p);}//change the type ast text to a reable text.<br /><font color="#000000"><font size="2">                                                                                                //   For example, </font>DOT LHS RHS ,before resolveSelectExpression the<br />                                                                        // ast root text will be DOT(.),after resolveSelectExpression, the ast root<br />                                                                         // text maybe table alias+column name.<br /></font>.....<br /><br />;<br /><br />-------------------------------------------------------------------------------------------------------------------------<img src ="http://www.blogjava.net/RR00/aggbug/69366.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-13 14:03 <a href="http://www.blogjava.net/RR00/articles/69366.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>hql.g&amp;&amp;hql-sql.g</title><link>http://www.blogjava.net/RR00/articles/68955.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Mon, 11 Sep 2006 06:39:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/68955.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/68955.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/68955.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/68955.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/68955.html</trackback:ping><description><![CDATA[
		<font color="#ff1493">hql.g</font> include all the validated SQL,at the same time include the ERROR SQL,<br />         for example,"select (a,b)" is according with the hql.g,but not a validated sql.<br /><font color="#ff1493">hql-sql.g <br /></font>        changed the validated hql to sql,throws exceptions when encounter the error-sql.<br /><br />maybe hql-sql.g transform the ast,and sqlgenate.g  transform the ast to string.next I will see it.<img src ="http://www.blogjava.net/RR00/aggbug/68955.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-11 14:39 <a href="http://www.blogjava.net/RR00/articles/68955.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>An Introduction To ANTLR</title><link>http://www.blogjava.net/RR00/articles/68916.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Mon, 11 Sep 2006 03:02:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/68916.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/68916.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/68916.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/68916.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/68916.html</trackback:ping><description><![CDATA[
		<h1>An Introduction To ANTLR</h1>
		<p>
				<a href="http://www.cs.usfca.edu/~parrt">
						<b>Terence Parr</b>
				</a>
		</p>
		<h2>Introduction</h2>
		<p>During the 1980s I built many many recognizers and translators by hand and finally got disgusted enough to try to automate the process; whence my motto: 
</p>
		<blockquote>"<i>Why program by hand in five days what you can spend five years of your life automating.</i>"</blockquote>
		<p>The benefit of building so many projects by hand is that you can see the commonality and what can reasonably expected to be formalized and automated. I didn't understand <tt>yacc</tt> too well back then and wanted something that reproduced what I built by hand anyway. ANTLR is the result (originally called PCCTS actually). I've been working on it for well over a decade now. [See <a href="http://www.antlr.org/history.html"><b>a quick history</b></a> for more details]. 
</p>
		<p>
				<em>ANTLR</em>, ANother Tool for Language Recognition, is a tool that accepts grammatical language descriptions and generates programs that recognize sentences in those languages. As part of a translator, you may augment your grammars with simple operators and actions to tell ANTLR how to build ASTs and how to generate output. ANTLR knows how to generate recognizers in Java, C++, C#, and soon Python. 
</p>
		<p>ANTLR knows how to build recognizers that apply grammatical structure to three different kinds of input: (i) character streams, (ii) token streams, and (iii) two-dimensional trees structures. Naturally these correspond to lexers, parsers, and tree walkers. The syntax for specifying these grammars, the <em>meta-language</em>, is nearly identical in all cases. 
</p>
		<p>Once you are comfortable with ANTLR or a similar tool, you will start to see programming in a new light. Many tasks cry out for language solutions well outside the stereotypical programming language genre. For example, these course notes are written in TML, Terence's Markup Language. I hate typing HTML and so I built a trivial translator using ANTLR to convert text (with a few extra goodies and conventions) into HTML or PDF or whatever I bother to write a generator for. 
</p>
		<p>Finally, let me point out that ANTLR is just a tool--that's all. It helps you build software by automating the well-understood tedious components, but does not attempt to let you specify an entire compiler, for example, in a single description. Tools that claim this sort of thing are great for writing amazing "one liners" for publishing journal papers, but fail miserably on real projects. 
</p>
		<p>There are, as of early 2003, almost 5,000 ANTLR downloads a month. ANTLR is completely in the public domain without even a copyright and comes with complete source code. 
</p>
		<p>These notes assume you are familiar with basic recognition and translation concepts. For now, you need to get familiar with ANTLR's metalanguage and what it generates. Later, we will focus on building complicated translators. 
</p>
		<h2>A Gentle Introduction to ANTLR Syntax</h2>
		<p>Getting to know ANTLR is best done via example. A simple calculator is often used to get started and with good reason: it's easy to understand and simple. There are a number of similar examples and tutorials for ANTLR, but I will describe a calculator here in my own words. First we will build something that directly evaluates simple expressions. Then we will generate trees and evaluate the trees to get the same answer. 
</p>
		<p>While you know that eventually you have to break up an input stream of characters into tokens, thinking about the grammatical structure of an expression is good place to start. 
</p>
		<h3>Syntax-directed execution</h3>
		<h4>Recognition</h4>
		<p>Let's accept arithmetic expressions with operators plus, minus, and multiply such as <tt>3+4*5-1</tt> or expressions with parentheses such as <tt>(3+4)*5</tt> to enforce an evaluation order. 
</p>
		<p>All ANTLR grammars are subclasses of <tt>Lexer</tt>, <tt>Parser</tt>, or <tt>TreeParser</tt> and, since you should start thinking about this at the syntactic level, you will build a <tt>Parser</tt> subclass. After the class declaration, you will specify the rules in EBNF notation: 
</p>
		<div class="code">
				<pre>class ExprParser extends Parser;

expr:   mexpr ((PLUS|MINUS) mexpr)*
    ;      

mexpr      
    :   atom (STAR atom)*
    ;    

atom:   INT 
    |   LPAREN expr RPAREN 
    ;
</pre>
		</div>
		<p>The lexer follows a similar pattern and only needs to define some operators and whitespace. Putting the lexer into the same file, say <tt>expr.g</tt>, is the easiest thing to do: 
</p>
		<div class="code">
				<pre> 
class ExprLexer extends Lexer;

options {
    k=2; // needed for newline junk
    charVocabulary='\u0000'..'\u007F'; // allow ascii
}

LPAREN: '(' ;
RPAREN: ')' ;
PLUS  : '+' ;
MINUS : '-' ;
STAR  : '*' ;
INT   : ('0'..'9')+ ;
WS    : ( ' '
        | '\r' '\n'
        | '\n'
        | '\t'
        )
        {$setType(Token.SKIP);}
      ;    
</pre>
		</div>
		<p>To generate a program (in Java) from this grammar, <tt>expr.g</tt>, run ANTLR on it: 
</p>
		<div class="code">
				<pre>$ java antlr.Tool expr.g
ANTLR Parser Generator   Version 2.7.2   1989-2003 jGuru.com
$ 
</pre>
		</div>
		<h4>What does ANTLR generate?</h4>
		<p>While not necessary for the completion of this tutorial, you may find it illuminating to see what ANTLR generates inside the recognizer files. ANTLR generates recognizers that mimic what you would build by hand--recursive-descent parsers; <tt>yacc</tt> and friends, on the other hand, generate tables full of integers as they simulate push-down-automata. 
</p>
		<p>ANTLR will generate the following files: 
</p>
		<div class="code">
				<pre>ExprLexer.java
ExprParser.java
ExprParserTokenTypes.java
ExprParserTokenTypes.txt
</pre>
		</div>
		<p>If you take a look inside, for example, <tt>ExprParser.java</tt>, you will see a method for every rule defined in the parser grammar in <tt>expr.g</tt>. For example, the code for rules <tt>mexpr</tt> and <tt>atom</tt> look very much like this: 
</p>
		<div class="code">
				<pre>public void mexpr() {
  atom();
  while ( LA(1)==STAR ) {
    match(STAR);
    atom();
  }
}

public void atom() {
  switch ( LA(1) ) { // switch on lookahead token type
    case INT :
      match(INT); 
      break;
    case LPAREN :
      match(LPAREN);
      expr();
      match(RPAREN);
      break; 
    default :
      // error
  }
}
</pre>
		</div>
		<p>Notice that rule references are translated to method calls and token references are translated to <tt>match(TOKEN)</tt> calls. The only hard thing about building a parser from a grammar is computing the lookahead information. 
</p>
		<p>The token types class defines all the constant token type numbers your lexer and parser use: 
</p>
		<div class="code">
				<pre>// $ANTLR 2.7.2: "expr.g" -&gt; "ExprParser.java"$

public interface ExprParserTokenTypes {
        int EOF = 1;
        int NULL_TREE_LOOKAHEAD = 3;
        int PLUS = 4;
        int MINUS = 5;
        int STAR = 6;
        int INT = 7;
        int LPAREN = 8;
        int RPAREN = 9;
        int WS = 10;
}
</pre>
		</div>
		<h4>Testing the lexer/parser</h4>
		<p>To actually use the resulting parser, in <tt>ExprParser.java</tt>, use a <tt>main()</tt> such as the following: 
</p>
		<div class="code">
				<pre>import antlr.*;
public class Main {
    public static void main(String[] args) throws Exception {
        ExprLexer lexer = new ExprLexer(System.in);
        ExprParser parser = new ExprParser(lexer);
        parser.expr();
    }
}
</pre>
		</div>
		<div class="code">
				<pre>$ java Main
3+(4*5)
$ 
</pre>
		</div>
		<p>Or for bad input: 
</p>
		<div class="code">
				<pre>$ java Main
3++
line 1:3: unexpected token: +
$ 
</pre>
		</div>
		<p>or 
</p>
		<div class="code">
				<pre>$ java Main
3+(4
line 1:6: expecting RPAREN, found 'null'
$ 
</pre>
		</div>
		<h4>Expression evaluation</h4>
		<p>To actually evaluate the expressions on the fly as the tokens come in, just add actions to the parser: 
</p>
		<div class="code">
				<pre>class ExprParser extends Parser;

expr returns [int value=0]
{int x;}
    :   value=mexpr
        ( PLUS x=mexpr  {value += x;}
        | MINUS x=mexpr {value -= x;} 
        )*
    ;

mexpr returns [int value=0]
{int x;}
    :   value=atom ( STAR x=atom {value *= x;} )*
    ;

atom returns [int value=0]
    :   i:INT {value=Integer.parseInt(i.getText());}
    |   LPAREN value=expr RPAREN
    ;
</pre>
		</div>
		<p>The lexer is still the same, but add a print statement in the main program: 
</p>
		<div class="code">
				<pre>import antlr.*;

public class Main {
        public static void main(String[] args) throws Exception {
                ExprLexer lexer = new ExprLexer(System.in);
                ExprParser parser = new ExprParser(lexer);
                int x = parser.expr();
                System.out.println(x);
        }
}
</pre>
		</div>
		<p>Now, when you run the program you get the resulting computations: 
</p>
		<div class="code">
				<pre>$ java Main
3+4*5
23
$ java Main
(3+4)*5
35
$ 
</pre>
		</div>
		<h4>How ANTLR translates actions</h4>
		<p>Actions are generally placed into the generated parser verbatim at the spot corresponding to the position in the grammar. 
</p>
		<p>Rule return specifications like 
</p>
		<div class="code">
				<pre>mexpr returns [int value=0]
  : ...
  ;
</pre>
		</div>
		<p>are translated to 
</p>
		<div class="code">
				<pre>public int mexpr() {
  int value=0;
  ...
  return value;
}
</pre>
		</div>
		<p>If you were to add an incoming parameter specification, the args are also just copied to the method declaration: 
</p>
		<div class="code">
				<pre>mexpr[int x] returns [int value=0]
  : ... {value = x;}
  ;
</pre>
		</div>
		<p>yields 
</p>
		<div class="code">
				<pre>public int mexpr(int x) {
  int value=0;
  ...
  value = x;
  return value;
}
</pre>
		</div>
		<p>So, the full translation for the <tt>mexpr</tt> and <tt>atom</tt> rules looks like: 
</p>
		<div class="code">
				<pre>public int mexpr() {
  int value=0;
  int x; // local variable def from rule mexpr
  value = atom();
  while ( LA(1)==STAR ) {
    match(STAR);
    x = atom();
    value *= x;
  }
  return value;
}

public int atom() {
  int value=0;
  switch ( LA(1) ) { // switch on lookahead token type
    case INT :
      Token i = LT(1); // make label i point to next lookahead token object
      match(INT); 
      value=Integer.parseInt(i.getText()); // compute int value of token
      break;
    case LPAREN :
      match(LPAREN);
      value = expr(); // return whatever expr() computes
      match(RPAREN);
      break; 
    default :
      // error
  }
  return value;
}
</pre>
		</div>
		<h3>Evaluation via AST intermediate form</h3>
		<p>So now you've seen basic syntax-directed translation/computation where the grammar/syntax dictates when to do an action. A more powerful strategy is to build an intermediate representation that holds all or most of the input symbols and has encoded, in the structure of the data, the relationship between those tokens. For example, input "3+4" is represented as an abstract syntax tree (AST) via: 
</p>
		<div class="code">
				<pre>  +
 / \
3   4
</pre>
		</div>
		<p>For this kind of tree, you will use a tree walker (generated by ANTLR from a tree grammar) to compute the same values as before, but using a different strategy. 
</p>
		<p>The utility of ASTs becomes clear when you must do multiple walks over the tree to figure out what to compute or to do tree rewrites, morphing the tree towards another language. 
</p>
		<h4>AST construction</h4>
		<p>To get ANTLR to generate a useful AST is pretty simple for many grammars. In our case, turn on the <tt>buildAST</tt> option and then add a few suffix operators to tell ANTLR what tokens should be subtree roots. 
</p>
		<div class="code">
				<pre>class ExprParser extends Parser;

options {
        buildAST=true;
}

expr:   mexpr ((PLUS^|MINUS^) mexpr)*
    ;

mexpr
    :   atom (STAR^ atom)*
    ;

atom:   INT
    |   LPAREN! expr RPAREN!
    ;
</pre>
		</div>
		<p>Again, the lexer doesn't change. The main program asks for the resulting tree and prints it out: 
</p>
		<div class="code">
				<pre>import antlr.*;
import antlr.collections.*;

public class Main {   
    public static void main(String[] args) throws Exception {    
        ExprLexer lexer = new ExprLexer(System.in);
        ExprParser parser = new ExprParser(lexer);
        parser.expr();
        AST t = parser.getAST();
        System.out.println(t.toStringTree());
    }    
}    
</pre>
		</div>
		<div class="code">
				<pre>$ java Main
3+4
 ( + 3 4 )
$ java Main
3+4*5 
 ( + 3 ( * 4 5 ) )
$ java Main
(3+4)*5
 ( * ( + 3 4 ) 5 )
$ 
</pre>
		</div>
		<h4>AST parsing and evaluation</h4>
		<p>The trees built by the above parser are pretty simple. A single rule in the tree parser suffices. 
</p>
		<div class="code">
				<pre>class ExprTreeParser extends TreeParser;

options {
    importVocab=ExprParser;
}

expr returns [int r=0]
{ int a,b; }
    :   #(PLUS  a=expr b=expr)  {r = a+b;}
    |   #(MINUS a=expr b=expr)  {r = a-b;}   
    |   #(STAR  a=expr b=expr)  {r = a*b;}
    |   i:INT                   {r = (int)Integer.parseInt(i.getText());}
    ;
</pre>
		</div>
		<p>The main program is modified to use the new tree parser for evaluation: 
</p>
		<div class="code">
				<pre>import antlr.*;
import antlr.collections.*;

public class Main {
    public static void main(String[] args) throws Exception {    
        ExprLexer lexer = new ExprLexer(System.in);
        ExprParser parser = new ExprParser(lexer);
        parser.expr();
        AST t = parser.getAST();
        System.out.println(t.toStringTree());
        ExprTreeParser treeParser = new ExprTreeParser();
        int x = treeParser.expr(t);
        System.out.println(x);
    }    
}
</pre>
		</div>
		<p>Now you get the tree structure and the resulting value. 
</p>
		<div class="code">
				<pre>$ java Main
3+4
 ( + 3 4 )
7
$ java Main
3+(4*5)+10
 ( + ( + 3 ( * 4 5 ) ) 10 )
33
$ 
</pre>
		</div>
<img src ="http://www.blogjava.net/RR00/aggbug/68916.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-11 11:02 <a href="http://www.blogjava.net/RR00/articles/68916.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>hql.g</title><link>http://www.blogjava.net/RR00/articles/68537.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Fri, 08 Sep 2006 08:06:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/68537.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/68537.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/68537.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/68537.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/68537.html</trackback:ping><description><![CDATA[
		<font color="#ff0000">expressionOrVector</font>  deal with the expression wraped by the open-close('()').<br /><font color="#ff0000">sum(..expr..)</font> include the open-close,so sum must followed by open-close.All the same define in the <font color="#800080">aggregate.<br /></font><img src ="http://www.blogjava.net/RR00/aggbug/68537.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-08 16:06 <a href="http://www.blogjava.net/RR00/articles/68537.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>use tokens in parser</title><link>http://www.blogjava.net/RR00/articles/68296.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Thu, 07 Sep 2006 07:12:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/68296.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/68296.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/68296.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/68296.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/68296.html</trackback:ping><description><![CDATA[
		<p>
				<font color="#a52a2a">
						<font style="BACKGROUND-COLOR: #deb887" color="#000000">class Pascal extends Parser;<br />options{<br />buildAST=true;<br />}</font>
				</font>
		</p>
		<p>
				<font color="#a52a2a">
						<font color="#000000">
								<font style="BACKGROUND-COLOR: #deb887">tokens<br />{<br /> // -- HQL Keyword tokens --<br /> ALL="all";<br /> } <br /></font>
								<br />//use <font color="#a52a2a"> testLiterals=true </font>to define IDENT and set <font style="BACKGROUND-COLOR: #deb887">buildAST=true;</font> ,so parser can use <font style="BACKGROUND-COLOR: #deb887">tokens{ }</font><br />define.else not.</font>
						<br />IDENT options { testLiterals=true; }</font>
				<br /> : ID_START_LETTER ( ID_LETTER )*  <br /> ;<br />protected<br />ID_START_LETTER<br />    :    '_'<br />    |    '$'<br />    |    'a'..'z'<br />    |    '\u0080'..'\ufffe'       // HHH-558 : Allow unicode chars in identifiers<br />    ;<br />protected<br />ID_LETTER<br />    :    ID_START_LETTER<br />    |    '0'..'9'<br />    ;</p>
<img src ="http://www.blogjava.net/RR00/aggbug/68296.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-07 15:12 <a href="http://www.blogjava.net/RR00/articles/68296.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>What is a "protected" lexer rule?</title><link>http://www.blogjava.net/RR00/articles/68072.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Wed, 06 Sep 2006 09:18:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/68072.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/68072.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/68072.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/68072.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/68072.html</trackback:ping><description><![CDATA[
		<table cellspacing="0" cellpadding="5" width="100%" bgcolor="#ffffff" border="0">
				<tbody>
						<tr>
								<td valign="top" align="right">
										<font face="Verdana,Arial,Helvetica" size="2">
												<b>Author</b>
										</font>
								</td>
								<td valign="top" colspan="3">
										<font face="Verdana,Arial,Helvetica" size="2">
												<a href="http://www.jguru.com/guru/viewfaqs.jsp?EID=1">
														<strong>Terence Parr</strong>
												</a> <font color="#669900">PREMIUM</font></font>
								</td>
						</tr>
				</tbody>
		</table>
		<table cellspacing="0" cellpadding="5" width="100%" bgcolor="#ffffff" border="0">
				<tbody>
						<tr>
								<td valign="top" align="right">
										<p align="left">
												<font face="Verdana,Arial,Helvetica" size="2">
														<b>
																<em>                                 Created   <font face="Verdana,Arial,Helvetica" size="2">Sep 3, 1999<br /><br /></font></em>
														</b>
												</font>
										</p>
										<hr />
										<p>
										</p>
										<p align="left"> </p>
								</td>
								<td valign="top">
										<p align="left">
												<font face="Verdana,Arial,Helvetica" size="2">
												</font> </p>
								</td>
						</tr>
				</tbody>
		</table>
		<font face="Verdana">
				<font size="2">
						<strong>Answer</strong>
						<br />
				</font>
				<font size="2">A lexer is a <tt>TokenStream</tt> source that merely spits out a stream of <tt>Token</tt> objects to the parser (or another stream consumer). As such, a lexer implements method <tt>nextToken()</tt> to satisfy interface <tt>TokenStream</tt>. The parser repeatedly calls <tt><i>yourlexer</i>.nextToken()</tt> to get tokens. </font>
		</font>
		<p>
				<font size="2">
						<font face="Verdana">What token definitions result in token objects that get sent to the parser? The answer you'd expect or the one you're used to is, "You get a </font>
						<tt>
								<font face="Verdana,Arial,Helvetica">Token</font>
						</tt>object for every lexical rule in your lexer grammar." This is indeed the default case for ANTLR's lexer grammars. </font>
		</p>
		<p>
				<font size="2">What if you want to break up the definition of a complicated rule into multiple rules? Surely you don't want every rule to result in a complete <tt>Token</tt> object in this case. Some rules are only around to help other rules construct tokens. To distinguish these "helper" rules from rules that result in tokens, use the <tt>protected</tt> modifier. This overloading of the access-visibility Java term occurs because if the rule is not visible, it cannot be "seen" by the parser. </font>
		</p>
		<p>
				<font size="2">Another, more practical, way to look at this is to note that only non-protected rules get called by <tt>nextToken()</tt> and, hence, only non-protected rules can generate tokens that get shoved down the <tt>TokenStream</tt> pipe to the parser. </font>
		</p>
		<p>
				<font size="2">I now recognize this approach as a mistake. I have a number of other proposals to fix this, none that seems to satisfy everyone. </font>
				<tt>
						<pre>
								<font color="#ff0000" size="2">class L extends Lexer;

/** This rule is "visible" to the parser
 *  and a Token object is sent to the
 *  parser when an INT is matched.
 */
INT : (DIGIT)+ ;


/** This rule does not result in a token
 *  object that is passed to the parser.
 *  It merely recognizes a portion of INT.
 */
protected
DIGIT : '0'..'9' ;
</font>
						</pre>
				</tt>
		</p>
		<p>
		</p>
		<p>
				<font size="2">By definition, all lexical rules return <tt>Token</tt> objects (ANTLR optimizes away many of these object creations, however), but only the <tt>Token</tt> objects of non-protected rules get pulled out of the lexer itself</font>
		</p>
<img src ="http://www.blogjava.net/RR00/aggbug/68072.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-06 17:18 <a href="http://www.blogjava.net/RR00/articles/68072.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item><item><title>class LexPascal extends Lexer;</title><link>http://www.blogjava.net/RR00/articles/68066.html</link><dc:creator>R.Zeus</dc:creator><author>R.Zeus</author><pubDate>Wed, 06 Sep 2006 08:53:00 GMT</pubDate><guid>http://www.blogjava.net/RR00/articles/68066.html</guid><wfw:comment>http://www.blogjava.net/RR00/comments/68066.html</wfw:comment><comments>http://www.blogjava.net/RR00/articles/68066.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.blogjava.net/RR00/comments/commentRss/68066.html</wfw:commentRss><trackback:ping>http://www.blogjava.net/RR00/services/trackbacks/68066.html</trackback:ping><description><![CDATA[
		<p>
				<font color="#a52a2a">
						<hr />
				</font>
		</p>
		<p>
		</p>
		<p>
		</p>
		<p>
				<font color="#a52a2a">header<br />{<br />//   $Id: hql.g 8805 2005-12-09 12:22:18Z pgmjsd $</font>
		</p>
		<p>
				<font color="#a52a2a">package test;</font>
		</p>
		<p>
				<br />
				<font color="#a52a2a">}</font>
		</p>
		<p>
				<font color="#a52a2a">class Pascal extends Parser;</font>
		</p>
		<p>
				<font color="#a52a2a">prog:   INT<br />           EOF<br />            { System.out.println("plain old INT"); }<br />        <br />    |   REAL { System.out.println("token REAL"); }<br />    |RANGE { System.out.println("token RANGE");}<br />    |MORERANGE{ System.out.println("token MORERANGE");}<br />    ;</font>
		</p>
		<p>
				<font color="#a52a2a">class LexPascal extends Lexer;</font>
		</p>
		<p>
				<font color="#a52a2a">WS  :   (' '<br />    |   '\t'<br />    |   '\n'<br />    |   '\r')+<br />        { $setType(Token.SKIP); }<br />    ;</font>
		</p>
		<p>
				<font color="#a52a2a">
						<font color="#008000">protected</font>
						<br />INT :   ('0'..'9')+<br />    ;</font>
		</p>
		<p>
				<font color="#a52a2a">
						<font color="#008000">protected</font>
						<br />REAL:   INT '.' INT<br />    ;<br /><font color="#008000">protected</font><br />RANGE<br />    :   INT ".." INT<br />    ;<br /><font color="#008000">protected<br /></font>MORERANGE<br /> :INT "..."  INT;<br /> <br />RANGE_OR_INT<br />    :( INT "..." ) =&gt; MORERANGE  { $setType(MORERANGE); }   <br />    |( INT ".." ) =&gt; RANGE  { $setType(RANGE); }<br />    |   ( INT '.' )  =&gt; REAL { $setType(REAL); }<br />    |   INT                  { $setType(INT); }<br />    ;    <br /><hr /></font>
		</p>
		<p>
		</p>
		<p>
		</p>use <font color="#008000">protected</font> and '<font color="#008000">=&gt;</font>' to distinguish all the tokens that has the common lift-prefixes.is there other ways?<br />see 
<a href="/RR00/articles/68072.html">What is a "protected" lexer rule? </a><img src ="http://www.blogjava.net/RR00/aggbug/68066.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.blogjava.net/RR00/" target="_blank">R.Zeus</a> 2006-09-06 16:53 <a href="http://www.blogjava.net/RR00/articles/68066.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss>