這里介紹一個極其簡單的編程語言的部分實現,即Straight-Line編程語言,沒有循環和if-else的結構。
Straight-Line語言的語法(Grammar)
Stm ::= Stm; Stm
|
CompoundStm
|
Stm ::= id := Exp
|
AssignStm
|
Stm ::= print(ExpList)
|
PrintStm
|
ExpList ::= Exp, ExpList
|
PairExpList
|
ExpList ::= Exp
|
LastExpList
|
Exp ::= id
|
IdExp
|
Exp ::= num
|
NumExp
|
Exp ::= Exp Binop Exp
|
OpExp
|
Exp ::= (Stm, Exp)
|
EseqExp
|
Binop::= + | - | * | /
|
Arithmetic Operators
|
Straight-Line語言的語義如下,s
1;s
2先執行s
1,再執行s
2。i := e, 計算e的值,保存在變量i中。print(e
1, e
2, ..., e
n)打印出e
1, e
2, ..., e
n的值,用空格隔離,結尾換行。(Stm, Exp)類似C中的逗號表達式,先執行Stm,為了Stm的Side Effect,然后計算Exp,返回Exp的值。舉個例子,
a := 5 + 3; b := (print (a, a-1), 10*a); print(b)
輸出
8 7
80
怎樣在編譯器中表達Straight-Line語言的程序呢?Straight-Line程序是一個樹狀結構,可以用樹狀的數據結構來表示, 節點表示程序的不同單元。從Straight-Line語言的語法我們可以推出怎樣在Java中用類的結構表達Straight-Line的程序。每個符號對應一個抽象類,比如Stm抽象類對應Stm符號。每條語法規則用一個具體類的構造函數實現。比如CompoundStm的右邊有兩個Stm組成,那么繼承自Stm的CompoundStm的一個構造函數的參數是兩個Stm。這兩個Stm保存在CompoundStm的屬性里。
abstract class Stm {}
abstract class Exp {}
abstract class ExpList {}
class CompoundStm extends Stm {
Stm stm1, stm2;
CompoundStm(Stm s1, Stm s2) { stm1 = s1; stm2 = s2; }
}
class AssignStm extends Stm {
String id; Exp exp;
AssignStm(String i, Exp e) { id = i; exp = e; }
}
class PrintStm extends Stm {
ExpList exps;
PrintStm(ExpList e) { exps = e; }
}
class PairExpList extends ExpList {
Exp head; ExpList tail;
PairExpList(Exp h, ExpList t) { head = h; tail = t; }
}
class LastExpList extends ExpList {
Exp exp;
LastExpList(Exp e) { exp = e; }
}
class IdExp extends Exp {
String id;
IdExp(String i) { id = i; }
}
class NumExp extends Exp {
int num;
NumExp(int n) { num = n; }
}
class OpExp extends Exp {
final static int Plus = 1, Minus = 2, Times = 3, Div = 4;
Exp left, right;
int oper;
OpExp(Exp l, int o, Exp r) { left = l; oper = o; right = r; }
}
class EseqExp extends Exp {
Stm stm; Exp exp;
EseqExp(Stm s, Exp e) { stm = s; exp = e; }
}
上面這幾個Java類描述了Straight-Line語言
的語法結構。使用Parsing的技術可以將a := 5 + 3; b := (print (a, a-1), 10*a); print(b)
這段程序轉化為如下的樹狀結構
Stm testprog = new CompoundStm(new AssignStm("a", new OpExp(
new NumExp(5), OpExp.Plus, new NumExp(3))), new CompoundStm(
new AssignStm("b", new EseqExp(new PrintStm(new PairExpList(
new IdExp("a"), new LastExpList(new OpExp(new IdExp("a"),
OpExp.Minus, new NumExp(1))))), new OpExp(
new NumExp(10), OpExp.Times, new IdExp("a")))),
new PrintStm(new LastExpList(new IdExp("b")))));
在這里,我要略過Parsing,從上面這段樹狀結構開始,對Straight-Line程序做分析和解釋。分析是指分析一個Straight-Line程序的屬性,比如int maxargs(Stm stm)分析stm中的Print表達式的最大參數個數。解釋就是執行一個Straight-Line程序。下面我們來實現maxargs和Straight-Line程序的一個解釋器。我們采用一種沒有Side Effect的實現方式,也就是變量和對象屬性除了在構造時不能改變,對局部變量用定義時即賦值的方式。
首先是maxargs。我們先寫測試代碼。
package tiger.slpl;
import junit.framework.TestCase;
public class TestCountMaxPrintStmArgs extends TestCase {
public TestCountMaxPrintStmArgs(String m) {
super(m);
}
public void testMaxargs() {
CountMaxPrintStmArgs counter = new CountMaxPrintStmArgs();
assertEquals(2, counter.maxargs(TestProg.testprog));
}
}
TestProg.testprog即是上面給出的程序,print表達式參數個數的最大值是2. 現在實現maxargs。
package tiger.slpl;
import static java.lang.Math.max;
/**
* This Interpreter is too in functional style.<br>
* no assignment<br>
* definition with initialization<br>
* <code>int i = 1;</code> introduces a new variable
*
* @author pan
*/
class CountMaxPrintStmArgs {
/**
* Entry Point
*/
int maxargs(Stm stm) {
return _maxargs(stm);
}
/*
* Because ExpList can occurs only for PrintStm.
* That is the same to count length of ExpList
* but here I use another approach to count only for
* PrintStm
*
* if you want to avoid instanceof, then you can
* pack maxargs methods in classes e.g. Stm
*/
private int _maxargs(Stm stm) {
if (stm instanceof CompoundStm) {
CompoundStm cstm = (CompoundStm) stm;
return max(_maxargs(cstm.stm1), _maxargs(cstm.stm2));
} else if (stm instanceof AssignStm) {
AssignStm astm = (AssignStm) stm;
return _maxargs(astm.exp);
} else { // Then it can be only PrintStm
PrintStm pstm = (PrintStm) stm;
return max(countargs(pstm.exps), _maxargs(pstm.exps));
}
}
private int _maxargs(ExpList exps) {
if (exps instanceof PairExpList) {
PairExpList pexps = (PairExpList) exps;
return max(_maxargs(pexps.head), _maxargs(pexps.tail));
} else { // Then it can be LastExpList
LastExpList lexps = (LastExpList) exps;
return _maxargs(lexps.exp);
}
}
private int _maxargs(Exp exp) {
if (exp instanceof IdExp)
return 0;
else if (exp instanceof NumExp)
return 0;
else if (exp instanceof OpExp) {
OpExp oexp = (OpExp) exp;
return max(_maxargs(oexp.left), _maxargs(oexp.right));
} else { // Then it can be EseqExp
EseqExp eexp = (EseqExp) exp;
return max(_maxargs(eexp.stm), _maxargs(eexp.exp));
}
}
private int countargs(ExpList exps) {
if (exps instanceof LastExpList)
return 1;
else { // Then it is a PairExpList
PairExpList pexps = (PairExpList) exps;
return 1 + countargs(pexps.tail);
}
}
}
這里解釋一下int _maxargs(Stm stm)。一個Stm可以是CompoundStm, AssignStm或者PrintStm。如果是CompoundStm,那么_maxargs(Stm stm)等于stm下兩個子Stm的maxargs的較大值。如果是AssignStm,等于AssignStm的表達式的maxargs。如果是PrintStm,那么是PrintStm的參數個數(countargs數PrintStm的參數個數),或者ExpList的maxargs,看哪個更大。其他的函數的解釋也是類似的,對照Straight Line語言的語法不難理解。
上面的maxargs的實現中用了很多instanceof,另外的一種實現方式可以把各個maxargs放在各自的類下,比如CompoundStm.maxargs計算一個CompoundStm的maxargs。這種方式的一個缺點是,將分析算法放在模型結構類中。如果有很多種分析要做,模型類就比較混亂。可以使用Visitor設計模式,對不同的算法定義不同的Visitor類,兼顧前面兩種方式的優點。當然這是一篇有關編譯技術的隨筆,代碼采用最容易理解的實現方式。
下一篇來介紹如何解釋Straight Line語言的程序。
轉載請保留
http://m.tkk7.com/xilaile/archive/2007/05/13/117159.html