カテゴリー : javacc
このカテゴリーの登録数:5件 表示 : 1 - 5 / 5
2010/01/11
2007/10/29
ioをJavaで動かすぞ日記。その3
超簡単な四則演算まで(複雑な構文を入れるとパースエラーorz)
io(123 + 321 + 456) = 900 io(1 * 2 + 3) = 5 io(5 + 2 / 3) = 5.666667
JavaCCを使うとVisitorパターンを使用するのを余儀なくされるっぽいんだけど、これが慣れない。
BNFで生成されたNode単位で処理をさせるのは分かってるんだけど、Ioみたいに動的言語の場合はContextを沢山作らなくちゃいけなくて、Contextをcompositeなんかで持つと評価されていないノードがあったりして困る・・・
というか、Ioはすべてのオブジェクトに対してMessageを投げるから凄くシンプルに作れる気がするんだけど、難しい。むーん。
とりあえず、ここまで動かすにあたって、jrubyとかjythonなどのソースを読みつつ、こんな構成になってます。
lang.io
|-- IOMain.java
|-- compiler
| |-- ArgumentCompiler.java
| |-- CodeCompiler.java
| |-- ExpressionCompiler.java
| |-- IdentifierCompiler.java
| |-- MessageCompiler.java
| |-- NodeCompiler.java
| |-- NodeCompilerFactory.java
| |-- NumberCompiler.java
| |-- OperatorCompiler.java
| |-- QuoteCompiler.java
| |-- StandardVisitor.java
| `-- Visitor.java
|-- core
| |-- Arguments.java
| |-- Context.java
| |-- Message.java
| `-- Scope.java
|-- evaluator
| |-- Evaluator.java
| |-- EvaluatorContext.java
| `-- IOEvaluator.java
|-- node
| `-- IONode.java
|-- parser
| |-- IONodeArgument.java
| |-- IONodeExpression.java
| |-- IONodeIdentifier.java
| |-- IONodeMessage.java
| |-- IONodeNumber.java
| |-- IONodeOperator.java
| |-- IONodeQuote.java
| |-- IOParser.java
| |-- IOParser.jj
| |-- IOParser.jjt
| |-- IOParserConstants.java
| |-- IOParserTokenManager.java
| |-- IOParserTreeConstants.java
| |-- IOParserVisitor.java
| |-- JJTIOParserState.java
| |-- Node.java
| |-- ParseException.java
| |-- SimpleCharStream.java
| |-- SimpleNode.java
| |-- Token.java
| `-- TokenMgrError.java
`-- runtime
|-- IBlock.java
|-- IoArguments.java
|-- IoBlock.java
|-- IoCall.java
|-- IoMessage.java
|-- IoMethod.java
|-- IoNumber.java
|-- IoObject.java
|-- IoString.java
`-- Slot.java
んで、Compiler郡はContextとかを持って構文解析後の処理をやっちゃってContextに格納するやつ(たぶん)
public interface NodeCompiler { public void compile(Visitor visitor, SimpleNode node, Context context); } public class NodeCompilerFactory implements IOParserVisitor { private static final NodeCompilerFactory self = new NodeCompilerFactory(); private NodeCompilerFactory() { // no operation } public static NodeCompiler getCompiler(Node node) { return (NodeCompiler) node.jjtAccept(self, null); } public Object visit(IONodeExpression node, Object data) { return new ExpressionCompiler(); } public Object visit(IONodeMessage node, Object data) { return new MessageCompiler(); } : : : } public class MessageCompiler implements NodeCompiler { public void compile(Visitor visitor, SimpleNode node, Context context) { context.setMessage(new Message()); visitor.acceptChildren(node, context); } } public class NumberCompiler implements NodeCompiler { public void compile(Visitor visitor, SimpleNode node, Context context) { BigDecimal decimal = new BigDecimal(node.getNodeValue()); context.setSelf(new IoNumber(decimal)); } } public class OperatorCompiler implements NodeCompiler { public void compile(Visitor visitor, SimpleNode node, Context context) { context.getMessage().setName(node.getNodeValue()); } } public class QuoteCompiler implements NodeCompiler { public void compile(Visitor visitor, SimpleNode node, Context context) { context.setSelf(new IoString(node.getNodeValue())); } }
数値できたときはBigDecimalでいいよね。divideとかってメソッドがあるし。
後は、ここで生成されたContextのMessage単位にオブジェクトたちに処理を任せる。これはIoのMessageChainそのまま。(というか、今の実装だと四則演算用にしかメッセージの受渡しができない…JavaCCの本買わないとダメかな)
また、Java上でIoの動きをエミュレートするruntimeパッケージのIoObjectとかIoNumberはこんな感じに。
public class IoObject { private Map<String, Slot> slot = new HashMap<String, Slot>(); public IoObject() { : : something : } public Slot getSlot(String slotName) { return slot.get(slotName); } public Slot setSlot(String slotName, Slot slotObject) { return slot.put(slotName, slotObject); } public Slot setSlot(String slotName, IoObject slotObject){ return slot.put(slotName, new Slot(createReturnSelf(slotObject))); } public IoObject call(IoCall call){ IoMessage message = call.getMessage(); Slot s = slot.get(message.getName()); if(s == null){ return this; } IBlock block = s.getSlot(); return block.call(this, message.getArguments()); } : : : }
public class IoNumber extends IoObject { private BigDecimal number; private static final Map<String, Slot> slot = new HashMap<String, Slot>(); static { slot.put("+", new Slot(new AddMethod())); slot.put("-", new Slot(new SubtractMethod())); slot.put("*", new Slot(new MultiplyMethod())); slot.put("/", new Slot(new DivideMethod())); } public IoNumber(BigDecimal number){ super(); this.number = number; } protected BigDecimal getNumber(){ return this.number; } public String toString(){ return number.toString(); } protected Map<String, Slot> allSlots(){ return slot; } protected static class AddMethod extends IoMethod { public IoObject call(IoObject object, IoArguments parameter){ BigDecimal target = ((IoNumber) object).getNumber(); IoNumber arg0 = (IoNumber) parameter.get(0); BigDecimal result = target.add(arg0.getNumber()); return new IoNumber(result); } } protected static class SubtractMethod extends IoMethod { public IoObject call(IoObject object, IoArguments parameter){ BigDecimal target = ((IoNumber) object).getNumber(); IoNumber arg0 = (IoNumber) parameter.get(0); BigDecimal result = target.subtract(arg0.getNumber()); return new IoNumber(result); } } protected static class MultiplyMethod extends IoMethod { public IoObject call(IoObject object, IoArguments parameter){ BigDecimal target = ((IoNumber) object).getNumber(); IoNumber arg0 = (IoNumber) parameter.get(0); BigDecimal result = target.multiply(arg0.getNumber()); return new IoNumber(result); } } protected static class DivideMethod extends IoMethod { public IoObject call(IoObject object, IoArguments parameter){ BigDecimal target = ((IoNumber) object).getNumber(); IoNumber arg0 = (IoNumber) parameter.get(0); BigDecimal result = arg0.getNumber().divide(target, 6, BigDecimal.ROUND_UP); return new IoNumber(result); } } }
もう少しVisitorパターンに慣れないとダメだなこりゃ。あと、遅延評価をできるようにしておこう。
構文ももう少しちゃんとやらないとパースエラーだらけだし…
道のりは長い…
2007/10/23
ioをJavaで動かすぞ日記。その2
とりあえず、構文解析はここでいいかも。次は評価を作ろう
> Hoge foo = 123 - (456 * 321)
Call: Expression
Call: Message(LOOKING AHEAD...)
Visited token: <<IDENTIFIER>: "Hoge" at line 1 column 1>; Expected token: <<WCPAD>>
Call: Symbol(LOOKING AHEAD...)
Call: Number(LOOKING AHEAD...)
Visited token: <<IDENTIFIER>: "Hoge" at line 1 column 1>; Expected token: <<NUMBER>>
Return: Number(LOOKAHEAD FAILED)
Call: Identifier(LOOKING AHEAD...)
Visited token: <<IDENTIFIER>: "Hoge" at line 1 column 1>; Expected token: <<IDENTIFIER>>
Return: Identifier(LOOKAHEAD SUCCEEDED)
Return: Symbol(LOOKAHEAD SUCCEEDED)
Visited token: <<IDENTIFIER>: "foo" at line 1 column 6>; Expected token: <<SCPAD>>
Call: Arguments(LOOKING AHEAD...)
Visited token: <<IDENTIFIER>: "foo" at line 1 column 6>; Expected token: <<OPEN>>
Return: Arguments(LOOKAHEAD FAILED)
Return: Message(LOOKAHEAD SUCCEEDED)
Call: Message
Call: Symbol
Call: Identifier
Consumed token: <<IDENTIFIER>: "Hoge" at line 1 column 1>
Return: Identifier
Return: Symbol
Return: Message
Call: Message
Call: Symbol
Call: Identifier
Consumed token: <<IDENTIFIER>: "foo" at line 1 column 6>
Return: Identifier
Return: Symbol
Return: Message
Call: Message
Call: Symbol
Call: Operator
Consumed token: <<OPERATOR>: "=" at line 1 column 10>
Return: Operator
Return: Symbol
Return: Message
Call: Message
Call: Symbol
Call: Number
Consumed token: <<NUMBER>: "123" at line 1 column 12>
Return: Number
Return: Symbol
Return: Message
Call: Message
Call: Symbol
Call: Operator
Consumed token: <<OPERATOR>: "-" at line 1 column 16>
Return: Operator
Return: Symbol
Call: Arguments
Consumed token: <<OPEN>: "(" at line 1 column 18>
Call: Argument
Call: Expression
Call: Message(LOOKING AHEAD...)
Visited token: <<NUMBER>: "456" at line 1 column 19>; Expected token: <<WCPAD>>
Call: Symbol(LOOKING AHEAD...)
Call: Number(LOOKING AHEAD...)
Visited token: <<NUMBER>: "456" at line 1 column 19>; Expected token: <<NUMBER>>
Return: Number(LOOKAHEAD SUCCEEDED)
Return: Symbol(LOOKAHEAD SUCCEEDED)
Visited token: <<OPERATOR>: "*" at line 1 column 23>; Expected token: <<SCPAD>>
Call: Arguments(LOOKING AHEAD...)
Visited token: <<OPERATOR>: "*" at line 1 column 23>; Expected token: <<OPEN>>
Return: Arguments(LOOKAHEAD FAILED)
Return: Message(LOOKAHEAD SUCCEEDED)
Call: Message
Call: Symbol
Call: Number
Consumed token: <<NUMBER>: "456" at line 1 column 19>
Return: Number
Return: Symbol
Return: Message
Call: Message
Call: Symbol
Call: Operator
Consumed token: <<OPERATOR>: "*" at line 1 column 23>
Return: Operator
Return: Symbol
Return: Message
Call: Message
Call: Symbol
Call: Number
Consumed token: <<NUMBER>: "321" at line 1 column 25>
Return: Number
Return: Symbol
Return: Message
Return: Expression
Return: Argument
Consumed token: <<CLOSE>: ")" at line 1 column 28>
Return: Arguments
Return: Message
Return: Expression
[IOExpression]:Expression
また、作ったjjt↓(コメント入れました。)
// オプション定義
options {
JDK_VERSION = "1.5";
DEBUG_PARSER = true;
DEBUG_LOOKAHEAD = true;
DEBUG_TOKEN_MANAGER = false;
ERROR_REPORTING = true;
//* <boolean: true> メソッドをすべてstaticにする
STATIC = false;
//* <boolean: false>ノードごとにクラスを生成する
MULTI = true;
//* <boolean: false> Visitorパターンを利用する
VISITOR = true;
//* <String> LOOKAHEADの規定値
// LOOKAHEAD
//
USER_CHAR_STREAM = false;
//
USER_TOKEN_MANAGER = false;
//* <boolean> 生成したパーサーがUNICODEでの入力を受け付けるようにする
UNICODE_INPUT = true;
//
JAVA_UNICODE_ESCAPE = false;
//* <boolean: true> SimpleNode及び文法中で使用されるその他のノードのサンプル実装を生成します
// BUILD_NODE_FILES
//* <boolean: false> 大文字小文字の区別をしない
IGNORE_CASE = false;
//* <String> ノードの基底クラス
NODE_EXTENDS = "jp.s2php5.io.node.IONode";
//* <boolean: false> 各ノードスコープの入口と出口にユーザ定義パーサメソッドの呼出しを挿入します
// NODE_SCOPE_HOOK = true;
//* <boolean: false> ノードを生成する際、次のシグネチャのファクトリメソッドを使用します
NODE_FACTORY = true;
//* <boolean: false> パーサオプジェクトを渡す、ノード生成ルーチンの別形式を使います
NODE_USES_PARSER = true;
//* <String: ""> 生成したノードクラスを格納するパッケージ。デフォルトはパーサのパッケージ
// NODE_PACKAGE
//* <String: "AST"> multiモードでノード識別子からノードクラス名を生成するのに使われるプレフィクス
NODE_PREFIX = "IO";
//* <boolean: false> 特に指定のない場合、ノードを生成しないようにする
// NODE_DEFAULT_VOID
//
BUILD_PARSER = true;
BUILD_TOKEN_MANAGER = true;
SANITY_CHECK = true;
FORCE_LA_CHECK = true;
//* <boolean: false> Token発見時に void CommonTokenAction(Token token) が呼び出される
COMMON_TOKEN_ACTION = true;
}
// パーサークラスの定義
PARSER_BEGIN(IOParser)
package jp.s2php5.io.parser;
public class IOParser {}
PARSER_END(IOParser)
TOKEN_MGR_DECLS: {
void CommonTokenAction(Token t) {
}
}
/*
* SKIP: マッチした文字列は捨てられる。
* MORE: マッチ操作は継続される。ここでマッチした文字列と以後マッチしたものを結合したものがトークンになる。
* TOKEN: マッチした文字列でトークンを形成して返す。
* SPECIAL_TOKEN: スペシャルトークンを作るが、それは返さない。その次にマッチしたトークンのspecialTokenフィールドで見ることが出来る。
*/
/* operators */
TOKEN: {
< OPERATOR:
<ASSIGN>
| <DOT>
| <SINGLEQUOTE>
| <DOUBLEQUOTE>
| <COLON>
| <TILDE>
| <BANG>
| <HOOK>
| <ATMARK>
| <DLR>
| <REM>
| <XOR>
| <BIT_AND>
| <BIT_OR>
| <BACKSLASH>
| <STAR>
| <MINUS>
| <PLUS>
| <SLASH>
| <GT>
| <LT>
>
| < #ASSIGN: "=" >
| < #GT: ">" >
| < #LT: "<" >
| < #BANG: "!" >
| < #HOOK: "?" >
| < #TILDE: "~" >
| < #PLUS: "+" >
| < #MINUS: "-" >
| < #STAR: "*" >
| < #SLASH: "/" >
| < #BACKSLASH: "\\">
| < #BIT_AND: "&" >
| < #BIT_OR: "|" >
| < #XOR: "^" >
| < #REM: "%" >
| < #ATMARK: "@" >
| < #DLR: "$" >
}
/* separators */
TOKEN: {
< SEMICOLON: ";" >
| < COMMA: "," >
| < DOT: "." >
| < COLON: ":" >
| < BACKQUOTE: "`" >
| < SINGLEQUOTE: "'" >
| < DOUBLEQUOTE: "\"" >
}
/* spans */
TOKEN: {
< TERMINATOR:
(<SEPERATOR>)? <SEMICOLON> | <NEWLINE> | <NEWLINE> (<SEPERATOR>)?
>
}
SPECIAL_TOKEN: {
< SEPERATOR: " " | "\t" | "\f" >
}
SPECIAL_TOKEN: {
< WHITESPACE: " " | "\t" | "\n" | "\r" | "\f" >
}
TOKEN: {
< SCTPAD: <SEPERATOR> | <COMMENT> | <TERMINATOR> >
}
TOKEN: {
< SCPAD: <SEPERATOR> | <COMMENT> >
}
TOKEN: {
< WCPAD: <WHITESPACE> | <COMMENT> >
}
TOKEN: {
< NEWLINE: "\r\n"|"\n"|"\r" >
}
/* identifier */
TOKEN: {
< IDENTIFIER: (<COLON>)? <LETTER> (<LETTER> | <NUMBER> | <SPECIALCHAR>)+ >
| < #LETTER: ["$","A"-"Z","_","a"-"z"] >
| < #SPECIALCHAR: [":", ".", "-"] >
}
/* numbers */
TOKEN: {
< NUMBER: <HEXNUMBER> | <DECIMAL> | <OCTET> | <FLOAT> >
}
TOKEN: {
< HEXNUMBER: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+ >
| < DECIMAL: ["1"-"9"] (["0"-"9"])* | "0" >
| < OCTET: "0" (["0"-"7"])* >
| < FLOAT: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)?
| "." (["0"-"9"])+ (<EXPONENT>)?
| (["0"-"9"])+ <EXPONENT> >
| < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >
}
/* comments */
TOKEN: {
< COMMENT: <SINGLE_LINE_COMMENT> | <MULTI_LINE_COMMENT> >
}
TOKEN: {
<SINGLE_LINE_COMMENT: ("//"|"#") (~["\n","\r"])*>
| <MULTI_LINE_COMMENT: "/*" (~["*"])* "*" ("*" | (~["*","/"] (~["*"])* "*"))* "/">
}
/* quotes */
TOKEN: {
< QUOTE: <BACKQUOTES> | <SINGLEQUOTES> | <DOUBLEQUOTES> |<TRIQUOTES> >
| < BACKQUOTES: <BACKQUOTE> > : IN_BACKQUOTE
| < SINGLEQUOTES: <SINGLEQUOTE> > : IN_SINGLEQUOTE
| < DOUBLEQUOTES: <DOUBLEQUOTE> > : IN_DOUBLEQUOTE
| < TRIQUOTES: "\"\"\"" > : IN_TRIQUOTE
}
< IN_BACKQUOTE > TOKEN: {
< BACKQUOTE > : DEFAULT
}
< IN_SINGLEQUOTE > TOKEN: {
< SINGLEQUOTE > : DEFAULT
}
< IN_DOUBLEQUOTE > TOKEN : {
< DOUBLEQUOTE > : DEFAULT
}
< IN_TRIQUOTE > TOKEN: {
< TRIQUOTE > : DEFAULT
}
/* characters */
TOKEN: {
< OPEN: <LPAREN> | <LBRACE> | <LBRACKET> >
| < #LPAREN: "(" >
| < #LBRACE: "{" >
| < #LBRACKET: "[" >
}
TOKEN: {
< CLOSE: <RPAREN> | <RBRACE> | <RBRACKET> >
| < #RPAREN: ")" >
| < #RBRACE: "}" >
| < #RBRACKET: "]" >
}
TOKEN: {
< ANYTHING_ELSE: (~[]) >
}
// 文法の定義
Node Expression():
{}
{
(
LOOKAHEAD(Message()) (Message())+ {
return jjtree.rootNode();
}
| <SCTPAD>
| <EOF>
)
}
Node Message():
{}
{
(<WCPAD>)* Symbol() (<SCPAD>)* (Arguments())* {
return jjtThis;
}
}
void Arguments() #void :
{}
{
<OPEN>
(Argument() (<COMMA> Argument())*)*
<CLOSE>
}
Node Argument() :
{}
{
(<WCPAD>)*
Expression()
(<WCPAD>)* {
return jjtThis;
}
}
void Symbol() :
{}
{
Number()
| Identifier()
| Quote()
| Operator()
}
void Identifier() :
{ Token token; }
{
token = <IDENTIFIER> {
jjtThis.setNodeValue(token.image);
}
}
void Number() :
{ Token token; }
{
token = <NUMBER> {
jjtThis.setNodeValue(token.image);
}
}
void Operator() :
{ Token token; }
{
token = <OPERATOR> {
jjtThis.setNodeValue(token.image);
}
}
void Quote() :
{ Token token; }
{
token = <QUOTE> {
jjtThis.setNodeValue(token.image);
}
}
道のりは長い…
2007/10/22
ioをJavaで動かすぞ日記。その1.5
むーん、Expressionが元のBNFだと再帰するけど、javaccだとendless loopとかって怒られる。これじゃいかんなあ
Call: Expression Hoge foo bar baz; Consumed token: <<EXPRESSION>: "Hoge " at line 1 column 1> Call: Command Consumed token: <<EXPRESSION>: "foo " at line 1 column 6> Return: Command Return: Expression Call: Expression Consumed token: <<EXPRESSION>: "bar " at line 1 column 10> Call: Command Consumed token: <<EXPRESSION>: "baz" at line 1 column 14> Return: Command Return: Expression
ということで、Expression 2 個についてしか解析できないorz
public class IOMain implements IOParserVisitor { public static void main(String[] args){ try { IOParser parser = new IOParser(System.in); IOMain visitor = new IOMain(); parseNode(visitor, parser.Expression()); parseNode(visitor, parser.Expression()); } catch (ParseException e) { e.printStackTrace(); } } private static void parseNode(IOParserVisitor visitor, Node node){ node.jjtAccept(visitor, null); for(int i = 0; i < node.jjtGetNumChildren(); ++i){ parseNode(visitor, node.jjtGetChild(i)); } } public Object visit(SimpleNode node, Object data) { return "[Node]: " + node; } public Object visit(IOExpression node, Object data) { String word = node.jjtGetChild(0).jjtAccept(this, null).toString(); return "[Expression]: " + word; } public Object visit(IOCommand node, Object data) { return "[IOCommand]: " + node.getNodeValue(); } }
ちょっとTokenを出力するようにしてみる
options {
JDK_VERSION = "1.5";
DEBUG_PARSER = true;
DEBUG_LOOKAHEAD = false;
DEBUG_TOKEN_MANAGER = false;
ERROR_REPORTING = true;
STATIC = false;
MULTI = true;
VISITOR = true;
USER_CHAR_STREAM = false;
USER_TOKEN_MANAGER = false;
UNICODE_INPUT = true;
JAVA_UNICODE_ESCAPE = false;
IGNORE_CASE = false;
NODE_EXTENDS = "jp.s2php5.io.node.IONode";
NODE_FACTORY = true;
NODE_USES_PARSER = true;
NODE_PREFIX = "IO";
BUILD_PARSER = true;
BUILD_TOKEN_MANAGER = true;
SANITY_CHECK = true;
FORCE_LA_CHECK = true;
COMMON_TOKEN_ACTION = true;
}
PARSER_BEGIN(IOParser)
package jp.s2php5.io.parser;
public class IOParser {}
PARSER_END(IOParser)
TOKEN_MGR_DECLS: {
void CommonTokenAction(Token t) {
}
}
/* operators */
TOKEN: {
< ASSIGN: "=" >
| < GT: ">" >
| < LT: "<" >
| < BANG: "!" >
| < HOOK: "?" >
| < TILDE: "~" >
| < PLUS: "+" >
| < MINUS: "-" >
| < STAR: "*" >
| < SLASH: "/" >
| < BACKSLASH: "\\">
| < BIT_AND: "&" >
| < BIT_OR: "|" >
| < XOR: "^" >
| < REM: "%" >
| < ATMARK: "@" >
| < DLR: "$" >
}
/* separators */
TOKEN: {
< LPAREN: "(" >
| < RPAREN: ")" >
| < LBRACE: "{" >
| < RBRACE: "}" >
| < LBRACKET: "[" >
| < RBRACKET: "]" >
| < SEMICOLON: ";" >
| < COMMA: "," >
| < DOT: "." >
| < COLON: ":" >
| < BACKQUOTE: "`" >
| < SINGLEQUOTE: "'" >
| < DOUBLEQUOTE: "\"" >
}
TOKEN: {
< OPERATOR:
<ASSIGN>
| <DOT>
| <SINGLEQUOTE>
| <DOUBLEQUOTE>
| <COLON>
| <TILDE>
| <BANG>
| <HOOK>
| <ATMARK>
| <DLR>
| <REM>
| <XOR>
| <BIT_AND>
| <BIT_OR>
| <BACKSLASH>
| <STAR>
| <MINUS>
| <PLUS>
| <SLASH>
| <LPAREN>
| <RPAREN>
| <LBRACE>
| <RBRACE>
| <LBRACKET>
| <RBRACKET>
| <GT>
| <LT>
>
}
/* message */
<DEFAULT> TOKEN: {
< EXPRESSION: <MESSAGE> | <SCTPAD> >
}
TOKEN: {
< MESSAGE: (<WCPAD>)* <SYMBOL> (<SCPAD>)* (<ARGUMENTS>)* >
}
TOKEN: {
< ARGUMENTS:
<OPEN>
(<ARGUMENT> (<COMMA> <ARGUMENT>)*)*
<CLOSE>
>
}
TOKEN: {
< ARGUMENT: (<WCPAD>)* <SYMBOL> (<WCPAD>)* >
}
/* symbols */
TOKEN: {
< SYMBOL: <IDENTIFIER> | <NUMBER> | <OPERATOR> | <QUOTE> > : DEFAULT
}
TOKEN: {
< IDENTIFIER: (<COLON>)? (<LETTER> | <NUMBER> | <SPECIALCHAR>)+ >
| < #LETTER: ["$","A"-"Z","_","a"-"z"] >
}
TOKEN: {
< SPECIALCHAR: [":", ".", "-"] >
}
TOKEN: {
< QUOTE: <BACKQUOTES> | <SINGLEQUOTES> | <DOUBLEQUOTES> |<TRIQUOTES> >
| < BACKQUOTES: <BACKQUOTE> > : IN_BACKQUOTE
| < SINGLEQUOTES: <SINGLEQUOTE> > : IN_SINGLEQUOTE
| < DOUBLEQUOTES: <DOUBLEQUOTE> > : IN_DOUBLEQUOTE
| < TRIQUOTES: "\"\"\"" > : IN_TRIQUOTE
}
< IN_BACKQUOTE > TOKEN: {
< BACKQUOTE > : DEFAULT
}
< IN_SINGLEQUOTE > TOKEN: {
< SINGLEQUOTE > : DEFAULT
}
< IN_DOUBLEQUOTE > TOKEN : {
< DOUBLEQUOTE > : DEFAULT
}
< IN_TRIQUOTE > TOKEN: {
< TRIQUOTE > : DEFAULT
}
/* spans */
TOKEN: {
< TERMINATOR:
(<SEPERATOR>)? <SEMICOLON> | <NEWLINE> | (<NEWLINE> | <SEPERATOR>)?
>
}
TOKEN: {
< SEPERATOR: " " | "\t" | "\f" >
}
TOKEN: {
< WHITESPACE: " " | "\t" | "\n" | "\r" | "\f" >
}
TOKEN: {
< SCTPAD: <SEPERATOR> | <COMMENT> | <TERMINATOR> >
}
TOKEN: {
< SCPAD: <SEPERATOR> | <COMMENT> >
}
TOKEN: {
< WCPAD: <WHITESPACE> | <COMMENT> >
}
TOKEN: {
< NEWLINE: "\r\n"|"\n"|"\r" >
}
/* comments */
TOKEN: {
< COMMENT: <SINGLE_LINE_COMMENT> | <MULTI_LINE_COMMENT> >
}
TOKEN: {
<SINGLE_LINE_COMMENT: ("//"|"#") (~["\n","\r"])*>
| <MULTI_LINE_COMMENT: "/*" (~["*"])* "*" ("*" | (~["*","/"] (~["*"])* "*"))* "/">
}
/* numbers */
TOKEN: {
< NUMBER: <HEXNUMBER> | <DECIMAL> | <OCTET> | <FLOAT> >
}
TOKEN: {
< HEXNUMBER: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+ >
| < DECIMAL: ["1"-"9"] (["0"-"9"])* | "0" >
| < OCTET: "0" (["0"-"7"])* >
| < FLOAT: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)?
| "." (["0"-"9"])+ (<EXPONENT>)?
| (["0"-"9"])+ <EXPONENT> >
| < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >
}
/* characters */
TOKEN: {
< OPEN: <LPAREN> | <LBRACKET> | <LBRACE> >
}
TOKEN: {
< CLOSE: <RPAREN> | <RBRACKET> | <RBRACE> >
}
// 文法の定義
Node Expression():
{}
{
<EXPRESSION> Command() {
return jjtree.rootNode();
}
}
void Command() :
{ Token t;}
{
t = <EXPRESSION> {
jjtThis.setNodeValue(t.image);
}
}
四則演算よりも前に、Expressionをどうにかしよう。
2007/10/21
ioをJavaで動かすぞ日記。その1
言語拡張したい言語が最近増えてきました。PHPもそうだけど、ioもその一つ(その原因は日本語が使えないからっていう理由だけど)
でも、ほとんどのヤツら(言語)はCで書かれているのが残念すぎる。Cは分からん。javaで書かれていればいいのにー
(中略)
探してみると、parrotのサンプル(?)みたいなところに IO.pg があったのでJavaCC風に書き直してるテスト。
javaccのgrammarsに、PHP5のサンプルコードとか、Cのサンプルコードとかのサンプルコードを参考に味付けしてみたりしてます。
options {
LOOKAHEAD = 1;
CHOICE_AMBIGUITY_CHECK = 2;
OTHER_AMBIGUITY_CHECK = 1;
STATIC = false;
DEBUG_PARSER = false;
DEBUG_LOOKAHEAD = false;
DEBUG_TOKEN_MANAGER = false;
ERROR_REPORTING = true;
JAVA_UNICODE_ESCAPE = true;
UNICODE_INPUT = true;
IGNORE_CASE = false;
USER_TOKEN_MANAGER = false;
USER_CHAR_STREAM = false;
BUILD_PARSER = true;
BUILD_TOKEN_MANAGER = true;
SANITY_CHECK = true;
FORCE_LA_CHECK = true;
MULTI = true;
VISITOR = false;
}
PARSER_BEGIN(IOParser)
package jp.s2php5.io;
public class IOParser {
}
PARSER_END(IOParser)
/* SEPARATORS */
TOKEN: {
< LPAREN: "(" >
| < RPAREN: ")" >
| < LBRACE: "{" >
| < RBRACE: "}" >
| < LBRACKET: "[" >
| < RBRACKET: "]" >
| < SEMICOLON: ";" >
| < COMMA: "," >
| < DOT: "." >
| < COLON: ":" >
| < BACKQUOTE: "`" >
| < SINGLEQUOTE: "'" >
| < DOUBLEQUOTE: "\"" >
}
/* Message */
TOKEN: {
< EXPRESSION: <MESSAGE> | <SCTPAD> >
}
TOKEN: {
< MESSAGE: (<WCPAD>)* <SYMBOL> (<SCPAD>)* (<ARGUMENTS>)* >
}
TOKEN: {
< ARGUMENTS:
<OPEN>
(<ARGUMENT> (<COMMA><ARGUMENT>)*)*
<CLOSE>
>
}
TOKEN: {
< ARGUMENT: (<WCPAD>)* <SYMBOL> (<WCPAD>)* >
}
/* symbols */
TOKEN: {
< SYMBOL: <IDENTIFIER> | <NUMBER> | <OPERATOR> | <QUOTE> >
}
TOKEN: {
< IDENTIFIER: (<COLON>)? (["a"-"z", "A"-"Z"] | <NUMBER> | <SPECIALCHAR>)+ >
}
TOKEN: {
< SPECIALCHAR: [":", ".", "-"] >
}
TOKEN: {
< OPERATOR:
"="
| "."
| "'"
| "~"
| "!"
| "@"
| "$"
| "%"
| "^"
| "&"
| "*"
| "-"
| "+"
| "/"
| "="
| "{"
| "}"
| "["
| "]"
| "|"
| "\\"
| "<"
| ">"
| "?"
>
}
TOKEN: {
< QUOTE: <BACKQUOTE> | <SINGLEQUOTE> | <DOUBLEQUOTE> >
}
/* spans */
TOKEN: {
< TERMINATOR:
(<SEPERATOR>)? ";" | ("\r\n"|"\n"|"\r") | (<SEPERATOR>)?
>
}
TOKEN: {
< SEPERATOR: (<WHITESPACE> | <DOT>) >
}
TOKEN: {
< WHITESPACE: " " | "\t" | "\n" | "\r" | "\f" >
}
TOKEN: {
< SCTPAD: <SEPERATOR> | <COMMENT> | <TERMINATOR> >
}
TOKEN: {
< SCPAD: <SEPERATOR> | <COMMENT> >
}
TOKEN: {
< WCPAD: <WHITESPACE> | <COMMENT> >
}
/* comments */
TOKEN: {
< COMMENT: <SINGLE_LINE_COMMENT> | <MULTI_LINE_COMMENT> >
}
SPECIAL_TOKEN: {
<SINGLE_LINE_COMMENT: ("//"|"#") (~["\n","\r"])*>
| <MULTI_LINE_COMMENT: "/*" (~["*"])* "*" ("*" | (~["*","/"] (~["*"])* "*"))* "/">
}
/* numbers */
TOKEN: {
< NUMBER: <HEXNUMBER> | <DECIMAL> | <OCTET> | <FLOAT> >
}
TOKEN: {
< HEXNUMBER: "0" ["x","X"] (["0"-"9","a"-"f","A"-"F"])+ >
| < DECIMAL: ["1"-"9"] (["0"-"9"])* | "0" >
| < OCTET: "0" (["0"-"7"])* >
| < FLOAT: (["0"-"9"])+ "." (["0"-"9"])* (<EXPONENT>)?
| "." (["0"-"9"])+ (<EXPONENT>)?
| (["0"-"9"])+ <EXPONENT> >
| < #EXPONENT: ["e","E"] (["+","-"])? (["0"-"9"])+ >
}
/* characters */
TOKEN: {
< OPEN: <LPAREN> | <LBRACKET> | <LBRACE> >
}
TOKEN: {
< CLOSE: <RPAREN> | <RBRACKET> | <RBRACE> >
}
ちなみに、構文解析っぽいのだけ。動きません、これから動くように作っていきます。(予定では)
参考にしているのは
- CodeZine:JavaCCでスクリプト言語を作成する 第2回(構文解析, スクリプト言語, JJTree, JavaCC)
- CodeZine:JavaCCでスクリプト言語を作成する 第3回(構文解析, スクリプト言語, JJTree, JavaCC)
# 動かないのは載せるなよ…

ANTLRのやつがあったけど、JavaCCのがみつからなかったので、でっちあげた
via - http://harward.us/~nharward/antlr/memcached_protocol.g
できあがったのは、こんな感じ
SKIP: { " " | "\t" | "\r" | "\n" } TOKEN: { < NUMBER: ["1"-"9"] (["0"-"9"])* | "0" > | < FLAGS: < NUMBER > > | < TIME: < NUMBER > > | < LENGTH: < NUMBER > > | < CREMENT_VALUE: < NUMBER > > | < CAS_UNIQUE: < NUMBER > > } TOKEN: { < SET_STATEMENT: "set" > | < ADD_STATEMENT: "add" > | < REPLACE_STATEMENT: "replace" > | < APPEND_STATEMENT: "append" > | < PREPEND_STATEMENT: "prepend" > | < CAS_STATEMENT: "cas" > | < STORAGE_STATEMENT: < SET_STATEMENT > | < ADD_STATEMENT > | < REPLACE_STATEMENT > | < APPEND_STATEMENT > | < PREPEND_STATEMENT > > | < STORAGE_COMMAND: ( < STORAGE_STATEMENT > < KEY > < FLAGS > < TIME > < LENGTH > | < CAS_STATEMENT > < KEY > < FLAGS > < TIME > < LENGTH > < CAS_UNIQUE > ) (< NOREPLY >)? > } TOKEN: { < RETRIEVAL_STATEMENT: "get" | "gets" > | < RETRIEVAL_COMMAND: < RETRIEVAL_STATEMENT > < KEY > > } TOKEN: { < DELETE_STATEMENT: "delete" > | < DELETE_COMMAND: < DELETE_STATEMENT > < KEY > (< TIME >)? (< NOREPLY >)? > } TOKEN: { < INCREMENT_STATEMENT: "incr" > | < INCREMENT_COMMAND: < INCREMENT_STATEMENT > < KEY > < CREMENT_VALUE > (< NOREPLY >)? > } TOKEN: { < DECREMENT_STATEMENT: "decr" > | < DECREMENT_COMMAND: < DECREMENT_STATEMENT > < KEY > < CREMENT_VALUE > (< NOREPLY >)? > } TOKEN: { < STATISTICS_STATEMENT: "STAT" > | < STATISTICS_OPTION: "items" | "slabs" | "sizes" > | < STATISTICS_COMMAND: < STATISTICS_STATEMENT > (< STATISTICS_OPTION >)? > } TOKEN: { < FLUSH_STATEMENT: "flush_all" > | < FLUSH_COMMAND: < FLUSH_STATEMENT > (< TIME >)? (< NOREPLY >)? > } TOKEN: { < VERSION_STATEMENT: "version" > | < VERSION_COMMAND: < VERSION_STATEMENT > > } TOKEN: { < NOREPLY: "noreply" > } // last match TOKEN: { < KEY: (~[" ", "\r","\n"])+ > }少し、数値まわりのToken(TIME, LENGTHとか)が適当すぎるかな。
んで、これにテキトーなNodeをparseしてあげてみる
Command Command(): { Command command; } { ( command = RetrievalCommand() | command = StorageCommand() | command = DeleteCommand() | command = VersionCommand() ) { return command; } } StorageCommand StorageCommand(): { StorageCommand command; String key; Long flags = 0L; Long time = 0L; Long length = 0L; Boolean noreply = Boolean.FALSE; } { command = createStorageCommand() key = Key() flags = Flags() time = Time() length = Length() noreply = Noreply() { command.setNode(jjtThis); command.setKey(key); command.setFlags(flags); command.setExpTime(time); command.setLength(length); command.setNoreply(noreply); return command; } } StorageCommand createStorageCommand(): {} { ( < SET_STATEMENT > { return new SetCommand(); } | < ADD_STATEMENT > { return new AddCommand(); } | < REPLACE_STATEMENT > { return new ReplaceCommand(); } | < APPEND_STATEMENT > { return new AppendCommand(); } | < PREPEND_STATEMENT > { return new PrependCommand(); } ) } RetrievalCommand RetrievalCommand(): { RetrievalCommand command = new RetrievalCommand(); String key; } { < RETRIEVAL_STATEMENT > ( key = Key() { command.addKey(key); } )+ { command.setNode(jjtThis); return command; } } DeleteCommand DeleteCommand(): { DeleteCommand command = new DeleteCommand(); String key; Long time = 0L; Boolean noreply = Boolean.FALSE; } { < DELETE_STATEMENT > key = Key() time = Time() noreply = Noreply() { command.setNode(jjtThis); command.setKey(key); command.setExpTime(time); command.setNoreply(noreply); return command; } } VersionCommand VersionCommand(): {} { < VERSION_STATEMENT > { VersionCommand command = new VersionCommand(); command.setNode(jjtThis); return command; } } String Key(): { Token key; } { key = < KEY > { return key.image; } } Long Flags(): { Token flags; } { flags = < NUMBER > { return Long.valueOf(flags.image); } } Long Time(): { Token time; } { time = < NUMBER > { return Long.valueOf(time.image); } } Long Length(): { Token length; } { length = < NUMBER > { return Long.valueOf(length.image); } } Boolean Noreply(): { Boolean noreply = Boolean.FALSE; } { [< NOREPLY >{noreply = Boolean.TRUE;}] { return noreply; } }これに、適当なコードを投げてあげると
{ StringReader reader = new StringReader("get hoge\r\n"); MemcacheParser parser = new MemcacheParser(reader); try { parser.Command(); } catch (ParseException e) { e.printStackTrace(); } } { StringReader reader = new StringReader("gets hoge foo\r\n"); MemcacheParser parser = new MemcacheParser(reader); try { parser.Command(); } catch (ParseException e) { e.printStackTrace(); } } { StringReader reader = new StringReader("set xyzkey 0 0 6\r\n"); MemcacheParser parser = new MemcacheParser(reader); try { parser.Command(); } catch (ParseException e) { e.printStackTrace(); } }Call: Command Call: RetrievalCommand Consumed token: <<RETRIEVAL_STATEMENT>: "get" at line 1 column 1> Call: Key Consumed token: <<KEY>: "hoge" at line 1 column 5> Return: Key Return: RetrievalCommand Return: Command Call: Command Call: RetrievalCommand Consumed token: <<RETRIEVAL_STATEMENT>: "gets" at line 1 column 1> Call: Key Consumed token: <<KEY>: "hoge" at line 1 column 6> Return: Key Call: Key Consumed token: <<KEY>: "foo" at line 1 column 11> Return: Key Return: RetrievalCommand Return: Command Call: Command Call: StorageCommand Call: createStorageCommand Consumed token: <"set" at line 1 column 1> Return: createStorageCommand Call: Key Consumed token: <<KEY>: "xyzkey" at line 1 column 5> Return: Key Call: Flags Consumed token: <<NUMBER>: "0" at line 1 column 12> Return: Flags Call: Time Consumed token: <<NUMBER>: "0" at line 1 column 14> Return: Time Call: Length Consumed token: <<NUMBER>: "6" at line 1 column 16> Return: Length Call: Noreply Return: Noreply Return: StorageCommand Return: Commandと、こんな感じになる。
<NUMBER>とか、ホント、マジメにtokenが書けてないですね。。
ここまでできたので、set と get しかない、memcached 互換ものをでっち上げてみた
見事に、setとgetしか実装してません。しかもハンドリングは少し適当。
ということで、これと(java-lang)、memcached(c-lang)で比較してみた。
$target = array( array('host' => 'localhost', 'port' => 11211), array('host' => 'localhost', 'port' => 12221) ); foreach($target as $t){ $memcache = new Memcache; $memcache->connect($t['host'], $t['port']); $elapsed = microtime(true); for($i = 0; $i < 1000; ++$i){ $memcache->set('hoge', '123'); $memcache->set('hoge', '124'); $memcache->get('hoge'); } echo 'target host => ', $t['host'], ' port =>', $t['port'], PHP_EOL; echo 'elapsed: ', (microtime(true) - $elapsed), PHP_EOL; }なんつーか、「もうちょっとがんばりま賞」って感じで残念感があります。(約3倍遅い)
とりあえず、動きそうなので、他の実装も頑張る。