Ruby是这样“理解”你给它的程序文本的
*.rb => 分词 => 解析 => 编译 => YARV指令
- 程序文本转换为一系列的词条
- LALR解析器把词条输入流转换为AST的数据结构
- AST数据结构转换为字节码指令
分词
依 print("hello world")
为例:
require 'ripper'
require 'pp'
pp Ripper.lex <<CODE
print("hello world")
CODE
print("hello world")
CODE
输出:
[[[1, 0], :on_ident, "print", CMDARG],
[[1, 5], :on_lparen, "(", BEG|LABEL],
[[1, 6], :on_tstring_beg, "\"", BEG|LABEL],
[[1, 7], :on_tstring_content, "hello world", BEG|LABEL],
[[1, 18], :on_tstring_end, "\"", END],
[[1, 19], :on_rparen, ")", ENDFN],
[[1, 20], :on_nl, "\n", BEG]]
语法解析
Bison
parse.y 中定义了 Ruby 代码必须遵守的语法规则。
AST
require 'ripper'
require 'pp'
code = <<CODE
10.times do |n|
puts n
end
CODE
pp code
pp Ripper.sexp(code)
10.times do |n|
puts n
end
CODE
pp code
pp Ripper.sexp(code)
输出:
"10.times do |n|\n" + " puts n\n" + "end\n"
[:program,
[[:method_add_block,
[:call,
[:@int, "10", [1, 0]],
[:@period, ".", [1, 2]],
[:@ident, "times", [1, 3]]],
[:do_block,
[:block_var,
[:params, [[:@ident, "n", [1, 13]]], nil, nil, nil, nil, nil, nil],
false],
[:bodystmt,
[[:command,
[:@ident, "puts", [2, 2]],
[:args_add_block, [[:var_ref, [:@ident, "n", [2, 7]]]], false]]],
nil,
nil,
nil]]]]]
Process finished with exit code 0
显示有关 AST 节点的结构信息
a = 1;
执行:
ruby --dump parsetree demo.rb
输出:
###########################################################
## Do NOT use this node dump for any purpose other than ##
## debug and research. Compatibility is not guaranteed. ##
###########################################################
# @ NODE_SCOPE (line: 1, location: (1,0)-(1,6))
# +- nd_tbl: :a
# +- nd_args:
# | (null node)
# +- nd_body:
# @ NODE_LASGN (line: 1, location: (1,0)-(1,5))*
# +- nd_vid: :a
# +- nd_value:
# @ NODE_LIT (line: 1, location: (1,4)-(1,5))
# +- nd_lit: 1
参考
- 《Ruby 原理剖析》