31
loading...
This website collects cookies to deliver better user experience
statement
per line, and there will only be these statements:read name
print name
set name = expression
#
will start a comment that will go until end of current line.# Program for converting miles to km
read miles
set kms = miles * 1.60934
print kms
statement
is, but what is an expression
?1.60934
( expression )
expression + expression
expression - expression
expression * expression
expression / expression
3 + 4 * 5
means 3 + ( 4 * 5 )
or (3 + 4) * 5
. It also doesn't know if 2 - 3 - 4
is 2 - (3 - 4)
or (2 - 3) - 4
.+
or *
goes first), and for each operator its associativity (left or right or not allowed - it determines which way 2 - 3 - 4
is parsed).expression
with 6 possibilities, it will now only have these:expression + product
expression - product
product
product
is:product * factor
product / factor
factor
factor
is:( expression )
number
+
on either side, unless we use parentheses, because none of the rules it refers to has +
. And 2 - 3 - 4
cannot possibly mean 2 - (3 - 4)
because the right side of -
is a product
, so it cannot have any -
in it, without explicit parentheses.product
or factor
. I don't find them terribly intuitive myself. You could call it expression2
, expression3
or such.expression
is:product ( ("+"|"-") product )*
product
is:factor ( ("*"|"/") factor ]*)
factor
is:"(" expression ")"
number
a|b
means a
or b
. ( ... )*
means zero or more of whatever's in the parentheses. And as there are special characters now, I'm quoting all literal symbols like "+"
, or "("
.StringScanner
from Ruby standard library, and some regexp. Here's tokenizer program for our math language:#!/usr/bin/env ruby
require "strscan"
require "pathname"
class MathLanguage
def initialize(path)
@lines = Pathname(path).readlines
end
def tokenize(string)
scanner = StringScanner.new(string)
tokens = []
until scanner.eos?
if scanner.scan(/\s+/)
# do nothing
elsif scanner.scan(/#.*/)
# comments - ignore rest of line
elsif scanner.scan(/-?\d+(?:\.\d*)?/)
tokens << { type: "number", value: scanner.matched.to_f }
elsif scanner.scan(/[\+\-\*\/=()]|set|read|print/)
tokens << { type: scanner.matched }
elsif scanner.scan(/[a-zA-Z][a-zA-Z0-9]*/)
tokens << { type: "id", value: scanner.matched }
else
raise "Invalid character #{scanner.rest}"
end
end
tokens
end
def call
@lines.each do |line|
tokens = tokenize(line)
puts "Line: \"#{line.chomp}\" has tokens:"
tokens.each do |token|
p token
end
end
end
end
unless ARGV.size == 1
STDERR.puts "Usage: #{$0} file.math"
exit 1
end
MathLanguage.new(ARGV[0]).call
$ ./math_tokenizer.rb miles_to_km.math
Line: "# Program for converting miles to km" has tokens:
Line: "read miles" has tokens:
{:type=>"read"}
{:type=>"id", :value=>"miles"}
Line: "set kms = miles * 1.60934" has tokens:
{:type=>"set"}
{:type=>"id", :value=>"kms"}
{:type=>"="}
{:type=>"id", :value=>"miles"}
{:type=>"*"}
{:type=>"number", :value=>1.60934}
Line: "print kms" has tokens:
{:type=>"print"}
{:type=>"id", :value=>"kms"}
X
should expect that X
will be on the left of the list, and raise exception if it isn't - but they should be OK with there being some leftover tokens.def call
@lines.each do |line|
@tokens = tokenize(line)
next if @tokens.empty?
statement = parse_statement
raise "Extra tokens left over" unless @tokens.empty?
pp statement
end
end
@tokens
to instance variable, so we don't need to pass it to every method.next_token_is?
checks if next token is of any of the expected types and return true
or false
. expect_token
shifts token if it's one of expected types, or raises exception if it's of a wrong type:def next_token_is?(*types)
@tokens[0] && types.include?(@tokens[0][:type])
end
def expect_token(*types)
raise "Parse error" unless next_token_is?(*types)
@tokens.shift
end
def parse_factor
token = expect_token("number", "id", "(")
case token[:type]
when "number", "id"
token
when "("
result = parse_expression
expect_token(")")
result
end
end
def parse_product
result = parse_factor
while next_token_is?("*", "/")
op_token = @tokens.shift
result = {type: op_token[:type], left: result, right: parse_factor}
end
result
end
def parse_expression
result = parse_product
while next_token_is?("+", "-")
op_token = @tokens.shift
result = {type: op_token[:type], left: result, right: parse_product}
end
result
end
def parse_statement
token = expect_token("read", "set", "print")
case token[:type]
when "read"
token = expect_token("id")
{type: "read", id: token[:value]}
when "set"
var_token = expect_token("id")
expect_token("=")
expr = parse_expression
{type: "set", id: var_token[:value], expr: expr}
when "print"
token = expect_token("id")
{type: "print", id: token[:value]}
end
end
$ ./math_parser.rb miles_to_km.math
{:type=>"read",
:id=>"miles"}
{:type=>"set",
:id=>"kms",
:expr=>
{:type=>"*",
:left=>{:type=>"id", :value=>"miles"},
:right=>{:type=>"number", :value=>1.60934}}}
{:type=>"print",
:id=>"kms"}
At line 20 char 1: expected one of "read", "set", or "print", but got token: "number"
.raise "Parse error"
, but we could, and we will in the future.def call
@variables = {}
@lines.each do |line|
@tokens = tokenize(line)
next if @tokens.empty?
statement = parse_statement
raise "Extra tokens left over" unless @tokens.empty?
eval_statement(statement)
end
end
def eval_expression(expr)
case expr[:type]
when "number"
expr[:value]
when "id"
@variables[expr[:value]]
when "+", "-", "*", "/"
left = eval_expression(expr[:left])
right = eval_expression(expr[:right])
left.send(expr[:type], right)
else
raise
end
end
def eval_statement(statement)
case statement[:type]
when "read"
@variables[statement[:id]] = STDIN.readline.to_f
when "print"
puts @variables[statement[:id]]
when "set"
@variables[statement[:id]] = eval_expression(statement[:expr])
else
raise
end
end
else raise
there to catch programming errors. If we coded things correctly, it should never be reached.left.send(expr[:type], right)
is equivalent to left + right
, left - right
, left * right
, or left / right
depending on expr[:type]
.$ ./math.rb miles_to_km.math
500
804.67