YACC - "bottom-up" parsers Exercise 1. (easy) ------------------------------------------------------------------- Use the YACC generator to create a parser to check whether an input file contains a sentence of the form 'an bn' (a' raised to the power of n and 'b' raised to the power of n) or if there is a "Syntax Error!!!"(being signalled together with the line number). The spaces, newlines and tabs should be omitted. For example if the input sentence is: aaa b bb the output should be accepted. In the case of: aaa ba bb the output looks like: 2: Syntax Error!!! Exercise 2. (easy) ----------------------------------------------------------------------------- Use the YACC generator to create a program to parse Boolean expressions being built of: constants 'true' and 'false', logical connectives 'or', 'and' and parentheses. Example input: false or true and (false or true) is accepted. The other input: true or false and causes the signal: Syntax Error!!! The parser is partially defined as: %token or and true false %start S %% S : E ; E : E or T | T ; %% #include yyerror(char *s) { printf("Syntax Error!!!"); } Your task is to augument the given definition. Exercise 3. (medium) ---------------------------------------------------------------------------- Let us consider Boolean expressions in Pascal consisting of: integer variables identifiers, equality symbol '=' and conjunction connective 'and'. Let us assume that expressions of the form: a=b=c are illegal. The scanner looks like: %{ #include "gram.h" %} %% "and" {return and;} "=" {return eq;} [a-zA-Z]+ {return var;} \ |\n|\t {;} . {yyerror(" Unexpected character!");} Let us consider the following definition: %token and eq var %start S %% S : E ; E : E eq E | E and E | var ; %% #include yyerror(char *s) { printf("Syntax Error!!!"); } Unfortunately, this parser is incorrect for it accepts the input: a=c=n Your task is to correct the parser in the way that for: a=c=n it gives Syntax Error!!! and for: a=b and b=c it accepts the input. Exercise 4.(easy) ------------------------------------------------------------------------- Use the YACC generator to create a parser to check whether an input text file is of the form: x|y where 'x' is a sequence, possibly empty, of octal digits and 'y' is its reverse. Example correct file is as follows: 1234|4321 but 1234|3321 is incorrect and should be signalled as "Syntax error!". Exercise 5.(easy) --------------------------------------------------------------------------- Use the YACC generator to create a parser to check whether an input text file contains a sequence of correctly paired parentheses. Any incorrect sequence should be signaleed as "Syntax error!". Example correct data: (()()((())())()) but (() is incorrect. YACC - "bottom-up" parsers Exercise 6. ---------------------------------------------------------------------------- Let us consider the metalanguage to define regular expressions being built on the alphabet A={a,b,c,d, f,g,...,z, 0,1,...,9}. Let us also assume that: - "e" denotes the language consisting of the empty word, - "O" denotes the empty language (set), - every symbol of the alphabet denotes the one-element language containing only this symbol, - let r, r1 i r2 be regular expressions, then "r1 | r2", "r1 r2", "r *" , "r +" and "( r) " are regular expressions. Nothing more is considered to be a regular expression. Use YACC to create a parser for the defined language. The scanner looks like: %{ #include "gram.h" %} %% O {return 'O';} e {return 'e';} [a-df-z] {return(letter);} [0-9] {return(digit);} "+" {return('+');} "*" {return('*');} "|" {return('|');} "(" { return('('); } ")" { return(')'); } \ |\n|\t {;} . {yyerror(" Unexpected character!");} Example correct data are for instance: (abc|f*|e)+ | O Example illegal data: (x|y|)*(www Exercise 7. ------------------------------------------------------------------------------ Let us consider a simple Pascal-like language to declare variables of types "integer" and "real". The program starts with the 'program' keyword followed by name (a sequence of letters and digits starting with a letter) and the semicolon ';'. Program body starts with 'var' followed by nonempty sequence of declarations. Each declaration consists of nonempty sequence of variable names followed by ':' and a word 'integer' or 'real'. Declarations are separated by ';'. The last declaration ends with '.' Use the YACC generator to create a parser for the language checking whether either the input is correct or there is the "Syntax error!". The scanner is given below: %{ #include "gram.h" %} %% "program" {return( program);} "var" {return( var);} "integer" {return( integer);} "real" {return( real);} [a-zA-Z][0-9a-zA-Z]* {return(nazwa);} ";" { return(';'); } ":" { return(':'); } \. { return('.'); } "," { return(','); } \ |\n ; . {yyerror("Unexpected character!\n");} The scanner ignores spaces, newlines and tabs and gives 5 lexical items to be declared in '%token' specification in YACC. The example correct program looks like: program test; var a,Beta,c21:integer; c,wart:real. The given below program contains 2 bugs - a lexical one and a syntactical one: program xxx; var# a,beta,, c21:integer; c,wart:real. YACC - "bottom-up" parsers Exercise 8. ------------------------------------------------------------------------------ Let us consider a simple Pascal-like language to declare variables of types "integer" and "real". The program starts with the 'program' keyword followed by name (a sequence of letters and digits starting with a letter) and the semicolon ';'. Program body starts with 'var' followed by nonempty sequence of declarations. Each declaration consists of nonempty sequence of variable names followed by ':' and a word 'integer' or 'real'. Declarations are separated by ';'. The last declaration ends with '.' Use the YACC generator to create a parser for the language checking whether either the input is correct or there is the "Syntax error!". The scanner is given below: %{ #include "gram.h" %} %% "program" {return( program);} "var" {return( var);} "integer" {return( integer);} "real" {return( real);} [a-zA-Z][0-9a-zA-Z]* {return(nazwa);} ";" { return(';'); } ":" { return(':'); } \. { return('.'); } "," { return(','); } \ |\n ; . {yyerror("Unexpected character!\n");} The scanner ignores spaces, newlines and tabs and gives 5 lexical items to be declared in '%token' specification in YACC. The example correct program looks like: program test; var a,Beta,c21:integer; c,wart:real. The given below program contains 2 bugs - a lexical one and a syntactical one: program xxx; var# a,beta,, c21:integer; c,wart:real.