CS 241 — Spring 2018 — Assignment 7

Assignments for CS 241
← Assignment 6 Assignment 7 Assignment 8 →
Wednesday, June 27 at 5:00pm Wednesday, July 4 at 5:00pm Wednesday, July 11 at 5:00pm
P1P2P3P4P5P6

In this assignment, you will implement the semantic analysis phase of the WLP4 compiler for which you wrote a scanner and parser in the previous two assignments. In particular, it is the job of this assignment to catch all remaining compile-time errors in the source program; a program which makes it through this phase of compilation is guaranteed to be free of compile-time errors.

Note: the following two assignments will build upon your solution for this assignment. In order to proceed to those assignments, you must at minimum complete the part of this assignment devoted to building the symbol table (Problem 1). As long as your solution does at least this much correctly, you will not be disadvantaged when you start the following assignment. However, in order to complete assignment 9 you will need a working type checker.

Resources

The specific semantic rules that must be enforced are those outlined in the "Context-sensitive Syntax" section of the WLP4 Specification. A summary of the semantic rules of WLP4 related to types is available for download, and you should use it as a reference. The summary contains formal descriptions of type rules for WLP4; although these should match the English descriptions on the WLP4 specification page, the WLP4 specification should be considered the definitive resource. Please report any discrepancies between the two documents.

Testing

You must test your semantic analyzer yourself in Linux. To test your semantic analyzer, you will need .wlp4i files as generated by the parser specified in the previous assignment. In case you do not wish to use your own parser, you can use the one we have provided. Invoke it using the command wlp4parse.

Starting from a .wlp4 source file, the complete sequence of commands to test your semantic analyzer on it is:

wlp4scan < foo.wlp4 > foo.scanned
wlp4parse < foo.scanned > foo.wlp4i
racket wlp4gen.rkt < foo.wlp4i

or

wlp4scan < foo.wlp4 > foo.scanned
wlp4parse < foo.scanned > foo.wlp4i
./wlp4gen < foo.wlp4i

This can be abbreviated to

cat foo.wlp4 | wlp4scan | wlp4parse | racket wlp4gen.rkt

or

cat foo.wlp4 | wlp4scan | wlp4parse | ./wlp4gen

It is strongly recommended that you create a parse tree which can then be traversed to check for various compile-time errors. It may be useful to store the following items at each node in the tree:

Future assignments will build upon the code written in this assignment. In future assignments you may want to add additional capabilities to nodes in the tree, such as keeping track of what the node does (addition, subtraction, etc).

Remember the leaf nodes of the parse tree are terminals, so you can detect if you reached a leaf in the tree if the LHS of the rule is one of: BOF, BECOMES, COMMA, ELSE, EOF, EQ, GE, GT, ID, IF, INT, LBRACE, LE, LPAREN, LT, MINUS, NE, NUM, PCT, PLUS, PRINTLN, RBRACE, RETURN, RPAREN, SEMI, SLASH, STAR, WAIN, WHILE, AMP, LBRACK, RBRACK, NEW, DELETE, NULL.

Recall that in A1P1 you had to read a tree as part of your solution. You may find your solution to A1P1 helpful for starting this assignment.

Each solution in this assignment builds upon the previous question, so you must do the questions in order and your solution for any Pi should also solve Pi-1.

Problem 1 — 18% (wlp4gen.rkt or wlp4gen.cc)

For this problem, assume that the WLP4 file contains only the procedure wain. In other words, assume the production procedures → procedure procedures is not used, so that the non-terminal procedures is guaranteed to derive wain directly.

Implement reading in a .wlp4i file and creating a parse tree. Modify the program so that it collects information about declarations and types. To solve this problem, you must build a symbol table which contains an entry for every declared identifier in the program, including the parameters of wain. If an identifier is declared more than once, or is used without first having been declared, you must output a string containing ERROR to standard error. You are encouraged, but not required, to make your error messages more descriptive than this as long as the string ERROR occurs. These are the only errors you need to check for this problem.

If the input program contains no errors of the kind described above, then you must output a symbol table to standard error containing the name and type of every identifier declared in the program, including the parameters of wain. The symbol table must be formatted as follows: the first line contains the name of the procedure (wain); subsequently, the output contains one line for each declared identifier, and each line consists of the (case-sensitive) name of the identifier, followed by a space, followed by the identifier's type (int or int*). The order in which the entries in this symbol table occur is not important.

For example, if the input is:

int wain(int *a, int b) {
  int c = 0;
  return *a + b + c;
}

then the output on stderr should be:

wain
a int*
b int
c int

Click here to go back to the top of the page.

Problem 2 — 6% (wlp4gen.rkt or wlp4gen.cc)

Extend your solution to the previous problem to account for the possible presence of other procedures in the program, but still only process wain. For example, if the input is

int id (int x) { return x; }
int wain(int *a, int b) {
  int c = 0;
  return *a + id(b) + c;
}

then the output on stderr should be:

wain
a int*
b int
c int

Click here to go back to the top of the page.

Problem 3 — 12% (wlp4gen.rkt or wlp4gen.cc)

Extend your solution to the previous problem to include signatures for all procedures encountered, but still only show the variables local to procedure wain. The signature of a procedure is the sequence of parameter types that the procedure was declared to take; it does not include the result type. Leave an extra newline between procedures so that we can tell where one procedure ends and the next one begins. For this problem, you must indicate an error by writing a string containing ERROR to stderr if a procedure is declared more than once.

For example, when using the same input to the previous problem, the output on stderr should be:

id int

wain int* int
a int*
b int
c int

Click here to go back to the top of the page.

Problem 4 — 18% (wlp4gen.rkt or wlp4gen.cc)

Extend your solution to the previous problem to include type information for all variables in all procedures. Leave an extra newline between procedures so that we can tell where one procedure ends and the next begins. If any variable is declared more than once in the same scope, or used in a scope in which it is not declared, output a string containing ERROR to stderr. By the end of this problem, all errors related to declarations and scope should be caught.

For example, when using the same input to the previous problem, the output on stderr should be:

id int
x int

wain int* int
a int*
b int
c int

Click here to go back to the top of the page.

Problem 5 — 23% (wlp4gen.rkt or wlp4gen.cc)

Extend your solution to the previous problem to check for type errors in expressions. In particular, for every subtree with an expr or lvalue node at its root, compute the type associated with that subtree, or output a string containing ERROR to stderr if the expression cannot be validly typed. If every expression in the program has a valid type and there are no declaration errors then your program should not produce any output.

For example, in a node corresponding to the rule expr → expr PLUS term if the expr and term on the right-hand side both have type int*, then your program should output ERROR to stderr. Otherwise the expr on the left-hand side is well-typed, and its type is given on the semantic rules handout.

Click here to go back to the top of the page.

Problem 6 — 23% (wlp4gen.rkt or wlp4gen.cc)

Extend your solution to the previous problem to check for type errors in statements and tests. These are program elements that do not themselves have types, but demand that some of their subelements be given particular types. For example, println can only be used with type int, and delete [] can only be used with type int*.

By the end of this problem, your program should check and enforce every type rule given on the semantic rules handout. If a source program contains any kind of semantic error, you must output a string containing ERROR to standard error. If the source program is free of semantic errors, your program should produce no output.

Click here to go back to the top of the page.