CS 444/644 - Compiler Construction (Winter 2020) - Assignment 1
Scanning, Parsing, Weeding, AST Building
The first assignment to implement lexical and syntactic analysis.
The scanner splits up the input into tokens and catches lexical
errors (inputs that cannot form valid tokens).
The parser checks that the input list of tokens conforms to
a syntax specified using a context-free grammar.
The weeding stage detects simple errors that generally could
have been checked by a context-free grammar, but
this would make the grammar very complicated.
Once the checks have been done, it is convenient to convert
the initial parse tree, which follows the context-free
grammar used for parsing, into a simpler Abstract Syntax
It is recommended that your design follow the above four
stages, though it is not required, as long as you somehow
implement the lexical and syntactic specification of Joos 1W.
You must hand in to Marmoset a .zip archive containing
your source code. The .zip file must contain a file called
Makefile. Marmoset will run make on this Makefile
to compile your compiler. The Makefile must generate an executable
(binary or shell script) called joosc. When run with a single
argument, a filename, joosc should process the given Joos 1W file,
produce appropriate diagnostic messages on standard error,
and exit with one of the following Unix return codes:
- 0: the input file is lexically/syntactically valid Joos 1W
- 42: the input file is not lexically/syntactically valid Joos 1W
- any other value: your compiler crashed
The Marmoset tests for this assignment take several minutes to run.
Do not submit more than one submission at a time to Marmoset.
If Marmoset reports that your previous submission has not been tested
yet, do not submit another one. Denial-of-service attacks on Marmoset
will result in disciplinary action.
The following restrictions of the Joos 1W language must be
checked (either in the parser or in the weeder):
This phase of the compiler should also determine the values of all literals
in the program. For integer literals, it must check that the literal
is within the range of a Java int. For character and
string literals, it must expand any escape sequences in the literal.
- All characters in the input program must be in the range of 7-bit ASCII (0 to 127).
- A class cannot be both abstract and final.
- A method has a body if and only if it is neither abstract nor native.
- An abstract method cannot be static or final.
- A static method cannot be final.
- A native method must be static.
- The type void may only be used as the return type of a method.
- A formal parameter of a method must not have an initializer.
- A class/interface must be declared in a .java file with the same base name as the class/interface.
- An interface cannot contain fields or constructors.
- An interface method cannot be static, final, or native.
- An interface method cannot have a body.
- Every class must contain at least one explicit constructor.
- No field can be final.
- No multidimensional array types or multidimensional array creation expressions are allowed.
- A method or constructor must not contain explicit this() or super() calls.
Submit to Marmoset a PDF document of no more than five pages explaining
the design of this phase of your compiler. The document should be
organized to enable someone unfamiliar with your code to understand the
structure of (this phase of) your compiler. In the document, discuss
challenges that you encountered and how you tried to overcome them in
your design and implementation. Also explain the testing that you
did before submitting to Marmoset.
The document will be hand-marked, with 50% of the marks for
organization, clarity, and style, and 50% of the marks for