CS 241 -- Winter 2012 -- Assignment 3

Assignments 3 and 4 may be done in either Scheme, C++, or Java. See language-specific notes for each option at the end of this document.

In assignments 3 and 4, you will incrementally write an assembler for MIPS assembly language (CS241 dialect).

Note carefully: In order to do assignment 4, you must do assignment 3 first. Furthermore, one of the problems on assignment 5 cannot be done if you have not completed assignment 4. We will not be distributing a solution to assignment 3 for you to use as a starting point for assignment 4. Similarly, we will not be distributing a solution to assignment 4 as a starting point for assignment 5.

Submission deadline: Thursday, January 26, 2012 at 10:00 pm

Reminder: For this and future assignments, be sure to run the command source /u/cs241/setup to gain access to the CS 241 tools.

Part I - Scheme or C++ or Java

This exercise allows you to brush up on your Scheme, C++, or Java programming skills. It also ensures you know or learn how to use these languages in the Unix environment.

Assignment 3, Problem 1 — Scheme, C++, or Java in the Unix Environment (2 marks of 60) (filename: output)

Do one of the following options (you may do more than one, but one is sufficient for full marks):

A3P1 Option 1: Scheme

The following Scheme program reads a list of numbers from the standard input, and prints them to the standard output, one number per line:
#lang scheme
(define (copyinput)
 (let ((i (read)))
  (cond 
   ((eof-object? i) '())
   (#t (display i) 
       (display "\n")
       (copyinput)
))))

(void (copyinput))
Use an editor to create a file called Numbers.ss containing the above program.

Run the program using the command: racket Numbers.ss

Enter some numbers to test the program. To signal the end of your input, press Ctrl-D at the beginning of a new line.

Use an editor to create an input file for the program called input containing the sequence of numbers 1 2 3 4 5.

Run the program, redirecting standard input from the file named input as follows: racket Numbers.ss < input

The numbers should be printed to your terminal, one number per line.

Run the program again, this time redirecting standard input from the file named input and standard output to a file named output, as follows: racket Numbers.ss < input > output

View the contents of the output file using the cat command: cat output

Submit to Marmoset the output file output that was generated by running the program.

A3P1 Option 2: C++

The following C++ program reads a list of numbers from the standard input, and prints them to the standard output, one number per line:
#include <iostream>

using namespace std;

int main() {
  while(true) {
    int i;
    cin >> i;
    if(cin.fail()) break;
    cout << i << endl;
  }
}
Use an editor to create a file called Numbers.cc containing the above program.

Compile the program using the command "g++ -o Numbers Numbers.cc".

This command will create a file called Numbers containing the compiled code.

Run the program using the command: ./Numbers

Enter some numbers to test the program. To signal the end of your input, press Ctrl-D at the beginning of a new line.

Use an editor to create an input file for the program called input containing the sequence of numbers 1 2 3 4 5.

Run the program, redirecting standard input from the file named input as follows: ./Numbers < input

The numbers should be printed to your terminal, one number per line.

Run the program again, this time redirecting standard input from the file named input and standard output to a file named output, as follows: ./Numbers < input > output

View the contents of the output file using the cat command: cat output

Submit to Marmoset the output file output that was generated by running the program.

A3P1 Option 3: Java

The following Java program reads a list of numbers from the standard input, and prints them to the standard output, one number per line:
import java.util.*;

public class Numbers {
    public static final void main(String[] args) {
        Scanner s = new Scanner(System.in);
        while(s.hasNext()) {
            int i = new Integer(s.next());
            System.out.println(i);
        }
    }
}
Use an editor to create a file called Numbers.java containing the above program.

Compile the Java program using the command: javac Numbers.java

This command will create a file called Numbers.class containing the compiled code.

Run the program using the command: java Numbers

Enter some numbers to test the program. To signal the end of your input, press Ctrl-D at the beginning of a new line.

Use an editor to create an input file for the program called input containing the sequence of numbers 1 2 3 4 5.

Run the program, redirecting standard input from the file named input as follows: java Numbers < input

The numbers should be printed to your terminal, one number per line.

Run the program again, this time redirecting standard input from the file named input and standard output to a file named output, as follows: java Numbers < input > output

View the contents of the output file using the cat command: cat output

Submit to Marmoset the output file output that was generated by running the program.

Assignment 3, Problem 2 — Scheme/C++/Java Review (7 marks of 60) (filename: twice.ss or twice.cc or Twice.java)

Write a Scheme, C++, or Java program that reads a list of integers from standard input, and prints the list twice, one integer per line. Your program should first print the list in the same order as given in the input; it should then print the numbers again in the same order. You may use, but are not required to use, any part of the code we have provided for Problem 1 above. Your solution must be reasonably efficient.

Test your program.

Submit a file called twice.ss or twice.cc or Twice.java containing the Scheme, C++, or Java source code for your program.

Part II - Writing an Assembler

For the remaining problems in assignment 3 and assignment 4 you will implement an assembler for progressively larger subsets of MIPS assembly language. Subject to the assumptions stated in the problems, your assembler must report all errors, and must correctly translate all correct assembly language programs to MIPS machine language.

We have provided a scanner (also called a tokenizer) for MIPS assembly language for each available language option (see language-specific notes).

Each problem in Part II requires you to submit a program that reads from standard input and writes to standard output as well as standard error. The input and output specifications are identical regardless of which language you choose. The only difference is that you must submit the appropriate .ss, .cc, or .java file depending on your choice of language.

For each problem, we ask you to implement support for additional instructions. You may submit the same assembler for all the problems. We encourage you to submit to Marmoset early. As soon as you implement support for the instructions specified by a problem, submit the current version of your assembler to Marmoset. That way, if you do not complete all of the problems before the deadline, you will still get credit for those that you did complete.

Hint: Depending on the design decisions you make in your solutions to problems 3 and 4, you may have to restructure your code to get a working solution to problem 5. Therefore, you may want to read and understand all of the problems (especially up to and including problem 5) before beginning problem 3. However, if you find this overwhelming, you may find it easier to just focus on solving problems 3 and 4 first, and deal with problem 5 when you come to it. The decision is yours.

Assignment 3, Problem 3 (17 marks out of 60; filename: asm.ss or asm.cc or Asm.java)

Begin by writing an assembler that correctly translates input containing no labels and no instructions other than .word. You may assume that the input to your assembler contains no labels and no instructions other than .word.

Your assembler should never crash, even if the input is not a valid assembly language program. Your assembler should not silently ignore errors in the input program. If the input contains a line that is not valid in MIPS assembly language, your assembler should print an appropriate error message containing the word ERROR in all capitals to standard error and stop. It is good practice, but not a requirement, to embed ERROR within a meaningful error message.

Hint: there are relatively few ways in which an assembly language program can be valid (and all the valid forms are spelled out here), but many ways in which it can be invalid. You will find it much easier to write code that looks for valid input and rejects everything unexpected, rather than code that explicitly looks for all the different ways in which the input could be invalid.

If the input contains a correct MIPS assembly language program, your assembler should output the equivalent MIPS machine language to standard output.

Assignment 3, Problem 4 (17 marks out of 60; filename: asm.ss or asm.cc or Asm.java)

Add support for label definitions to your assembler. Other than the inclusion of label definitions, the restrictions, assumptions and output requirements (including error-checking) stated in problem 3 apply to problem 4.

In addition, if the input is a correct MIPS assembly program, your assembler should output a symbol table: a listing of the names and values of all defined labels to standard error. The list should be printed on several lines, one line for each label in the input. Each line should consist of the label (without the trailing colon), followed by a space, followed by the value of the label (in decimal). The labels may appear in the symbol table in any order.

In handling labels, you may use any data structure or data structures you choose, but be sure to take efficiency into account.

Assignment 3, Problem 5 (17 marks out of 60; filename: asm.ss or asm.cc or Asm.java)

Modify your assembler to allow labels to be defined and also to be used as operands.

Other than the inclusion of label definitions and labels as operands, the restrictions, assumptions, and output requirements (including error-checking) stated in problem 3 apply to problem 5. (Note that you need not list the names and values of defined labels as in problem 4.)

Assignment 4

Submission deadline: Thursday, February 2, 2012 at 10:00 pm

Note: The restrictions, assumptions, and output requirements (including error-checking) as stated in assignment 3 apply throughout assignment 4 as well. In addition, your solution for each problem should continue to be a correct solution for each problem that came before it (for example, a correct solution of A4P3 will also meet the requirements of A4P1 and A4P2).

Assignment 4, Problem 1 (9 marks out of 60; filename: asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle jr and jalr instructions.

Assignment 4, Problem 2 (9 marks out of 60; filename asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle add, sub, slt, and sltu instructions.

Assignment 4, Problem 3 (9 marks out of 60; filename: asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle beq and bne instructions with an integer or hex constant as the branch offset.

Assignment 4, Problem 4 (9 marks out of 60; filename asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle beq and bne instructions with a label as the branch target operand.

Assignment 4, Problem 5 (8 marks out of 60; filename: asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle the lis, mflo, and mfhi instructions.

Assignment 4, Problem 6 (8 marks out of 60; filename asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle the mult, multu, div, and divu instructions.

Assignment 4, Problem 7 (8 marks out of 60; filename asm.ss or asm.cc or Asm.java)

Modify your assembler to correctly handle the sw and lw instructions.

Your assembler should correctly translate any MIPS assembly language program, and write ERROR to standard error for any input that is not a valid MIPS assembly program.

Language-Specific Details

Scheme

The provided starter asm.ss has a function called scan that takes as input a string and returns a list of tokens.

The Using Scheme in CS241 document contains hints and techniques for using Scheme to write the assembler. See also the comments in the provided scanner.

C++

The provided starter asm.cc has a function called scan that returns a vector of tokens.

You are strongly advised to check for pointer-related errors by vetting your programs with valgrind. To do this, run "valgrind program optionsAndArguments" instead of just "program optionsAndArguments" on a Linux or Macintosh computer — valgrind is not available for the Solaris or Windows environments. Marmoset will run your submissions with valgrind as well, and will reject any submission that is reported to leak memory. Be aware that running valgrind increases the execution time of your program by a factor of 5 to 20.

Java

This provided starter Asm.java has a method called Token[] scan(String) that takes as input a string intended to be one line from the assembly language source file. The method partitions the string into tokens and returns an array of Token objects.

The provided scanner ignores all comments (anything on a line following a semicolon) and whitespace.

Each Token contains a field called kind, which indicates what kind of token it is (such as an integer, register, or instruction), and a field called lexeme, which contains the original text that makes up the token. Also, the Token contains a method called toInt(), which returns the integer representation if the token is a decimal or hexadecimal integer constant, or the register number if the token is a register.

The provided scanner comes with a main driver class called Asm. The provided class reads from standard input one line at a time, feeds each line to the scanner, and prints the tokens that the scanner produces. The loop that prints the tokens is for illustration purposes only; comment it out or remove it before beginning to work on the assignment.

Your task is to start with the provided driver class, and extend it into an assembler for MIPS assembly language. Your assembler should output a MIPS machine language binary on standard output. Hint: If you have a one-byte value stored in a Java int, the function System.out.write(int) will output this byte directly to standard output. You can use the xxd command in Unix to convert your binary output back to hexadecimal so you can read it.

You may find it useful to refer to the Java standard class library documentation.