M03: Design Recipe

Slides

Commentary

The design recipe is a process for developing programs. The intent is to make your life easier by asking the right questions and performing the right tasks in the right order.

It’s really tempting, even for experienced programmers, to just jump in and start writing code. Don’t! The design recipe helps you think about the problem first so you can use your time efficiently.

You should have the Thrival Guide read by now. It makes sense to leave the Style Guide until after you’ve finished reading the M03 slides but before you start the assignment.

Some instructors say that this is the most important slide in the entire course.

The communication between you and computer must describe precisely what the program should do (but maybe not very readable by humans).

“The future” may be as little as five minutes later (depending on one’s short-term memory), or as much as years later.

Even programs that one expects will never be seen by others should be written as if they were; it will help in getting them working properly without wasting time, and one never knows when something will prove useful to others.

All programs written on assignments and exams are going to be read for marking purposes; that alone merits careful attention to communication.

A sample Racket comment:

;; Add x twice
(define (twice x) (+ x x))

Some definitions:

  • Correct: The software gives the correct answer.
  • Efficient: The program uses resources efficiently. Resources could include processor time or memory.
  • Maintainable: The program is easy to modifiy if a bug is found or a new requirement is added.
  • Portable: The program will work (without modification) on many different kinds of computers.
  • Testable: The program is built in parts, each of which can be easily tested for correctness.

We won’t define them all because that’s not the point. The point is that there are many possible goals when developing software.

The goals that will be most important in CS135 are correctness, readability, and testability. Efficiency will have a minor role towards the end of CS135 and a more major role in CS136.

The focus is on the process while you develop the function.
This is not something you tack on at the end. Tacking it on at the end turns it into a make-work project. At that point most of the benefit of the design recipe will have been lost.

Purpose: more English-like.

Contract: more math-like. So far, our functions have only consumed numbers and produce a number. We’ll soon see different kinds of data. As our data becomes more complex, the contract will take on an increasingly important role.

The function definition can be divided into two pieces, the header and the body. The header is the function name, parameters and the define keyword. The body is the expression that computes the result.

We’ll have much more to say about examples and tests in a little bit.

Course personnel will be reluctant to assist students who have problems with code that lacks a contract, purpose, or examples. Please do your best on them before you seek help.

It’s really common for students to use the following order instead of the suggested order:

  1. Attempt to write the function (step 5).
  2. Debug.
  3. Debug some more.
  4. Seek help.
  5. Finally get the function working for some inputs.
  6. Do some informal testing in the REPL.
  7. Fix a couple of bugs.
  8. Write the purpose and contract.
  9. Write down the examples and tests previously used for informal testing.
  10. Wonder why course staff makes you “waste” time on the design recipe.

Doing these steps in order is important.

As you read these slides on using the design recipe, think about how you would apply each step to write a function that sums the integers from 0 to n.

Purpose: The purpose says what the function is supposed to do. That’s helpful for potential users of your function. But it’s also necessary for you to write the function. If you don’t have a clear understanding of what the function is supposed to do, you’ll have a difficult time writing it.

The purpose says what the function does; it doesn’t say how.

The purpose is placed in a comment. Racket comments that take the entire line (as this one does) traditionally start with two semi-colons. Comments that appear at the end of a line of code traditionally start with a single semi-colon.

Examples: Examples serve three purposes.

The first purpose is to show the function’s user a typical use of the function. What does the code look like that uses this function?

The second purpose is a simple test to ensure it does what it is supposed to do. That is, given a example use of the function (the first purpose), what is the correct result? We write this as executable code, not a comment, so the computer can check it for us. More on this in a moment.

The third purpose is to go through the process of finding the answer used for that simple test. If you can’t find the answer manually, you surely can’t write a function that tells the computer how to do it.

Examples don’t need to be big or use large numbers. Usually the smallest non-trivial examples are the best. They’re easy to work out by hand and easier to verify that they are correct. But don’t chose examples that deliberately avoid steps in the solution, either.

Include examples in your program using the built-in function check-expect. It takes two arguments. When you click “Run”, DrRacket will apply check-expect to the arguments. If the two arguments evaluate to the same thing, it will simply print “The test passed” in the interactions pane and go on to execute the rest of the program. If the arguments evaluate to different values, it will print those values and stop execution.

Header: The header is the whole function except for the body expression. Most importantly, it includes the function name and parameter names.

You’ll often be given the name of the function in the assignment specification, although sometimes you’ll choose the name yourself. In that case:

  • you will already have chosen the name as part of writing down the examples;
  • choose a name that’s meaningful (the style guide has helpful suggestions).

Likewise, choose parameter names that are meaningful. Names like interest-rate and student-name are great. Some functions will just consume numbers that don’t have a specific meaning. In those cases n or i (if an integer) is fine.

Contract: The contract says what type of data the function consumes and what type of data it produces. The contract always contains an arrow. We’ll often typeset it as shown in the slide, but you should write it as a dash and greater-than sign (->) in your code.

The left side of the arrow will contain a data type for each parameter. The right side will contain the data type the function produces. We’ll discuss the possible data types a little later in this lecture module and add to them as the course progresses. For now, Num means any number (e.g. 3, 22/7, π, etc.). That is, sum-of-squares consumes two numbers (one for n1 and one for n2) and produces another number.

Contracts give us an opportunity to carefully think through the data consumed by the function. Is it really all numbers or only integers? All integers or only non-negative integers?

Contracts may feel trivial now, when we only have a few data types to chose from. As we add more data types and techniques, the contracts will become more complex and offer real help in designing our functions. Looking ahead to M14, we’ll eventually see a contract such as (X Y -> Y) Y (listof X) -> Y.

Note that the contract begins with the name of the function and a colon.

Purpose: Now that the names of the function and its parameters are established, we can polish the purpose statement. It begins with the name and parameters, mimicing an application of the function. The parameter names are used in the purpose statement to clarify their roles.

Function Body: Finally, we’re ready to write the function body. Hopefully, after working our a number of examples by hand, this is reasonably easy to do.

Tests: The last step in the design recipe process is to write additional tests to cover any complexities not covered by the examples. We’ll have more to say about tests and the relationship between tests and examples a little later.

Tests are usually written after (in time) the function is written. That way the tests can take into account the specifics of the code. There is a school of thought that says tests should always be written first.

Tests are written after (on the page) the function.

Small, directed tests make it clear where the problem is when they fail. With one large test, it may not be clear that all of the code is exercised (this is not evident now, but will be once we see conditional expressions) and even if all of the code is exercised, it is hard to tell where the error is.

Working out the answer to a test “independently” does not necessarily mean with pencil and paper. It might involve a calculator or spreadsheet or published examples or …. The point is, you derive the answer without using the code you’re trying to test.

Implementing check-expect is actually pretty tricky – to the point that it didn’t exist when the first version of the textbook (the one CS135 is based on) was written. So the textbook uses testing methods that are now obsolete.

The last parameter to check-within is the tolerance. The example is actually checking that 1.414 - 0.001 <= (sqrt 2) <= 1.414 + 0.001 is true.

The contract says what kind of data our function consumes and what kind of data it produces. Right now, the only kind of data we know about are numbers. But even here there are different kinds of numbers. A function might only work for integers, for example, and fail for non-integers.

As we saw in the previous video, contracts have the form _____ -> _____ where the left-hand side describes the data the function consumes and the right-hand side describes the data the function produces.

On the left, give the most general data type for which the function will always work. Suppose your function works for all Ints, fails for some Nums, and is typically used on Nats. Then the left-hand side should be Int. It’s not Nat because Int is more general.

On the right, give the least general data type. It’s incorrect to say Int -> Num if the function only produces integers. In that case, say Int -> Int.

Watch a demo of applying the design recipe. This example is more complex than sum-of-squares and illustrates how to handle a “helper function”.

Problem Statement: Write a function, sum-range, which sums the numbers from a to b. For example, (sum-range 3 6) should produce 3 + 4 + 5 + 6 or 18.

Video: Download m03/m03.50_dr_demo

video m03/m03.50_dr_demo

In the sample code, we have no idea what type is associated with the value of q unless we trace through the evaluation of (mystery-fn 5) or trust its contract.

Errors in a dynamically typed program are only found if the code is executed – and then only if the right values are used (think division by zero). Sometimes those errors are found by users, years after the program is “finished”.

In contrast, a statically typed language is processed by the computer (compiled) before it is executed. The compilation process can find a large class of errors automatically so they can be fixed before the program is allowed to run.

Note that when we say “all arguments…will obey the contract”, that’s what we (the course instructors) think the contract is. That might be different from what you write in your assignment!

Now would be a good time to read the Style Guide. It gives lots of concrete direction on how to write up your assignments.

Most programming teams have some kind of style guide that specifies how code is formatted, identifiers chosen, documentation expectations, techniques to prefer (or avoid), etc. The goal is code that is easier to read, understand, and maintain. Everyone on the team is expected to follow to the style guide.

You are part of the CS135 team. It consists, at a minimum, of you, your instructor(s), and the staff that are marking your assignments and exams. As a member of this team, you are expected to follow the CS135 style guide.