M06: Lists

Slides

Commentary

Video: Download m02/m02.00_pl_design

video m02/m02.00_pl_design

In general, a list of n items is an item followed by a list of n-1 items. With the caveat that a 0-item list is special, of course.

Lists can be built up, one item at a time, with cons. The simplest list is the empty list, signified by the value empty. Every list starts with the empty list.

Lists can be bound to constants; for example, concerts0 and concerts1.

This slide also shows a way of visualizing lists. The empty list is shown as a solid black bar. You’ll see it at the end of every list. Elements of a list are shown as boxes.

Lists are typically drawn with empty on the right. The first item to be added will be right beside it. The most recently added item will be on the left. This also matches the order in which the Racket code is written.

If you have programmed before, this may look a lot like an array. It’s not. The computer can easily access the ith element of an array. That’s not the case with a list. Lists are more restrictive but have other advantages.

By the way, The Water Boys is an auditioned UW a cappella singing club that competes internationally. DaCapo is a local chamber choir growing out of UW-affiliated Conrad Grebel that focuses on new music.

The list of primitive (most basic) functions on lists is short and sweet. Watch the video to learn more.

Video: Download m06/m06.30_basic_constructs

video m06/m06.30_basic_constructs

The primary tools for extracting values from a list are first and rest. They may be used in combination to extract any value in the list.

The following will extract the second item, "eggs", from the list.

This function should handle any list of concerts, which includes the empty list. Applying first to an empty list produces an error, so we need to test for the empty list and handle it specially.

Note that the examples make use of the “concerts” “a” and “b” to keep them more concise.

If a function produces a Boolean value it may be more natural to write the body as a Boolean expression. A first attempt seems simple: check if the first element of the list (that is, (first los)) is the same as the second element, (first (rest los))).

That works great, as long as the list is at least two elements long. If there are either zero or one concerts it fails miserably when either first or rest is applied to an empty list (try it!).

The function makes use of short-circuit evaluation to avoid those errors. The order of the arguments to and matters. The check for whether los is empty must come first (why?).

The four images at the bottom of the slide illustrate the four examples.

The functions on the previous slides include contracts but using a new notation. Let’s understand that notation. The fundamental question is “list of what?”.

The left-hand-side of the contract should be as general as possible. If your function can correctly process a list of numbers such as (cons 1.1 (cons 22/7 empty)) use (listof Num) in the contract rather than the overly restrictive (listof Int) or (listof Nat).

On the other hand, the right-hand-side of the contract should be as specific as possible. For example, if your function always produces a natural number, use Nat in the contract rather than Int or Num.

We don’t currently have a way of writing a contract for a function consuming a list of strings and symbols such as (cons "Hello" (cons 'world empty)) – but we will.

Reviewing the contracts for next-concert and same-consec? indicates they both consume a list of concerts. Concerts are represented by strings, so they both consume a (listof Str). next-concert produces a string; same-consec? produces a Boolean. So their contracts are:

next-concert: (listof Str) -> Str
same-consec?: (listof Str) -> Bool

Formalizing list values (the top three lines of the slide) show us how to construct new list values.

  • empty is a value, by definition.
  • (cons 1 empty) is a value by the second rule and the fact that 1 and empty are both values.
  • (cons 2 (cons 1 empty)) is a value by the second rule and the fact that 2 is a value and, as we just established, (cons 1 empty) is a value.
  • (cons "code" (cons 2 (cons 1 empty))) is a value by the second rule and the fact that ….

We can also run that logic backwards. “(cons 1 (cons 2 empty)) is a list by the second rule and because 1 is a value and (cons 2 empty) is a value. We know that (cons 2 empty) is a value by rule 2 because 2 is a value and empty is a value.”

This slide looks very similar to the slide “Basic list constructs”. What’s the difference?

Basic list constructs specifies that the function cons consumes any value and a list value.
We could have given contracts instead:

  • cons: Any (listof Any) -> (listof Any)
  • first: (listof Any) -> Any (requires the list to be non-empty)
  • etc.

This slide is affirming that values provided to cons can be provided by expressions. For example:

(define lst (cons 1 (cons 2 (cons 3 empty))))
(cons (first (rest lst)) (rest lst))

This is a valid expression because (first (rest lst)) and (rest lst) are both valid expressions.

We now add substitution rules for lists to our existing list of fourteen substitution rules, with examples of two in action.

There’s no substitution rule for cons because of its role in list values. If we would include it, it would be

(cons a b) => (cons a b) where a is a value and b is a list.

In other words, “an X is an X”. And how do we know not apply the substitution rule again? And again. And again. So, it’s not included.

Understanding the next dozen slides is really important. Data definitions, templates, and recursion are core concepts in CS135. Watch the videos in addition to reading the slides and doing the exercises.

Video: Download m06/m06.40_data_def

video m06/m06.40_data_def

We can use this data definition to show rigorously that (cons "a" (cons "b" empty)) is a (listof Str):

According to the data definition, it is a (listof Str) if (and only if) "a" is a string and (cons "b" empty) is a (listof Str).

"a" is certainly a string.

Is (cons "b" empty) a (listof Str)?

Because the template will be used many times to create functions, the video goes farther than the slides and includes as much of the design recipe as possible (nothing in the design recipe goes away once we start using templates). The video also suggests a way to integrate the template with your function development workflow.

Video: Download m06/m06.50_template

video m06/m06.50_template

Here’s the template developed in the video, except that it had a few extra ...s.

;; (listof-X-template lox) PURPOSE
;; Examples:
(check-expect (listof-X-template empty) ANSWER)
(check-expect (listof-X-template (cons X empty)) ANSWER)

;; listof-X-template: (listof X) -> Any
(define (listof-X-template lox)
  (cond [(empty? lox) ...]
        [(cons? lox) (... (first lox)
                          (listof-X-template (rest lox)))]))

;; Tests

Video: Download m06/m06.60_count_concerts

video m06/m06.60_count_concerts

Note that the only parts of the solution that are not in the template are the 0 and the + 1.

Unfortunately, our stepping tool doesn’t support condensed traces. So we’ll be seeing less of it from now on.

Fire up DrRacket and write this code! Be sure to make use of the listof-X-template.

A lot of the design recipe has already been done for you. Study it before going on to actually write the function body.

Think about the “three crucial questions” from earlier in the module:

  • What does the function produce in the base case?
  • What does the function do with the first element in a non-empty list?
  • How does the function combine the value produced from the first element with the value obtained by applying the function to the rest of the list?

As you answer each of these questions, fill in the corresponding part of the template.

Copy and paste your code for count-waterboys and adapt it to count-string. Or go through the whole design recipe from scratch – it would be good practise!

This is the final form of the listof-X-template. You may (should!) use it in your assignments for functions that consume a list. You do not need to include it explicitly.

Simple recursion is easier to use than other forms of recursion (accumulative and generative). We are introducing patterns of recursion as guidance for you so that you can steer clear of the other (harder) forms until we have need of them.

This is a rule of thumb or heuristic for identifying simple recursion.

In the first example, the only argument is one step closer to the base case (lst with the first element removed). Note that the data definition is in terms of the first element and the rest of the list. So “one step closer to the base case” means removing the first element. A list that is one item shorter is not “one step closer to the base case” unless it is shorter because it’s missing the first element. Removing the second element doesn’t count!

The second example is like the first except that there is an additional parameter that is passed along unchanged. The count-string exercise is an example of this.

In the third example, process is meant to capture doing something to lst other than removing the first element. In the fourth example, math-function applied to x indicates that x is changed. In both cases, the function is no longer “simple recursion”.

DrRacket’s Help Desk contains documentation for all of the built-in list functions. There are a lot of them that CS135 students will never need or use. Trust us that we’ll tell you about the ones that are useful when the appropriate time comes.

In the mean time, a few notes:

  1. In addition to the ones we’ve already covered (cons, first, rest, empty?, cons?, list?), the useful ones are append, length, member?, remove, and reverse. But don’t use append, remove, and reverse until we introduce them.
  2. Functions that begin with ‘c’, end with ‘r’ and have a mixture of ‘a’s and ’d’s in between are historical relics from Lisp. The updated versions in Racket are first, second, third, etc. first is used A LOT. secondeighth are used only very occasionally.
  3. We will often state that you cannot use a particular function in a give problem or assignment. You’ll lose marks if you violate that. Why do we do it? To guide you to a particular insight or solution that we think is important.

So far all our list functions have produced a single value – a number or a boolean. But there’s no reason it can’t produce a list. The contract specifies that negate-list produces a list of numbers.

The function so far is simply the listof-X-template with the function renamed to negate-list.

Consider the three questions from earlier:

  1. What should the function produce in the base case?
    This is answered by the first example. It’s a great idea to include the base case(s) as examples.
  2. What does the function do to the first element in a non-empty list?
    In this case, negate it. We already have (first lon) to get the first element. All we need to do is add (- ...) around it.
  3. How does the function combine the value produced from the first element with the value obtained by applying the function to the rest of the list?
    In count-concerts we combined 1 and the recursive application using +. In sum you would have combined the first number with the sum of the rest of the numbers using + again. In my-member? you may have combined Boolean results with or.
    So what operator do we have to combine a value with a list? cons.

This produces a full trace; not the condensed trace shown in the slide.

Try to come up with the non-empty list data definition and template on your own. Then watch the video.

Video: Download m06/m06.70_nelist

video m06/m06.70_nelist

Recall that Racket (and most other languages) represent a string value with double quotes: “Hello, world!” or “To be, or not to be, that is the question.”

Racket has quite a few string-processing functions built-in. Examples include string-append, string-downcase, string-length, and substring (See the Help Desk for more details.)

But what if those functions didn’t exist or we needed something very specific to our project? We need a way to rip a string apart into its underlying characters, work on those characters, and then (maybe) put them back together again.

Check it out in DrRacket’s interactions pane! Type in (string->list "Funky @*π012é") to see what you get. Use cut-n-paste if you don’t know how to produce π on your keyboard.

Note the contract. This is the first function we’re writing that consumes a character (abbreviated Char) or a (listof X) where X happens to be a character.

The characters in the list don’t seem to have any specific meaning, so naming the parameter based on its structure is OK. loc is short for “list of characters”.

In one of the tests the author chose to spell out the list of characters explicitly. In the first two the author chose to use string->list to easily generate the list of characters from a string. Either approach is fine.

Compare this function to the listof-X-template (it’s consuming a (listof Char) after all):

;; listof-X-template: (listof X) -> Any
(define (listof-X-template lox)
  (cond [(empty? lox) ...]
        [(cons? lox) (... (first lox)
                          (listof-X-template (rest lox)))]))

The second question we’ve been asking, “What does the function do to the first element in a non-empty list?” has a more interesting answer in this case. If the first element matches the character we’re counting, count 1. If it does not, count 0. Whichever it is (1 or 0), add it with the result of processing the rest of the list to get the final answer.

There is another way to structure this function that’s worth a look. It doesn’t match the data definition and template as clearly, but is often used.

;; count-char/list: Char (listof Char) -> Nat
(define (count-char/list ch loc)
  (cond
    [(empty? loc)            0]
    [(char=? ch (first loc)) (+ 1 (count-char/list ch (rest loc)))]
    [else                    (count-char/list ch (rest loc))]))

The (listof X) template is a starting place for our code. It’s OK for the final result to look different, as is the case here.

Finally, a third version. This works but is considered poor style because it can be trivially transformed into the simpler and more readable code shown above.

;; count-char/list: Char (listof Char) -> Nat
(define (count-char/list ch loc)
  (cond
    [(empty? loc)            0]
    [else 
       (cond [(char=? ch (first loc)) 
              (+ 1 (count-char/list ch (rest loc)))]
             [else                    
              (count-char/list ch (rest loc))])]))

This is different from the code shown in the slide because in the slide the else clause is not a cond. It is an addition (that happens to involve a cond).

“additions made to the semantic model” means the new stepper rules.