CS 115 Module 7: The online part

Hello CS 115 Students,

In our last in-person classes (for Grarme’s and Yangtian’s sections), we covered up to and including slide 42 of Module 7. Here are the “lecture notes” compiled for the rest of the module, in the style that Yangtian has been posting for some of the topics throughout the term, so that you may read along and finish the content of Module 7. Please see the announcements on Learn for future updates on all aspects of the course.

Thank you for your understanding and stay safe.
Graeme, Sandy, Yangtian

Local expressions

Introduction

At this time, you should know how to write a recursive function that consumes a list and a number using the abstract list functions.

Here is a new question: how do you write a function that consumes a value along for the ride?

Defining a function within another

For example, the one in Slide 43:

Suppose we want to use abstract list functions to solve:

Write a function multiples-of that consumes a positive integer, n, and
a list of integers, ilist, and produces a new list containing only those
values in ilist which are multiples of n.

So it might be tempting to write this:

(define (is-mult? m)
    (zero? (remainder m n)))
(define (multiples-of n ilist)
    (filter is-mult? ilist))

However this does not work! In function is-mult, we have used n, but n is not defined as a variable or a parameter.

Can we simply change the parameters of is-mult?
No, filter requires the function that is passed in to take only one parameter.

Wouldn't it be nice if we can somehow define the function is-mult? in side of multiples-of? That is the only place we can get access to n. Note that the goal here is to make sure is-mult? works for every single n, not just for some fixed n; therefore, we cannot simply define n to be a constant.

So far, the functions and special forms we’ve seen can be arbitrarily nested – except define and check-expect. We can only make definitions at the top level, outside of any expressions.

To define a constant and function within the context of a function, the intermediate language provides the special form local, which creates a series of local definition and an expression using them, of the form

(local [def1 ... defn] exp)

The produced value of this special form is the evaluated value of exp. In exp we can make use of the definitions def1, ..., defn.

So to define is-mult? within multiples-of:

(define (multiples-of n ilist)
    (local
        [;; (is-mult? m) produces true if m is a multiple of n,
         ;; and false otherwise
         ;; is-mult?: Int -> Bool
         (define (is-mult? m)
            (zero? (remainder m n)))]
        (filter is-mult? ilist)))

Note the following:

In summary, in this example we are using local for defining a function within another function, for the inner function needs to access a parameter in the context of the outer one.

There is actually many use of local, and we will see some of the most important ones.

Defining a constant within a function (part 1)

Sometimes we would like to define a constant within a function to document parts of the program that looks confusing.

In this example, we are re-writing swap-parts from Module 2:

(define (mid num)
    (quotient num 2))
(define (front-part mystring)
    (substring mystring 0 (mid (string-length mystring))))
(define (back-part mystring)
    (substring mystring (mid (string-length mystring))))
(define (swap-parts mystring)
    (string-append (back-part mystring) (front-part mystring)))

Note that in front-part and back-part, we are repeating the function application (mid (string-length mystring)). This function application looks somewhat confusing and make the code a bit hard to read.

Now that we have introduced local we can make use of it. In this case, we are defining a constant mid instead of using another function for computing the middle index of the string:

(define (swap-parts mystring)
    (local
        [(define mid (quotient (string-length mystring) 2))
         (define front (substring mystring 0 mid))
         (define back (substring mystring mid))]
        (string-append back front)))

Instead of using three functions that compute the front and back of the string, we now just let the constants storing their values. Not only do we have fewer functions to write, the code looks a lot cleaner; we don't need to pass strings into multiple functions at a bunch of different points.

Also note that we are using mid in the definition of front and back. Just as global definitions, you must define a constant or a function before using it in another definition.

Something you might also have noticed in this example is that the version in Module 2 computes (mid (string-length mystring)) twice, and in the version that we uses local, the middle index is only computed once. This means that our new version is more efficient since it is doing less work. Efficiency as a topic will be discussed in CS 116 in more detail.

Re-using names in local definition

Previously, we have showed that you can reuse the names as long as there is no place where two identifiers of the same name being introduced at the same scope.

In the following example, n is both bound to a value and is also a name of a parameter under different contexts.

(define n 10)
(define (myfun n) (+ 2 n))
(myfun 6)

A local definition, in fact, enables us to define new names as constants and functions in the context of the function.

(define (my-fun n)
  (local
    [(define (local-fun n) (* n 10))]
    (+ n (local-fun n))))

In this example, the n in the expression part of local refers to the parameter of my-fun, and n in the body of local-fun refers to the parameter n of local-fun. In this example, the two ns will have the same vlaue

To confirm this in Dr. Racket, paste the above code to the editor, click check syntax and hover over the two parameters. These are what you should be seeing:

A screenshot in Dr.Racket

Nested local expressions

Sometimes using local at the beginning of the function does not make sense, since the definition of the data might make additional assumptions which is true in some part of the code.

(define (list-template-with-local alist)
  (cond
    [(empty? alist) ...]
    [else 
      ;; this local can't be moved to the top level of this function, since alist might be empty.
      (local
        [(define x (* 2 (first alist)))]
        ...)]))

Since local itself is treated as an expression, we can nest one local within another:

;; What does this expression produce?
(local
  [(define x 1)]
  (+ 1 
     (local 
       [(define y 2)]
       (+ x y))))

However, as you can see, having more than one local expression in a function definition hurts readability. That is why we are not using them often!

Defining a constant within a function (part 2)

The important observation in a purely functional world is that given a subexpression in the same context (that means, in the same function or local expression body) always produces the same value. If in the same context the same subexpression appears twice, the value produced is the same, but the expressions are evaluated twice.

To make sure we are only computing a subexpression once, we can pre-compute it and store its value in an local-ly defined constant. Every single time we would like to use the value of the subexpression, we can just refer to the constant.

For example, in Slide 54:

Suppose you are trying to calculate the roots of a quadratic equation
ax2+bx+c=0a x^2 + bx + c = 0. Assuming they exist, they are calculated as bb24ac2a\frac{-b-\sqrt{b^2-4ac}}{2a} and b+b24ac2a\frac{-b+\sqrt{b^2-4ac}}{2a}.

Without local you have to compute the b24ac\sqrt{b^2-4ac} twice:

(define (roots a b c)
        (list (/ (- (- b) (sqrt (- (sqr b) (* 4 a c)))) (* 2 a))
              (/ (+ (- b) (sqrt (- (sqr b) (* 4 a c)))) (* 2 a))))

Now, with local, you can factor that part out and define it in a constant:

(define (roots a b c)
    (local
        ;; d is the square root of determinant
        [(define d (sqrt (- (sqr b) (* 4 a c))))]
        (list (/ (- (- b) d) (* 2 a))
              (/ (+ (- b) d) (* 2 a)))))

Isn't it better?

Sometimes a expression can take long to compute. Consider the two versions of the same function:

;; (list-max alon) produces maximum in alon.
;; list-max: (listof Num) -> Num
;; requires: alon is nonempty
(define (list-max alon)
  (cond
    [(empty? (rest alon)) (first alon)]
    [else 
     (cond
       [(> (first alon) (list-max (rest alon))) (first alon)]
       [else  (list-max (rest alon))])]))

(define (list-max2 alon)
  (cond
    [(empty? (rest alon)) (first alon)]
    [else
     (local [(define max-rest (list-max2 (rest alon)))]
       (cond
         [(> (first alon) max-rest)  (first alon)]
         [else max-rest]))]))

The only difference between them is that in the second version, we are using a constant max-rest to denote the maximum of the rest of the alon.

Let's see when given a big list, how long does it take to find out the maximum?

All of the teaching languages have a utility function time which keeps track of the time taken to compute a certain expression and displays it to the screen. You do not need to know this for the final evaluation.

Entering these in the interactions window on a gaming computer (that is, a computer with decent hardware):

> (define big-list (range 0 25 1))
> (time (list-max big-list))
cpu time: 53026 real time: 53180 gc time: 106
24
> (time (list-max2 big-list))
cpu time: 0 real time: 0 gc time: 0
24

Note that the time is rounded to nearest milliseconds. You might see different value for time since different computers have different speeds when executing code. You will see a more detailed discussion on how to measure efficiency of code in CS 116 or CS 136.

While the two function produces the same result, but we see a significant difference in time. Why is that?

You might think since there is one extra duplicated function application, the time taken by the first version will merely be double of the second version. This seems counter-intuitive, but think about it in this way: the first version makes two recursive calls to the rest of the list. Given a list of length 3, say (list 1 2 3), list-max will be used in the following ways:

So in the first version we have used this function 15 times! This is something called exponential growth, and is extremely bad. We programmers would like to avoid it as much as possible.

In contrast, the second version only invokes list-max2 four times: each time applying it on a list of length 3,2,1 and 0 respectively.

In summary, sometimes we would like to create a constant for intermediate results, so we do not need to compute the same expression twice. In some cases these intermediate results can take a lot of time to compute, especially when the intermediate result is a recursive application; in this case, putting the intermediate result as a constant is required for efficiency.

Defining a constant within a function (part 3)

An approach for writing clean and readable code is use a meaningful names instead of comments to explain the code. Sometimes we just want to document our code by using a name attached to a value (i.e. a constant).

The following code illustrates this idea.

;; distance: Point Point -> Num
(define (distance point1 point2)
  (sqrt (+ (sqr (- (first point1) (first point2)))
           (sqr (- (second point1) (second point2))))))

(define (distance point1 point2)
  (local [(define delta-x (- (first point1) (first point2)))
          (define delta-y (- (second point1) (second point2)))
          (define sqrsum (+ (sqr delta-x) (sqr delta-y)))]
     (sqrt sqrsum)))

In this example, without giving the (- (first point1) (first point2)) a name, it might take a while to figure out that we are computing the difference in the x-coordinate of two points; however, giving this part a descriptive name (delta-x) makes the program much more readable.

Using local for encapsulation

encapsulation - noun. the action of enclosing something in or as if in a capsule.

-- Oxford Dictionary of English

In the context of computer science, encapsulation is often used to describe the action of grouping items together. In addition, this term also refers to hiding information that other people such as users of the code do not need to see.

For example:

(define (insert n alon)
  (cond
    [(empty? alon) (cons n empty)]
    [(<= n (first alon)) (cons n alon)]
    [else (cons (first alon) (insert n (rest alon)))]))

(define (my-sort alon)
  (cond
    [(empty? alon) empty]
    [else (insert (first alon) (my-sort (rest alon)))]))

In this case, my-sort is the function other programmer will use to sort their list. It does not seem that they will be using insert. We can then hide the implementation of insert in my-sort.

(define (my-sort alon)
  (local
      ;; (insert n alon) produces sorted list including n and all 
      ;; values in alon 
      ;; insert: Num (listof Num) -> (listof Num)
      ;; requires: alon is sorted in nondecreasing order
    [(define (insert n alon)
       (cond
         [(empty? alon) (cons n empty)]
         [(<= n (first alon)) (cons n alon)]
         [else (cons (first alon) (insert n (rest alon)))]))]
    (cond
      [(empty? alon) empty]
      [else (insert (first alon) (my-sort (rest alon)))])))

Again, for function definitions within local you only need contract and purpose.

Another Example

Here is an example where you might need all of the techniques above.

Here is another example, shorter-than-avg, in slide 69.

Use abstract list functions and local to write a function
shorter-than-avg that consumes a list of strings,
slist, and produces a list of all those strings in slist
that are shorter than the average length of all strings in slist.

Try write it yourself before looking at the model solution.

Click here to check your answer

;; shorter-than-avg: (listof Str) -> (listof Str) 
(define (shorter-than-avg slist)
   (cond
      [(empty? slist) empty]
      [else
         (local
            [(define all-lengths (map string-length slist))
             (define sum-lengths (foldr + 0 all-lengths))
             (define avg-len (/ sum-lengths (length all-lengths)))
             ;; shorter-than?: String -> Boolean
             (define (shorter-than? s)
                (< (string-length s) avg-len))]
           (filter shorter-than? slist))]))

Anonymous ("single use") helper functions


Note
At this time you should switch the language in Dr.Racket to Intermediate Students With Lambda.


Sometimes the helper function we write does not involve recursion itself and does only exists in the context of the main function, where it is passed as an argument to another function (in this case, likely an abstract list function):

(define (longer-strings los n)
  (local
     [(define (longer? s)
         (> (string-length s) n))]
     (filter longer? los)))

In this case, longer? is defined so it can be passed into filter.

Sometimes even just come up with a name of the helper function is a challenging task; sometimes a function is just used once since it will be passed into an abstract list function; sometimes a function is just short enough that writing another definition for it might just does not make sense.

lambda creates a function without giving it a name, hence we call functions defined in this way anonymous functions. Some examples of lambda definition looks like this:

;; A function that consumes a number x and produces 1 + x * x
(lambda (x) (+ 1 (sqr x)))

;; The function neg-combine (Slide 39), writing in lambda form.
(lambda (item result-on-rest) 
  (cons (- item) result-on-rest))

More generally,

(lambda (p1 p2 ... pN)
  function-body)

creates an anonymous function that consumes N parameters and produce the value the expression would evaluate to in function-body.

As mentioned in the beginning of the module, function is treated as a value. lambda creates a function, hence we do not simplify lambda expressions.

However, if you enter a lambda expression in the interactions window in Dr. Racket, it will rename the parameters and replace the body with ....

> (lambda (x) x)
(lambda (a1) ...)

Now instead of have a separate define for the function an abstract list function would require, we just write lambda functions in place of the function name you would normally give:

;; (first-points n a b) produces a line containing the points (list x y)
;; where x = 0,1,...,n-1, and y = a*x+b.

(define (first-points n a b)
   (build-list n
       (lambda (x) (list x (+ (* a x) b)))))

In fact the following two definitions are equivalent:

(define (double k) (* k 2))

(define double (lambda (k) (* k 2)))

In Module 1, we said a function application has the form (f a1 a2 ... an), where f has to be the name of a function.

Now the semantics have been changed due to the fact that we are regarding function as a value: f just needs to be a function value, so we can use a lambda expression in it's place:

((lambda (x) (* 2 x)) 4) => 8
((lambda (s t) (+ (string-length s) t)) "hello" 10) 
=> (+ (string-length "hello") 10)
=> (+ 5 10)
=> 15

When to use lambda?

In general, use lambda when the function

An example: count-starters

Complete count-starters that consumes a list of non-empty strings (called phrases) and a string of length one (called start) and produces the number of strings in phrases that start with start.

Try write it yourself before looking at the model solution. Remember, your solution should use lambda with some abstract list functions!

Click here to check your answer

;; Solution 1: use foldr, add 1 once seeing a string that starts with `start`
(define (count-starters phrases start)
  (foldr
    (lambda (s count)
      (+ count
        (cond [(string=? (substring s 0 1) start) 1]
              [else 0]))) 0 phrases))

;; Solution 2: use filter to get a list that only contains strings that start 
;; with start, and produce the length of this list.
(define (alt-count-starters phrases start)
  (length (filter (lambda (s) (string=? (substring s 0 1) start)) 
                  phrases)))


Small aside

This part explains why are we using the term lambda to denote functions without a name. It's interesting CS trivia and definitely not covered in class. If you don't want to read it, just skip this part.

Click here if you want to read it

The term lambda comes from Alonzo Church, a theoretical computer scientist who invented Lambda Calculus.

The most important feature of lambda calculus is the functions in this system do not have names.
In fact, the term "function" is not used; it is called abstraction in Lambda Calculus. Now you should see why earlier in the module each time we would like to make some part "abstract", we use a function value for that part.

For a short introduction, imagine we have a math function f(x)=x+1f(x) = x + 1. The lambda calculus equivalent is λx.x+1\lambda x. x + 1. In general, the format is a lower case greek letter lambda (λ\lambda), follow by a name for the parameter, followed by a dot and then the body of the function. Note that the name of the function (ff) is gone!

Church invented Lambda Calculus with the goal of using it as a notation for all of mathematics; however, Lambda Calculus is just not suitable for that. Instead, Lambda Calculus becomes a underlying model for computation for many functional programming languages. That's why many functional programming languages including Racket and Haskell incorporates a lambda character into their logo.

Would like to know more? Consider taking CS 442 when you are in fourth year.


Conclusion

This is the end of Module 7. By this time, you should be able to do all things the goals slide has outlined (please check it out!)