Orders of Growth
What does it mean to say a procedure takes Θ(N) time? The lecture notes give a pretty formal definition, but the basic idea is that this means "the amount of work I do is proportional to N", where "N" is something like "the number of the items in the input sentence". Thus, if I have twice as many items, I have to do twice as much work. (We also call this "linear" time.) A good example of this is the COUNT
procedure, which has to look at every word in the sentence to know how long it is.
How about some other cases? Θ(1), or "constant" time, applies to things that don't care how big the input is. A good example here is FIRST
. You can pass in a sentence with one word or a million words, and it will take the same amount of time to get the first word.
Let's try a few practice problems. If it takes 10 minutes for me to process a sentence of N words on my laptop, and I borrow a friend's server that works four times as fast, how long of a sentence can I process in 10 minutes if my algorithm...
-
...takes Θ(N) time?
4N words
This is the same sort of example we did before. Since work is proportional to the length of the sentence, and we can do four times as much work, we can handle sentences that are four times as long. Pretty simple.
-
...takes Θ(N²) time?
2N words
A little bit trickier! Again, we phrase it as "work is proportional to the length of the sentence, squared". So a sentence that's twice as long requires four times as much work. When would this happen? Remember the card-sorting example from discussion.
A quick way to use your intuition to your advantage: just because you can sort two hands of cards (2 3 10) and (5 6 k), doesn't mean you can put them together and get one sorted hand! That shows that it's harder to sort one big hand than two smaller ones.
-
...takes Θ(2^N) time?
N+2 words
This is hard! The book mentions the prototypical Θ(2^N) algorithm: Fibonacci. To find out
(fib N)
, you need to know N-1 and N-2. To find out N-1, you need to know N-2 and N-3...and so on. You end up making about 2^N calls toFIB
.Θ(2^N) problems are horrible because it means that work is proportional to an exponential: adding one word to the sentence (or asking for the next Fibonacci number) takes twice as long. This means that even though our computer speed quadrupled, we only get two more words in the input sentence before it takes as long as it did before.
-
...takes Θ(1) time?
N words, but it's a trick question
The example here again is
FIRST
. If I really have a laptop so slow thatFIRST
takes 10 minutes, it will take 10 minutes on any size input. Now, of course a faster computer can usually do these simple things more quickly as well! But let's say the delay is inherent—it sends information to the Moon and back, for no good reason. Being able to do "more" work doesn't help if the computer is just sitting idle waiting for an answer.This sort of question usually won't be phrased like this; instead, it will be more like "if I put a sentence that's twice as long in, how much longer does it take?" The answer, of course, is that it takes the same amount of time no matter how long the sentence is.
Occasionally you will also see things like O(N) (instead of Θ(N)). This just means a procedure's work is at most proportional to the length of its input, i.e. if I put in twice as many words, it can take up to twice as long. Thus, FIRST
is Θ(1), which means it is O(1), but it is also O(N), because doubling the length of its input takes less than double time. Conversely, COUNT
is O(N) but not O(1), because doubling the length of its input does affect the time. It's just good to know the difference, though sometimes people say O when they mean Θ.
Some more practice! What is the order of growth of each of the following procedures? What is N?
-
(define (fact x) (if (= x 0) 1 (* x (fact (- x 1))) ))
Θ(N), where N is
X
.Not too tricky; the number of recursive calls is equal to the original argument, and each call does a constant amount of work.
-
(define (sum-of-facts x n) (if (= n 0) 0 (+ (fact x) (sum-of-facts x (- n 1)) )))
Θ(N*X)
Let's break this down.
SUM-OF-FACTS
calls itselfN
times before reaching the base case. And in each call toSUM-OF-FACTS
(except the base case), we do a Θ(x) operation:(fact x)
. So the amount of time the method takes is proportional toN
timesX
. -
(define (vowel? letter) (member? letter '(a e i o u)) )
Θ(1)
This one surprises some people.
MEMBER?
is obviously a Θ(N) operation, since you have to look at every word in the sentence to see if it matches the one you're looking for. But in this particular case, we know the sentence is only going to be of length 5. That means we know how much timeMEMBER?
will take, no matter what the input toVOWEL?
is.If you think about it, what letter you put in shouldn't change how long this takes anyway, besides whether it's a vowel or not. There's no idea of "larger" input. So there's no "growth".
-
(define (pl-done? wd) (vowel? (first wd)) )
Θ(1)
Very simple. Both
FIRST
andVOWEL?
are Θ(1) operations, and there is no recursion. -
(define (pigl wd) (if (pl-done? wd) (word wd 'ay) (pigl (word (bf wd) (first wd)) )))
Θ(N), where N is the the number of letters in the word
There are two paths through this function: the base case, and the recursive case. But both paths only do Θ(1) operations. How many recursive calls are there? Well, if the word is pig-Latin-able, there must be a vowel somewhere in it. At worst it will be at the end of the word, so we might have to look at every letter. Thus there will be at most as many recursions as there are letters in the word. (That "at most" is another reason why CS people sometimes prefer writing O(N) to Θ(N).)
We don't take into account "bad" input here, but technically the running time could be infinite, like when we do
(pigl 'sky)
.
Iteration
Another way to measure the cost of a program is by how much space it requires. For a simple program like the straightforward COUNT
, you need space for every recursive call. Why? Because the first call to COUNT
isn't done until it can add 1 to the next call to COUNT
, which isn't done until...all the way down to the base case. Only once we reach the base case can all the intermediate calls actually release their memory (called "stack space") back to the computer.
If a program is written more carefully, however, it can use a fixed amount of memory, rather than one that grows with the amount of recursion you have. The term for such a program is iterative. For example, here's an iterative version of COUNT
:
(define (count sent) (define (iter sent count-so-far) (if (empty? sent) count-so-far (iter (bf sent) (+ 1 count-so-far)) )) (iter sent 0) )
It's a bit more complicated than the standard COUNT
for sure. But there's an important difference. Notice how the call to the helper procedure ITER
is the last thing that happens in COUNT
. This is called a tail call. Moreover, each recursive call in ITER
is also a tail call; this is often called being tail-recursive.
You might think we have the same problem as before: doesn't COUNT
still have to wait for ITER
to finish before returning its result? Scheme, however, is smart enough to realize that all COUNT
is going to do with the value it gets from ITER
is return it, and so just tells ITER
to return its result to whoever called COUNT
. That means that COUNT
doesn't have to sit around waiting for ITER
, and can give its memory back right away.
If a procedure calls another procedure that isn't iterative, does the first procedure still count as iterative? Not really—the space required to run the first procedure has to include the space for the second. Moral of the story: if every recursive call in a procedure is a tail call, and everything the procedure calls is iterative, that procedure is iterative as well!
Let's try a few examples. Which of the following generate an iterative process, and which generate a recursive (not iterative) process?
-
(define (every fn sent) (if (empty? sent) '() (se (fn (first sent) (every fn (bf sent)) )))
Recursive
After a recursive call finishes, the
EVERY
that called it still has to do work putting the result together. That means it has to stick around, which means the process is not iterative. -
(define (squares sent) (every (lambda (x) (* x x)) sent) )
Recursive
Although there are no recursive calls here, the version of
EVERY
we're using is the one above, which is not iterative. Therefore, any procedure that uses it in this way can't be iterative either. -
(define (alternating-pos sent) (cond ((empty? sent) #t) ((< 0 (first sent)) (alternating-neg (bf sent)) ) (else #f) )) (define (alternating-neg sent) (cond ((empty? sent) #t) ((> 0 (first sent)) (alternating-pos (bf sent)) ) (else #f) ))
Iterative
Even though we have mutual recursion here, every call is still a tail call, which means every call can give up its space once it gets to the mutually recursive step. Thus the total space required is still fixed, and the process is iterative.
Note: "recursive" just means a function that calls itself. Thus in Scheme, all "iterative" functions are recursive! (Other languages use loops.) But when we refer to a recursive process, we’re talking about the trait of requiring more and more memory as each call waits for the next to finish. Iterative processes only require a fixed amount of space.
A few thought questions
-
Can a function that has multiple calls to itself generate an iterative process?
Sure! The catch is that all of them have to be tail calls. How can that work? If each recursive call is in a different
IF
orCOND
clause. -
Can a function that has only one call to itself generate a recursive process (not iterative?)
Sure. Our usual implementation of
COUNT
only has one call to itself, and is not iterative. -
Can a function that has no calls to itself generate a recursive process?
Sure. This was the case with
SQUARES
, above.
Back to Jordy's home page | Course home page