# 61A Homework 7

Due by 11:59pm on Saturday, 7/19

Readings: You might find the following references useful:

This assignment is a bit different from the rest. For the first half, you'll be using dictionaries to create your own Shakespearean sentences. In the second half, you'll be diving into cryptography and using what you've learned this week to create a cypher. The entire second half of this assignment is optional. While it is good practice on the material for this week, you don't have to turn it in to get full points for this assignment. You worked very hard last week, so hopefully this homework will be a bit of a break.

Submission: See the online submission instructions. We have provided a hw7.py starter file for the questions below.

## Dictionaries and Shakespeare

Now that you know how dictionaries work, we can approximate the entire works of Shakespeare! We're going to use a bigram language model. Here's the idea: We start with some word — we'll use "The" as an example. Then we look through all of the texts of Shakespeare and for every instance of "The" we record the word that follows "The" and add it to a list, known as the successors of "The". Now suppose we've done this for every word Shakespeare has used, ever.

Let's go back to "The". Now, we randomly choose a word from this list, say "cat". Then we look up the successors of "cat" and randomly choose a word from that list, and we continue this process. This eventually will terminate in a period (".") and we will have generated a Shakespearean sentence!

The object that we'll be looking things up in is called a "successor table", although really it's just a dictionary. The keys in this dictionary are words, and the values are lists of successors to those words.

### Question 1

Here's an incomplete definition of the `build_successors_table` function. The input is a list of words (corresponding to a Shakespearean text), and the output is a successors table. (By default, the first word is a successor to "."). See the example below:

``````def build_successors_table(tokens):
"""Return a dictionary: keys are words; values are lists of
successors.

>>> text = ['We', 'came', 'to', 'investigate', ',', 'catch', \
'bad', 'guys', 'and', 'to', 'eat', 'pie', '.']
>>> table = build_successors_table(text)
>>> expected = {'and': ['to'], 'We': ['came'], 'bad': ['guys'], \
'pie': ['.'], ',': ['catch'], '.': ['We'], \
'to': ['investigate', 'eat'], 'investigate': [','], \
'catch': ['bad'], 'guys': ['and'], 'eat': ['pie'], \
'came': ['to']}
>>> expected == table
True
"""
table = {}
prev = '.'
for word in tokens:
if prev in table:
else:

prev = word
return table``````

### Question 2

Let's generate some sentences! Suppose we're given a starting word. We can look up this word in our table to find its list of successors, and then randomly select a word from this list to be the next word in the sentence. Then we just repeat until we reach some ending punctuation.

Hint: to randomly select from a list, first make sure you import the Python random library with `import random` and then use the expression `random.choice(my_list)`)

This might not be a bad time to play around with adding strings together as well. Let's fill in the `construct_sent` function!

``````def construct_sent(word, table):
"""Returns a random sentence starting with word, sampling from
table.
"""
import random
result = ' '
while word not in ['.', '!', '?']:
return result + word``````

Great! Now all that's left is to run our functions with some actual code. The following snippet included in the skeleton code will return a list containing the words in all of the works of Shakespeare.

Warning: do NOT try to print the return result of this function):

``````def shakespeare_tokens(path = 'shakespeare.txt', url = 'http://goo.gl/SztLfX'):
"""Return the words of Shakespeare's plays as a list."""
import os
from urllib.request import urlopen
if os.path.exists(path):
else:
shakespeare = urlopen(url)

Next, we probably want an easy way to refer to our list of tokens and our successors table. Let's make the following assignments:

``````>>> tokens = shakespeare_tokens()
>>> table = build_successors_table(tokens)``````

Finally, let's define an easy to call utility function:

``````>>> def sent():
return construct_sent('The', table)

>>> sent()
' The plebeians have done us must be news-cramm'd '

>>> sent()
' The ravish'd thee , with the mercy of beauty '

>>> sent()
' The bird of Tunis , or two white and plucker down with better ; that's God's sake '``````

Now, if we want to start the sentence with a random word, we can use the folowing:

``````def random_sent():
import random
return construct_sent(random.choice(table['.']), table)

>>> random_sent()
' You have our thoughts to blame his next to be praised and think ?'

>>> random_sent()
' Long live by thy name , then , Dost thou more angel , good Master Deep-vow , And tak'st more ado but following her , my sight Of speaking false !'

>>> random_sent()
' Yes , why blame him , as is as I shall find a case , That plays at the public weal or the ghost .'``````

## Creating a Cypher (Optional)

In this section, we'll be making our own message encryptor using a technique called a one-time pad.

Basically, this technique works by randomly generating a pad for each word in a message. A pad is just a random string of letters the same length as the word. For example, a pad for the word `'dog'` might be the string `'yaz'`.

Then, we create our encryption by adding the letters of the original word to our pad, mod 26, then using that as the new value for that letter.

For example, the first letter our word `'dog'` is d, which is the 3rd letter in the alphabet (indexing from 0). The first letter in our pad `'yaz'` is y, the 24th letter in the alphabet. Therefore, the first letter in our encryption would be b (`24 + 3 = 27 % 26 = 1`).

Since decrypting this type of message relies on the pad we randomly generate, after encrypting our message, we're going to use higher order functions and nonlocal to lock our randomly generated pad away.

In the Cypher section of your starter file, you've already been given the function `pad_creator`. This function takes in a single word, and returns a pad for it. See the doctests for further clarification on the domain and range of `pad_creator`.

You've also been given the dictionary `letter_dict`, which contains key-value pairs of letters of the alphabet with their corresponding index.

### Question 3

What is a pad? What is it used for? How is it different from our encrypted word?

How are we encrypting our words? Review the Wikipedia article linked above if you are still unclear on this process.

Make sure you can answer these questions before moving on.

### Question 4

The first thing we need to do is to find a way to encrypt a single word.

Write the function `word_mutator` which takes in a single word and a pad, and returns the encrypted version of that word. You may find `letter_dict` useful for this task. You may also find the value `string.ascii_lowercase` useful. `string.ascii_lowercase` returns a string of all of the lowercase ASCII characters in order.

``````def word_mutator(word, pad):
"""Returns an encrypted version of word using
the one-time pass techinique.

>>> word_mutator('charms', 'secret')
'ulciql'
"""
new_word = ''

### Question 5

Switching gears briefly for this problem, we need to consider the security of our pad. We can only decrypt our original message if we have the pad that encrypted it. However, we don't just want to return the pad with our encrypted message. Instead, lets create a higher order function `make_lock`.

`make_lock` takes in the pad we want to secure, a password, and a number of guesses which defaults to 3. `make_lock` returns a function which takes in a password attempt.

If the password is not the same as what was given to `make_lock`, we want to reduce the number of guesses left by 1, and store that attempted guess away. If a password is attempted more than once, or the amount of guess attempts left reduces to zero, the pad should be locked away forever.

However, if one gives the correct password before the guess attempts hits zero, the lock should return the pad given to `make_lock`, and then never return the pad again.

Make sure you read and understand the doctests for `make_lock` before you start writing your code.

``````def make_lock(pad, password, n=3):
"""Returns a function which takes in password attempts.
locked away forever.

is locked away forever.

never be retrieved again from this lock.

>>> lock1 = make_lock('correcthorsebatterystaple', 'letmein')
>>> lock1('123456')
>>> lock1('letmein')
'correcthorsebatterystaple'
>>> lock1('letmein')
>>> lock2 = make_lock('xyzzy', 'worst. password. ever.')
>>> lock2('Pikachu')
>>> lock2('Pikachu')
'Password attempt repeated: security system locked!'
"""
attempts = []
def lock(attempt):
return lock``````

### Question 6

Now we can finally write our encryption function!

Write the function `OTP_encrypter`, which takes in a list of words to encrypt, and a password to lock the corresponding pad with. `OTP_encrypter` should encrypt the original list of words, and mutate the `message` to store the encrypted words in it. It should return a lock for the pad corresponding to the message. In order to recover the list of words, we would have to use the password to recover the pad (which would be a list of strings, one for each word).

``````def OTP_encrypter(message, password):
"""Encrypts the words in the orignial list message, and

>>> message = ['robbery', 'planned', 'on', 'monday']
>>> message_copy = message[:]
>>> lock = OTP_encrypter(message, 'open sesame')
>>> message == message_copy
False