Fleeting memory

I’ve had a couple of questions about how to approach task 2 on homework 6, so let me share a little program that has kind of an analogous “memory” (or “history”) property to what we need there.

There is an example in the book of how to get a part-of-speech analyzer to use the history of its choices, but the task there was a little bit different.

Here, really all we want to do is add “previous result” to the set of features that we know about a post, so we want to go through the posts, extract the training features (which are features like “contains(the)” for the words), and add a feature named “prev-class” that holds the previous feature.

Perhaps it is due to idiosyncracies of the kind of programming I usually do, but there is a kind of “idiom” that I use all the time to do this kind of thing, where an element in a list needs to know something about the previous element. So, let me present that to you here, as a kind of conceptual model of how it might be done.

Suppose that we have a list of words like this:

basic_words = ['the', 'cat', 'pounced']

Suppose that for whatever reason, we want to convert this into a list like this:

word_list = [{'start': 't', 'word': 'the', 'prev': 'NONE'},
    {'start': 'c', 'word': 'cat', 'prev': 'the'},
    {'start': 'p', 'word': 'pounced', 'prev': 'cat'}]

This is a list of dictionaries. There is a dictionary for each word. In that dictionary, we have values for ‘start’, ‘word’, and ‘prev’. The value of ‘start’ is the first letter, the value of ‘word’ is the word, and the value of ‘prev’ is the preceding word.

print(word_list[0]['word'])
the
print(word_list[1]['prev'])
the

So what we want to do is define a function that will take a list of words and return a list of of these dictionaries, where one of the things we need to know is what the previous word was.

I would do this as follows:

def words2dicts(words):
    return_list = []   # we are building this list, it starts out empty
    prev_word = "NONE" # at the beginning, there is no previous word, so use "NONE"
    for w in words:
        # build the feature dictionary
        f = {'word': w, 'start': w[0]}
        # add in the previous word too
        f['prev'] = prev_word
        # add this dictionary to our running list of dictionaries
        return_list.append(f)
        # set prev_word to be the word we just processed
        prev_word = w
    # return the list we built
    return return_list

In words2dicts above, I built f, the feature dictionary, in two steps. The first step defined the keys ‘word’ and ‘start’, and the second step added one more, ‘prev’. This could all have been done in one step, but I thought it would be useful to see both ways of adding to the dictionary.

The “idiom” I was referring to before is just the practice of using a variable to remember what we processed the previous time through a loop. So, I set it to something that means “nothing” before we start the loop, and then once we have processed a word w, I set that variable to be the word w we just processed. Then, the next time through the loop, we can use that variable to refer to the word we did the previous time.

This is probably simpler than precomputing all the tags ahead of time and finding them in a list. The memory is “fleeting” in a way, because we don’t remember in any long-term way what words we have processed (except to the extent that the word itself is recorded in the dictionaries we are adding to the overall list), all we remember is just the immediately preceding word.

print(basic_words)
['the', 'cat', 'pounced']
dictlist = words2dicts(basic_words)
print(dictlist)
[{'word': 'the', 'prev': 'NONE', 'start': 't'}, {'word': 'cat', 'prev': 'the', 'start': 'c'},
{'word': 'pounced', 'prev': 'cat', 'start': 'p'}]
print(dictlist == word_list)
True

Anyway, this example has the basic contours of the function you’d probably want to create for task 2 in the homework. There are other ways to do it, but this one is the one that’s always my first reflex.

One other thing to be careful about, specifically about task 2 and Naive Bayes Classifiers: the structure of the training and test set must be a list of pairs. The pairs are in the form (dictionary1, answer1), and what the classifier is doing is trying to learn what features of the dictionrary correspond to different answers, so that it can look at a new dictionary and guess the answer. So, what you are trying to make in task 2 is a list of that form, and the real problem is essentially how to add “prev-class” as one of the features contained within the dictionary of features in the first position of each of those pairs.