Connections and Symbols

Connections and Symbols

By the way, the yellow book I was referring to in class on Monday is Connections and Symbols (1988, eds. Steven Pinker, Jacques Mehler, MIT Press), which is available through the BU Library here:

Connections and Symbols

You may need to go to the library site to sign in first in order to have access to the full text, but it is available once you have signed in.

  • readings

Homewoooork Fiiive

I’ve had a couple of questions about homework 5 and as has happened before it seems like it’s a bit harder in spots than anticipated. So, I’ll just generally make it due on Friday this week rather than being due tomorrow. If you’ve gotten through it already, great! But, otherwise, it’s worth looking over before class so you can ask questions, and I’ll see if there are things I want to bring up as well.

(Also don’t forget to read the immediately preceding post about the errors in the book in the example given for the first of the assigned problems! It doesn’t work as “printed.”)

  • homework

Home-doesn't-work 5 notes

It has been observed that the first homework problem I assigned is made… difficult… by the fact that the example code does not actually work.

This is in fact still an open bug of the NLTK book, but there are some details in a related bug report.

However, the main thing is that the thing you would base your approach to problem 9 on, the listing that starts with text = 'That U.S.A. poster-print costs $12.40...', does not work on the current version of NLTK. The reason is that the behavior of “capturing groups” has changed.

Basically, it misbehaves when you use grouping parentheses, so where you have a capturing group like (...) you want to instead use a non-capturing group like (?:...). Concretely, the example from chapter 3 should read:

>>> text = "That U.S.A. poster-print costs $12.40..."
>>> pattern = r'''(?x)    # set flag to allow verbose regexps
...     (?:[A-Z]\.)+        # abbreviations, e.g. U.S.A.
...   | \w+(?:-\w+)*        # words with optional internal hyphens
...   | \$?\d+(?:\.\d+)?%?  # currency and percentages, e.g. $12.40, 82%
...   | \.\.\.            # ellipsis
...   | [][.,;"'?():-_`]  # these are separate tokens; includes ], [
... '''
>>> nltk.regexp_tokenize(text, pattern)
['That', 'U.S.A.', 'poster-print', 'costs', '$12.40', '...']

That worked for me anyway. That should be able to give you a basis to work from when doing problem 9.

  • errata
  • homework

Homework 5

For homework 5, I’ll go back to using a few exercises from the book.

These are due on October 26.

From chapter 3, numbers 9, 10, 25, 29, 38.

For 9 and 38, it would be useful for me to have the text that you were working with as well.

  • homework

pythontutor.com

In class, a question about how you can debug your programs came up. I talked a little bit about things that debuggers have in common, and showed a bit about how you can use the debugger in Anaconda/Spyder, although I don’t have a great deal of experience with that debugger myself (so I immediately ran into the problem that I didn’t know what commands were available at the python debugger prompt).

But apparently when people are taught Python in CS classes, the site pythontutor.com is recommended as a troubleshooting resource. It looks pretty nice. For a short program, you can copy and paste it in, and then step through your program as it runs, line by line, to see how variables evolve.

I have not played with it really, I expect that while it will likely be useful for short functions that do not depend on NLTK, you’ll need to use a different debugger (like the one in Spyder) once you start using things that are not built into Python. (That is to say, for most things that require you to import <something> before using them.)

Still, it provides quite a nice visualization tool for simple things and can help you test your logic and see what is happening to your variables. I’ll see if I can come up with either some tutorials or documentation for the Spyder debugging functions. I haven’t looked for any at this point.

But, still, check out pythontutor.com if your functions aren’t doing what you think they should be.

  • resources