Course information
NLP&CL, Fall 2017
(CAS LX 390, GRS LX 690)
Meeting time | T 3:30-6:15, CAS 327 |
Instructor | Paul Hagstrom |
hagstrom@bu.edu | |
Phone | (617) 353-6220 |
Office | 621 Commonwealth Ave., Rm. 105 |
Office Hours | TR 11-12; W 2-3 (and by appointment) |
Prerequisite: CAS LX250 (Introduction to Linguistics), or consent of instructor.
Course Description: Introduction to computational techniques to explore linguistic models and test empirical claims. Serves as an introduction to programming, algorithms, and data structures, focused on modern applications to NLP. Topics include tagging and classification, parsing models, meaning representation, and information extraction.
Learning objectives
Students completing this course will:
- Gain a basic understanding of the types of research done in the field of natural language processing
- Gain experience approaching and solving these problems using Python and available corpora and libraries
Course Requirements
Readings. There will be readings for each class session. All readings mentioned on the schedule are required, and should be completed by the beginning of class.
Attendance and participation. Regular attendance is required, and participation in classroom discussions is expected.
Homework. There will be homework assignments on roughly a weekly schedule.
Exams. There will be two exams, one a midterm, at about the middle of the term, and one a final, at the end of the semester. Both will be take-home projects. The midterm will focus a bit more on Python, the final a bit more on using the NLTK to address language issues.
Project (LX690). Students registered for LX690 will prepare a final project in place of the final; this project should be proposed and approved by November 14. It should present a question or problem that the tools used in the course can help answer, and have implications (to be discussed in the project write-up) for some aspect of linguistic theory. The write-up of the project at the end would be in the vicinity of 15–20 pages.
Late assignments. Late assignments will not be accepted without prior arrangement.
Electronic communication
We live in an electronic age. You (unlike me) have always lived in an electronic age. You are expected to be reachable via your BU email address. The central communication center for the course is the course blog. Announcements, notes on readings, homework errata, and other information will be posted there on a regular basis, and things that are posted there will be assumed to have been communicated. Homework assignments can be sent (whenever feasible, and unless otherwise indicated) by email, or handed in on paper. It is your responsibility to ensure that electronically submitted material is in a readable format—if there is a question (for example, if you use a special font or an obscure word processor), send it early for verification. Unreadable submissions do not count as having been handed in.
Readings
The textbook for the course is Steven Bird, Ewan Klein, and Edward Loper (2016). Natural Language Processing with Python (Python 3, NLTK 3 version). Other readings may be assigned from time to time.
Grading Scheme
Homework (lowest dropped) | 50% |
Midterm exam | 15% |
Final exam / project | 20% |
Regular attendance, participation | 15% |
CAS/GRS Academic Conduct Code
It is essential that you read and adhere to the CAS Student Academic Conduct Code. Graduate students must also follow the policies of the GRS Academic Conduct code.