College of Humanities & Social Sciences

Summer 2019 400-Level Courses

*Please note, if you plan on registering for more than one LING 402 course per quarter, you must register in person at the Registrar's Office in OM 230

LING 402: Computational Linguistics - James Hearne, Ph.D.

MTWRF 12-1:20

Prerequisite: LING 310; one course from LING 204, ENG 270 or ANTH 347

Credits: 5

Course Description: This course addresses the use of computer systems to study natural language and related data sets.  Intended to be accessible to both linguists and computeers, this course is perfect for linguistics students with little knowledge of computers beyond normal literacy, as well as computer science students with little knowledge of linguistics. The hope is that these two communities will develop a fruitful collaboration and that perhaps novel discoveries will be made.  The corpora we will study include ancient Akkadian, Sumerian, Etruscan and Chinese texts, as well as J.R.R. Tolkien’s artificial languages and contemporary English corpora.

The primary instrument for computation linguistics in this course will be a language called R.  Although it can be approached as a standard programming language, it can also be approached as a collection of utilities accessible to the non-programmer.

Here are the topics we will address:

  • Descriptive statistics applied to language.
  • Visualization of statistical results.
  • Inferential statistics applied to language
    • Correlation analysis
    • Regression
  • Association measures
    • Collocations
    • Collostructions