Syllabus and Assessment

The following is a list of topics discussed in this course, along with the recommended readings. Each topic will be followed by one programming assignment. Each topic is made up of several videos, uploaded to an unlisted youtube channel. The link will be emailed before we start the course. We will have 1-2 zoom meetings for each topic under the assumption that you already watched the videos before we meet. Link and schedule of the zoom meetings will be communicated by email before we start the course. You can post your questions apriori if you want, as comments under the videos.

The course only has Pass/Fail grade, due to the current (rather stressful) situation.

Here are the topics. Assignments will be uploaded in due course of time.

1. Introduction

  • Course overview
  • Introduction to NLP
  • NLP, Machine Learning, and Economics: an overview

Readings:

(Note that you are not obligated to read everything thoroughly).

  • Chapter 1 from “Speech and Language Processing” by Jurafsky and Martin (available online)
  • Gentzkow, M., Kelly, B., \& Taddy, M. (2019). Text as data. Journal of Economic Literature, 57(3), 535-74.

2. Python fundamentals

  • Installing python on your personal machines/lab machines. Writing a hello world program. See this link for instructions: \url{https://www.py4e.com/install}
  • Installing Jupyter notebook
  • Writing basic variable declarations, performing arithmetic operations
  • Basic data structures (strings, lists, dictionaries)
  • Basic programming: conditional statements, loops, functions, error handling
  • Reading and writing text files

Readings:

“Python for Everybody” by Charles Severence. \url{https://www.py4e.com/html3/}. The content covered in this Chapter is taken from the first 10 chapters in the book.

3. Python & textual data

  • How to install various libraries
  • Reading and writing files in different formats (e.g., pdf, html, text, doc etc)
  • Pre-processing text (e.g., sentence splitting, removing punctuation/digits etc if needed)
  • Representing text as a numeric vector (e.g., bag of words, TF-IDF, embeddings)

Readings:

Chapters 2 and 3 in “Practical Natural Language Processing”

4. NLP and Machine Learning methods

(with econ specific datasets where possible)

  • Corpus collection (e.g., social media text, ethical issues etc)
  • Corpus analysis (basic analysis - e.g., frequent words/phrases etc)
  • Text classification
  • Information extraction (regular expressions, key phrase extraction, named entity recognition/linking etc)
  • Topic modeling
  • Text summarization
  • Visualizing textual data

Readings:

Chapter 1-2 from “NLTK book” (\url{https://nltk.org/book}) and Chapters 4-7 in “Practical Natural Language Processing” \

5. NLP and Economics: selected readings + Group discussion

(perhaps working in groups of 2-3 people?) You can choose from some of these papers

6. Student term papers

Briefly summarize what you learnt about the intersection of NLP and Economics by taking this course, and note down some thoughts on how it is useful for your own research topics. Depending on the time and interest, we can decide whether we want to have a presentation session or just writeup submissions.

7. Recap

  • Discussion on topics covered
  • Review of exercises
  • Resources for the future