There is no single textbook. Here are a few reference books - my teaching material will be based on these. We will try to rely on publicly accessible resources for as much as possible.
One book that may be useful is this 2022 book: “Text as Data: A New Framework for Machine Learning and the Social Sciences” by Justin Grimmer, Margaret E. Roberts, Brandon M. Stewart, published by Princeton University Press. The Google Books Preview seems interesting.
The following is a list of papers at the intersection of NLP and Economics I compiled from some past readings and exploring google scholar’s list of 20 most cited Economics journals. The list is neither complete nor objective and I did not read everything. You are free to choose any other relevant paper for the group discussions. You can find links to these by searching for titles on Google Scholar.
(Organized chronologically, latest first) 1. Kong, N., Dulleck, U., Jaffe, A. B., Sun, S., & Vajjala, S. (2020). Linguistic Metrics for Patent Disclosure: Evidence from University Versus Corporate Patents (No. w27803). National Bureau of Economic Research.
Ruan, Q., Wang, Z., Zhou, Y., & Lv, D. (2020). A new investor sentiment indicator (ISI) based on artificial intelligence: A powerful return predictor in China. Economic Modelling, 88, 47-58.
Wu, D. X., Yao, X., & Guo, J. L. (2020). Is Textual Tone Informative or Inflated for Firm’s Future Value? Evidence from Chinese Listed Firms. Economic Modelling.
Benchimol, J., & El-Shagi, M. (2020). Forecast performance in times of terrorism. Economic Modelling.
Larsen, V. H., Thorsrud, L. A., & Zhulanova, J. (2020). News-driven inflation expectations and information rigidities. Journal of Monetary Economics.
Chuthanondha, S. (2020). Do managements tell us the whole truth and nothing but the truth? Impact of textual sentiment in financial disclosure to future firm performance and market response in Thailand. International Journal of Monetary Economics and Finance, 13(3), 244-252.
Baylis, P. (2020). Temperature and temperament: Evidence from Twitter. Journal of Public Economics, 184, 104161.
Huang, Y., & Luk, P. (2020). Measuring economic policy uncertainty in China. China Economic Review, 59, 101367.
Rybinski, K. (2020). Should asset managers pay for economic research? A machine learning evaluation. The Journal of Finance and Data Science.
Cohen, L., Malloy, C., & Nguyen, Q. (2020). Lazy prices. The Journal of Finance, 75(3), 1371-1415.
Engle, R. F., Giglio, S., Kelly, B., Lee, H., & Stroebel, J. (2020). Hedging climate change news. The Review of Financial Studies, 33(3), 1184-1216.
Palaniswamy, N., Parthasarathy, R., & Rao, V. (2019). Unheard voices: The challenge of inducing women’s civic speech. World Development, 115, 64-77.
Hassan, T. A., Hollander, S., van Lent, L., & Tahoun, A. (2019). Firm-level political risk: Measurement and effects. The Quarterly Journal of Economics, 134(4), 2135-2202.
Boudoukh, J., Feldman, R., Kogan, S., & Richardson, M. (2019). Information, trading, and volatility: Evidence from firm-specific news. The Review of Financial Studies, 32(3), 992-1033.
Hanley, K. W., & Hoberg, G. (2019). Dynamic interpretation of emerging risks in the financial sector. The Review of Financial Studies, 32(12), 4543-4603.
Fryer Jr, R. G. (2019). An empirical analysis of racial differences in police use of force. Journal of Political Economy, 127(3), 1210-1261.
Fetzer, T. (2019). Can workfare programs moderate conflict? Evidence from India. Journal of the European Economic Association.
Chandra, Y. (2018). New narratives of development work? Making sense of social entrepreneurs’ development narratives across time and economies. World Development, 107, 306-326.
Gutmann, M. P., Merchant, E. K., & Roberts, E. (2018). “Big data” in economic history. The journal of economic history, 78(1), 268.
Atkins, A., Niranjan, M., & Gerding, E. (2018). Financial news predicts stock market volatility better than close price. The Journal of Finance and Data Science, 4(2), 120-137.
Buehlmaier, M. M., & Whited, T. M. (2018). Are financial constraints priced? Evidence from textual analysis. The Review of Financial Studies, 31(7), 2693-2728.
Baker, S. R. (2018). Debt and the response to household income shocks: Validation and application of linked financial account data. Journal of Political Economy, 126(4), 1504-1557.
Lu, Y., Shao, X., & Tao, Z. (2018). Exposure to Chinese imports and media slant: Evidence from 147 US local newspapers over 1998–2012. Journal of International Economics, 114, 316-330.
Saia, A. (2018). Random interactions in the Chamber: Legislators’ behavior and political distance. Journal of Public Economics, 164, 225-240.
Iaria, A., Schwarz, C., & Waldinger, F. (2018). Frontier knowledge and scientific production: Evidence from the collapse of international science. The Quarterly Journal of Economics, 133(2), 927-991.
Hansen, S., McMahon, M., & Prat, A. (2018). Transparency and deliberation within the FOMC: a computational linguistics approach. The Quarterly Journal of Economics, 133(2), 801-870.
Shapiro, A., Sudhof, M., & Wilson, D. J. (2017). Measuring News Sentiment, Federal Reserve Bank of San Francisco Working Paper 2017-01. Accessed, 17, 51.
Li, X., Shen, D., Xue, M., & Zhang, W. (2017). Daily happiness and stock returns: The case of Chinese company listed in the United States. Economic Modelling, 64, 496-501.
Gissler, S., Oldfather, J., & Ruffino, D. (2016). Lending on hold: Regulatory uncertainty and bank lending standards. Journal of Monetary Economics, 81, 89-101.
Gao, L. (2016). Applications of MachLearning and Computational Linguistics in Financial Economics (Doctoral dissertation, Carnegie Mellon University).
Bholat, David, Stephen Hansen, Pedro Santos, and Cheryl Schonhardt-Bailey. “Text mining for central banks: handbook.” Centre for Central Banking Studies 33 (2015): 1-19.
Moro, S., Cortez, P., & Rita, P. (2015). Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation. Expert Systems with Applications, 42(3), 1314-1324.
Cagé, J., Hervé, N., & Viaud, M. L. (2015). The production of information in an online world. The Review of Economic Studies.
Lawrence, A. (2013). Individual investors and financial disclosure. Journal of Accounting and Economics, 56(1), 130-147.
Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10‐Ks. The Journal of Finance, 66(1), 35-65.
Tetlock, P. C. (2011). All the news that’s fit to reprint: Do investors react to stale information?. The Review of Financial Studies, 24(5), 1481-1512.
Li, F. (2010). Textual Analysis of Corporate Disclosures: A Survey of the Literature. Journal of accounting literature, 29, 143-165.
Li, F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and economics, 45(2-3), 221-247.
Ros, R., van Erp, M., Rijpma, A., & Zijdeman, R. (2020). Mining Wages in Nineteenth-Century Job Advertisements. The Application of Language Resources and Language Technology to study Economic and Social Inequality. In Proceedings of the Workshop about Language Resources for the SSH Cloud (pp. 27-32).
Moreno-Ortiz, A., Fernandez-Cruz, J., & Hernández, C. P. C. (2020). Design and Evaluation of SentiEcon: a fine-grained Economic/Financial Sentiment Lexicon from a Corpus of Business News. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 5065-5072).
Masson, C., & Paroubek, P. (2020). Nlp analytics in finance with dore: A french 250m tokens corpus of corporate annual reports. In Proceedings of The 12th Language Resources and Evaluation Conference (pp. 2261-2267).
Qin, Y., & Yang, Y. (2019). What you say and how you say it matters: Predicting stock volatility using verbal and vocal cues. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 390-401).
Händschke, S. G., Buechel, S., Goldenstein, J., Poschmann, P., Duan, T., Walgenbach, P., & Hahn, U. (2018, July). A corpus of corporate annual and social responsibility reports: 280 million tokens of balanced organizational writing. In Proceedings of the First Workshop on Economics and Natural Language Processing (pp. 20-31).
Zamani, M., & Schwartz, H. A. (2017, April). Using twitter language to predict the real estate market. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers (pp. 28-33).
Lefever, E., & Hoste, V. (2016, May). A classification-based approach to economic event detection in dutch news text. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) (pp. 330-335).
Jelveh, Z., Kogut, B., & Naidu, S. (2014, October). Detecting latent ideology in expert text: Evidence from academic papers in economics. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 1804-1809).
Takala, P., Malo, P., Sinha, A., & Ahlgren, O. (2014, May). Gold-standard for Topic-specific Sentiment Analysis of Economic Texts. In LREC (Vol. 2014, pp. 2152-2157).
Lertcheva, N., & Aroonmanakun, W. (2011, November). Product name identification and classification in thai economic news. In Proceedings of the 3rd Named Entities Workshop (NEWS 2011) (pp. 58-64).
Ghose, A., Ipeirotis, P., & Sundararajan, A. (2007, June). Opinion mining using econometrics: A case study on reputation systems. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (pp. 416-423).
Brekke, M., Innselset, K., Kristiansen, M., & Øvsthus, K. (2006, May). Automatic Term Extraction from Knowledge Bank of Economics. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06).
Kermanidis, K. L., Fakotakis, N., & Kokkinakis, G. (2002, May). DELOS: An Automatically Tagged Economic Corpus for Modern Greek. In LREC.
Proceedings of the First Workshop on Financial Technology and Natural Language Processing. 2019
Proceedings of the Second Workshop on Financial Technology and Natural Language Processing. 2020
Proceedings of the second workshop on Economics and Natural Language Processing. 2019
Proceedings of the first workshop on Economics and Natural Language Processing (9 papers). 2018