Natural Language Processing Books, Course Data and Tutorials
Course Contents for Natural Language Processings
This Outline Will be similar with your University Course Outline
Introduction and Overview, Ambiguity and uncertainty in language, Regular Expressions. Chomsky hierarchy, regular languages, and their limitations. Finite-state automata. Practical regular expressions for finding and counting language phenomena. A little morphology. In class demonstrations of exploring a large corpus with regex tools, String Edit Distance and Alignment, Key algorithmic tool: dynamic programming, first a simple example, then its use in optimal alignment of sequences. String edit operations, edit distance, and examples of use in spelling correction, and machine translation, Context-Free Grammars, Constituency, CFG definition, use, and limitations. Chomsky Normal Form. Top-down parsing; bottom-up parsing, and the problems with each. The desirability of combining evidence from both directions, Information Theory, What is information? Measuring it in bits. The “noisy channel model.” The “Shannon game”–motivated by language! Entropy, cross-entropy, information gain. Its application to some language phenomena, Language modeling and Naive Bayes, Probabilistic language modeling and its applications. Markov models. N-grams. Estimating the probability of a word, and smoothing. Generative models of language. Their application to building an automatically-trained email spam filter, and automatically determining the language, Part of Speech Tagging and Hidden Markov Models, The concept of parts-of-speech, examples, usage. The Penn Treebank and Brown Corpus. Probabilistic (weighted) finite state automata. Hidden Markov models (HMMs), definition and use, Probabilistic Context-Free Grammars, Weighted context-free grammars, Maximum Entropy Classifiers, The maximum entropy principle, and its relation to maximum likelihood. The need in NLP to integratemany pieces of weak evidence. Maximum entropy classifiers and their application to document classification, sentence segmentation.
Reference Materials Recommended By HEC
1. Daniel Jurafsky and James H. Martin. 2008. Speech and Language Processing: An Introduction to Natural Language Processing, 43 Computational Linguistics and Speech Recognition. Second Edition. Prentice Hall.
2. Foundations of Statistical Natural Language Processing, Manning and Schütze, MIT Press. Cambridge, MA: May 1999
Get Youtube Videos
All the data is extracted from HEC official website. The basic purpose for this to find all course subjects data on one page.