- Basic algorithms: Chapters 1 through 7 discuss the classical algorithms for machine learning from text such as preprocessing, similarity computation, topic modeling, matrix factorization, clustering, classification, regression, and ensemble analysis.
- Domain-sensitive mining: Chapters 8 and 9 discuss the learning methods from text when combined with different domains such as multimedia and the Web. The problem of information retrieval and Web search is also discussed in the context of its relationship with ranking and machine learning methods.
- Sequence-centric mining: Chapters 10 through 14 discuss various sequence-centric and natural language applications, such as feature engineering, neural language models, deep learning, text summarization, information extraction, opinion mining, text segmentation, and event detection.
This textbook covers machine learning topics for text in detail. Since the coverage is extensive,multiple courses can be offered from the same book, depending on course level. Even though the presentation is text-centric, Chapters 3 to 7 cover machine learning algorithms that are often used indomains beyond text data. Therefore, the book can be used to offer courses not just in text analytics but also from the broader perspective of machine learning (with text as a backdrop).
This textbook targets graduate students in computer science, as well as researchers, professors, and industrial practitioners working in these related fields. This textbook is accompanied with a solution manual for classroom teaching.
2. Text Preparation and Similarity Computation
3. Matrix Factorization and Topic Modeling
4. Text Clustering
5. Text Classification: Basic Models
6. Linear Classification and Regression for Text
7. Classifier Performance and Evaluation
8. Joint Text Mining with Heterogeneous Data
9. Information Retrieval and Search Engines
10. Text Sequence Modeling and Deep Learning
11. Text Summarization
12. Information Extraction
13. Opinion Mining and Sentiment Analysis
14. Text Segmentation and Event Detection