# Information Retrieval Information retrieval is about technology to connect people to information. ![[IR pillars.jpg]] 1. Text preprocessing and indexing - [[Indexing]] - [[Jelinek-Mercer smoothing]] - (Interpolation of n-gram models) - [[Dirichlet smoothing]] - [[Query processing]] 2. Offline evaluation metrics - Unranked: [[Precision and Recall]] - Ranked: AP, [[Discounted Cumulative Gain]] - User-based: [[Expected Reciprocal Rank]], [[Rank Biased Precision]] 3. Test collections for offline evaluation 4. Term-based retrieval - Vector space model and [[TF-IDF]] - [[Query Likelihood Model]] - [[BM25]] 5. Semantic retrieval - Vector Space Model like [[Count-based Distributional Models]], [[Latent Semantic Analysis]], Average Word Embeddings (AWEs) Doc2Vec - Distribution based - [[Query Likelihood Model]], [[Probabilistic Generative Models|LDA]] 6. Offline LTR - [[Learning to Rank]] - [[RankNet]] - [[LambdaRank]] 7. IR-user interaction - [[Click models]] 8. [[Counterfactual Evaluation and LTR]] 9. [[Online Evaluation and LTR]] 10. Recommender systems - Ranking problem with user profile instead of query, but has a unique feature: explicit user ratings. - [[Recommender Systems]] - [[Content-based recommendation]] - [[Collaborative filtering]] - [[Deep recommenders]] - [[Sequential recommendation]] - [[Conversational recommendation]] 11. Coversational IR - [[Conversational Information Retrieval]] 12. Current developments - Neural models for passage matching and ranking - Query and document expansion - Weak supervision in LTR --- ## References