Compare methods
Review your selected methods side by side; rows that differ are highlighted.
| TREC Pooling and Relevance Judgments× | Relevance Feedback Evaluation× | |
|---|---|---|
| Field | Library Information Science | Library Information Science |
| Family | Process / pipeline | Process / pipeline |
| Year of origin≠ | 2005 | 1990 |
| Originator≠ | Ellen M. Voorhees & Donna K. Harman (NIST TREC) | Gerard Salton & Chris Buckley (building on J. J. Rocchio) |
| Type≠ | Pooled relevance-assessment pipeline for large test collections | Evaluation pipeline for relevance-feedback query reformulation |
| Seminal source≠ | Voorhees, E. M., & Harman, D. K. (Eds.). (2005). TREC: Experiment and Evaluation in Information Retrieval. MIT Press. ISBN: 9780262220736 | Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4), 288-297. DOI ↗ |
| Aliases | Pooling Method, Depth Pooling, TREC Pooling, Pooled Relevance Assessment | Rocchio Feedback Evaluation, Feedback Effectiveness Measurement, Residual Collection Evaluation, Relevance Feedback Assessment |
| Related | 3 | 3 |
| Summary≠ | Pooling is the technique that lets the Cranfield evaluation paradigm scale to collections of millions of documents, where judging every document for every topic is impossible. Developed and institutionalized at the US National Institute of Standards and Technology for the Text REtrieval Conference (TREC), pooling gathers the top-ranked documents returned by many participating systems for each topic, merges them into a single pool, has human assessors judge only that pool, and treats every unjudged document as non-relevant. The result is a reusable test collection — documents, topics, and pooled relevance judgments (qrels) — on which new systems can later be scored without further assessment. Pooling is what made large-scale, reproducible retrieval evaluation feasible. | Relevance feedback evaluation measures how much a retrieval system improves when it reformulates a query using user judgments on the first results. The technique that defined the field is Rocchio's vector-space feedback, in which documents the user marks relevant pull the query vector toward themselves and documents marked non-relevant push it away; Salton and Buckley's 1990 study systematized its evaluation and showed substantial effectiveness gains. The central methodological challenge is fairness: because the user has already seen and judged some documents, naively re-scoring the whole collection rewards the system for re-finding documents it was just told about. Residual-collection and frozen-rank evaluation solve this by measuring improvement only on documents the user has not yet seen. |
| ScholarGateDataset ↗ |
|
|