방법 비교
선택한 방법을 나란히 검토하세요. 서로 다른 행은 강조 표시됩니다.
| Mean Average Precision (MAP)× | BM25 Probabilistic Ranking (Okapi)× | |
|---|---|---|
| 분야 | 계량서지학 | 계량서지학 |
| 계열 | Process / pipeline | Process / pipeline |
| 기원 연도≠ | 2000 | 2009 |
| 창시자≠ | TREC / information-retrieval evaluation community; Chris Buckley & Ellen Voorhees (stability analysis) | Stephen Robertson; Karen Spärck Jones; Hugo Zaragoza (Okapi team, City University London) |
| 유형≠ | Binary-relevance ranked-retrieval evaluation pipeline | Probabilistic term-weighting and document-scoring pipeline for ranked retrieval |
| 원전≠ | Buckley, C., & Voorhees, E. M. (2000). Evaluating evaluation measure stability. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '00), 33-40. DOI ↗ | Robertson, S., & Zaragoza, H. (2009). The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval, 3(4), 333-389. DOI ↗ |
| 별칭 | MAP, Average Precision, AP, Mean AP | Okapi BM25, Best Matching 25, Probabilistic Relevance Ranking, BM25 Term Weighting |
| 관련 | 3 | 3 |
| 요약≠ | Mean Average Precision (MAP) is the classic single-number summary of ranked-retrieval effectiveness under binary relevance and the headline metric of the TREC ad hoc retrieval tracks. For a single query, average precision (AP) computes the precision of the result list at each rank where a relevant document appears and averages those values, rewarding systems that rank all relevant documents highly; MAP is then the mean of AP across a set of queries. Buckley and Voorhees's 2000 SIGIR analysis of evaluation-measure stability showed that average precision is among the most stable and discriminating IR measures, requiring fewer queries than alternatives like precision at a fixed cutoff to reliably tell two systems apart. MAP remains a standard reporting metric for ranked retrieval, complementing graded-relevance measures such as nDCG. | BM25, the Okapi 'Best Matching 25' function, is the dominant classical ranking function in information retrieval and the workhorse term-weighting scheme behind most lexical search engines and bibliographic databases. Developed by Stephen Robertson, Karen Spärck Jones and colleagues at City University London and formalized in Robertson and Zaragoza's 2009 monograph on the Probabilistic Relevance Framework, BM25 scores a document against a query as a sum, over query terms, of inverse-document-frequency weights multiplied by a saturating, length-normalized transform of within-document term frequency. Two free parameters control how quickly repeated terms stop adding evidence (k1) and how strongly document length is penalized (b). BM25 consistently outperformed plain TF-IDF in the TREC evaluations and remains the standard first-stage retrieval baseline against which modern neural rankers are measured. |
| ScholarGate데이터셋 ↗ |
|
|