In this second episode of the Neural Information Retrieval Talks podcast, Andrew Yates and Sergi Castella discuss the paper “The The Curse of Dense Low-Dimensional Information Retrieval for Large Index Sizes“
Paper:
This paper puts the spotlight on the popular IR benchmark MS MARCO and investigates whether modern neural retrieval models retrieve documents that are even more relevant than the original top relevance annotations. The results have important implications and raise the question of to what degree this benchmark is still an informative north star to follow.
Contact: castella@
Tim
...estamps:
00:00 Co-host introduction
00:26 Paper introduction
02:18 Dense vs. Sparse retrieval
05:46 Theoretical analysis of false positives(1)
08:17 What is low vs. high dimensional representations
11:49 Theoretical analysis o false positives (2)
20:10 First results: growing the MS-Marco index
28:35 Adding rShow more