TY - GEN
T1 - SCV
T2 - 31st International Conference on Computational Linguistics, COLING 2025
AU - Park, Cheoneum
AU - Jeong, Seohyeong
AU - Kim, Minsang
AU - Lim, Kyeongtae
AU - Lee, Yonghoon
N1 - Publisher Copyright:
©2025 Association for Computational Linguistics.
PY - 2025
Y1 - 2025
N2 - Recent advances in language models (LMs) has driven progress in information retrieval (IR), effectively extracting semantically relevant information. However, they face challenges in balancing computational costs with deeper query-document interactions. To tackle this, we present two mechanisms: 1) a light and effective multi-vector retrieval with sequence compression vectors, dubbed SCV and 2) coarse-to-fine vector search. The strengths of SCV stems from its application of span compressive vectors for scoring. By employing a non-linear operation to examine every token in the document, we abstract these into a span-level representation. These vectors effectively reduce the document’s dimensional representation, enabling the model to engage comprehensively with tokens across the entire collection of documents, rather than the subset retrieved by Approximate Nearest Neighbor. Therefore, our framework performs a coarse single vector search during the inference stage and conducts a fine-grained multi-vector search end-to-end. This approach effectively reduces the cost required for search. We empirically show that SCV achieves the fastest latency compared to other state-of-the-art models and can obtain competitive performance on both in-domain and out-of-domain benchmark datasets.
AB - Recent advances in language models (LMs) has driven progress in information retrieval (IR), effectively extracting semantically relevant information. However, they face challenges in balancing computational costs with deeper query-document interactions. To tackle this, we present two mechanisms: 1) a light and effective multi-vector retrieval with sequence compression vectors, dubbed SCV and 2) coarse-to-fine vector search. The strengths of SCV stems from its application of span compressive vectors for scoring. By employing a non-linear operation to examine every token in the document, we abstract these into a span-level representation. These vectors effectively reduce the document’s dimensional representation, enabling the model to engage comprehensively with tokens across the entire collection of documents, rather than the subset retrieved by Approximate Nearest Neighbor. Therefore, our framework performs a coarse single vector search during the inference stage and conducts a fine-grained multi-vector search end-to-end. This approach effectively reduces the cost required for search. We empirically show that SCV achieves the fastest latency compared to other state-of-the-art models and can obtain competitive performance on both in-domain and out-of-domain benchmark datasets.
UR - http://www.scopus.com/inward/record.url?scp=105000217429&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:105000217429
T3 - Proceedings - International Conference on Computational Linguistics, COLING
SP - 760
EP - 770
BT - Industry Track
A2 - Rambow, Owen
A2 - Wanner, Leo
A2 - Apidianaki, Marianna
A2 - Al-Khalifa, Hend
A2 - Di Eugenio, Barbara
A2 - Schockaert, Steven
A2 - Darwish, Kareem
A2 - Agarwal, Apoorv
PB - Association for Computational Linguistics (ACL)
Y2 - 19 January 2025 through 24 January 2025
ER -