research trends

RAG

InsertRank: LLMs can reason over BM25 scores to Improve Listwise Reranking

Large Language Models (LLMs) have demonstrated significant strides across various information retrieval tasks, particularly as rerankers, owing to their strong generalization and knowledge-transfer capabilities acquired from extensive pretraining. In parallel, the rise of LLM-based chat interfaces has raised user expectations, encouraging users to pose more complex queries that necessitate retrieval by ``reasoning'' over documents rather than through simple keyword matching or semantic similarity. While some recent efforts have exploited reasoning abilities of LLMs for reranking such queries, considerable potential for improvement remains. In that regards, we introduce InsertRank, an LLM-based reranker that leverages lexical signals like BM25 scores during reranking to further improve retrieval performance. InsertRank demonstrates improved retrieval effectiveness on -- BRIGHT, a reasoning benchmark spanning 12 diverse domains, and R2MED, a specialized medical reasoning retrieval benchmark spanning 8 different tasks. We conduct an exhaustive evaluation and several ablation studies and demonstrate that InsertRank consistently improves retrieval effectiveness across multiple families of LLMs, including GPT, Gemini, and Deepseek models. %In addition, we also conduct ablation studies on normalization by varying the scale of the BM25 scores, and positional bias by shuffling the order of the documents. With Deepseek-R1, InsertRank achieves a score of 37.5 on the BRIGHT benchmark. and 51.1 on the R2MED benchmark, surpassing previous methods.

Gallery

RAG

LLM

Autonomous Driving

Data Mining

CLIP

RAG

InsertRank: LLMs can reason over BM25 scores to Improve Listwise Reranking

LocationReasoner: Evaluating LLMs on Real-World Site Selection Reasoning

LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs

VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks

LlamaRec-LKG-RAG: A Single-Pass, Learnable Knowledge Graph-RAG Framework for LLM-Based Ranking

Let's CONFER: A Dataset for Evaluating Natural Language Inference Models on CONditional InFERence and Presupposition

Multi-Layer GRPO: Enhancing Reasoning and Self-Correction in Large Language Models

Stronger Baselines for Retrieval-Augmented Generation with Long-Context Language Models

EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation

LLM

Serving Large Language Models on Huawei CloudMatrix384

Serving Large Language Models on Huawei CloudMatrix384

Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning

LEANN: A Low-Storage Vector Index

EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model

EdgeProfiler: A Fast Profiling Framework for Lightweight LLMs on Edge Using Analytical Model

EPiC: Towards Lossless Speedup for Reasoning Training through Edge-Preserving CoT Condensation

TL;DR: Too Long, Do Re-weighting for Effcient LLM Reasoning Compression

MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems

CLIP

A Navigation Framework Utilizing Vision-Language Models

Efficient Medical Vision-Language Alignment Through Adapting Masked Vision Models

VersaVid-R1: A Versatile Video Understanding and Reasoning Model from Question Answering to Captioning Tasks

MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Experimental Evaluation of Static Image Sub-Region-Based Search Models Using CLIP

Zero Shot Composed Image Retrieval

From Play to Replay: Composed Video Retrieval for Temporally Fine-Grained Videos