This article details the evaluation of OpenSearch's agentic search capabilities, which uses LLMs to enable natural language interaction with data. Testing focused on two key areas: **search relevance** (how well relevant documents are ranked) and **execution accuracy** (correctness of the generated search queries). Results were measured using standard benchmarks like BEIR and BRIGHT, comparing traditional search methods to agentic search variants with different prompting strategies tailored to leverage the strengths of lexical and neural retrieval techniques. Read more...