Aarhus University Seal

CORE Talk: From Text Mining to Biological Reasoning with LLMs

CORE Talk by Sidsel Boldsen, Novo Nordisk

Info about event

Time

Thursday 23 April 2026,  at 10:00 - 11:00

Location

Jens Chr. Skous vej 4, bldg. 1481, room 366

Abstract: 

Large language models have become valuable research tools — from summarising literature to generating hypotheses — yet they inherently struggle with long-tail knowledge: facts that appear rarely in training data may not be captured at all, or may be stored too unreliably to surface when needed. Retrieval-augmented generation offers a path forward by letting models access tools that inject relevant evidence into their context at inference time. However, effective retrieval is far from trivial: it often demands multi-hop reasoning, iterative refinement, and domain-specific background knowledge.
In this talk, I will present how we at Novo Nordisk have addressed these challenges to enable researchers to explore key questions in drug discovery and target validation. I will walk through the key components of a deep research framework for biological reasoning that we developed, where language models navigate knowledge graphs derived from large-scale text mining of scientific literature. While benchmarking such report-style answers remains an open challenge, early adoption shows that researchers find the responses insightful, verifiable, and most importantly actionable - demonstrating the potential for deep research systems to support real-world research decisions.

 

Join via Zoom: