CORE Talk: From Text Mining to Biological Reasoning with LLMs
CORE Talk by Sidsel Boldsen, Novo Nordisk
Info about event
Time
Location
Jens Chr. Skous vej 4, bldg. 1481, room 366
Abstract:
Large language models have become valuable research tools — from summarising literature to generating hypotheses — yet they inherently struggle with long-tail knowledge: facts that appear rarely in training data may not be captured at all, or may be stored too unreliably to surface when needed. Retrieval-augmented generation offers a path forward by letting models access tools that inject relevant evidence into their context at inference time. However, effective retrieval is far from trivial: it often demands multi-hop reasoning, iterative refinement, and domain-specific background knowledge.
In this talk, I will present how we at Novo Nordisk have addressed these challenges to enable researchers to explore key questions in drug discovery and target validation. I will walk through the key components of a deep research framework for biological reasoning that we developed, where language models navigate knowledge graphs derived from large-scale text mining of scientific literature. While benchmarking such report-style answers remains an open challenge, early adoption shows that researchers find the responses insightful, verifiable, and most importantly actionable - demonstrating the potential for deep research systems to support real-world research decisions.
Join via Zoom: