Results may be incomplete – please double-check responses.

About

Conductor is a natural-language interface to the L-functions and Modular Forms Database (LMFDB), a comprehensive repository of mathematical objects arising in number theory and arithmetic geometry.

While the web interface of the LMFDB is excellent for looking up individual objects, querying and analysing data across the entire database often requires a nontrivial amount of familiarity with its extensive underlying schema, and of programming in general.

Conductor is a research-grade tool which aims to lower this barrier through its natural-language interface. It is capable of retrieving data across the LMFDB and analysing it through plots or statistical methods, all from plain English queries. For mathematicians with coding knowledge, it also automatically attaches all of the code it generates, allowing for full transparency and rigor in verifying its outputs. We expect that this software will not only aid mathematicians without technical backgrounds, but will help to accelerate active research in the growing field of AI-assisted mathematics.

Limitations

  • Conductor connects to devmirror.lmfdb.xyz, a PostgreSQL mirror of the LMFDB which may not have complete coverage of all tables. Moreover, the LMFDB itself is not exhaustive — some queries may return no results due to a genuine lack of data.
  • Because SQL generation is not infallible, the model may produce inaccurate results. It is good practice to double-check the underlying code produced by the model if you intend to rely on its output for research purposes.
  • Conductor is under active development. If you encounter an error or unexpected behaviour, please open an issue on GitHub.

Acknowledgements

Conductor was built by Ritik Jain. You can learn more about its architecture .

This work would not be possible without the LMFDB itself, which is the product of an enormous collective effort. A list of contributors is available here.

Miscellany

  • The name "Conductor" has a double meaning: it refers to both the notion of a conductor in number theory, and a conductor as a conduit of knowledge.
  • The pixelated background which appears on the home page and during thinking phases evolves according to the Game of Life, a cellular automaton invented by John Conway in 1970.

Architecture

Conductor's backend pipeline.

The backend of Conductor consists of a seven-stage FastAPI pipeline with error handling. We utilize Claude Haiku 4.5 for classification; otherwise we use Claude Sonnet 4.6, which handles user interactions and more complex tasks.

  1. An intent classifier determines whether the incoming message is a mathematical query or a conversational message. Conversational messages receive a natural response and skip all subsequent stages.
  2. An LLM-as-judge assesses query precision before any database interaction. If the query is ambiguous, it asks a followup question. If clear, it returns a refined restatement passed to all subsequent stages.
  3. A lightweight object resolution stage fires when the query references a concrete mathematical object and resolves it to a database identifier. If no concrete object is found, the query passes through unchanged.
  4. Our LLM maps the query to a list of relevant LMFDB table names using a two-layer hierarchical schema index (16 domains, 86 tables).
  5. Our LLM produces a validated SQL query using the tables identified in Stage 3. Correctness is enforced by using our preloaded schema as a ground truth.
  6. We run the SQL over a read-only SQLAlchemy connection with a 15-second timeout, returning a pandas DataFrame.
  7. (Optional) We translate a follow-up natural language instruction into Python. Plots and subsequent data analysis are captured in-memory and returned as base64-encoded PNGs alongside the generated code.