Getting Started

From zero to your first completed research run. No prior experience with agents required.

Table of contents

  1. Before you begin
  2. Installation
  3. Start the app
  4. Your first research run
    1. Step 1 — Start the interview
    2. Step 2 — Review the workflow plan
    3. Step 3 — Watch the swarm
    4. Step 4 — Read the results
    5. Step 5 — Run it again
  5. Bring your own models
  6. Understanding what’s stored
  7. Common questions
  8. What to investigate first

Before you begin

What you need:

Requirement Details
Python 3.11+ Check with python3 --version
Git For cloning and the run audit trail
Gemini API key Free tier works for exploration — get one here
Perplexity API key Powers live web search — get one here

Perplexity is optional if your research doesn’t require live web search. The system will still run — specialists will use web browsing and code execution instead.

What you do not need:

  • Docker
  • A database
  • An account with any platform
  • Anything beyond the two API keys above

Installation

git clone https://github.com/ai-agents-for-humans/slow-ai
cd slow-ai
bash install.sh

The install script does four things in sequence:

  1. Checks your environment — confirms Python 3.11+ is available
  2. Installs uv — the fast Python package manager that powers sandboxed execution
  3. Installs dependenciesuv sync pulls everything from pyproject.toml
  4. Configures your API keys — prompts for Gemini and Perplexity keys, writes them to .env and your shell profile

Your keys are stored locally and never leave your machine.

When it finishes you will see:

  ╔══════════════════════════════════════════════════════════╗
  ║                     you're ready                        ║
  ╚══════════════════════════════════════════════════════════╝

  Load keys in this terminal
    source ~/.zshrc   (or ~/.bashrc depending on your shell)

  Run the app
    PYTHONPATH=. uv run uvicorn app.main:app --reload

Start the app

PYTHONPATH=. uv run uvicorn app.main:app --reload

This starts the web interface at http://localhost:8000. Keep the terminal running while you work — the app uses server-sent events to stream live updates to your browser.


Your first research run

Step 1 — Start the interview

Click New in the sidebar. The interview begins immediately.

The interviewer agent asks you questions one at a time to understand your research goal. Answer naturally — as if you were briefing a colleague who needs to understand exactly what you need and why.

What the interview is doing:

  • Identifying the domain and goal
  • Clarifying scope (geography, timeframe, perspective)
  • Surfacing assumptions you may not have articulated
  • Producing a structured brief precise enough to plan against

What good interview answers look like:

  • Be specific about why you need this — the purpose shapes what evidence matters
  • Don’t worry about framing it perfectly — the agent will push back if something is vague
  • Answer one question at a time; don’t pre-answer questions that haven’t been asked

When the agent has enough, it presents a structured Problem Brief directly in the conversation — goal, domain, constraints, unknowns, and success criteria laid out clearly. Read it carefully. If something is missing or wrong, type your correction and the agent will revise. When you’re satisfied, click Confirm Brief →.

The brief is the cornerstone. Everything that follows — the context graph, the agent assignments, the synthesis — is built on top of it. A five-minute interview is the highest-leverage thing the system does.


Step 2 — Review the workflow plan

After confirming your brief, the planner generates a workflow plan: a structured breakdown of the research question into phases and parallel work items, rendered as an interactive graph.

You will see:

  • A visual graph — phases as dark nodes, work items as white nodes connected by dependency edges
  • A narrative summary on the right explaining the logic of the breakdown
  • A chat panel for refinement

Read the narrative first. It explains why the graph is structured the way it is. If the narrative doesn’t match your mental model of the work, that is the signal to refine.

Refining the plan through conversation:

You don’t edit the graph directly — you tell the system what you want changed in plain language. Type in the “Suggest a change…” panel and the graph regenerates with a new narrative.

  "The regulatory phase should come before the market sizing —
   reimbursement rules will constrain what's commercially viable."

  "Add a phase on competitive dynamics — I need to understand
   who the incumbents are before we assess entry strategy."

  "Remove the pricing phase entirely — the client already has this data."

Repeat until the shape of the work matches how an expert in your domain would approach it.

You are the expert. The system proposes, you decide. The graph is not locked until you launch — take the time to get the direction right. A well-shaped graph produces dramatically better results than a vague one.

When you’re satisfied: click Launch Agent Swarm → at the bottom.

The direction matters more than the detail. You are approving the shape of the investigation, not the implementation. Your job is to confirm the phases make sense and nothing critical is missing.


Step 3 — Watch the swarm

After launch, the approved plan drives a swarm of specialist agents running in parallel. The Run tab shows the agent DAG filling in as agents complete their work.

What you’re seeing:

Node colour Meaning
Grey Waiting — dependency not yet satisfied
Blue Running — agent is actively working
Green Complete — evidence envelope produced
Red Failed or partial — agent flagged a gap

The Log strip at the bottom streams every agent action as it happens — phase launches, specialist completions, synthesis steps, confidence scores. Click any completed node to open the evidence envelope on the right — what the agent found, its confidence score, the sources it cited, and what it couldn’t determine.

You do not need to watch it run. You can close the browser and come back later. The run continues in the background and the state is always written to disk.


Step 4 — Read the results

When the run completes, the Results tab opens automatically. Everything is in one place.

The Final Research Report — at the very top of the Results tab, the system synthesises all agent findings into a single long-form document: executive summary, thematic findings, open questions, limitations, recommendations, and an inline source list. Every claim is cited back to the agent that produced it.

Export the report — click ↓ Export HTML to download a self-contained, standalone HTML file. No external dependencies, print-ready, shareable without the app running.

Below the report: the full phase tree. Expanding a phase shows the synthesis for that phase, then each agent card with its full findings and confidence score. Each agent card has a collapsible Tool Calls section — every search query, web browse, or code execution, colour-coded by source. Agents that ran code show an Artefacts section — click any chip to open the file in a full-screen viewer with syntax highlighting.

Ask about the run — the chat bar at the bottom is always available. Ask follow-up questions grounded in the run’s evidence, or instruct the agent to rewrite or expand a section of the report. The document updates in place.

The Run Summary panel on the right shows confidence per phase and total tokens.


Step 5 — Run it again

The run doesn’t have to end here. Click Run again on the project in the sidebar.

The system reuses the approved brief and context graph from the original run, launches a fresh swarm, and produces a new report. The next run’s specialists can pull specific evidence from prior runs when their work item needs it — the understanding compounds.

  Run 1  →  Market landscape. Gaps: regulatory detail, pricing pressure.
  Run 2  →  Fresh swarm, same brief. Specialists build on Run 1 evidence.
  Run 3  →  Competitive synthesis. Uses Runs 1 + 2. Strategic recommendation.

Each run starts smarter than the last. This is the loop that turns a one-off investigation into a compounding body of knowledge.


Bring your own models

Out of the box, Slow AI uses Gemini models. To swap any model slot, edit src/slow_ai/llm/registry.json:

{
  "models": [
    {
      "name": "reasoning",
      "model_id": "google-gla:gemini-3.1-pro",
      "use_for": ["context_planning", "orchestration", "assessment"]
    },
    {
      "name": "fast",
      "model_id": "google-gla:gemini-3.1-flash",
      "use_for": ["skill_synthesis", "report_synthesis", "interview"]
    },
    {
      "name": "code",
      "model_id": "google-gla:gemini-3.1-pro",
      "use_for": ["code_generation"]
    },
    {
      "name": "specialist",
      "model_id": "google-gla:gemini-3.1-pro",
      "use_for": ["specialist_research"]
    }
  ]
}

Change model_id to any supported provider. No code changes needed.

Supported providers:

Provider Example model_id
Google google-gla:gemini-3.1-pro
OpenAI openai:gpt-4o
Anthropic anthropic:claude-opus-4-6
Ollama (local) ollama:qwen2.5-coder:7b
vLLM / LM Studio openai:model-name with custom base URL

For regulated environments: point every slot at a local Ollama instance. No data leaves your infrastructure. The agents do not know or care which provider they use.


Understanding what’s stored

Every run creates a directory under runs/:

runs/
  {run_id}/
    input_brief.json      ← the brief that started the run
    input_graph.json      ← the approved context graph
    envelopes/            ← one JSON file per specialist agent
    artefacts/            ← generated code, datasets, parsed documents
    live/                 ← real-time state files read by the UI
    conversation.jsonl    ← full history: interview, review, post-run chat
    runner.log            ← structured run log

Projects are stored under output/:

output/
  {project_id}/
    problem_brief.json
    runs.jsonl            ← index of all runs for this project

Everything is plain files. Nothing requires a running service to read. Open any envelope JSON to see exactly what an agent produced.


Common questions

The run is taking a long time — is it stuck?

Check runs/{run_id}/runner.log for the most recent log entry. If agents are still writing to their envelopes, the run is active. Some research questions — especially those requiring document parsing or dataset analysis — take 10–20 minutes per phase.

An agent failed — what now?

Red nodes in the DAG indicate a failed or partial agent. Click the node to see the evidence envelope — it will include an error description and what the agent attempted. The run continues with whatever work items were not blocked by that failure.

I want to add a new tool or API integration.

Tools are added to src/slow_ai/tools/. A tool is a Python function decorated with the pydantic-ai tool pattern. Once added, create a corresponding entry in src/slow_ai/skills/catalog/{skill_name}/SKILL.md with YAML frontmatter (name, description, tools, tags) and a full playbook body (when to use, how to execute, output contract, quality bar). The skill catalog is what the context planner draws from when assigning skills to work items.

Can I run it without the web UI?

Yes. The execution plane runs independently:

uv run python -m slow_ai.research --brief path/to/brief.json --graph path/to/graph.json

The UI reads the files the runner writes. Any other reader — a script, a different UI, a CLI — works the same way.


What to investigate first

If you’re not sure where to start, here are briefs that work well as first runs:

  • A market or competitive question in your professional domain
  • A technology evaluation — comparing frameworks, libraries, or approaches
  • A regulatory or policy question in a domain you know
  • A medical or scientific question you’ve been meaning to research properly

The interview will help you get the brief right regardless of where you start. The more specific your problem, the sharper the context graph — and the more useful the output.

Read how it works in depth Explore the architecture


Back to top

Systems that learn. Agents that work. Humans who stay in control.