Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.ivory.finance/llms.txt

Use this file to discover all available pages before exploring further.

How it works

Unlike the simpler /answer endpoint which does a single vector search, the agent runs a two-phase loop:
1

Phase 1 — Tool loop (up to 4 turns)

The LLM decides autonomously which tools to call and in what order. Every tool call emits an agent_step SSE event so your UI can show live progress.Example reasoning chain for “AWS Revenue Q4 2025”:
  1. news_search → finds Amazon reported Q4 2025 earnings on Feb 5, 2026
  2. search_filings → locates the 10-K filed 2026-02-07
  3. retrieve_from_filing → semantic search inside that specific 10-K for AWS revenue
2

Phase 2 — Streaming answer

After all tools complete, one streaming LLM call synthesises everything into a cited Markdown answer. Events arrive in order: sourcestoken × N → conversation_statedone.

Temporal reasoning

This is the core advantage over /answer. Pure KNN vector search cannot tell “Q4 2025 10-K” from “Q3 2025 10-Q” — the text is nearly identical. The agent solves this by:
  1. Using news_search to find the actual earnings report date
  2. Using search_filings with a precise after_date to locate the right filing
  3. Using retrieve_from_filing scoped to that one accession number
Query:  "AWS Revenue Q4 2025"

Agent:  news_search("Amazon Q4 2025 earnings date")
          → Amazon reported Feb 5, 2026

        search_filings(cik="0001018724", form_types=["10-K"], after_date="2026-01-01")
          → 10-K filed 2026-02-07 | Accession: 0001018724-26-000027

        retrieve_from_filing(accession="0001018724-26-000027", query="AWS cloud revenue Q4 2025")
          → [S1] AWS Segment Results: net sales $28.8B in Q4 2025

Answer: "Amazon Web Services generated **$28.8 billion** in Q4 2025 [S1]…"

Conversation history

The agent supports stateless (tenant-owned) chat persistence. Your system stores messages; you replay them on every request. The server never stores end-user messages.
FieldTypeDescription
chat_idstring | nullOpaque ID echoed back in conversation_state. Use to route the updated history to the right record.
conversation_historyConversationTurn[] | nullPrior user/assistant turns injected as context before the current query. Supersedes session_id when both are provided.
{
  "query": "What about gross margin?",
  "chat_id": "chat_user-42_abc123",
  "conversation_history": [
    { "role": "user",      "content": "What was AWS revenue in Q4 2025?" },
    { "role": "assistant", "content": "AWS generated $28.8 billion in Q4 2025 [S1]." }
  ]
}
At the end of every stream, a conversation_state event is emitted containing all user-facing turns (including the current one). Persist this as your new history for the next request.
{
  "type": "conversation_state",
  "chat_id": "chat_user-42_abc123",
  "messages": [
    { "role": "user",      "content": "What was AWS revenue in Q4 2025?" },
    { "role": "assistant", "content": "AWS generated $28.8 billion in Q4 2025 [S1]." },
    { "role": "user",      "content": "What about gross margin?" },
    { "role": "assistant", "content": "Amazon's consolidated gross margin for Q4 2025 was approximately 48% [S2]." }
  ]
}
conversation_history and session_id are independent mechanisms. conversation_history is for tenant-managed cross-session persistence (Model B). session_id is a server-side Valkey cache with a 24-hour TTL, useful for within-session continuity without replaying history. If both are provided, conversation_history takes precedence.

SSE stream — code examples

Connect with Accept: text/event-stream. Each line: data: <JSON>\n\n.
import httpx, json

# Load your stored history (empty for first turn)
history = []

with httpx.stream(
    "POST",
    "https://api.ivory.finance/v1/rag/answer/agent",
    headers={"X-API-Key": "YOUR_KEY", "Accept": "text/event-stream"},
    json={
        "query": "What was AWS revenue in Q4 2025?",
        "cik": "0001018724",
        "chat_id": "chat_user-42_abc123",
        "conversation_history": history,  # [] on first turn
    },
    timeout=60,
) as r:
    for line in r.iter_lines():
        if not line.startswith("data: "):
            continue
        event = json.loads(line[6:])
        match event["type"]:
            case "agent_step":
                print(f"[{event['status']}] {event['tool']}")
            case "sources":
                print(f"Sources: {len(event['sources'])} SEC, {len(event['web_sources'])} web")
            case "token":
                print(event["token"], end="", flush=True)
            case "conversation_state":
                # Persist this as the history for the next request
                history = event["messages"]
                save_to_your_db(event["chat_id"], history)
            case "done":
                print("\n\nComplete.")
                break
            case "error":
                print(f"Error: {event['detail']}")
                break

Multi-tenant configuration

Pass an optional config object to choose any supported LLM or connect the agent to your own database. All fields are optional — omit config entirely to use server defaults (GPT-4o + server Postgres).
{
  "query": "What is our portfolio exposure to NVDA risk factors?",
  "config": {
    "llm_provider": "anthropic",
    "llm_model": "claude-opus-4-6",
    "db_connection_string": "postgresql://user:pass@your-db.com:5432/portfolio"
  }
}

LLM Providers

OpenAI (GPT-4o) or Anthropic (Claude). Switch per request with no redeployment.

Database Connections

Connect PostgreSQL, MySQL, MariaDB, MongoDB, or Oracle. The agent queries your data alongside SEC filings.
See the Agent Configuration guide for API key setup, connection string formats, read-only user instructions, and security best practices.

Citation scheme

BracketSourceList in response
[S1], [S2], …SEC filing chunks from retrieve_from_filingsources[]
[W1], [W2], …Web results from web_searchweb_sources[] source_type: "web"
[N1], [N2], …News results from news_searchweb_sources[] source_type: "news"
Numbers are sequential across all tool calls — [S1] and [S2] from two different retrieve_from_filing calls on different filings both appear in sources[] in order.