Agentic Answer with Tool Calling (Streaming)

How it works

Unlike the simpler /answer endpoint which does a single vector search, the agent runs a two-phase loop:

Phase 1 — Tool loop (up to 4 turns)

The LLM decides autonomously which tools to call and in what order. Every tool call emits an agent_step SSE event so your UI can show live progress.Example reasoning chain for “AWS Revenue Q4 2025”:

news_search → finds Amazon reported Q4 2025 earnings on Feb 5, 2026
search_filings → locates the 10-K filed 2026-02-07
retrieve_from_filing → semantic search inside that specific 10-K for AWS revenue

Phase 2 — Streaming answer

After all tools complete, one streaming LLM call synthesises everything into a cited Markdown answer. Events arrive in order: sources → token × N → conversation_state → done.

Temporal reasoning

This is the core advantage over /answer. Pure KNN vector search cannot tell “Q4 2025 10-K” from “Q3 2025 10-Q” — the text is nearly identical. The agent solves this by:

Using news_search to find the actual earnings report date
Using search_filings with a precise after_date to locate the right filing
Using retrieve_from_filing scoped to that one accession number

Query:  "AWS Revenue Q4 2025"

Agent:  news_search("Amazon Q4 2025 earnings date")
          → Amazon reported Feb 5, 2026

        search_filings(cik="0001018724", form_types=["10-K"], after_date="2026-01-01")
          → 10-K filed 2026-02-07 | Accession: 0001018724-26-000027

        retrieve_from_filing(accession="0001018724-26-000027", query="AWS cloud revenue Q4 2025")
          → [S1] AWS Segment Results: net sales $28.8B in Q4 2025

Answer: "Amazon Web Services generated **$28.8 billion** in Q4 2025 [S1]…"

Conversation history

The agent supports stateless (tenant-owned) chat persistence. Your system stores messages; you replay them on every request. The server never stores end-user messages.

Field	Type	Description
`chat_id`	`string \| null`	Opaque ID echoed back in `conversation_state`. Use to route the updated history to the right record.
`conversation_history`	`ConversationTurn[] \| null`	Prior `user`/`assistant` turns injected as context before the current query. Supersedes `session_id` when both are provided.

{
  "query": "What about gross margin?",
  "chat_id": "chat_user-42_abc123",
  "conversation_history": [
    { "role": "user",      "content": "What was AWS revenue in Q4 2025?" },
    { "role": "assistant", "content": "AWS generated $28.8 billion in Q4 2025 [S1]." }
  ]
}

At the end of every stream, a conversation_state event is emitted containing all user-facing turns (including the current one). Persist this as your new history for the next request.

{
  "type": "conversation_state",
  "chat_id": "chat_user-42_abc123",
  "messages": [
    { "role": "user",      "content": "What was AWS revenue in Q4 2025?" },
    { "role": "assistant", "content": "AWS generated $28.8 billion in Q4 2025 [S1]." },
    { "role": "user",      "content": "What about gross margin?" },
    { "role": "assistant", "content": "Amazon's consolidated gross margin for Q4 2025 was approximately 48% [S2]." }
  ]
}

conversation_history and session_id are independent mechanisms. conversation_history is for tenant-managed cross-session persistence (Model B). session_id is a server-side Valkey cache with a 24-hour TTL, useful for within-session continuity without replaying history. If both are provided, conversation_history takes precedence.

SSE stream — code examples

Connect with Accept: text/event-stream. Each line: data: <JSON>\n\n.

import httpx, json

# Load your stored history (empty for first turn)
history = []

with httpx.stream(
    "POST",
    "https://api.ivory.finance/v1/rag/answer/agent",
    headers={"X-API-Key": "YOUR_KEY", "Accept": "text/event-stream"},
    json={
        "query": "What was AWS revenue in Q4 2025?",
        "cik": "0001018724",
        "chat_id": "chat_user-42_abc123",
        "conversation_history": history,  # [] on first turn
    },
    timeout=60,
) as r:
    for line in r.iter_lines():
        if not line.startswith("data: "):
            continue
        event = json.loads(line[6:])
        match event["type"]:
            case "agent_step":
                print(f"[{event['status']}] {event['tool']}")
            case "sources":
                print(f"Sources: {len(event['sources'])} SEC, {len(event['web_sources'])} web")
            case "token":
                print(event["token"], end="", flush=True)
            case "conversation_state":
                # Persist this as the history for the next request
                history = event["messages"]
                save_to_your_db(event["chat_id"], history)
            case "done":
                print("\n\nComplete.")
                break
            case "error":
                print(f"Error: {event['detail']}")
                break

Multi-tenant configuration

Pass an optional config object to choose any supported LLM or connect the agent to your own database. All fields are optional — omit config entirely to use server defaults (GPT-4o + server Postgres).

{
  "query": "What is our portfolio exposure to NVDA risk factors?",
  "config": {
    "llm_provider": "anthropic",
    "llm_model": "claude-opus-4-6",
    "db_connection_string": "postgresql://user:pass@your-db.com:5432/portfolio"
  }
}

LLM Providers

OpenAI (GPT-4o) or Anthropic (Claude). Switch per request with no redeployment.

Database Connections

Connect PostgreSQL, MySQL, MariaDB, MongoDB, or Oracle. The agent queries your data alongside SEC filings.

See the Agent Configuration guide for API key setup, connection string formats, read-only user instructions, and security best practices.

Citation scheme

Bracket	Source	List in response
`[S1], [S2], …`	SEC filing chunks from `retrieve_from_filing`	`sources[]`
`[W1], [W2], …`	Web results from `web_search`	`web_sources[]` `source_type: "web"`
`[N1], [N2], …`	News results from `news_search`	`web_sources[]` `source_type: "news"`

Numbers are sequential across all tool calls — [S1] and [S2] from two different retrieve_from_filing calls on different filings both appear in sources[] in order.

Companies

Topics

Filings

Press Releases

Presentations

Financial Statements

Earnings

Intelligence Editor

Agentic RAG

Documents

News

Real-Time

Insider Trades

Tools

Alternative Data

Export

Share

Admin

Authentication

Real-Time Database

AI Foundry

ETL Connectors

Lakehouse

Portfolios

Deals

KYC

ESG

Mandates

Forensics

Agentic Answer with Tool Calling (Streaming)

How it works

Temporal reasoning

Conversation history

SSE stream — code examples

Multi-tenant configuration

LLM Providers

Database Connections

Citation scheme

Companies

Topics

Filings

Press Releases

Presentations

Financial Statements

Earnings

Intelligence Editor

Agentic RAG

Documents

News

Real-Time

Insider Trades

Tools

Alternative Data

Export

Share

Admin

Authentication

Real-Time Database

AI Foundry

ETL Connectors

Lakehouse

Portfolios

Deals

KYC

ESG

Mandates

Forensics

Documentation Index

​How it works

​Temporal reasoning

​Conversation history

​SSE stream — code examples

​Multi-tenant configuration

LLM Providers

Database Connections

​Citation scheme

How it works

Temporal reasoning

Conversation history

SSE stream — code examples

Multi-tenant configuration

Citation scheme