ETL Connectors

Architecture
Multi-tenancy
Connector lifecycle
Scheduler
Supported cron schedules
Endpoints summary

Architecture

Singer tap (600+ sources)
    │
    ▼
meltano_runner_worker   ←── etl.meltano_connectors (scheduler)
    │  runs subprocess: meltano run tap-{name} target-postgres --state-id {id}
    │
    ▼
etl_staging_{source}    ←── Postgres staging schema (shared, per-tap)
    │
    ▼
POST /v1/etl/meltano/webhook  (HMAC-SHA256 signed)
    │
    ▼
_transform_and_load()   ←── etl.field_mappings (per-connector, LLM-mapped)
    │  applies Python transform expressions, batches 500 rows per INSERT
    ▼
IOMETE Iceberg          ←── ivory_tenant_{slug}.{table}  (per-tenant namespace)

Multi-tenancy

Every connector is scoped to a tenant via JWT sub claim. Isolation is enforced at three levels:

Layer	Mechanism
API	JWT Bearer → `tenant_id` scopes every query
Staging	Meltano runner uses per-tenant DB credentials
Lakehouse	IOMETE namespace `ivory_tenant_{slug}` per tenant

Connector lifecycle

register (POST /connectors)
    │   status = 'active',  next_run_at = now() + 5 min
    ▼
runner polls next_run_at
    │
    ▼
meltano subprocess runs
    │
    ▼
webhook callback  →  _transform_and_load  →  Iceberg
    │   on success: reset consecutive_failures, advance next_run_at
    │   on failure: consecutive_failures++
    │               after 3 failures: status = 'error'
    ▼
pause / resume / trigger (manual)
    │
    ▼
delete (irreversible — cascade-deletes all runs, mappings, jobs)

Scheduler

The meltano_runner_worker pod polls etl.meltano_connectors every 30 seconds for rows where status = 'active' AND next_run_at <= now(). It then:

Inserts an etl.runner_jobs row (tracks subprocess PID + log tail)
Runs meltano run tap-{name} target-postgres --state-id {connector_id}
Passes the stored tap_state as --state for incremental syncs
Calls the webhook when complete
Advances next_run_at based on sync_schedule (cron expression)

Circuit breaker: after 3 consecutive failures the connector is set to status = 'error' and removed from the scheduler queue. Use POST /{id}/resume to re-activate after fixing the root cause.

Supported cron schedules

Expression	Interval
`/15 * * *`	Every 15 minutes
`0 * * * *`	Hourly
`0 /6 * *`	Every 6 hours
`0 2 * * *`	Daily at 02:00 UTC (default)
`0 0 * * 0`	Weekly

Endpoints summary

# Webhook (no JWT — HMAC auth)
POST   /v1/etl/meltano/webhook

# Connector management (JWT required)
POST   /v1/admin/connectors
GET    /v1/admin/connectors
GET    /v1/admin/connectors/{id}
POST   /v1/admin/connectors/{id}/trigger
POST   /v1/admin/connectors/{id}/pause
POST   /v1/admin/connectors/{id}/resume
DELETE /v1/admin/connectors/{id}

# Field mappings
POST   /v1/admin/connectors/{id}/map-fields
GET    /v1/admin/connectors/{id}/mappings
PUT    /v1/admin/connectors/{id}/mappings/{mapping_id}

# Run history
GET    /v1/admin/connectors/{id}/runs

# Lakehouse catalogs
POST   /v1/admin/tenants/{tenant_id}/catalog
GET    /v1/admin/tenants/{tenant_id}/catalog
GET    /v1/admin/lakehouse/catalogs
GET    /v1/admin/lakehouse/health
POST   /v1/admin/lakehouse/sync-targets
GET    /v1/admin/lakehouse/sync-targets

Agents ETL Connectors

Companies

Topics

Filings

Press Releases

Presentations

Financial Statements

Earnings

Intelligence Editor

Agentic RAG

Documents

News

Real-Time

Insider Trades

Tools

Alternative Data

Export

Share

Admin

Authentication

Real-Time Database

AI Foundry

Lakehouse

Portfolios

Deals

KYC

ESG

Mandates

Forensics

ETL Connectors — Overview

Architecture

Multi-tenancy

Connector lifecycle

Scheduler

Supported cron schedules

Endpoints summary

Companies

Topics

Filings

Press Releases

Presentations

Financial Statements

Earnings

Intelligence Editor

Agentic RAG

Documents

News

Real-Time

Insider Trades

Tools

Alternative Data

Export

Share

Admin

Authentication

Real-Time Database

AI Foundry

ETL Connectors

Lakehouse

Portfolios

Deals

KYC

ESG

Mandates

Forensics

Documentation Index

​Architecture

​Multi-tenancy

​Connector lifecycle

​Scheduler

​Supported cron schedules

​Endpoints summary

Architecture

Multi-tenancy

Connector lifecycle

Scheduler

Supported cron schedules

Endpoints summary