AutoML Training - Ivory Finance

Authentication

Authorization: Bearer <access_token>

Model lifecycle

POST /models/train         → queued → training → done (or failed)
POST /models/tune          → Katib Bayesian HPO → tuning (up to 12 trials)
GET  /models               → list all runs + metrics
GET  /models/{id}          → full metrics, feature importance, imbalance ratio
GET  /models/{id}/explain  → plain-English explanation (GPT-4o-mini)
GET  /models/{id}/shap     → SHAP mean |value| top-20 features         ← new
GET  /models/{id}/tune     → HPO trial progress + best params
POST /models/{id}/deploy   → KServe InferenceService → deploying → active
POST /models/{id}/threshold → set decision threshold (default 0.5)     ← new
GET  /models/{id}/drift    → per-feature PSI vs training baseline       ← new
POST /models/{id}/predict  → live inference with threshold applied

Train a model

Launches a full 23-algorithm competition across four problem types. The winning algorithm is selected automatically based on validation performance, its artifact is stored in Cloudflare R2, and the run record is updated with complete metrics, SHAP values, and a drift-monitoring baseline. What the worker does on each training run:

Loads up to IOMETE_ROW_LIMIT rows (default 500,000) from your Iceberg table
Auto-detects classification vs regression when problem_type is omitted
Computes class imbalance ratio and applies SMOTE when ratio exceeds 10:1
Runs stratified train_test_split (80/20) for classification
Trains all applicable algorithms with a per-algorithm timeout (default 5 min)
Picks the winner by highest val_accuracy (classification) or lowest MAE (regression)
Computes SHAP mean absolute values for the top-20 features
Records per-feature decile distributions as the PSI drift baseline

Returns 202 Accepted immediately. Poll GET /models/{id} for completion.

Algorithm roster

Problem type	Algorithms	Winner criterion
Classification	XGBoost, LightGBM, CatBoost, RandomForest, ExtraTrees, GradientBoosting, LogisticRegression, SVC (≤5k rows)	Highest `val_accuracy`
Regression	XGBoost, LightGBM, CatBoost, RandomForest, ExtraTrees, GradientBoosting, Ridge, Lasso, ElasticNet, SVR (≤10k rows)	Lowest MAE
Time-series	Prophet, ARIMA (pmdarima auto_arima)	Lowest MAE
Anomaly	IsolationForest, LOF, OneClassSVM	All three scored; IsolationForest preferred for inference

Request body

Field	Required	Default	Description
`dataset_id`	Yes	—	UUID of a `ready` dataset
`name`	Yes	—	Human name for this run
`target_column`	Yes	—	Column to predict
`feature_columns`	No	`[]`	Features to use — empty = all columns except target
`problem_type`	No	`null`	`classification`, `regression`, `timeseries`, `anomaly`, or `auto`

curl -X POST https://api.ivory.finance/v1/foundry/models/train \
  -H "Authorization: Bearer $IVORY_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_id": "3f7a1b2c-...",
    "name": "Revenue Growth Predictor v1",
    "target_column": "revenue_growth_pct",
    "problem_type": "regression"
  }'

{
  "run_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "queued",
  "message": "AutoML job submitted. Poll GET /v1/foundry/models/{run_id} for status."
}

List models

curl https://api.ivory.finance/v1/foundry/models \
  -H "Authorization: Bearer $IVORY_JWT"

{
  "models": [
    {
      "id": "a1b2c3d4-...",
      "name": "Revenue Growth Predictor v1",
      "target_column": "revenue_growth_pct",
      "problem_type": "regression",
      "algorithm": "LightGBM",
      "mae": 3.21,
      "rmse": 4.87,
      "status": "done",
      "trial_count": 10,
      "deploy_status": "active",
      "endpoint_url": "http://model-acme-a1b2c3d4-predictor.kserve.svc.cluster.local",
      "created_at": "2026-03-25T14:00:00Z",
      "completed_at": "2026-03-25T14:09:11Z"
    }
  ]
}

Get model details

Returns full metrics, feature importance, class imbalance ratio, and training status per algorithm in the competition.

curl https://api.ivory.finance/v1/foundry/models/a1b2c3d4-... \
  -H "Authorization: Bearer $IVORY_JWT"

{
  "id": "a1b2c3d4-...",
  "name": "Deal Score v2",
  "status": "done",
  "algorithm": "XGBoost",
  "problem_type": "classification",
  "metrics": {
    "val_accuracy": 0.891,
    "train_accuracy": 0.934,
    "f1_score": 0.887,
    "roc_auc": 0.941
  },
  "imbalance_ratio": 0.08,
  "trial_count": 8,
  "hyperparams": {
    "winner": "XGBoost",
    "competition": {
      "XGBoost":          { "score": 0.891, "elapsed_s": 12.4 },
      "LightGBM":         { "score": 0.883, "elapsed_s": 7.1  },
      "CatBoost":         { "score": 0.871, "elapsed_s": 18.3 },
      "RandomForest":     { "score": 0.856, "elapsed_s": 9.7  },
      "LogisticRegression":{ "score": 0.812, "elapsed_s": 2.1 }
    }
  },
  "feature_importance": [
    { "feature": "deal_size_usd",   "importance": 0.382 },
    { "feature": "sector_slug",     "importance": 0.241 },
    { "feature": "buyer_hhi",       "importance": 0.198 }
  ],
  "artifact_path": "tenants/acme/models/a1b2c3d4-.../model.pkl",
  "created_at": "2026-03-25T14:00:00Z",
  "completed_at": "2026-03-25T14:09:11Z"
}

Explain model

Returns a plain-English explanation of what the model learned, powered by GPT-4o-mini and the top feature importances stored at training time.

curl https://api.ivory.finance/v1/foundry/models/a1b2c3d4-.../explain \
  -H "Authorization: Bearer $IVORY_JWT"

{
  "run_id": "a1b2c3d4-...",
  "model_name": "Deal Score v2",
  "top_features": [
    { "feature": "deal_size_usd", "importance": 0.382 },
    { "feature": "sector_slug",   "importance": 0.241 }
  ],
  "explanation": "The model predicts deal success primarily from transaction size (38%) — larger deals close more reliably once in late-stage diligence. Sector classification (24%) captures cyclical patterns: healthcare and tech consistently outperform industrials. Buyer HHI (20%) reflects market concentration — acquirers with dominant market share face fewer regulatory hurdles and close faster. One caveat: the model was trained on historical completed deals; early-stage opportunities with no comparable precedent may be systematically underscored."
}

SHAP feature importance

Returns SHAP (SHapley Additive exPlanations) mean absolute values for the top-20 features, computed at training time on up to 500 validation rows. SHAP values are more reliable than standard feature importance — they measure the actual contribution of each feature to individual predictions, not just split frequency in tree nodes.

Algorithm type	SHAP method used
XGBoost, LightGBM, CatBoost, RandomForest, ExtraTrees, GradientBoosting	`TreeExplainer` (exact, fast)
LogisticRegression, Ridge, Lasso, ElasticNet	`LinearExplainer`
SVC, SVR, anomaly models	Skipped (too slow for production use)

curl https://api.ivory.finance/v1/foundry/models/a1b2c3d4-.../shap \
  -H "Authorization: Bearer $IVORY_JWT"

{
  "run_id": "a1b2c3d4-...",
  "algorithm": "XGBoost",
  "shap_summary": [
    { "feature": "deal_size_usd",        "mean_abs_shap": 0.2841 },
    { "feature": "sector_slug",          "mean_abs_shap": 0.1923 },
    { "feature": "buyer_hhi",            "mean_abs_shap": 0.1544 },
    { "feature": "days_in_diligence",    "mean_abs_shap": 0.1201 },
    { "feature": "target_revenue_ttm",   "mean_abs_shap": 0.0987 },
    { "feature": "ebitda_margin",        "mean_abs_shap": 0.0812 },
    { "feature": "buyer_leverage_ratio", "mean_abs_shap": 0.0634 }
  ]
}

Hyperparameter optimisation (Katib HPO)

Before a final production run, let Foundry search for optimal hyperparameters via Kubeflow Katib Bayesian optimisation — up to 12 trials, 2 in parallel. When the experiment finishes, call GET /models/{id}/tune to retrieve the best params, then feed them into POST /models/train for the final run.

Request body

Field	Required	Default	Description
`dataset_id`	Yes	—	UUID of a `ready` dataset
`name`	Yes	—	Human name for this tuning run
`target_column`	Yes	—	Column to predict
`problem_type`	No	`auto`	`classification`, `regression`, or `auto`

curl -X POST https://api.ivory.finance/v1/foundry/models/tune \
  -H "Authorization: Bearer $IVORY_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_id": "3f7a1b2c-...",
    "name": "Revenue Growth — Katib Search",
    "target_column": "revenue_growth_pct",
    "problem_type": "regression"
  }'

{
  "run_id": "c3d4e5f6-7890-abcd-ef12-34567890abcd",
  "katib_experiment": "foundry-tune-c3d4e5f6",
  "katib_namespace": "kubeflow",
  "max_trials": 12,
  "parallel_trials": 2,
  "status": "tuning",
  "message": "Katib HPO running. Poll GET /v1/foundry/models/{run_id}/tune for status."
}

Deploy model

Wraps a trained model as a live KServe InferenceService (RawDeployment mode — no serverless cold starts) and registers it in the Kubeflow Model Registry. The status flips from "deploying" to "active" within a few seconds as K8s brings up the pod. Retry POST /models/{id}/predict until you get a 200.

Request body

Field	Required	Description
`model_name`	Yes	Short identifier for this deployed version (lowercase, hyphens OK)

curl -X POST https://api.ivory.finance/v1/foundry/models/a1b2c3d4-.../deploy \
  -H "Authorization: Bearer $IVORY_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "model_name": "deal-score-v2" }'

{
  "registry_id": "b2c3d4e5-...",
  "model_name": "deal-score-v2",
  "inference_service": "model-acme-corp-a1b2c3d4",
  "kserve_namespace": "kserve",
  "endpoint_url": "http://model-acme-corp-a1b2c3d4-predictor.kserve.svc.cluster.local",
  "status": "deploying",
  "predict_url": "/v1/foundry/models/a1b2c3d4-.../predict"
}

Set decision threshold

Adjust the classification threshold for a deployed binary model. Scores ≥ threshold → label 1, scores < threshold → label 0. The default is 0.5. Tune it based on your risk appetite:

Use case	Recommended threshold	Rationale
Fraud / AML detection	`0.25 – 0.35`	Maximise recall — a missed fraud is worse than a false alarm reviewed by compliance
Deal scoring	`0.65 – 0.75`	Surface only high-conviction opportunities — reduce analyst noise
Earnings beat prediction	`0.50`	Symmetric precision/recall trade-off

Request body

Field	Required	Description
`threshold`	Yes	Float strictly between 0.0 and 1.0
`threshold_metric`	No	Context label: `f1`, `precision`, `recall`, or `balanced_accuracy`

curl -X POST https://api.ivory.finance/v1/foundry/models/a1b2c3d4-.../threshold \
  -H "Authorization: Bearer $IVORY_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "threshold": 0.3, "threshold_metric": "recall" }'

{
  "registry_id": "b2c3d4e5-...",
  "threshold": 0.3,
  "threshold_metric": "recall",
  "message": "Decision threshold updated to 0.3"
}

Drift monitoring

Computes Population Stability Index (PSI) for each feature by comparing the current distribution of incoming prediction inputs against the training-time baseline (decile bins computed pre-SMOTE on the training set). PSI is evaluated over the last 500 prediction log entries.

PSI range	Status	Action
`< 0.10`	`stable`	No action needed
`0.10 – 0.20`	`moderate`	Monitor closely; consider retraining if trend continues
`> 0.20`	`critical`	Significant input distribution shift — retrain recommended

curl https://api.ivory.finance/v1/foundry/models/a1b2c3d4-.../drift \
  -H "Authorization: Bearer $IVORY_JWT"

{
  "run_id": "a1b2c3d4-...",
  "drift_status": "stable",
  "features_checked": 7,
  "prediction_sample_size": 500,
  "features": [
    { "feature": "deal_size_usd",     "psi": 0.031, "status": "stable",   "baseline_mean": 182400000, "current_mean": 191200000, "sample_size": 500 },
    { "feature": "ebitda_margin",     "psi": 0.048, "status": "stable",   "baseline_mean": 0.21,      "current_mean": 0.22,      "sample_size": 500 },
    { "feature": "days_in_diligence", "psi": 0.019, "status": "stable",   "baseline_mean": 47.3,      "current_mean": 49.1,      "sample_size": 498 }
  ]
}

Predict

Proxy a prediction request to the deployed KServe endpoint. For binary classification models the stored decision threshold is applied automatically:

score ≥ threshold → label: 1 (positive class)
score < threshold → label: 0 (negative class)

Both the raw score and the thresholded label are returned so you can audit or override the decision at the application layer.

# Regression
curl -X POST https://api.ivory.finance/v1/foundry/models/a1b2c3d4-.../predict \
  -H "Authorization: Bearer $IVORY_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "yoy_revenue": 0.18,
      "gross_margin": 0.62,
      "r_and_d_ratio": 0.22,
      "sector_slug": "semiconductors"
    }
  }'

{
  "prediction": { "value": 14.7 },
  "latency_ms": 7.4
}

Companies

Topics

Filings

Press Releases

Presentations

Financial Statements

Earnings

Intelligence Editor

Agentic RAG

Documents

News

Real-Time

Insider Trades

Tools

Alternative Data

Export

Share

Admin

Authentication

Real-Time Database

AI Foundry

ETL Connectors

Lakehouse

Portfolios

Deals

KYC

ESG

Mandates

Forensics

Documentation Index

​Authentication

​Model lifecycle

​Train a model

​Algorithm roster

​Request body

​List models

​Get model details

​Explain model

​SHAP feature importance

​Hyperparameter optimisation (Katib HPO)

​Request body

​Deploy model

​Request body

​Set decision threshold

​Request body

​Drift monitoring

​Predict

Authentication

Model lifecycle

Train a model

Algorithm roster

Request body

List models

Get model details

Explain model

SHAP feature importance

Hyperparameter optimisation (Katib HPO)

Request body

Deploy model

Request body

Set decision threshold

Request body

Drift monitoring

Predict