Skip to main content

Overview

Enterprise-ready reference for the conversation-intelligence platform that powers docs.illocution.ai. The FastAPI service exposes REST and Server-Sent Event (SSE) endpoints for uploading media, replaying recordings with realtime feedback, and streaming fully live sessions. This document is structured so it can be dropped directly into Mintlify.

Primary Customer Use Cases

  1. Sales / Success intelligence - highlight objections, low-confidence answers, and emotional swings in discovery calls.
  2. Behavioral interview review - surface commitment drops and fear spikes for hiring managers.
  3. Research panels & UX testing - quantify engagement levels and narrative arcs across respondents.

Quickstart Playbook

Follow step-by-step guides for batch upload, realtime replay, and live capture workflows.

Base URL

Deployment specific (e.g., https://api.illocution.ai)

Version

Surfaced via FastAPI.title/version headers and /metrics

Contact

[email protected] for keys, quotas, and SLA adjustments

Open CORS

Open CORS enabled for web clients when authenticated

Product Snapshot

CapabilityDescription
Multimodal timelineSpeaker diarization, prosody analysis, and AI agents for emotion, sentiment, cognitive state, commitment, and passion per utterance.
Summaries & insightsStructured ConversationSummary, segmentation, key transitions, moment detection, and aggregated stats ready for dashboards.
Deployment modesBatch upload, Realtime Replay (upload-now, stream-now), and Live Capture (chunked ingest from browsers/softphones).
ArtifactsNormalized JSON bundles stored per conversation for auditing or re-processing.

Access, Security, and Limits

ItemDetail
AuthenticationPass X-API-Key: <client key> on every request. A fallback api_key query param is supported for clients that cannot set custom headers.
Key provisioningAPI keys are configured server-side and can be rotated without downtime.
CORSOpen CORS is enabled so web clients can call the API directly when authenticated.
Payload sizeDefault limit MAX_FILE_SIZE_MB=500. Oversized uploads fail before processing.
ConcurrencyRecommended 3 simultaneous batch uploads per tenant; streaming endpoints support one SSE consumer per conversation.
Data retentionUploaded media and artifacts persist on the API host; configure lifecycle management externally if required.
TLSAssume HTTPS termination at the deployment layer (Railway, Render, etc.).
Missing or invalid API keys will result in a 401 Unauthorized response with {"detail": "Invalid or missing API key"}.

Error Handling Guide

Learn about error codes, retry strategies, and troubleshooting common issues.

Workflow Deep Dive (Data Flow)

The platform offers three ingestion patterns. Each feeds the same downstream analytics stack but differs in how media is supplied and how results are streamed back.

Batch Upload (POST /analyze)

  1. Client sends a single multipart/form-data request with the media file.
  2. Backend processes the file and generates complete analysis results.
  3. Server returns the full JSON payload with all insights and artifacts.
  4. Recommended when latency is less critical than simplicity (e.g., overnight processing).

Batch Analyze Endpoint

Full endpoint documentation with request parameters, response examples, and error handling.

Realtime Replay (POST /analyze/stream)

  1. Client uploads the entire file and immediately receives an SSE stream.
  2. Analysis results are streamed in real-time as processing completes.
  3. Consumers receive incremental updates (status, transcripts, emotion, cognitive, transitions).
  4. Ideal for coaching portals that want “live” feedback from recorded calls without implementing chunk uploads.

Live Capture Sessions (/analyze/live/*)

  1. Client requests a session, receiving chunk, events, and control endpoints.
  2. Audio or video chunks are uploaded sequentially and processed in real time.
  3. Analysis events (final_transcript, emotion, cognitive, transition, moment) are streamed as they become available.
  4. Clients call /control with finalize when the live source ends (or cancel if interrupted). Summary + segmentation follow automatically.
  5. Designed for browser microphones, SIP bridges, or any streaming producer.

Live Capture Endpoint

Complete guide to session management, chunk uploads, and real-time streaming.
Regardless of mode, the resulting timeline, summary, and artifacts share the same schema, so downstream analytics do not need to care how the conversation was ingested.

Processing Pipeline & Artifacts

Uploaded media undergoes comprehensive analysis to produce multimodal insights including transcripts, emotion detection, cognitive state analysis, prosody metrics, and conversation-level summaries. The platform generates structured outputs including:
  • Timeline - Per-utterance analysis with transcripts, emotions, cognitive signals, and prosody markers
  • Summary - High-level conversation insights and key takeaways
  • Segmentation - Conversation phases with transitions and narrative arcs
  • Moments - Detected key moments such as objections, CTAs, and topic shifts
  • Artifacts - Complete analysis results persisted for auditing or re-processing
Consumers typically ingest the HTTP/SSE response and optionally fetch artifacts for auditing.

Response Schemas

Detailed schemas for timeline entries, summaries, segmentation, and all response types.

Endpoint Overview

EndpointModeUse whenOutputDocumentation
POST /analyzeBatchYou have a full recording and can wait for a single JSON payload.Full analysis JSON in the HTTP response.Batch Analyze
POST /analyze/streamRealtime ReplayYou have the recording but want SSE updates immediately while it processes.SSE feed with transcripts, signals, summary, status.Realtime Replay
/analyze/live/*Live CaptureAudio is generated incrementally (mic/softphone).Chunk ingest + SSE feed.Live Capture
POST /segment/conversationRe-analysisYou already have a timeline and only need segmentation.ConversationSegmentation.Segment Conversation
GET /metrics, GET /healthOpsHealth checks and lightweight counters.JSON.Metrics & Health
Each endpoint is documented with request tables, typical responses, and failure semantics. Click the links above for detailed documentation.

Supported Media Formats

  • Video: .mp4, .mov, .webm
  • Audio: .mp3, .wav, .m4a, .aac

Content Types

  • Upload endpoints: Expect multipart/form-data
  • SSE consumers: Must set Accept: text/event-stream
  • JSON endpoints: Use application/json

OpenAPI Specification

View the complete OpenAPI specification

Implementation Tips

  • Javascript SSE clients: use EventSource/EventSourcePolyfill; close the stream on done. See the SSE Events reference for complete event handling examples.
  • Timeout budget: allow at least media_length + 60s for replay requests; live sessions stay open until you finalize.
  • Artifacts: artifacts can be persisted to external storage (S3, GCS) if you need to retain outputs beyond the container lifecycle.
  • Response structure: review the Response Schemas documentation to understand all available fields and data types.