curl --request POST \
  --url https://api.lighton.ai/api/v3/ask \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "query": "How are JWT tokens signed?",
  "max_results": 5,
  "workspace_id": [
    42
  ]
}
'

{
  "results": [
    {
      "chunk_id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "JWT tokens are signed using RS256 and expire after 1 hour.",
      "score": 0.95,
      "scores": {
        "text": 0.91,
        "vision": null,
        "keyword": 0.43,
        "multivector": 0.78,
        "relevance": 0.95
      },
      "source": {
        "file_id": 512,
        "filename": "auth-system.pdf",
        "title": "Authentication System Design",
        "mime_type": "pdf",
        "size_bytes": 482113,
        "page_start": 3,
        "page_end": 4,
        "total_pages": 12,
        "tags": [
          {
            "id": 7,
            "name": "security"
          }
        ],
        "content_types": [
          {
            "path": "engineering:security",
            "label": "Security",
            "attribute_values": {
              "topic": {
                "value": "authentication",
                "type": "text"
              }
            }
          }
        ],
        "external_metadata": null
      },
      "workspace": {
        "id": 42,
        "name": "Engineering Docs"
      }
    }
  ],
  "answer": "Based on the authentication system design document, JWT tokens are signed using RS256 and have a 1-hour expiration (auth-system.pdf, page 3)."
}

Intelligence

Ask a question over your documents

Retrieval-augmented generation: searches your indexed corpus, then generates an LLM answer grounded in the retrieved passages.

Modes:

stream=false (default): returns a single JSON response with results and answer.
stream=true: returns Server-Sent Events — event: sources (retrieved chunks), event: token (answer tokens), event: done (stream complete), or event: error (generation failure).

Model: defaults to mistral-large-latest (flagship, best answer quality). Pass model=alfred-ft5 for the lighter, faster LightOn fine-tune. Company-specific custom models (custom-{company_id}-{uuid}) are also accepted. Any other value returns 422.

Relevance scoring: relevance scoring always runs in scoring_and_filtering mode — candidates are scored for relevance and only those above the quality threshold are used as context. score equals the relevance score (scores.relevance, 0–1). Results are returned in descending order of score. If the scoring model is temporarily unavailable, score falls back to the combined retrieval score (higher is better, no fixed upper bound) and scores.relevance is null.

Scoping: same rules as /api/v3/search — use workspace_id and/or tag_id to narrow results, or file_id to target specific files. file_id cannot be combined with workspace_id or tag_id (422).

Facet filtering: use content_type and attribute to narrow results by facet metadata. Content type uses colon-separated paths (e.g. legal:contract:nda). Repeated attribute entries are ANDed; values inside one entry are ORed with | (pipe, recommended). Example: attribute=fiscal_year:2024|2025&attribute=status:active → (fiscal_year 2024 OR 2025) AND (status active). Supports operators (>, >=, <, <=), prefix (name:prefix*), smart dates, and content-type scoping.

If the reranker is temporarily unavailable, results are returned in retrieval order and each result item includes a warnings array. Each warning has a code matching the degraded scores key (e.g. relevance) and a reason classifying the failure: model_not_found, timeout, service_error, or unknown. The warnings key is absent from result items when all pipeline steps succeed.

Billing: 1 search-with-generation credit per request.

POST

api

ask

curl --request POST \
  --url https://api.lighton.ai/api/v3/ask \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "query": "How are JWT tokens signed?",
  "max_results": 5,
  "workspace_id": [
    42
  ]
}
'

{
  "results": [
    {
      "chunk_id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "JWT tokens are signed using RS256 and expire after 1 hour.",
      "score": 0.95,
      "scores": {
        "text": 0.91,
        "vision": null,
        "keyword": 0.43,
        "multivector": 0.78,
        "relevance": 0.95
      },
      "source": {
        "file_id": 512,
        "filename": "auth-system.pdf",
        "title": "Authentication System Design",
        "mime_type": "pdf",
        "size_bytes": 482113,
        "page_start": 3,
        "page_end": 4,
        "total_pages": 12,
        "tags": [
          {
            "id": 7,
            "name": "security"
          }
        ],
        "content_types": [
          {
            "path": "engineering:security",
            "label": "Security",
            "attribute_values": {
              "topic": {
                "value": "authentication",
                "type": "text"
              }
            }
          }
        ],
        "external_metadata": null
      },
      "workspace": {
        "id": 42,
        "name": "Engineering Docs"
      }
    }
  ],
  "answer": "Based on the authentication system design document, JWT tokens are signed using RS256 and have a 1-hour expiration (auth-system.pdf, page 3)."
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

DRF serializer mixin providing content_type and attribute fields.

Compose into any request serializer via multiple inheritance::

class SearchRequestSerializer(FacetFilterFieldsMixin, serializers.Serializer):
    query = serializers.CharField(...)
    # content_type and attribute inherited from the mixin

query

string

required

Natural-language question. Maximum 1500 characters.

Maximum string length: 1500

content_type

string[]

Filter by content type path. Multiple values are OR. Exact-or-subtree matching by default (e.g. legal matches legal, legal:contract). Wildcards: *contract* (contains), legal:contract* (prefix).

attribute

string[]

Filter by attribute value. Repeated attribute entries are ANDed; values inside one entry are ORed with | (pipe is the recommended OR delimiter — comma also works but can be ambiguous with multi-key values). Example: attribute=fiscal_year:2024|2025&attribute=status:active → (fiscal_year 2024 OR 2025) AND (status active). Formats: name (has any value), name:value (exact), name:>value / name:>=value (gt/gte), name:<value / name:<=value (lt/lte), name:prefix* (starts with, case-insensitive), name:*text* (contains, case-insensitive), name:a|b (OR). Smart dates: filing_date:2023 (year), filing_date:2023-06 (month). Type-aware: booleans (true/false), multi-select (membership check). Scoped: content_type(legal:compliance).regulation:AML.

max_results

integer

default:10

Maximum number of chunks to retrieve for context. Range: 1–50.

Required range: 1 <= x <= 50

workspace_id

integer[]

Restrict search to these workspace IDs. Cannot combine with file_id.

tag_id

integer[]

Restrict to documents carrying any of these tag IDs (OR). Cannot combine with file_id.

file_id

integer[]

Restrict to specific file IDs. Cannot combine with workspace_id or tag_id.

relevance_scoring

enum<string>

default:scoring_and_filtering

Controls the relevance scoring step used during retrieval. "none": Skip scoring — lowest latency, relevance score is null in each result. "scoring_only": Score every candidate but return them all. Omit for the default (score and filter).

none - none
scoring_only - scoring_only
scoring_and_filtering - scoring_and_filtering

Available options:

none,

scoring_only,

scoring_and_filtering

stream

boolean

default:false

When true, response is streamed as Server-Sent Events.

model

string

default:mistral-large-latest

LLM used for answer generation. Standard values:

mistral-large-latest: Mistral Large 2 — flagship general-purpose model. Best answer quality (default).
alfred-ft5: Alfred FT5 — LightOn fine-tuned model, lighter and faster for straightforward questions. Custom model technical names (e.g. custom-{company_id}-{uuid}) are also accepted.

Response

Synchronous mode (stream=false): complete answer with sources.

Streaming mode (stream=true): Server-Sent Events with event: sources, event: token, and event: done (or event: error).

results

object[]

required

Retrieved chunks used as context, ordered by relevance score descending.

Show child attributes

answer

string

required

LLM-generated answer grounded in the retrieved results.

API Errors Search document chunks

⌘I

Overview

Intelligence

Document Processing

Administration

Ask a question over your documents

Authorizations

Body

Response