Ask a question over your documents
Retrieval-augmented generation: searches your indexed corpus, then generates an LLM answer grounded in the retrieved passages.
Modes:
stream=false(default): returns a single JSON response withresultsandanswer.stream=true: returns Server-Sent Events:event: sources(retrieved chunks),event: token(answer tokens),event: done(stream complete), orevent: error(generation failure).
Model: defaults to mistral-large-latest (flagship, best answer quality). Pass model=alfred-ft5 for the lighter, faster LightOn fine-tune. Only these two models are supported; any other value returns 422.
Scoping: same rules as /api/v3/search: use workspace_id and/or tag_id to narrow results, or file_id to target specific files. file_id cannot be combined with workspace_id or tag_id (422).
Billing: 1 search-with-generation credit per request.
Documentation Index
Fetch the complete documentation index at: https://developers.lighton.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Natural-language question. Maximum 1500 characters.
1500Maximum number of chunks to retrieve for context. Range: 1–50.
1 <= x <= 50Restrict search to these workspace IDs. Cannot combine with file_id.
Restrict to documents carrying any of these tag IDs (OR). Cannot combine with file_id.
Restrict to specific file IDs. Cannot combine with workspace_id or tag_id.
When true, response is streamed as Server-Sent Events.
LLM used for answer generation. Allowed values:
mistral-large-latest: Mistral Large 2, flagship general-purpose model. Best answer quality (default).alfred-ft5: Alfred FT5, LightOn fine-tuned model, lighter and faster for straightforward questions.
mistral-large-latest- mistral-large-latestalfred-ft5- alfred-ft5
mistral-large-latest, alfred-ft5 Response
Synchronous mode (stream=false): complete answer with sources.
Streaming mode (stream=true): Server-Sent Events with event: sources, event: token, and event: done (or event: error).