Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developers.lighton.ai/llms.txt

Use this file to discover all available pages before exploring further.

Before you can search, your documents need to be in LightOn. Uploading a file triggers an ingestion pipeline that parses the content, splits it into chunks, generates embeddings, and indexes everything. The whole process typically takes a few seconds for a standard PDF. Ingestion is asynchronous — the upload returns immediately with a pending status, and you poll GET /api/v3/files/{id} for completion.
This tutorial covers POST /api/v3/files and GET /api/v3/files. The full schema for every endpoint and parameter lives in the API reference.

Upload a file

Send the file as multipart/form-data with the destination workspace_id:
import requests

headers = {"Authorization": "Bearer $CONSOLE_API_KEY"}

response = requests.post(
    "https://api.lighton.ai/api/v3/files",
    headers=headers,
    data={"workspace_id": 42},
    files={"file": open("handbook.pdf", "rb")},
)

file = response.json()
print(file["id"], file["status"], file["upload_session_uuid"])
# → 12345 pending 550e8400-e29b-41d4-a716-446655440000
The response is a 201 with the new file record, including an upload_session_uuid you can use later to find every file uploaded in the same batch.

Wait for indexing to complete

Poll GET /api/v3/files/{id} until status reaches embedded:
import time

file_id = file["id"]
while True:
    r = requests.get(f"https://api.lighton.ai/api/v3/files/{file_id}", headers=headers)
    body = r.json()
    if body["status"] == "embedded":
        print("Ready to search")
        break
    if body["status"] in ("parsing_failed", "embedding_failed", "fail"):
        print("Ingestion failed:", body.get("status_detail"))
        break
    time.sleep(2)
The status field moves through these stages:
StatusWhat’s happening
pendingQueued for processing
parsingExtracting text from the document
parsing_failedParsing failed — see status_detail
embeddingGenerating vector embeddings
embedding_failedEmbedding failed — see status_detail
embeddedIndexed and ready to search
updatingRe-indexing in progress
failGeneric failure — see status_detail
status_vision tracks the same lifecycle for vision/image embeddings: pending, processing, embedded, fail, or - (not available for this file).

Organising documents with tags and titles

Add a human-readable title and assign tag IDs at upload time. Tags can be sent as a JSON-encoded array string or as repeated form fields with the same name.
requests.post(
    "https://api.lighton.ai/api/v3/files",
    headers=headers,
    data={
        "workspace_id": 42,
        "title": "Q4 Financial Report",
        "tags": "[1, 2]",        # JSON-encoded list of tag IDs
    },
    files={"file": open("q4-report.pdf", "rb")},
)
If a tag ID is invalid, the file is still created but the response is a 207 (multi-status) with a message explaining which tags were rejected. To replace tags after upload, PATCH /api/v3/files/{id} with a new tags array — it replaces all existing tags, manual and auto-assigned. Send [0] (sentinel) to remove every tag when using multipart format. To add tags without touching existing ones, POST /api/v3/files/{id}/tags.

Tracking documents from external systems

If you’re ingesting documents from a third-party system (ServiceNow, Confluence, SharePoint, etc.), store the source identifier in external_metadata. This lets you find the LightOn file later given only the external ID, and surface the original URL in your UI.
import json

requests.post(
    "https://api.lighton.ai/api/v3/files",
    headers=headers,
    data={
        "workspace_id": 42,
        "external_metadata": json.dumps({
            "external_id": "SRV-456789",
            "doc_type": "incident",
            "additional_metadata": {
                "external_url": "https://servicenow.example.com/incident/SRV-456789",
            },
        }),
    },
    files={"file": open("srv-456789.pdf", "rb")},
)
external_id is required when creating; doc_type and additional_metadata are optional. When sent via multipart/form-data, the whole external_metadata value must be a JSON string. Retrieve it later by external ID:
GET /api/v3/files?external_metadata__external_id=SRV-456789

Listing and filtering your documents

GET /api/v3/files supports rich filtering. A few common patterns:
# All files in a workspace
requests.get("https://api.lighton.ai/api/v3/files", headers=headers,
             params={"workspace_id": "42"})

# Semantic search across filenames and titles, with the top chunk inline
requests.get("https://api.lighton.ai/api/v3/files", headers=headers,
             params={"search": "security policy", "search_details": True})

# PDFs tagged 'legal', most recent first
requests.get("https://api.lighton.ai/api/v3/files", headers=headers,
             params={"tag_id": "3", "extension": "pdf", "ordering": "-created_at"})

# Files in a 10–50 page window
requests.get("https://api.lighton.ai/api/v3/files", headers=headers,
             params={"total_pages_min": 10, "total_pages_max": 50})
Set include_details=true to receive the signature (TLSH hash for duplicate detection) and parser fields on each result. For schema-driven filtering, the attribute and content_type query parameters support facet filters with operators (=, >, <, * prefix, |/, for OR). See the API reference for the full DSL.

Deleting files

Single delete:
requests.delete(f"https://api.lighton.ai/api/v3/files/{file_id}", headers=headers)
Bulk delete:
requests.post(
    "https://api.lighton.ai/api/v3/files/bulk-delete",
    headers=headers,
    json={"ids": [123, 124, 125]},
)
Both return 204 No Content on success. Files in synced (datasource-managed) workspaces cannot be deleted manually — the API returns 400.

Common errors

StatusCause
400Validation error, unsupported file type, or synced-workspace constraint
401Missing or invalid API key
403Permission denied (no upload/delete rights)
404File does not exist or is not accessible
429Too many concurrent uploads for this session