Skip to main content
This tutorial uses POST /api/v3/files/{file_id}/facets and GET /api/v3/files/{file_id}/facets. The full schema lives in the API reference.
Your classification tree exists. Now you’ll classify actual documents and fill in their metadata, turning unstructured files into queryable, structured data. Applying facets to a document is a two-step act:
  1. Classify: tell LightOn what type this document is (contract:nda)
  2. Set values: fill in the attribute values for that classification (counterparty, jurisdiction, …)
All actions are idempotent. Send a JSON array to batch multiple actions. LightOn queues exactly one BM25 reindex per batch, regardless of how many actions it contains.

Step 1: Upload a document

Before classifying, you need a file. If you already have one, skip this step and use your existing file_id.
Replace workspace_id with your own. You can list your workspaces via GET /api/v3/workspaces.
upload_file.py
import os
import requests

headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}
workspace_id = 42  # replace with your workspace ID

with open("acme_nda_2025.pdf", "rb") as f:
    response = requests.post(
        "https://api.lighton.ai/api/v3/files",
        headers=headers,
        files={"file": f},
        data={"workspace_id": workspace_id, "title": "NDA with Acme Corp"},
    )
print(response.json())
The response has an id field: that’s your file_id. Ingestion is asynchronous (status: pending), but you don’t need to wait before classifying.

Step 2: Classify the document

Tell LightOn this file is a contract:nda:
classify_file.py
import os
import requests

file_id = 1234  # replace with your file ID
headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}

response = requests.post(
    f"https://api.lighton.ai/api/v3/files/{file_id}/facets",
    headers=headers,
    json={"action": "classify", "content_type_path": "contract:nda"},
)
print(response.json())
Returns 201 Created (or 200 if already classified, since classify is idempotent). This is lightweight: it only creates the link between the file and the content type path. No attribute values are set yet. A file can hold classifications from multiple trees, but only one per tree. If you need to reclassify within the same tree, see Rules & constraints for what’s allowed.

Step 3: Set attribute values

Fill in the structured metadata. Notice that you can set attributes like counterparty and jurisdiction on contract:nda even though they were defined on the parent contract node. This works because inherit_attributes is true in the classification tree you built earlier.
set_values.py
import os
import requests

file_id = 1234  # replace with your file ID
headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}
url = f"https://api.lighton.ai/api/v3/files/{file_id}/facets"

actions = [
    {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "counterparty", "value": "Acme Corp"},
    {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "jurisdiction", "value": ["FR", "DE"]},
    {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "effective_date", "value": "2025-03-01"},
    {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "signed", "value": True},
    {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "is_mutual", "value": True},
    {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "duration_years", "value": 3},
]

for payload in actions:
    response = requests.post(url, headers=headers, json=payload)
    print(response.json())
The file must be classified before you can set values. Each attribute type enforces strict validation. See Rules & constraints for the full type-by-type reference.

The efficient way: batch everything in one call

In production, always batch the classify + all set_value actions together. This triggers exactly one BM25 reindex, not seven.
batch_apply.py
import os
import requests

file_id = 1234  # replace with your file ID
headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}

response = requests.post(
    f"https://api.lighton.ai/api/v3/files/{file_id}/facets",
    headers=headers,
    json=[
        {"action": "classify", "content_type_path": "contract:nda"},
        {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "counterparty", "value": "Acme Corp"},
        {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "jurisdiction", "value": ["FR", "DE"]},
        {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "effective_date", "value": "2025-03-01"},
        {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "signed", "value": True},
        {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "is_mutual", "value": True},
        {"action": "set_value", "content_type_path": "contract:nda", "attribute_name": "duration_years", "value": 3},
    ],
)
print(response.json())

Step 4: Read back the full facets on a file

read_facets.py
import os
import requests

file_id = 1234  # replace with your file ID
headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}

response = requests.get(
    f"https://api.lighton.ai/api/v3/files/{file_id}/facets",
    headers=headers,
)
print(response.json())
The response includes:
  • labels: breadcrumb from root to leaf (e.g. ["Contract", "Non-Disclosure Agreement"]), useful for display in a UI
  • inherited: true: attributes that came from a parent node (contract), like counterparty and jurisdiction
  • inherited: false: attributes defined directly on contract:nda, like is_mutual and duration_years
  • can_edit: true: whether the current user can modify this file’s facets

Multi-classification: a document with two types

A file can be classified under multiple content types, but only from different trees. You cannot have two classifications from the same tree (see Rules & constraints). For example, if you have a second tree called regulation, you can classify a file as both contract:nda and regulation:gdpr:
multi_classify.py
import os
import requests

file_id = 1234  # replace with your file ID
headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}

# The file is already classified as contract:nda (from Step 2).
# Multi-classification requires a DIFFERENT tree. You cannot have two
# classifications in the same tree. Here we add a second classification
# from the "regulation" tree.

response = requests.post(
    f"https://api.lighton.ai/api/v3/files/{file_id}/facets",
    headers=headers,
    json=[
        {"action": "classify", "content_type_path": "regulation:gdpr"},
        {"action": "set_value", "content_type_path": "regulation:gdpr", "attribute_name": "compliance_status", "value": "under-review"},
    ],
)
print(response.json())
The file now has two classifications from two independent trees. Each classification has its own set of attribute values. They don’t interact.

Remove a classification or a value

clear_unclassify.py
import os
import requests

file_id = 1234  # replace with your file ID
headers = {"Authorization": f"Bearer {os.environ['LIGHTON_API_KEY']}"}
url = f"https://api.lighton.ai/api/v3/files/{file_id}/facets"

response_clear = requests.post(
    url,
    headers=headers,
    json={"action": "clear_value", "content_type_path": "contract:nda", "attribute_name": "duration_years"},
)

response_unclassify = requests.post(
    url,
    headers=headers,
    json={"action": "unclassify", "content_type_path": "contract:nda"},
)

print(response_clear.status_code, response_unclassify.status_code)
Both actions return 204 No Content on success. unclassify cascades: it removes the classification and all its attribute values. See Rules & constraints for all cascade behaviors.

Action reference

ActionWhat it does
classifyAssign a content type path to a file (idempotent)
unclassifyRemove a classification and cascade-delete all its attribute values
set_valueSet an attribute value on a classified file (idempotent)
clear_valueRemove a single attribute value