“Give me all NDAs, signed, valid in France, where the counterparty is Acme.”That query doesn’t rely on finding those exact words in the document body. It relies on structured metadata you’ve attached to the file: a type (
nda), a flag (signed = true), a jurisdiction, a counterparty name. This works for any document type: contracts, invoices, HR policies, technical specs, whatever your app needs.
Workspaces, tags, or facets?
Facets aren’t the only way to organise documents. Workspaces are containers that isolate a team’s or customer’s files, and tags are flat labels that group files into collections (even across workspaces) with zero schema to design. The three compose: a file lives in one workspace, can carry several tags, and can be classified with facets.| Workspaces | Tags | Facets | |
|---|---|---|---|
| What it is | A container; every file lives in exactly one | Flat, reusable labels | Typed, hierarchical metadata with a schema |
| A file belongs to | exactly one workspace | many tags, across workspaces | many content types, across workspaces |
| Best for | Isolating teams, customers, tenants | Cross-cutting collections (project, topic) | Precise structured queries |
| Setup cost | Create a workspace | Create a tag | Design a content-type tree |
| Scope a query with | workspace_id | tag_id | content_type / attribute |
| Access control | Yes: API keys can be scoped to a workspace with a per-key role | No: not a permission boundary | No: not a permission boundary |
How it works
Facets are built on three layers:- Build your classification trees: create content types and their attributes (
POST /api/v3/content-types), or adopt a ready-made starter kit (legal, finance, healthcare, tech, manufacturing) viaGET /api/v3/content-types/templates - Classify files & set values: assign a content type to each file and fill in its attribute values (
POST /api/v3/files/{id}/facets) - Query by metadata: filter files by content type and attribute values (
GET /api/v3/files)
Glossary
Content Type
A Content Type is a node in your company’s classification tree: a named, hierarchical label that describes what a document is. Content types live in a tree up to 4 levels deep. The path between levels uses: as separator.
Each Content Type has:
code: a short kebab-case identifier, unique among siblings (nda,service-agreement)label: the human-readable name ("Non-Disclosure Agreement")description: explains what this type means, used by AI to classify documents automaticallyinherit_attributes: whether documents classified here also get the parent node’s attributes
Attribute
An Attribute is a custom attribute attached to a Content Type. It describes what data a document of that type should carry.| Field | Meaning |
|---|---|
name | machine name, e.g. jurisdiction |
label | display name, e.g. "Jurisdiction" |
attribute_type | the data type (see table below) |
required | whether this field must have a value |
choices | for select/multi-select: the allowed values |
description | explains what this field means, used by AI to extract values |
| Type | attribute_type | What you store |
|---|---|---|
| Short text | "text" | Any string |
| Rich text | "rich-text" | Markdown string |
| Number | "number" | Stored as a float (e.g. 50000, "1.5") |
| Date | "date" | ISO 8601 string: "2025-01-31" |
| Boolean | "boolean" | true or false |
| Single option | "select" | One string from a fixed list (choices) |
| Multiple options | "multi-select" | Array of strings from a fixed list (choices) |
Classification
A Classification is the link between a specific file and a Content Type path. It answers the question: “What type is this document?” One file can have multiple classifications. A single document might be both acontract:nda and a contract:data-processing-agreement at the same time.
Attribute Value
An Attribute Value is the actual data stored for a specific attribute on a specific classification of a specific file. When a file is classified ascontract:nda and you set counterparty = "Acme Corp", you’ve created an Attribute Value: {file: #1234, path: "contract:nda", name: "counterparty", value: "Acme Corp"}.
Values are scoped by content type path. If a file has two classifications, each classification has its own set of values. They don’t mix.
Inherited Attributes
When a Content Type node hasinherit_attributes: true, files classified at any descendant node automatically have access to the attributes defined on that ancestor.
contract:nda also defines its own attribute is_mutual (boolean), then a file classified as contract:nda sees all three fields: counterparty and jurisdiction (inherited from contract) plus is_mutual (defined directly on contract:nda).
Why it matters
| Without Facets | With Facets |
|---|---|
| Find NDAs → search “NDA” in file names | Filter ?content_type=contract:nda: precise, instant |
| Find French contracts → full-text search for “France” | Filter ?attribute=jurisdiction:FR: structured, no false positives |
| Find unsigned contracts → manual review | Filter ?attribute=signed:false |
| Build a dashboard of contracts by type | Group by content_type: structured counts, no parsing |
| AI answers “what NDAs do we have with Acme?” | BM25 index enriched with metadata: finds the right documents |
Limits at a glance
| Area | Constraint |
|---|---|
| Tree depth | 4 levels max |
| Code format | Lowercase alphanumeric + hyphens (kebab-case) |
| Path length | 768 characters max |
| Attribute types | 7: text, number, date, boolean, select, multi-select, rich-text |
| Reserved names | Some attribute names are reserved by the system and rejected on creation |
| Classifications per tree | 1 per file per tree (multiple trees allowed) |
| Type immutability | attribute_type cannot be changed after creation |
| Filterable types | text, number, date, boolean, select. multi-select and rich-text are not filterable |
| Scope | Content types are company-scoped, not workspace-scoped |
Next steps
Building a classification tree
Create content types and custom attributes
Classifying files & setting metadata
Classify files and set attribute values
Filtering documents by metadata
Query by content type and attribute value