Embedded schemas

Embedded documentSchema

DocumentInput accepts either a schemaRef (pointer to a built-in schema) or an embedded JSON Schema body via documentSchema. This page documents how documentSchema is handled server-side, what JSON Schema features are supported, and the composition patterns that work.

input DocumentInput {
    name: String!
    schemaRef: String            # option A — one of the built-ins
    documentSchema: JSON         # option B — inline JSON Schema (Draft 2020-12)
    description: String
    uniquePerVault: Boolean
    data: JSON!
}

schemaRef vs documentSchema — which do I use?

Situation Use
Data matches a single built-in exactly schemaRef
Need to combine several built-ins (e.g. name+address) documentSchema (with $ref)
Need extra required fields on top of a built-in documentSchema (allOf + required)
Need fields that no built-in covers documentSchema
One-off document tied to a specific flow or app documentSchema
Warning

Mutual exclusion is enforced. Providing neither returns a “no schema definition” error; providing both returns “both schema definitions”. The check lives in DocumentMetadata.ValidateSchemaDefinition() in the domain layer (pkg/domain/document.go).

Lifecycle contract

  1. At create time — the backend parses documentSchema as a JSON Schema, registers it at URI embedded://document-schema.json, compiles it, and validates data against it. Failure → GraphQL error, nothing is persisted.
  2. At write time — the compiled schema is stored as bytea alongside the encrypted data. The schema itself is not encrypted (it’s metadata, not data).
  3. On update — the schema is preserved verbatim from the existing version. The backend ignores documentSchema / schemaRef on update inputs. If you need a different schema, create a new document.
  4. On validate — same pipeline as create, but the document is never persisted. Useful for client-side pre-flight checks.

Supported JSON Schema

The compiler is santhosh-tekuri/jsonschema v6 running in strict mode (AssertContent, AssertFormat, AssertVocabs).

Supported: JSON Schema Draft 2020-12, including:

  • Standard keywords: type, properties, required, additionalProperties, patternProperties, allOf, anyOf, oneOf, not, if/then/else, enum, const, pattern, minLength/maxLength, minimum/maximum/ exclusiveMinimum/exclusiveMaximum, multipleOf, minItems/maxItems, uniqueItems, minProperties/maxProperties
  • $ref, $defs, $anchor, $dynamicRef
  • Nested arrays (items, prefixItems, contains)
  • Discrimination via oneOf + $defs

Custom string format values

The compiler registers two non-standard formats on top of the library’s built-in ones:

format Validation
country ISO 3166-1 alpha-2 or alpha-3 (via the countries Go package)
currency Exactly 3 uppercase ASCII letters (ISO 4217)
email Library default (RFC 5322)
date Library default (ISO 8601 date)
date-time Library default
uri Library default

Formats are asserted — a value that violates the format fails validation with a GraphQL error. They are not just annotations.

$ref to built-in schemas

This is the reason you’d usually reach for documentSchema. The compiler has the schema registry wired in as a URL loader, so any $ref to a canonical built-in URL is resolved from the embedded filesystem:

https://schema.identa.io/core/{SchemaName}.json

No network is used. External URLs (e.g. https://example.com/foo.json) are rejected at compile time with “schema not found”.

See Built-in schemas for the full list of 24 resolvable refs.

Examples

1. Minimal inline schema — no refs

A free-form profile with no built-in fields:

{
  "title": "Nickname",
  "type": "object",
  "properties": {
    "nickname": { "type": "string", "minLength": 1, "maxLength": 32 },
    "tagline":  { "type": "string" }
  },
  "required": ["nickname"]
}

With data:

{ "nickname": "alice", "tagline": "Curiouser and curiouser" }

2. Extend a built-in with required fields

The built-in PersonAddress marks every field as optional. If your app wants streetAddress, city, postalCode, and addressCountry to be mandatory, wrap it in allOf + required:

{
  "title": "RequiredAddressForm",
  "type": "object",
  "allOf": [
    { "$ref": "https://schema.identa.io/core/PersonAddress.json" }
  ],
  "required": ["streetAddress", "city", "postalCode", "addressCountry"]
}

The built-in’s format: "country" on addressCountry still applies, so "addressCountry": "INVALID" fails.

3. Compose multiple built-ins

Combine PersonFullName and PersonAddress into a single document with both nested under named properties:

{
  "title": "PersonWithAddress",
  "type": "object",
  "properties": {
    "person":  { "$ref": "https://schema.identa.io/core/PersonFullName.json" },
    "address": { "$ref": "https://schema.identa.io/core/PersonAddress.json" }
  },
  "required": ["person", "address"]
}

With data:

{
  "person":  { "firstName": "Alice", "lastName": "Liddell" },
  "address": { "streetAddress": "123 Main St", "city": "London",
               "postalCode": "SW1A 1AA", "addressCountry": "GB" }
}

4. Extend with your own properties

Add app-specific fields alongside a built-in using the “second allOf branch” pattern:

{
  "title": "VerifiedAddress",
  "type": "object",
  "allOf": [
    { "$ref": "https://schema.identa.io/core/PersonAddress.json" },
    {
      "properties": {
        "verified":         { "type": "boolean" },
        "verificationDate": { "type": "string", "format": "date" }
      },
      "required": ["streetAddress", "city", "addressCountry", "verified"]
    }
  ]
}

5. Discrimination with $defs + oneOf

When a document can be one of several shapes, model it like the built-in UBOStructure does — define variants in $defs and discriminate with a fixed ownerType enum:

{
  "title": "Contact",
  "type": "object",
  "properties": {
    "contact": { "$ref": "#/$defs/Contact" }
  },
  "required": ["contact"],
  "$defs": {
    "Contact": {
      "oneOf": [
        { "$ref": "#/$defs/PersonContact" },
        { "$ref": "#/$defs/CompanyContact" }
      ]
    },
    "PersonContact": {
      "type": "object",
      "properties": {
        "kind":      { "type": "string", "enum": ["person"] },
        "fullName":  { "$ref": "https://schema.identa.io/core/PersonFullName.json" },
        "email":     { "type": "string", "format": "email" }
      },
      "required": ["kind", "fullName"]
    },
    "CompanyContact": {
      "type": "object",
      "properties": {
        "kind":         { "type": "string", "enum": ["company"] },
        "legalEntity":  { "$ref": "https://schema.identa.io/core/LegalEntity.json" }
      },
      "required": ["kind", "legalEntity"]
    }
  }
}

6. Enum-constrained field

For fixed vocabularies, pair type: "string" with enum:

{
  "title": "IncidentReport",
  "type": "object",
  "properties": {
    "severity":    { "type": "string", "enum": ["low", "medium", "high", "critical"] },
    "description": { "type": "string", "minLength": 1 },
    "occurredAt":  { "type": "string", "format": "date-time" }
  },
  "required": ["severity", "description", "occurredAt"]
}

Passing documentSchema over GraphQL

documentSchema is typed as JSON — a json.RawMessage on the wire. You can send it as a JSON object literal in variables:

curl -X POST https://api.test.geena.eu/graphql \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "mutation($i: DocumentInput!) { user { document { create(input: $i) { id version } } } }",
    "variables": {
      "i": {
        "name": "Verified address",
        "documentSchema": {
          "title": "VerifiedAddress",
          "type": "object",
          "allOf": [
            { "$ref": "https://schema.identa.io/core/PersonAddress.json" },
            { "properties": { "verified": { "type": "boolean" } },
              "required": ["streetAddress", "city", "addressCountry", "verified"] }
          ]
        },
        "data": {
          "streetAddress": "123 Main St", "city": "London",
          "addressCountry": "GB", "verified": true
        }
      }
    }
  }'

Use user.document.validate with the same input shape to dry-run the schema + data before creating.

Gotchas

  • Schema is locked at create. update keeps the original schema. There is no “migrate schema” API; create a new document instead.
  • uniquePerVault is locked too and enforced by a PostgreSQL unique constraint. If you set uniquePerVault: true and a second create would collide, the write fails with document schema already exists in vault. The uniqueness is per (vault, schemaRef) or (vault, documentSchema).
  • Only registered URLs are resolvable. $ref to a built-in URL works. $ref to any other absolute URL errors with “schema not found”. Relative $ref into #/$defs within the same schema works.
  • Strict format assertion. format: "country" on a value that isn’t a valid ISO 3166 code fails. Library defaults (email, uri, date) are similarly strict.
  • No additionalProperties by default. JSON Schema’s default is to allow any extra properties. If you want strict rejection, set "additionalProperties": false explicitly on the outermost object.
  • Schema is not encrypted. It is metadata — only the data payload is encrypted with the vault’s DEK. Don’t put secrets in description or in the schema itself.
  • Size. Schemas are stored as bytea in vault_documents.document_schema. There is no hard upper bound, but validation time scales with schema complexity. Keep schemas focused.