API Documentation

Overview

Skely converts PDFs with an embedded text layer into deterministic, byte-stable JSON, Markdown, or ASCII by reading the document's own text and vector data, with no probabilistic models. The same input and the same engine version always produce the same bytes. The core endpoint is POST /v1/convert; companion routes handle large uploads, status polling, and engine-version discovery.

1 credit = 1 page. Failed conversions are never charged. Credits drain from your monthly subscription bucket first, then from purchased top-up credits (which never expire).

Base URL

All endpoints are served under /v1 at:

https://api.skely.io

Your first call

Send the raw PDF bytes as the request body and get structured JSON back. Every parameter rides in the query string; the body is the PDF.

curl -X POST "https://api.skely.io/v1/convert?format=json" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY" \
  -H "Content-Type: application/pdf" \
  --data-binary @invoice.pdf

Replace sk_live_YOUR_API_KEY with a real key — create one here.

Authentication

Every /v1 route requires a bearer token. Send your API key in the Authorization header:

HTTP header

Authorization: Bearer sk_live_YOUR_API_KEY

The token is an API key (sk_live_…) tied to your account. Create and manage keys in the dashboard; the secret is shown only once, so store it securely.

Keep keys server-side. A missing or invalid key returns ERR::AUTH::MISSING_TOKEN / ERR::AUTH::INVALID_KEY (see Errors).

POST /v1/convert

POSThttps://api.skely.io/v1/convert

The main reference endpoint. It is always a POST. Provide the PDF in one of three input modes, optionally scope the work to specific pages, and optionally run a no-charge cost estimate. The API key is the Authorization: Bearer header only — never a query or body param.

Input modes

Provide exactly one PDF source — raw bytes in the body, a url, or a gcs_ref. Supplying more than one returns ERR::INPUT::AMBIGUOUS_INPUT; supplying none returns ERR::INPUT::NO_INPUT.

Mode	Transport	Source	Notes
Raw bytes	`application/pdf` body	`--data-binary`	The PDF bytes in the request body, up to 20 MiB inline (a larger inline body is rejected with `413 ERR::INPUT::TOO_LARGE`). `Content-Type` is optional and not used for format detection (the format is read from the bytes); the examples send `application/pdf` by convention. For a larger file (up to the 50 MiB hard ceiling), upload it via /v1/uploads and pass the returned `gcs_ref`.
Public URL	query `url=` (no body)	`url`	A public `https` URL to a PDF. Private/internal/metadata addresses are rejected (SSRF guard).
Upload reference	query `gcs_ref=` (no body)	`gcs_ref`	A reference returned by /v1/uploads — your own upload only. The way to convert a file larger than the 20 MiB inline cap.

Never put the key in the URL. The API key is the Authorization: Bearer header only. A credential-looking query param (key, api_key, token, …) is rejected with ERR::AUTH::KEY_IN_URL rather than silently honoured.

Parameters

All non-secret parameters are query-string parameters. The body is reserved for the raw PDF bytes. Unrecognized query parameters are ignored — except credential-looking names (key, api_key, …), rejected with ERR::AUTH::KEY_IN_URL; an unknown value for a recognized param (e.g. mode, format, engine) is also rejected, not ignored. The output controls bounds / fonts / semantic are documented under Output options.

Name	Type	Required	Description
mode	"convert" \| "info"	optional	Default `convert` (full extraction); `info` is a lite metadata probe (see Document info). An unknown value returns `ERR::INPUT::BAD_MODE`.
format	"json" \| "md" \| "ascii"	optional	Output representation — see Output formats. Defaults to `json`.
pages	string	optional	Scope to a 1-based page selection — a range string like `"1-11,18,20-22"` (ranges + singletons, comma-separated, whitespace tolerated). You are billed only for the selected pages. Omit to convert the whole document.
url	https URL	conditional	A public `https` URL to a PDF, fetched server-side. Use when there is no body. Mutually exclusive with a body and with `gcs_ref`.
gcs_ref	string	conditional	A ref from POST /v1/uploads (your own upload only) — the way to convert a PDF larger than the 20 MiB inline cap (up to the 50 MiB hard ceiling). Use when there is no body. Mutually exclusive with a body and with `url`.
cost	boolean	optional	Dry run. Returns the page count and estimated credits without converting or charging anything. Defaults to `false`.
callback_url	https URL	optional	Webhook target — Skely POSTs a signed completion notification when the conversion settles. See Webhooks. Must be `https`; otherwise `ERR::INPUT::BAD_CALLBACK_URL`.
async	boolean	optional	Opt into the async (queued) path — the way to convert more than 500 pages, and to run any conversion fire-and-forget. Returns `202` with a `Location` to poll instead of an inline `200`. Equivalent to the `Prefer: respond-async` header. See Large files & async. Defaults to `false`.
engine	string	optional	Pin the engine — `<format>@<version>` (e.g. `pdf@2026-06-21`), a bare `pdf` (its latest), or `latest` / `stable`. Pin a concrete version only together with its format — a bare version like `2026-06-21` is rejected with `ERR::INPUT::BAD_ENGINE`. Discover valid versions with GET /v1/engine-versions. If the pinned format isn't the format detected from the bytes → `ERR::INPUT::FORMAT_MISMATCH`; an unparseable pin → `ERR::INPUT::BAD_ENGINE`; an unknown version → `ERR::INPUT::BAD_VERSION`. The resolved engine is echoed as the qualified token `<format>@<version>` (e.g. `pdf@2026-06-21`) in the `engine` response field; the `X-Skely-Engine-Version` header carries the bare version (`2026-06-21`), so to re-pin combine it with `X-Skely-Source-Format`.

Request examples

# 1) raw PDF bytes in the body (<= 20 MiB), scoped to a page selection
curl -X POST "https://api.skely.io/v1/convert?format=md&pages=1-11,18" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

# 2) public URL (no body)
curl -X POST "https://api.skely.io/v1/convert?format=json&url=https://example.com/report.pdf" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"

# 3) gcs_ref from POST /v1/uploads (for large files, no body)
curl -X POST "https://api.skely.io/v1/convert?format=json&gcs_ref=uploads/<uid>/<id>" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"

Success response

pages is what you were charged for; total_pages is the document's full length. A successful conversion returns status: "success". (The example's result.sections is abbreviated to one page; a multi-page convert returns one section per converted page.) Every convert response — this success body, the dry-run, info, and the async 202 — also echoes source_format, media_type, determinism, and a billable object ({ unit, count }). source_format and determinism mirror the X-Skely-Source-Format / X-Skely-Determinismheaders; media_type and billable have no header counterpart.

200 · application/json

{
  "request_id": "<request_id>",
  "status": "success",
  "engine": "pdf@2026-06-21",
  "source_format": "pdf",
  "media_type": "application/pdf",
  "determinism": "byte",
  "billable": { "unit": "page", "count": 12 },
  "pages": 12,
  "total_pages": 24,
  "credits_charged": 12,
  "format": "json",
  "result": {
    "documentType": {
      "type": "invoice",
      "subtype": "hotel_folio",
      "confidence": 0.91,
      "alternatives": [{ "type": "statement", "score": 6 }],
      "signals": ["anchor:\"folio\"", "room", "guest"]
    },
    "meta": {
      "schemaVersion": 3,
      "sourceFormat": "pdf",
      "mediaType": "application/pdf",
      "engineVersion": "2026-06-21",
      "determinism": "byte",
      "totalSections": 24
    },
    "sections": [
      {
        "kind": "page",
        "index": 0,
        "width": 612,
        "height": 792,
        "blocks": [
          {
            "type": "cluster",
            "semantic": "address",
            "entries": [
              { "kind": "text", "text": "6865 West 103rd Avenue", "semantic": "street" },
              { "kind": "kv", "key": "Tel", "value": "303-464-1997", "semantic": "phone" }
            ]
          },
          {
            "type": "table",
            "columns": [
              { "text": "Date", "alignment": "left" },
              { "text": "Charges", "alignment": "right" }
            ],
            "rows": [
              { "cells": [
                { "column": "Date", "text": "09-03-23", "semantic": "date" },
                { "column": "Charges", "text": "3.69", "semantic": "currency", "normalized": 3.69 }
              ] }
            ],
            "footers": [
              { "label": "Total", "cells": [
                { "column": "Charges", "text": "12.00", "semantic": "currency", "normalized": 12 }
              ] }
            ]
          }
        ]
      }
    ]
  }
}

The result is a structured document object. It leads with documentType (type, subtype, confidence, plus alternatives and signals) and a meta block, then a sections[] array — for a PDF each section has kind: "page" and a 0-based index, holding ordered blocks. Each block is one of two kinds: a table block (with columns[], rows[].cells[], and footers[]) or a cluster block — a group of free text and key/value (kv) entries that aren't part of a table — each carrying semantic / normalized tags (a cluster may be flagged residual: true — a catch-all of leftover text no other block claimed). The exact shape is tied to the engine version you pin (see engine). For md and ascii, result is a string instead — see Output formats.

Dry-run (cost estimate)

Pass cost=true to get a price up front. Nothing is converted and nothing is charged. The dry-run envelope carries dry_run: true and estimated_credits (no status field), and is still subject to the 500-billable-unit synchronous cap (it cannot use the async path), so cost=true on a document over 500 units returns 413 ERR::CONVERT::MAX_PAGES_EXCEEDED — scope it with pages. selected_pages appears when you supplied a pages scope.

# cost=true estimates only — nothing is converted, nothing is charged
curl -X POST "https://api.skely.io/v1/convert?cost=true&pages=1-11,18" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

200 · dry-run response

{
  "request_id": "<request_id>",
  "dry_run": true,
  "mode": "convert",
  "engine": "pdf@2026-06-21",
  "source_format": "pdf",
  "media_type": "application/pdf",
  "determinism": "byte",
  "billable": { "unit": "page", "count": 12 },
  "pages": 12,
  "total_pages": 24,
  "estimated_credits": 12,
  "format": "json",
  "selected_pages": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 18]
}

Discovering engine versions

GET /v1/engine-versions (authenticated) lists the pinnable <format>@<version> values per detected format — the values you can pass to the engine parameter above. Each version reports its schema_version, status, and whether it is the latest / stable pick for its format. Experimental (unreleased) versions are not listed and cannot be pinned.

cURL

# List the pinnable engine versions per detected format
curl "https://api.skely.io/v1/engine-versions" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"

# => {
#   "formats": [
#     { "format": "pdf", "versions": [
#       { "version": "2026-06-21", "schema_version": 3, "status": "current",
#         "latest": true, "stable": true }
#     ] }
#   ]
# }

Output options

Three boolean query params control which fields appear in the output. They apply to every format — the same projection drives json, md, and ascii. A section's width / height (present for fixed-layout pages such as PDF) are always kept.

Name	Type	Default	Description
bounds	boolean	false	Include per-element positional geometry: block `bounds` plus entry / cell `bounds` (and a `kv`'s `keyBounds` / `valueBounds`) as `[x, y, w, h]` tuples in points (top-left origin). Off by default, so the response carries no per-element geometry.
fonts	boolean	false	Include typography: a document-level `styles` table plus a per-element `style` key (into that table) and `size` (point size of the run). The query flag is `fonts`; the output keys are `styles` (table), `style` (per-element), and `size` (per-element point size). Off by default, so the response omits the `styles` table and every `style` / `size`.
semantic	boolean	true	Include semantic annotation: `semantic` tags plus `unit` / `normalized` typed values. On by default. Set `semantic=false` for a plain structural view — and because `md` / `ascii` render from the projected doc, that also drops Markdown links derived from semantic tags.

The defaults give a clean output.bounds=false, fonts=false, semantic=true: no geometry, no style / size or styles table, semantics kept. Turn a flag on only when you need that facet.

Example

The same source block under each option set. Switch tabs to see how the projection changes the result (shown here for format=json). These samples are abbreviated — the always-present meta block and the documentTypealternatives / signals are elided to highlight the projection (see the full Success response).

{
  "documentType": { "type": "invoice", "subtype": "hotel_folio", "confidence": 0.91 },
  "sections": [
    {
      "kind": "page",
      "index": 0,
      "width": 612,
      "height": 792,
      "blocks": [
        {
          "type": "cluster",
          "semantic": "address",
          "entries": [
            { "kind": "text", "text": "6865 West 103rd Avenue", "semantic": "street" },
            { "kind": "kv", "key": "Tel", "value": "303-464-1997", "semantic": "phone" }
          ]
        }
      ]
    }
  ]
}

Add the flags to the query string, e.g. ?format=json&bounds=true&fonts=true to include everything, or ?semantic=false for a purely structural result.

Document info

Pass mode=info on /v1/convert for a lite metadata probe: Skely reports the document's high-level facts — detected type, authoritative page count, word count, and (opt-in) fonts and page sizes — without any layout or data extraction. No tables, key/values, or text blocks are returned. The default mode=convert is the full conversion described above; an unknown mode returns ERR::INPUT::BAD_MODE (422).

What it returns

The response is a flat envelope (request_id, mode, status, engine, source_format, media_type, determinism, billable, pages, total_pages, credits_charged) plus these top-level fields:

Field	Type	When	Description
documentType	object	always	`{ type, subtype?, confidence, alternatives: [{ type, score }], signals: […] }` — the detected type with runner-up `alternatives` and the matched `signals`.
wordCount	integer	always	Total words across the inspected pages.
fonts	object	fonts=true	Document-level `fonts` table (`{ name, family, bold, italic }` per font key). Included only when `fonts=true`. Note: info mode names this table `fonts`, whereas a full convert exposes the same typography under `styles` / `style`.
pageSizes	array	bounds=true	`[{ page, width, height }, …]` in points (1/72 inch), top-left origin — the same coordinate system as `bounds`. Included only when `bounds=true`.

Info mode costs 1 credit per 50 pages, rounded up (minimum 1) — a 140-page document costs 3 credits, versus 1 credit/page for a full convert. pages scoping still applies: cost is ceil(inspected / 50) while total_pages remains the document's true total. So you can scope to a few pages (e.g. pages=1) to cheaply read the type and true page count of a huge document. A dry run (cost=true) returns estimated_credits on the same basis.

Request example

cURL

# Lite metadata probe — type, page count, fonts (no extraction). 1 credit / 50 pages.
curl -X POST "https://api.skely.io/v1/convert?mode=info&fonts=true" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

Response

This example sets fonts=true, so the fonts table is present; pageSizes would appear the same way when you add bounds=true.

200 · mode=info

{
  "request_id": "<request_id>",
  "mode": "info",
  "status": "success",
  "engine": "pdf@2026-06-21",
  "source_format": "pdf",
  "media_type": "application/pdf",
  "determinism": "byte",
  "billable": { "unit": "page", "count": 3 },
  "pages": 3,
  "total_pages": 3,
  "credits_charged": 1,
  "documentType": {
    "type": "invoice",
    "subtype": "receipt",
    "confidence": 0.82,
    "alternatives": [{ "type": "statement", "score": 6 }],
    "signals": ["anchor:\"invoice\"", "anchor:\"receipt\"", "subtotal", "total"]
  },
  "wordCount": 412,
  "fonts": {
    "g_f1": { "name": "Helvetica", "family": "Helvetica", "bold": false, "italic": false },
    "g_f2": { "name": "Helvetica-Bold", "family": "Helvetica", "bold": true, "italic": false }
  }
}

semantic is ignored in info mode (nothing is rendered or extracted), and format does not change the output — though an unrecognized format value is still rejected with ERR::INPUT::BAD_FORMAT. No webhook fires for an info-mode request. The inspected-page cap (max 500) still applies; scope with pages for larger documents. The detected type can occasionally differ from a full convert (it is classified from the inspected pages' text only, so a narrow scope can change it); for the overwhelming majority of documents it matches.

Output formats

The format query param selects one of three representations of the same underlying document model. All three are deterministic: the same input and engine pin always produce the same bytes.

format	result type	What it is
json	object	The structured document object — `documentType`, `meta`, and `sections[]` of ordered `cluster` / `table` blocks (a section's `kind` is `page` for PDFs). The default.
md	string	GitHub-flavored Markdown rendered from the same document model as `json`: pipe tables, `key: value` lines, `##` cluster headers, in reading order.
ascii	string	A fixed-width plaintext rendering (monospace `+--+` tables + plain `key: value` lines). Ideal for terminals, logs, and diffs.

format=json

The structured document object. This is the example shown under Success response above — result is a JSON object.

format=md

GitHub-flavored Markdown rendered from the projected document (it honors the projection flags — e.g. semantic=false drops links). result is a JSON string (newlines are \n-escaped in the JSON envelope). A table footer renders as a normal row keyed by column — the JSON footers[].label (e.g. "Total") is not emitted in md / ascii.

200 · format=md

{
  "request_id": "<request_id>",
  "status": "success",
  "engine": "pdf@2026-06-21",
  "source_format": "pdf",
  "media_type": "application/pdf",
  "determinism": "byte",
  "billable": { "unit": "page", "count": 1 },
  "pages": 1,
  "total_pages": 1,
  "credits_charged": 1,
  "format": "md",
  "result": "6865 West 103rd Avenue\n**Tel:** 303-464-1997\n\n| Date | Charges |\n| --- | --- |\n| 09-03-23 | 3.69 |\n|  | 12.00 |\n"
}

format=ascii

A fixed-width plaintext document. result is a JSON string; download it as .txt. Below is the decoded string (trailing newline omitted; what you get from result):

format=ascii · decoded result

# invoice/hotel_folio (confidence 0.91)

== Page 1 ==

6865 West 103rd Avenue
Tel: 303-464-1997

+----------+---------+
| Date     | Charges |
+----------+---------+
| 09-03-23 |    3.69 |
+----------+---------+
|          |   12.00 |
+----------+---------+

The ascii renderer is a pure, deterministic function of the JSON result — no timestamps, locale, or randomness. Same document → byte-identical ASCII.

Response headers

Every successful/v1/convert response (and the async 202) carries the cost of the request and your remaining balances, so you can track usage without a separate call. Error responses always carry X-Skely-Error-Code and X-Skely-Request-Id; a 429 also carries Retry-After, and an error raised after format detection (e.g. bad page range, too-many-pages, insufficient credits) additionally echoes X-Skely-Source-Format / X-Skely-Engine-Version / X-Skely-Determinism.

Header	Meaning
X-Skely-Credits-Cost	Credits quoted for this request. For a full convert it equals the selected page count; for `mode=info` it is `ceil(selected pages / 50)`. On a partial conversion you are charged for the pages that actually converted (the `credits_charged` body field), which may be lower than this header.
X-Skely-Subscription-Credits-Remaining	Credits left in your monthly subscription bucket (drained first; resets each period).
X-Skely-Purchased-Credits-Remaining	Credits left in your purchased top-up bucket (these never expire).
X-Skely-Request-Id	This request's id. It is also the key for GET /v1/convert/{id} and appears in error bodies.
X-Skely-Source-Format	The source format detected from the document bytes (e.g. `pdf`).
X-Skely-Engine-Version	The resolved, concrete engine version that ran (never an alias such as `latest` / `stable`; e.g. `2026-06-21`). To re-pin exactly what produced a result, combine it with `X-Skely-Source-Format` as `<format>@<version>` (e.g. `pdf@2026-06-21`) — the bare version alone is rejected with `ERR::INPUT::BAD_ENGINE`.
X-Skely-Determinism	The resolved engine's reproducibility tier for these bytes — currently `byte` for every supported format (same document → identical bytes). Present on every successful convert / info, the dry-run (`cost=true`) 200, and the 202 enqueue.

On an error response, X-Skely-Error-Code and X-Skely-Request-Id are always returned (the cost/balance headers are dropped, though an error after format detection — e.g. insufficient credits — still echoes X-Skely-Credits-Cost and the engine headers). A Retry-After header (seconds) accompanies both a rate-limited (429) response and an async 202, and the 202 also carries a Location header with the path to poll.

The API is built for server-side use (keep your key server-side). Cross-origin browser access is restricted to Skely's own web app — only those origins may make credentialed requests and read the X-Skely-* / Retry-After / Location headers (exposed via CORS); call from your server, not the browser.

POST /v1/uploads

POSThttps://api.skely.io/v1/uploads

For PDFs over the 20 MiB inline body cap (up to the 50 MiB hard ceiling), skip the inline upload. Request a short-lived signed URL (with Content-Type: application/octet-stream and no Skely auth header on the PUT), and if the document is over 500 pages also opt into the async path. The response also sets an X-Skely-Request-Id header (the upload id).

Response field	Type	Description
upload_url	uri	Short-lived signed `PUT` URL. PUT the raw bytes here with `Content-Type: application/octet-stream` and no Skely auth header.
gcs_ref	string	An opaque reference to pass back as `gcs_ref` on /v1/convert (e.g. `uploads/<uid>/<id>`) — send it verbatim; don't construct it yourself.
expires_in	integer	Seconds the signed URL stays valid (`900` — 15 minutes).
content_type	string	The exact `Content-Type` the PUT must send (`application/octet-stream`); the signed URL binds it.

Full flow

# 1) request a signed upload URL (no body)
curl -X POST "https://api.skely.io/v1/uploads" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"
# => { "upload_url": "https://...", "gcs_ref": "uploads/<uid>/<id>",
#      "expires_in": 900, "content_type": "application/octet-stream" }

# 2) PUT the PDF straight to the signed URL (octet-stream; NO Skely auth header)
curl -X PUT "<upload_url>" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @big-report.pdf

# 3) convert using the returned gcs_ref (query param, no body)
#    add async=true if the document is over 500 pages — see "Large files & async"
curl -X POST "https://api.skely.io/v1/convert?format=json&gcs_ref=uploads/<uid>/<id>" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"

Large files & async conversion

A conversion runs synchronously by default and returns the result in the 200 response — but the synchronous path is capped at 500 pages per request. To convert a larger document (or to run any conversion fire-and-forget), opt into the async path: the request is queued, returns 202 with a Location to poll, and you fetch the result from GET /v1/convert/{id} when the job finishes. The async ceiling is 5000 pages.

Large files are not an error. Files over the 20 MiB inline body cap go through /v1/uploads (up to the 50 MiB hard ceiling); documents over 500 pages go through the async path below. Nothing about a big document makes the API reject it outright — you just pick the right transport.

Quickstart: upload → convert async → poll → download

The full large-file flow end to end. Steps 1–2 (the signed upload) are only needed for a file over 20 MiB; for a smaller file you can send the bytes inline and just add async=true. Honor the 202's Retry-After between polls rather than busy-waiting.

# 1) (large file) get a signed upload URL — no body
UP=$(curl -s -X POST "https://api.skely.io/v1/uploads" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY")
URL=$(echo "$UP" | jq -r .upload_url)
REF=$(echo "$UP" | jq -r .gcs_ref)

# 2) PUT the bytes straight to storage (octet-stream, NO Skely auth header)
curl -X PUT "$URL" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @big-report.pdf

# 3) enqueue: async=true => 202 + Location to poll. Capture the poll path from the
#    Location header (Prefer: respond-async is an equivalent opt-in to async=true).
LOC=$(curl -s -D - -o /dev/null -X POST "https://api.skely.io/v1/convert?format=json&async=true&gcs_ref=$REF" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY" \
  | awk 'tolower($1) == "location:" { print $2 }' | tr -d '\r')
# 202 body: { "request_id": "...", "status": "queued", "credits_reserved": 1200, ... }
# LOC is the poll path, e.g. /v1/convert/<request_id>

# 4) poll the captured path until terminal, honoring Retry-After
curl "https://api.skely.io${LOC}" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"
# => succeeded: { "status": "succeeded", "pages_succeeded": 1200, "result": {...} }
#    or large:  { "status": "succeeded", "result_url": "https://<signed-download-url>", "result_url_expires_in": 900 }

Opting in

Two equivalent ways to request the async path — either is enough:

Knob	Where	Notes
async=true	query param	Only `true` / `1` turns it on.
Prefer: respond-async	HTTP request header	RFC 7240. Token matched case-insensitively; tolerates `;params` and comma lists.

What async does and doesn't apply to. Opting in enqueues convert requests of any size (even ≤ 500 pages — handy for fire-and-forget with a webhook). A dry run (cost=true) and mode=info are never queued — they do no extraction and always respond synchronously.

The 202 enqueue response

On an accepted async request you get 202 Accepted with a Location header (the path to poll) and a Retry-After header (default 5 s; the body mirrors it as retry_after). Credits equal to credits_reserved are reserved up front. In this 202 body result_url is the relative poll path (/v1/convert/{request_id}) — not a downloadable result; the signed download URL appears on the poll response only once the job succeeds.

202 · application/json

{
  "request_id": "<request_id>",
  "status": "queued",
  "engine": "pdf@2026-06-21",
  "source_format": "pdf",
  "media_type": "application/pdf",
  "determinism": "byte",
  "billable": { "unit": "page", "count": 1200 },
  "pages": 1200,
  "total_pages": 1200,
  "credits_reserved": 1200,
  "format": "json",
  "result_url": "/v1/convert/<request_id>",
  "retry_after": 5
}

Polling for the result

Poll GET /v1/convert/{request_id} (the Location). status moves queued → processing → a terminal state: succeeded, partial (some pages converted — you are billed only for those), or failed (nothing converted; the reservation is refunded). On terminal success the result is delivered two ways:

Inline when small (≤ 256 KiB): a result field (a structured object for json, a string for md / ascii).
Signed URL when large: a result_url (a short-lived 15-minute signed GET URL you fetch directly from storage, no auth) plus result_url_expires_in. A retried poll returns a fresh URL.

The converted output is retained for 24 hours after completion. After that the result is reaped: GET /v1/convert/{request_id} still returns 200 with a compact status record (request_id, status, pages, format) from your request history (plus the webhook object if a callback_url was set), but with no result / result_url — re-run the conversion to regenerate the output. (In the brief window where the record outlives its stored object, the poll returns result_expired: true; same remedy.) A 404 ERR::REQUEST::NOT_FOUND means an unknown id or one you don't own. Note: after reaping, an async job's status reads success | partial | failed (e.g. succeeded → success).

200 · terminal · inline result

{
  "request_id": "<request_id>",
  "status": "succeeded",
  "format": "md",
  "pages": 1200,
  "total_pages": 1200,
  "pages_succeeded": 1200,
  "result": "6865 West 103rd Avenue\n**Tel:** 303-464-1997\n\n| Date | Charges |\n| --- | --- |\n…"
}

200 · terminal · signed-URL result

{
  "request_id": "<request_id>",
  "status": "succeeded",
  "format": "json",
  "pages": 1200,
  "total_pages": 1200,
  "pages_succeeded": 1200,
  "result_url": "https://<signed-download-url>?…signature…",
  "result_url_expires_in": 900
}

Limits & errors

Situation	Result
≤ 500 pages, not opted in	Synchronous `200` (the default).
> 500 pages, not opted in	`413ERR::CONVERT::MAX_PAGES_EXCEEDED` — the message explains how to scope with `pages`, split client-side, or opt into async.
Any size, opted into async	`202` enqueue (any size is accepted once you opt into async).
> 5000 pages, even opted in	`413ERR::CONVERT::MAX_PAGES_EXCEEDED` — opting in does not lift the async ceiling; split the document client-side.
Inline body > 20 MiB / file > 50 MiB	`413ERR::INPUT::TOO_LARGE` — use /v1/uploads; 50 MiB is the hard ceiling on every path.
Not enough credits	`402ERR::BILLING::INSUFFICIENT_CREDITS` (with `needed` / `available`) — checked before enqueue, so nothing is reserved or charged.

Billing

Reserve → settle → refund. At 1 credit per page, the full estimated cost is reserved when the job is enqueued (subscription bucket first, then purchased) — the 402 check happens before that, so you are never enqueued without the credits. When the job finishes it is settled: you are charged for the pages that actually converted and the unused reservation is refunded. A failed job refunds the entire reservation. Usage is recorded once, under the same request_id, and appears on your Requests page.

GET /v1/convert/{id}

GEThttps://api.skely.io/v1/convert/{id}

Owner-scoped status for a request_id from a prior conversion (also returned in the X-Skely-Request-Id header and the 202Location). This is the poll endpoint for async jobs and also returns a compact record for a completed synchronous conversion. Poll no faster than the 202's Retry-After.

Response field	Type	When	Description
request_id	string	always	The request's id.
status	string	always	Async (live job): `queued` \| `processing` \| `succeeded` \| `partial` \| `failed`. Sync / reaped record: `success` \| `partial` \| `failed` (a reaped partial async job reads `partial`).
format	string	always	The output format.
pages	integer	always	Async: billable pages reserved. Sync record: pages charged.
total_pages	integer	async	The document's full page count.
pages_succeeded	integer \| null	async	Pages actually converted (what you are charged for). `null` until terminal.
result	object \| string	terminal, small	The converted output, inline when ≤ 256 KiB (a structured object for `json`; a string for `md` / `ascii`).
result_url	uri	terminal, large	A short-lived (15-minute) signed GET URL for a result over 256 KiB. Fetch directly — no Skely auth. A retry returns a fresh URL.
result_url_expires_in	integer	with result_url	Seconds the signed URL stays valid (`900`).
result_expired	boolean	reaped	`true` in the brief window where the record still exists but its stored output is gone — re-run the conversion. (Once fully reaped past 24h, the response is a compact usage record with no result fields instead.)
error	string	failed	A short failure reason on a `failed` job.
webhook	object	callback set, terminal	The latest webhook delivery state (`delivery_id`, `event_type`, `status`, `attempts`, `max_attempts`, `last_status_code`, `last_error`) when a callback_url was supplied. Present once the job settles — absent while `queued` / `processing`.

cURL

curl "https://api.skely.io/v1/convert/<request_id>" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY"

# Completed synchronous conversion (compact record):
# => { "request_id": "<request_id>", "status": "success", "pages": 12, "format": "json" }

# Async job mid-flight:
# => { "request_id": "<request_id>", "status": "processing", "format": "json",
#      "pages": 1200, "total_pages": 1200, "pages_succeeded": null }

See Large files & async for the full queued → terminal poll loop with inline and signed-URL result examples. An unknown id (or one you don't own) returns 404 with ERR::REQUEST::NOT_FOUND. For a completed synchronous conversion the record is a compact subset (request_id, status, pages, format) — it does not re-echo the cost/engine metadata from the original POST response; keep that response if you need it.

Terminal-success spelling. A live async poll reports succeeded; the webhook and any record served from history (synchronous conversions, and async jobs polled after the 24h window) report success. Treat both as the same success state — { "succeeded", "success" }.

Webhooks

Pass callback_url= on a non-dry-run /v1/convert and Skely POSTs one signed event to that https URL when the conversion settles — conversion.succeeded or conversion.failed. The webhook is always a completion notification: data.result is null and data.result_url points at the request status endpoint (GET /v1/convert/{request_id}).

Pairs with the async path for fire-and-forget. On a synchronous conversion the document is already in the 200 response, so the webhook is just a signal. On an async conversion (async=true) the webhook lets you skip polling entirely: receive the event when the job settles, then fetch the result (inline or via the signed URL) from the status endpoint.

Opting in

The callback_url must be an https URL and is SSRF-checked (private/internal/metadata addresses are rejected). A webhook fires only when a conversion actually runs and reaches a terminal state. It does not fire for dry runs (cost=true) or for pre-flight rejections — auth failures, rate limits, insufficient credits, bad input, or too-many-pages errors all return synchronously and run no conversion, so no event is sent.

cURL

# Opt in per request with callback_url (https only)
curl -X POST "https://api.skely.io/v1/convert?format=json&callback_url=https://you.example.com/hooks/skely" \
  -H "Authorization: Bearer sk_live_YOUR_API_KEY" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

Delivery

Skely POSTs application/json to your callback_url with these headers:

Header	Meaning
X-Skely-Webhook-Id	Stable delivery id — constant across all retries of the same event. Dedupe on it.
X-Skely-Event	The event type: `conversion.succeeded` or `conversion.failed`.
X-Skely-Signature	HMAC signature, format `t=<unix_seconds>,v1=<hex>` (see below).
X-Skely-Signature-Timestamp	Unix seconds the body was signed — mirrors the `t=` value inside `X-Skely-Signature` (verify the replay window against that `t=`; this header is informational). Each retry is re-signed with a fresh timestamp.
User-Agent	`Skely-Webhooks/1`

Event payload

The top-level request_id equals data.request_id and the original X-Skely-Request-Id — the join key. data.status is one of success, partial, or failed: a partial async settlement fires the conversion.succeeded event with data.status: "partial", and you are billed only for the converted pages (credits_charged = pages, with pages < total_pages). data.result is always null — fetch the output from the status endpoint via data.result_url. On failure, data carries status: "failed" and credits_charged: 0. Note the webhook uses success, whereas the GET poll's async status is succeeded — match them accordingly.

conversion.succeeded

{
  "id": "<event_id>",
  "type": "conversion.succeeded",
  "created": 1718924400,
  "api_version": "v1",
  "request_id": "<request_id>",
  "data": {
    "request_id": "<request_id>",
    "status": "success",
    "format": "json",
    "pages": 12,
    "total_pages": 12,
    "credits_charged": 12,
    "result_url": "https://api.skely.io/v1/convert/<request_id>",
    "result": null
  }
}

Verifying the signature

Signatures are HMAC-SHA256 over "{timestamp}.{raw_body}", keyed with your per-account signing secret (whsec_… — see signing secret below). Verify before parsing JSON. Recompute the HMAC, compare to the v1 value in constant time, and reject if the t timestamp is outside a ±5-minute (300 s) window. Because each retry is re-signed with a fresh timestamp, the window applies to the actual send time of that attempt.

import hashlib, hmac, time

def verify(secret: str, headers: dict, raw_body: bytes) -> bool:
    sig = dict(p.split("=", 1) for p in headers.get("X-Skely-Signature", "").split(",") if "=" in p)
    if "t" not in sig or "v1" not in sig:
        return False
    ts = int(sig["t"])
    if abs(time.time() - ts) > 300:  # +/- 5 min replay window
        return False
    signed = f"{ts}.".encode() + raw_body
    expected = hmac.new(secret.encode(), signed, hashlib.sha256).hexdigest()
    return hmac.compare_digest(expected, sig["v1"])

Retries & idempotency

The first attempt fires immediately. Skely waits 10 s per attempt for a response; any 2xx is success, and anything else (non-2xx, timeout, connection error) is a failed attempt. Failed deliveries retry with backoff after +1m, +5m, +30m, +2h, +6h — 6 attempts over ~8.6 h — after which the delivery is marked failed. Acknowledge fast: return 2xx within 10 s and do any heavy work asynchronously.

Delivery is at-least-once, so duplicates are possible. Dedupe on X-Skely-Webhook-Id (stable across retries of an event) and the event id, and keep your handler idempotent. Skely ignores your response body — only the status code matters.

Signing secret

Each account has one webhook signing secret (prefix whsec_). View and rotate it on the API Keys page. Every event — including every retry — is signed with the current secret. Rotating invalidates the old secret immediately, so roll the new value into your verifier as part of rotation.

Errors

Errors are namespaced and machine-readable. Most error codes follow the format ERR::SCOPE::REASON — for example ERR::INPUT::BLOCKED_URL (the generic fallback ERR::INTERNAL is the one exception — two segments). The code appears both in the response body and in the X-Skely-Error-Code response header.

Error body

application/json

{
  "request_id": "<request_id>",
  "code": "ERR::INPUT::BLOCKED_URL",
  "message": "URL resolves to a private/internal address",
  "is_retryable": false,
  "docs_url": "https://skely.io/docs#err-input-blocked_url"
}

Field	Description
request_id	The id of the failed request — quote it in support tickets.
code	The namespaced `ERR::SCOPE::REASON` code (also in `X-Skely-Error-Code`).
message	A human-readable explanation.
is_retryable	Whether retrying the same request could succeed.
docs_url	A link to the relevant documentation for this error.

Some statuses extend the body with extra fields: a 402 (ERR::BILLING::INSUFFICIENT_CREDITS) adds needed and available; a 413 (ERR::CONVERT::MAX_PAGES_EXCEEDED) adds pages, total_pages, and max_pages; a 429 (ERR::RATE::LIMITED) adds retry_after (mirrors the Retry-After header); a 415 (ERR::INPUT::UNSUPPORTED_FORMAT) adds candidates and supported; a 415 (ERR::INPUT::FORMAT_MISMATCH) adds expected and detected; a 422 (ERR::INPUT::BAD_VERSION) adds available.

Error code catalog

Every code the API can return, with its HTTP status, whether it is safe to retry, and what to do.

Code	HTTP	Retryable	Meaning / what to do
ERR::AUTH::MISSING_TOKEN	401	no	No Authorization: Bearer key was sent.
ERR::AUTH::INVALID_KEY	401	no	The API key is malformed or not recognized.
ERR::AUTH::REVOKED_KEY	401	no	The API key was revoked — create a new one.
ERR::AUTH::INVALID_TOKEN	401	no	The session token is invalid or expired.
ERR::AUTH::EMAIL_UNVERIFIED	403	no	Verify your account email before calling the API.
ERR::AUTH::APP_CHECK	401	no	Browser verification failed (web callers only); API-key callers never hit this.
ERR::AUTH::FORBIDDEN_ORIGIN	403	no	Request origin is not allowed — call the documented base URL; contact support if it persists.
ERR::AUTH::KEY_IN_URL	400	no	A credential was put in the URL — send it only in the Authorization header.
ERR::RATE::LIMITED	429	yes	Per-key rate limit hit — back off and retry after Retry-After.
ERR::SERVICE::UNAVAILABLE	503	yes	The service is temporarily unavailable — retry with exponential backoff.
ERR::BILLING::INSUFFICIENT_CREDITS	402	no	Not enough credits (see needed / available) — top up or upgrade.
ERR::INPUT::NO_INPUT	422	no	No document source — send body bytes, url, or gcs_ref.
ERR::INPUT::EMPTY_PDF	422	no	The supplied document was empty.
ERR::INPUT::BAD_FORMAT	422	no	format must be one of json, md, or ascii.
ERR::INPUT::BAD_MODE	422	no	mode must be convert or info.
ERR::INPUT::AMBIGUOUS_INPUT	422	no	More than one document source supplied — send exactly one.
ERR::INPUT::UNSUPPORTED_FORMAT	415	no	The bytes match no supported format.
ERR::INPUT::FORMAT_MISMATCH	415	no	The engine pin's format isn't the format detected from the bytes.
ERR::INPUT::BAD_CALLBACK_URL	422	no	callback_url must be a valid https URL (SSRF rules apply).
ERR::INPUT::BAD_ENGINE	422	no	The engine pin is unparseable (e.g. a bare version, unknown format).
ERR::INPUT::BAD_VERSION	422	no	Unknown engine version for the detected format.
ERR::INPUT::BAD_URL	422	no	The url is malformed or could not be resolved.
ERR::INPUT::BLOCKED_URL	403	no	The url resolves to a private/internal/metadata address (SSRF guard).
ERR::INPUT::FORBIDDEN_REF	403	no	The gcs_ref does not reference your own upload.
ERR::INPUT::BAD_REF	422	no	The gcs_ref is malformed.
ERR::INPUT::BAD_PAGE_RANGE	422	no	The pages selection is malformed or out of range.
ERR::INPUT::FETCH_FAILED	422	yes	Could not fetch the url — transient; retry.
ERR::INPUT::GCS_FETCH_FAILED	422	yes	Could not read the referenced upload — transient; retry.
ERR::INPUT::TOO_MANY_REDIRECTS	422	no	The url redirected too many times.
ERR::INPUT::TOO_LARGE	413	no	Over the size cap — use /v1/uploads; 50 MiB is the hard ceiling.
ERR::CONVERT::MAX_PAGES_EXCEEDED	413	no	Over the page cap — scope with pages, split, or use the async path.
ERR::REQUEST::NOT_FOUND	404	no	No such request, or not yours. (A reaped result is a 200 status record, not a 404.)
ERR::INTERNAL	500	no	Unexpected server error — retry; contact support if it persists.

Retry only codes marked retryable, and back off — a 429 (ERR::RATE::LIMITED) tells you how long to wait via Retry-After; a 503 (ERR::SERVICE::UNAVAILABLE) carries no Retry-After — retry after a few seconds with exponential backoff. Failed conversions are never charged.