Overview
Skely converts PDFs with an embedded text layer into deterministic, byte-stable JSON, Markdown, or ASCII by reading the document's own text and vector data, with no probabilistic models. The same input and the same engine version always produce the same bytes. The core endpoint is POST /v1/convert; companion routes handle large uploads, status polling, and engine-version discovery.
Base URL
All endpoints are served under /v1 at:
Your first call
Send the raw PDF bytes as the request body and get structured JSON back. Every parameter rides in the query string; the body is the PDF.
curl -X POST "https://api.skely.io/v1/convert?format=json" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "Content-Type: application/pdf" \
--data-binary @invoice.pdfReplace sk_live_YOUR_API_KEY with a real key — create one here.
Authentication
Every /v1 route requires a bearer token. Send your API key in the Authorization header:
Authorization: Bearer sk_live_YOUR_API_KEYThe token is an API key (sk_live_…) tied to your account. Create and manage keys in the dashboard; the secret is shown only once, so store it securely.
ERR::AUTH::MISSING_TOKEN / ERR::AUTH::INVALID_KEY (see Errors).POST /v1/convert
The main reference endpoint. It is always a POST. Provide the PDF in one of three input modes, optionally scope the work to specific pages, and optionally run a no-charge cost estimate. The API key is the Authorization: Bearer header only — never a query or body param.
Input modes
Provide exactly one PDF source — raw bytes in the body, a url, or a gcs_ref. Supplying more than one returns ERR::INPUT::AMBIGUOUS_INPUT; supplying none returns ERR::INPUT::NO_INPUT.
| Mode | Transport | Source | Notes |
|---|---|---|---|
| Raw bytes | application/pdf body | --data-binary | The PDF bytes in the request body, up to 20 MiB inline (a larger inline body is rejected with 413 ERR::INPUT::TOO_LARGE). Content-Type is optional and not used for format detection (the format is read from the bytes); the examples send application/pdf by convention. For a larger file (up to the 50 MiB hard ceiling), upload it via /v1/uploads and pass the returned gcs_ref. |
| Public URL | query url= (no body) | url | A public https URL to a PDF. Private/internal/metadata addresses are rejected (SSRF guard). |
| Upload reference | query gcs_ref= (no body) | gcs_ref | A reference returned by /v1/uploads — your own upload only. The way to convert a file larger than the 20 MiB inline cap. |
Authorization: Bearer header only. A credential-looking query param (key, api_key, token, …) is rejected with ERR::AUTH::KEY_IN_URL rather than silently honoured.Parameters
All non-secret parameters are query-string parameters. The body is reserved for the raw PDF bytes. Unrecognized query parameters are ignored — except credential-looking names (key, api_key, …), rejected with ERR::AUTH::KEY_IN_URL; an unknown value for a recognized param (e.g. mode, format, engine) is also rejected, not ignored. The output controls bounds / fonts / semantic are documented under Output options.
| Name | Type | Required | Description |
|---|---|---|---|
| mode | "convert" | "info" | optional | Default convert (full extraction); info is a lite metadata probe (see Document info). An unknown value returns ERR::INPUT::BAD_MODE. |
| format | "json" | "md" | "ascii" | optional | Output representation — see Output formats. Defaults to json. |
| pages | string | optional | Scope to a 1-based page selection — a range string like "1-11,18,20-22" (ranges + singletons, comma-separated, whitespace tolerated). You are billed only for the selected pages. Omit to convert the whole document. |
| url | https URL | conditional | A public https URL to a PDF, fetched server-side. Use when there is no body. Mutually exclusive with a body and with gcs_ref. |
| gcs_ref | string | conditional | A ref from POST /v1/uploads (your own upload only) — the way to convert a PDF larger than the 20 MiB inline cap (up to the 50 MiB hard ceiling). Use when there is no body. Mutually exclusive with a body and with url. |
| cost | boolean | optional | Dry run. Returns the page count and estimated credits without converting or charging anything. Defaults to false. |
| callback_url | https URL | optional | Webhook target — Skely POSTs a signed completion notification when the conversion settles. See Webhooks. Must be https; otherwise ERR::INPUT::BAD_CALLBACK_URL. |
| async | boolean | optional | Opt into the async (queued) path — the way to convert more than 500 pages, and to run any conversion fire-and-forget. Returns 202 with a Location to poll instead of an inline 200. Equivalent to the Prefer: respond-async header. See Large files & async. Defaults to false. |
| engine | string | optional | Pin the engine — <format>@<version> (e.g. pdf@2026-06-21), a bare pdf (its latest), or latest / stable. Pin a concrete version only together with its format — a bare version like 2026-06-21 is rejected with ERR::INPUT::BAD_ENGINE. Discover valid versions with GET /v1/engine-versions. If the pinned format isn't the format detected from the bytes → ERR::INPUT::FORMAT_MISMATCH; an unparseable pin → ERR::INPUT::BAD_ENGINE; an unknown version → ERR::INPUT::BAD_VERSION. The resolved engine is echoed as the qualified token <format>@<version> (e.g. pdf@2026-06-21) in the engine response field; the X-Skely-Engine-Version header carries the bare version (2026-06-21), so to re-pin combine it with X-Skely-Source-Format. |
Request examples
# 1) raw PDF bytes in the body (<= 20 MiB), scoped to a page selection
curl -X POST "https://api.skely.io/v1/convert?format=md&pages=1-11,18" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "Content-Type: application/pdf" \
--data-binary @report.pdf
# 2) public URL (no body)
curl -X POST "https://api.skely.io/v1/convert?format=json&url=https://example.com/report.pdf" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"
# 3) gcs_ref from POST /v1/uploads (for large files, no body)
curl -X POST "https://api.skely.io/v1/convert?format=json&gcs_ref=uploads/<uid>/<id>" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"Success response
pages is what you were charged for; total_pages is the document's full length. A successful conversion returns status: "success". (The example's result.sections is abbreviated to one page; a multi-page convert returns one section per converted page.) Every convert response — this success body, the dry-run, info, and the async 202 — also echoes source_format, media_type, determinism, and a billable object ({ unit, count }). source_format and determinism mirror the X-Skely-Source-Format / X-Skely-Determinismheaders; media_type and billable have no header counterpart.
{
"request_id": "<request_id>",
"status": "success",
"engine": "pdf@2026-06-21",
"source_format": "pdf",
"media_type": "application/pdf",
"determinism": "byte",
"billable": { "unit": "page", "count": 12 },
"pages": 12,
"total_pages": 24,
"credits_charged": 12,
"format": "json",
"result": {
"documentType": {
"type": "invoice",
"subtype": "hotel_folio",
"confidence": 0.91,
"alternatives": [{ "type": "statement", "score": 6 }],
"signals": ["anchor:\"folio\"", "room", "guest"]
},
"meta": {
"schemaVersion": 3,
"sourceFormat": "pdf",
"mediaType": "application/pdf",
"engineVersion": "2026-06-21",
"determinism": "byte",
"totalSections": 24
},
"sections": [
{
"kind": "page",
"index": 0,
"width": 612,
"height": 792,
"blocks": [
{
"type": "cluster",
"semantic": "address",
"entries": [
{ "kind": "text", "text": "6865 West 103rd Avenue", "semantic": "street" },
{ "kind": "kv", "key": "Tel", "value": "303-464-1997", "semantic": "phone" }
]
},
{
"type": "table",
"columns": [
{ "text": "Date", "alignment": "left" },
{ "text": "Charges", "alignment": "right" }
],
"rows": [
{ "cells": [
{ "column": "Date", "text": "09-03-23", "semantic": "date" },
{ "column": "Charges", "text": "3.69", "semantic": "currency", "normalized": 3.69 }
] }
],
"footers": [
{ "label": "Total", "cells": [
{ "column": "Charges", "text": "12.00", "semantic": "currency", "normalized": 12 }
] }
]
}
]
}
]
}
}result is a structured document object. It leads with documentType (type, subtype, confidence, plus alternatives and signals) and a meta block, then a sections[] array — for a PDF each section has kind: "page" and a 0-based index, holding ordered blocks. Each block is one of two kinds: a table block (with columns[], rows[].cells[], and footers[]) or a cluster block — a group of free text and key/value (kv) entries that aren't part of a table — each carrying semantic / normalized tags (a cluster may be flagged residual: true — a catch-all of leftover text no other block claimed). The exact shape is tied to the engine version you pin (see engine). For md and ascii, result is a string instead — see Output formats.Dry-run (cost estimate)
Pass cost=true to get a price up front. Nothing is converted and nothing is charged. The dry-run envelope carries dry_run: true and estimated_credits (no status field), and is still subject to the 500-billable-unit synchronous cap (it cannot use the async path), so cost=true on a document over 500 units returns 413 ERR::CONVERT::MAX_PAGES_EXCEEDED — scope it with pages. selected_pages appears when you supplied a pages scope.
# cost=true estimates only — nothing is converted, nothing is charged
curl -X POST "https://api.skely.io/v1/convert?cost=true&pages=1-11,18" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "Content-Type: application/pdf" \
--data-binary @report.pdf{
"request_id": "<request_id>",
"dry_run": true,
"mode": "convert",
"engine": "pdf@2026-06-21",
"source_format": "pdf",
"media_type": "application/pdf",
"determinism": "byte",
"billable": { "unit": "page", "count": 12 },
"pages": 12,
"total_pages": 24,
"estimated_credits": 12,
"format": "json",
"selected_pages": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 18]
}Discovering engine versions
GET /v1/engine-versions (authenticated) lists the pinnable <format>@<version> values per detected format — the values you can pass to the engine parameter above. Each version reports its schema_version, status, and whether it is the latest / stable pick for its format. Experimental (unreleased) versions are not listed and cannot be pinned.
# List the pinnable engine versions per detected format
curl "https://api.skely.io/v1/engine-versions" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"
# => {
# "formats": [
# { "format": "pdf", "versions": [
# { "version": "2026-06-21", "schema_version": 3, "status": "current",
# "latest": true, "stable": true }
# ] }
# ]
# }Output options
Three boolean query params control which fields appear in the output. They apply to every format — the same projection drives json, md, and ascii. A section's width / height (present for fixed-layout pages such as PDF) are always kept.
| Name | Type | Default | Description |
|---|---|---|---|
| bounds | boolean | false | Include per-element positional geometry: block bounds plus entry / cell bounds (and a kv's keyBounds / valueBounds) as [x, y, w, h] tuples in points (top-left origin). Off by default, so the response carries no per-element geometry. |
| fonts | boolean | false | Include typography: a document-level styles table plus a per-element style key (into that table) and size (point size of the run). The query flag is fonts; the output keys are styles (table), style (per-element), and size (per-element point size). Off by default, so the response omits the styles table and every style / size. |
| semantic | boolean | true | Include semantic annotation: semantic tags plus unit / normalized typed values. On by default. Set semantic=false for a plain structural view — and because md / ascii render from the projected doc, that also drops Markdown links derived from semantic tags. |
bounds=false, fonts=false, semantic=true: no geometry, no style / size or styles table, semantics kept. Turn a flag on only when you need that facet.Example
The same source block under each option set. Switch tabs to see how the projection changes the result (shown here for format=json). These samples are abbreviated — the always-present meta block and the documentTypealternatives / signals are elided to highlight the projection (see the full Success response).
{
"documentType": { "type": "invoice", "subtype": "hotel_folio", "confidence": 0.91 },
"sections": [
{
"kind": "page",
"index": 0,
"width": 612,
"height": 792,
"blocks": [
{
"type": "cluster",
"semantic": "address",
"entries": [
{ "kind": "text", "text": "6865 West 103rd Avenue", "semantic": "street" },
{ "kind": "kv", "key": "Tel", "value": "303-464-1997", "semantic": "phone" }
]
}
]
}
]
}Add the flags to the query string, e.g. ?format=json&bounds=true&fonts=true to include everything, or ?semantic=false for a purely structural result.
Document info
Pass mode=info on /v1/convert for a lite metadata probe: Skely reports the document's high-level facts — detected type, authoritative page count, word count, and (opt-in) fonts and page sizes — without any layout or data extraction. No tables, key/values, or text blocks are returned. The default mode=convert is the full conversion described above; an unknown mode returns ERR::INPUT::BAD_MODE (422).
What it returns
The response is a flat envelope (request_id, mode, status, engine, source_format, media_type, determinism, billable, pages, total_pages, credits_charged) plus these top-level fields:
| Field | Type | When | Description |
|---|---|---|---|
| documentType | object | always | { type, subtype?, confidence, alternatives: [{ type, score }], signals: […] } — the detected type with runner-up alternatives and the matched signals. |
| wordCount | integer | always | Total words across the inspected pages. |
| fonts | object | fonts=true | Document-level fonts table ({ name, family, bold, italic } per font key). Included only when fonts=true. Note: info mode names this table fonts, whereas a full convert exposes the same typography under styles / style. |
| pageSizes | array | bounds=true | [{ page, width, height }, …] in points (1/72 inch), top-left origin — the same coordinate system as bounds. Included only when bounds=true. |
pages scoping still applies: cost is ceil(inspected / 50) while total_pages remains the document's true total. So you can scope to a few pages (e.g. pages=1) to cheaply read the type and true page count of a huge document. A dry run (cost=true) returns estimated_credits on the same basis.Request example
# Lite metadata probe — type, page count, fonts (no extraction). 1 credit / 50 pages.
curl -X POST "https://api.skely.io/v1/convert?mode=info&fonts=true" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "Content-Type: application/pdf" \
--data-binary @report.pdfResponse
This example sets fonts=true, so the fonts table is present; pageSizes would appear the same way when you add bounds=true.
{
"request_id": "<request_id>",
"mode": "info",
"status": "success",
"engine": "pdf@2026-06-21",
"source_format": "pdf",
"media_type": "application/pdf",
"determinism": "byte",
"billable": { "unit": "page", "count": 3 },
"pages": 3,
"total_pages": 3,
"credits_charged": 1,
"documentType": {
"type": "invoice",
"subtype": "receipt",
"confidence": 0.82,
"alternatives": [{ "type": "statement", "score": 6 }],
"signals": ["anchor:\"invoice\"", "anchor:\"receipt\"", "subtotal", "total"]
},
"wordCount": 412,
"fonts": {
"g_f1": { "name": "Helvetica", "family": "Helvetica", "bold": false, "italic": false },
"g_f2": { "name": "Helvetica-Bold", "family": "Helvetica", "bold": true, "italic": false }
}
}semantic is ignored in info mode (nothing is rendered or extracted), and format does not change the output — though an unrecognized format value is still rejected with ERR::INPUT::BAD_FORMAT. No webhook fires for an info-mode request. The inspected-page cap (max 500) still applies; scope with pages for larger documents. The detected type can occasionally differ from a full convert (it is classified from the inspected pages' text only, so a narrow scope can change it); for the overwhelming majority of documents it matches.Output formats
The format query param selects one of three representations of the same underlying document model. All three are deterministic: the same input and engine pin always produce the same bytes.
| format | result type | What it is |
|---|---|---|
| json | object | The structured document object — documentType, meta, and sections[] of ordered cluster / table blocks (a section's kind is page for PDFs). The default. |
| md | string | GitHub-flavored Markdown rendered from the same document model as json: pipe tables, **key:** value lines, ## cluster headers, in reading order. |
| ascii | string | A fixed-width plaintext rendering (monospace +--+ tables + plain key: value lines). Ideal for terminals, logs, and diffs. |
format=json
The structured document object. This is the example shown under Success response above — result is a JSON object.
format=md
GitHub-flavored Markdown rendered from the projected document (it honors the projection flags — e.g. semantic=false drops links). result is a JSON string (newlines are \n-escaped in the JSON envelope). A table footer renders as a normal row keyed by column — the JSON footers[].label (e.g. "Total") is not emitted in md / ascii.
{
"request_id": "<request_id>",
"status": "success",
"engine": "pdf@2026-06-21",
"source_format": "pdf",
"media_type": "application/pdf",
"determinism": "byte",
"billable": { "unit": "page", "count": 1 },
"pages": 1,
"total_pages": 1,
"credits_charged": 1,
"format": "md",
"result": "6865 West 103rd Avenue\n**Tel:** 303-464-1997\n\n| Date | Charges |\n| --- | --- |\n| 09-03-23 | 3.69 |\n| | 12.00 |\n"
}format=ascii
A fixed-width plaintext document. result is a JSON string; download it as .txt. Below is the decoded string (trailing newline omitted; what you get from result):
# invoice/hotel_folio (confidence 0.91)
== Page 1 ==
6865 West 103rd Avenue
Tel: 303-464-1997
+----------+---------+
| Date | Charges |
+----------+---------+
| 09-03-23 | 3.69 |
+----------+---------+
| | 12.00 |
+----------+---------+ascii renderer is a pure, deterministic function of the JSON result — no timestamps, locale, or randomness. Same document → byte-identical ASCII.Response headers
Every successful/v1/convert response (and the async 202) carries the cost of the request and your remaining balances, so you can track usage without a separate call. Error responses always carry X-Skely-Error-Code and X-Skely-Request-Id; a 429 also carries Retry-After, and an error raised after format detection (e.g. bad page range, too-many-pages, insufficient credits) additionally echoes X-Skely-Source-Format / X-Skely-Engine-Version / X-Skely-Determinism.
| Header | Meaning |
|---|---|
| X-Skely-Credits-Cost | Credits quoted for this request. For a full convert it equals the selected page count; for mode=info it is ceil(selected pages / 50). On a partial conversion you are charged for the pages that actually converted (the credits_charged body field), which may be lower than this header. |
| X-Skely-Subscription-Credits-Remaining | Credits left in your monthly subscription bucket (drained first; resets each period). |
| X-Skely-Purchased-Credits-Remaining | Credits left in your purchased top-up bucket (these never expire). |
| X-Skely-Request-Id | This request's id. It is also the key for GET /v1/convert/{id} and appears in error bodies. |
| X-Skely-Source-Format | The source format detected from the document bytes (e.g. pdf). |
| X-Skely-Engine-Version | The resolved, concrete engine version that ran (never an alias such as latest / stable; e.g. 2026-06-21). To re-pin exactly what produced a result, combine it with X-Skely-Source-Format as <format>@<version> (e.g. pdf@2026-06-21) — the bare version alone is rejected with ERR::INPUT::BAD_ENGINE. |
| X-Skely-Determinism | The resolved engine's reproducibility tier for these bytes — currently byte for every supported format (same document → identical bytes). Present on every successful convert / info, the dry-run (cost=true) 200, and the 202 enqueue. |
X-Skely-Error-Code and X-Skely-Request-Id are always returned (the cost/balance headers are dropped, though an error after format detection — e.g. insufficient credits — still echoes X-Skely-Credits-Cost and the engine headers). A Retry-After header (seconds) accompanies both a rate-limited (429) response and an async 202, and the 202 also carries a Location header with the path to poll.X-Skely-* / Retry-After / Location headers (exposed via CORS); call from your server, not the browser.POST /v1/uploads
For PDFs over the 20 MiB inline body cap (up to the 50 MiB hard ceiling), skip the inline upload. Request a short-lived signed URL (with Content-Type: application/octet-stream and no Skely auth header on the PUT), and if the document is over 500 pages also opt into the async path. The response also sets an X-Skely-Request-Id header (the upload id).
| Response field | Type | Description |
|---|---|---|
| upload_url | uri | Short-lived signed PUT URL. PUT the raw bytes here with Content-Type: application/octet-stream and no Skely auth header. |
| gcs_ref | string | An opaque reference to pass back as gcs_ref on /v1/convert (e.g. uploads/<uid>/<id>) — send it verbatim; don't construct it yourself. |
| expires_in | integer | Seconds the signed URL stays valid (900 — 15 minutes). |
| content_type | string | The exact Content-Type the PUT must send (application/octet-stream); the signed URL binds it. |
Full flow
# 1) request a signed upload URL (no body)
curl -X POST "https://api.skely.io/v1/uploads" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"
# => { "upload_url": "https://...", "gcs_ref": "uploads/<uid>/<id>",
# "expires_in": 900, "content_type": "application/octet-stream" }
# 2) PUT the PDF straight to the signed URL (octet-stream; NO Skely auth header)
curl -X PUT "<upload_url>" \
-H "Content-Type: application/octet-stream" \
--data-binary @big-report.pdf
# 3) convert using the returned gcs_ref (query param, no body)
# add async=true if the document is over 500 pages — see "Large files & async"
curl -X POST "https://api.skely.io/v1/convert?format=json&gcs_ref=uploads/<uid>/<id>" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"Large files & async conversion
A conversion runs synchronously by default and returns the result in the 200 response — but the synchronous path is capped at 500 pages per request. To convert a larger document (or to run any conversion fire-and-forget), opt into the async path: the request is queued, returns 202 with a Location to poll, and you fetch the result from GET /v1/convert/{id} when the job finishes. The async ceiling is 5000 pages.
Quickstart: upload → convert async → poll → download
The full large-file flow end to end. Steps 1–2 (the signed upload) are only needed for a file over 20 MiB; for a smaller file you can send the bytes inline and just add async=true. Honor the 202's Retry-After between polls rather than busy-waiting.
# 1) (large file) get a signed upload URL — no body
UP=$(curl -s -X POST "https://api.skely.io/v1/uploads" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY")
URL=$(echo "$UP" | jq -r .upload_url)
REF=$(echo "$UP" | jq -r .gcs_ref)
# 2) PUT the bytes straight to storage (octet-stream, NO Skely auth header)
curl -X PUT "$URL" \
-H "Content-Type: application/octet-stream" \
--data-binary @big-report.pdf
# 3) enqueue: async=true => 202 + Location to poll. Capture the poll path from the
# Location header (Prefer: respond-async is an equivalent opt-in to async=true).
LOC=$(curl -s -D - -o /dev/null -X POST "https://api.skely.io/v1/convert?format=json&async=true&gcs_ref=$REF" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
| awk 'tolower($1) == "location:" { print $2 }' | tr -d '\r')
# 202 body: { "request_id": "...", "status": "queued", "credits_reserved": 1200, ... }
# LOC is the poll path, e.g. /v1/convert/<request_id>
# 4) poll the captured path until terminal, honoring Retry-After
curl "https://api.skely.io${LOC}" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"
# => succeeded: { "status": "succeeded", "pages_succeeded": 1200, "result": {...} }
# or large: { "status": "succeeded", "result_url": "https://<signed-download-url>", "result_url_expires_in": 900 }Opting in
Two equivalent ways to request the async path — either is enough:
| Knob | Where | Notes |
|---|---|---|
| async=true | query param | Only true / 1 turns it on. |
| Prefer: respond-async | HTTP request header | RFC 7240. Token matched case-insensitively; tolerates ;params and comma lists. |
convert requests of any size (even ≤ 500 pages — handy for fire-and-forget with a webhook). A dry run (cost=true) and mode=info are never queued — they do no extraction and always respond synchronously.The 202 enqueue response
On an accepted async request you get 202 Accepted with a Location header (the path to poll) and a Retry-After header (default 5 s; the body mirrors it as retry_after). Credits equal to credits_reserved are reserved up front. In this 202 body result_url is the relative poll path (/v1/convert/{request_id}) — not a downloadable result; the signed download URL appears on the poll response only once the job succeeds.
{
"request_id": "<request_id>",
"status": "queued",
"engine": "pdf@2026-06-21",
"source_format": "pdf",
"media_type": "application/pdf",
"determinism": "byte",
"billable": { "unit": "page", "count": 1200 },
"pages": 1200,
"total_pages": 1200,
"credits_reserved": 1200,
"format": "json",
"result_url": "/v1/convert/<request_id>",
"retry_after": 5
}Polling for the result
Poll GET /v1/convert/{request_id} (the Location). status moves queued → processing → a terminal state: succeeded, partial (some pages converted — you are billed only for those), or failed (nothing converted; the reservation is refunded). On terminal success the result is delivered two ways:
- Inline when small (≤ 256 KiB): a
resultfield (a structured object forjson, a string formd/ascii). - Signed URL when large: a
result_url(a short-lived 15-minute signed GET URL you fetch directly from storage, no auth) plusresult_url_expires_in. A retried poll returns a fresh URL.
The converted output is retained for 24 hours after completion. After that the result is reaped: GET /v1/convert/{request_id} still returns 200 with a compact status record (request_id, status, pages, format) from your request history (plus the webhook object if a callback_url was set), but with no result / result_url — re-run the conversion to regenerate the output. (In the brief window where the record outlives its stored object, the poll returns result_expired: true; same remedy.) A 404 ERR::REQUEST::NOT_FOUND means an unknown id or one you don't own. Note: after reaping, an async job's status reads success | partial | failed (e.g. succeeded → success).
{
"request_id": "<request_id>",
"status": "succeeded",
"format": "md",
"pages": 1200,
"total_pages": 1200,
"pages_succeeded": 1200,
"result": "6865 West 103rd Avenue\n**Tel:** 303-464-1997\n\n| Date | Charges |\n| --- | --- |\n…"
}{
"request_id": "<request_id>",
"status": "succeeded",
"format": "json",
"pages": 1200,
"total_pages": 1200,
"pages_succeeded": 1200,
"result_url": "https://<signed-download-url>?…signature…",
"result_url_expires_in": 900
}Limits & errors
| Situation | Result |
|---|---|
| ≤ 500 pages, not opted in | Synchronous 200 (the default). |
| > 500 pages, not opted in | 413ERR::CONVERT::MAX_PAGES_EXCEEDED — the message explains how to scope with pages, split client-side, or opt into async. |
| Any size, opted into async | 202 enqueue (any size is accepted once you opt into async). |
| > 5000 pages, even opted in | 413ERR::CONVERT::MAX_PAGES_EXCEEDED — opting in does not lift the async ceiling; split the document client-side. |
| Inline body > 20 MiB / file > 50 MiB | 413ERR::INPUT::TOO_LARGE — use /v1/uploads; 50 MiB is the hard ceiling on every path. |
| Not enough credits | 402ERR::BILLING::INSUFFICIENT_CREDITS (with needed / available) — checked before enqueue, so nothing is reserved or charged. |
Billing
402 check happens before that, so you are never enqueued without the credits. When the job finishes it is settled: you are charged for the pages that actually converted and the unused reservation is refunded. A failed job refunds the entire reservation. Usage is recorded once, under the same request_id, and appears on your Requests page.GET /v1/convert/{id}
Owner-scoped status for a request_id from a prior conversion (also returned in the X-Skely-Request-Id header and the 202Location). This is the poll endpoint for async jobs and also returns a compact record for a completed synchronous conversion. Poll no faster than the 202's Retry-After.
| Response field | Type | When | Description |
|---|---|---|---|
| request_id | string | always | The request's id. |
| status | string | always | Async (live job): queued | processing | succeeded | partial | failed. Sync / reaped record: success | partial | failed (a reaped partial async job reads partial). |
| format | string | always | The output format. |
| pages | integer | always | Async: billable pages reserved. Sync record: pages charged. |
| total_pages | integer | async | The document's full page count. |
| pages_succeeded | integer | null | async | Pages actually converted (what you are charged for). null until terminal. |
| result | object | string | terminal, small | The converted output, inline when ≤ 256 KiB (a structured object for json; a string for md / ascii). |
| result_url | uri | terminal, large | A short-lived (15-minute) signed GET URL for a result over 256 KiB. Fetch directly — no Skely auth. A retry returns a fresh URL. |
| result_url_expires_in | integer | with result_url | Seconds the signed URL stays valid (900). |
| result_expired | boolean | reaped | true in the brief window where the record still exists but its stored output is gone — re-run the conversion. (Once fully reaped past 24h, the response is a compact usage record with no result fields instead.) |
| error | string | failed | A short failure reason on a failed job. |
| webhook | object | callback set, terminal | The latest webhook delivery state (delivery_id, event_type, status, attempts, max_attempts, last_status_code, last_error) when a callback_url was supplied. Present once the job settles — absent while queued / processing. |
curl "https://api.skely.io/v1/convert/<request_id>" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY"
# Completed synchronous conversion (compact record):
# => { "request_id": "<request_id>", "status": "success", "pages": 12, "format": "json" }
# Async job mid-flight:
# => { "request_id": "<request_id>", "status": "processing", "format": "json",
# "pages": 1200, "total_pages": 1200, "pages_succeeded": null }See Large files & async for the full queued → terminal poll loop with inline and signed-URL result examples. An unknown id (or one you don't own) returns 404 with ERR::REQUEST::NOT_FOUND. For a completed synchronous conversion the record is a compact subset (request_id, status, pages, format) — it does not re-echo the cost/engine metadata from the original POST response; keep that response if you need it.
succeeded; the webhook and any record served from history (synchronous conversions, and async jobs polled after the 24h window) report success. Treat both as the same success state — { "succeeded", "success" }.Webhooks
Pass callback_url= on a non-dry-run /v1/convert and Skely POSTs one signed event to that https URL when the conversion settles — conversion.succeeded or conversion.failed. The webhook is always a completion notification: data.result is null and data.result_url points at the request status endpoint (GET /v1/convert/{request_id}).
200 response, so the webhook is just a signal. On an async conversion (async=true) the webhook lets you skip polling entirely: receive the event when the job settles, then fetch the result (inline or via the signed URL) from the status endpoint.Opting in
The callback_url must be an https URL and is SSRF-checked (private/internal/metadata addresses are rejected). A webhook fires only when a conversion actually runs and reaches a terminal state. It does not fire for dry runs (cost=true) or for pre-flight rejections — auth failures, rate limits, insufficient credits, bad input, or too-many-pages errors all return synchronously and run no conversion, so no event is sent.
# Opt in per request with callback_url (https only)
curl -X POST "https://api.skely.io/v1/convert?format=json&callback_url=https://you.example.com/hooks/skely" \
-H "Authorization: Bearer sk_live_YOUR_API_KEY" \
-H "Content-Type: application/pdf" \
--data-binary @report.pdfDelivery
Skely POSTs application/json to your callback_url with these headers:
| Header | Meaning |
|---|---|
| X-Skely-Webhook-Id | Stable delivery id — constant across all retries of the same event. Dedupe on it. |
| X-Skely-Event | The event type: conversion.succeeded or conversion.failed. |
| X-Skely-Signature | HMAC signature, format t=<unix_seconds>,v1=<hex> (see below). |
| X-Skely-Signature-Timestamp | Unix seconds the body was signed — mirrors the t= value inside X-Skely-Signature (verify the replay window against that t=; this header is informational). Each retry is re-signed with a fresh timestamp. |
| User-Agent | Skely-Webhooks/1 |
Event payload
The top-level request_id equals data.request_id and the original X-Skely-Request-Id — the join key. data.status is one of success, partial, or failed: a partial async settlement fires the conversion.succeeded event with data.status: "partial", and you are billed only for the converted pages (credits_charged = pages, with pages < total_pages). data.result is always null — fetch the output from the status endpoint via data.result_url. On failure, data carries status: "failed" and credits_charged: 0. Note the webhook uses success, whereas the GET poll's async status is succeeded — match them accordingly.
{
"id": "<event_id>",
"type": "conversion.succeeded",
"created": 1718924400,
"api_version": "v1",
"request_id": "<request_id>",
"data": {
"request_id": "<request_id>",
"status": "success",
"format": "json",
"pages": 12,
"total_pages": 12,
"credits_charged": 12,
"result_url": "https://api.skely.io/v1/convert/<request_id>",
"result": null
}
}Verifying the signature
Signatures are HMAC-SHA256 over "{timestamp}.{raw_body}", keyed with your per-account signing secret (whsec_… — see signing secret below). Verify before parsing JSON. Recompute the HMAC, compare to the v1 value in constant time, and reject if the t timestamp is outside a ±5-minute (300 s) window. Because each retry is re-signed with a fresh timestamp, the window applies to the actual send time of that attempt.
import hashlib, hmac, time
def verify(secret: str, headers: dict, raw_body: bytes) -> bool:
sig = dict(p.split("=", 1) for p in headers.get("X-Skely-Signature", "").split(",") if "=" in p)
if "t" not in sig or "v1" not in sig:
return False
ts = int(sig["t"])
if abs(time.time() - ts) > 300: # +/- 5 min replay window
return False
signed = f"{ts}.".encode() + raw_body
expected = hmac.new(secret.encode(), signed, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, sig["v1"])Retries & idempotency
The first attempt fires immediately. Skely waits 10 s per attempt for a response; any 2xx is success, and anything else (non-2xx, timeout, connection error) is a failed attempt. Failed deliveries retry with backoff after +1m, +5m, +30m, +2h, +6h — 6 attempts over ~8.6 h — after which the delivery is marked failed. Acknowledge fast: return 2xx within 10 s and do any heavy work asynchronously.
Delivery is at-least-once, so duplicates are possible. Dedupe on X-Skely-Webhook-Id (stable across retries of an event) and the event id, and keep your handler idempotent. Skely ignores your response body — only the status code matters.
Signing secret
Each account has one webhook signing secret (prefix whsec_). View and rotate it on the API Keys page. Every event — including every retry — is signed with the current secret. Rotating invalidates the old secret immediately, so roll the new value into your verifier as part of rotation.
Errors
Errors are namespaced and machine-readable. Most error codes follow the format ERR::SCOPE::REASON — for example ERR::INPUT::BLOCKED_URL (the generic fallback ERR::INTERNAL is the one exception — two segments). The code appears both in the response body and in the X-Skely-Error-Code response header.
Error body
{
"request_id": "<request_id>",
"code": "ERR::INPUT::BLOCKED_URL",
"message": "URL resolves to a private/internal address",
"is_retryable": false,
"docs_url": "https://skely.io/docs#err-input-blocked_url"
}| Field | Description |
|---|---|
| request_id | The id of the failed request — quote it in support tickets. |
| code | The namespaced ERR::SCOPE::REASON code (also in X-Skely-Error-Code). |
| message | A human-readable explanation. |
| is_retryable | Whether retrying the same request could succeed. |
| docs_url | A link to the relevant documentation for this error. |
Some statuses extend the body with extra fields: a 402 (ERR::BILLING::INSUFFICIENT_CREDITS) adds needed and available; a 413 (ERR::CONVERT::MAX_PAGES_EXCEEDED) adds pages, total_pages, and max_pages; a 429 (ERR::RATE::LIMITED) adds retry_after (mirrors the Retry-After header); a 415 (ERR::INPUT::UNSUPPORTED_FORMAT) adds candidates and supported; a 415 (ERR::INPUT::FORMAT_MISMATCH) adds expected and detected; a 422 (ERR::INPUT::BAD_VERSION) adds available.
Error code catalog
Every code the API can return, with its HTTP status, whether it is safe to retry, and what to do.
| Code | HTTP | Retryable | Meaning / what to do |
|---|---|---|---|
| ERR::AUTH::MISSING_TOKEN | 401 | no | No Authorization: Bearer key was sent. |
| ERR::AUTH::INVALID_KEY | 401 | no | The API key is malformed or not recognized. |
| ERR::AUTH::REVOKED_KEY | 401 | no | The API key was revoked — create a new one. |
| ERR::AUTH::INVALID_TOKEN | 401 | no | The session token is invalid or expired. |
| ERR::AUTH::EMAIL_UNVERIFIED | 403 | no | Verify your account email before calling the API. |
| ERR::AUTH::APP_CHECK | 401 | no | Browser verification failed (web callers only); API-key callers never hit this. |
| ERR::AUTH::FORBIDDEN_ORIGIN | 403 | no | Request origin is not allowed — call the documented base URL; contact support if it persists. |
| ERR::AUTH::KEY_IN_URL | 400 | no | A credential was put in the URL — send it only in the Authorization header. |
| ERR::RATE::LIMITED | 429 | yes | Per-key rate limit hit — back off and retry after Retry-After. |
| ERR::SERVICE::UNAVAILABLE | 503 | yes | The service is temporarily unavailable — retry with exponential backoff. |
| ERR::BILLING::INSUFFICIENT_CREDITS | 402 | no | Not enough credits (see needed / available) — top up or upgrade. |
| ERR::INPUT::NO_INPUT | 422 | no | No document source — send body bytes, url, or gcs_ref. |
| ERR::INPUT::EMPTY_PDF | 422 | no | The supplied document was empty. |
| ERR::INPUT::BAD_FORMAT | 422 | no | format must be one of json, md, or ascii. |
| ERR::INPUT::BAD_MODE | 422 | no | mode must be convert or info. |
| ERR::INPUT::AMBIGUOUS_INPUT | 422 | no | More than one document source supplied — send exactly one. |
| ERR::INPUT::UNSUPPORTED_FORMAT | 415 | no | The bytes match no supported format. |
| ERR::INPUT::FORMAT_MISMATCH | 415 | no | The engine pin's format isn't the format detected from the bytes. |
| ERR::INPUT::BAD_CALLBACK_URL | 422 | no | callback_url must be a valid https URL (SSRF rules apply). |
| ERR::INPUT::BAD_ENGINE | 422 | no | The engine pin is unparseable (e.g. a bare version, unknown format). |
| ERR::INPUT::BAD_VERSION | 422 | no | Unknown engine version for the detected format. |
| ERR::INPUT::BAD_URL | 422 | no | The url is malformed or could not be resolved. |
| ERR::INPUT::BLOCKED_URL | 403 | no | The url resolves to a private/internal/metadata address (SSRF guard). |
| ERR::INPUT::FORBIDDEN_REF | 403 | no | The gcs_ref does not reference your own upload. |
| ERR::INPUT::BAD_REF | 422 | no | The gcs_ref is malformed. |
| ERR::INPUT::BAD_PAGE_RANGE | 422 | no | The pages selection is malformed or out of range. |
| ERR::INPUT::FETCH_FAILED | 422 | yes | Could not fetch the url — transient; retry. |
| ERR::INPUT::GCS_FETCH_FAILED | 422 | yes | Could not read the referenced upload — transient; retry. |
| ERR::INPUT::TOO_MANY_REDIRECTS | 422 | no | The url redirected too many times. |
| ERR::INPUT::TOO_LARGE | 413 | no | Over the size cap — use /v1/uploads; 50 MiB is the hard ceiling. |
| ERR::CONVERT::MAX_PAGES_EXCEEDED | 413 | no | Over the page cap — scope with pages, split, or use the async path. |
| ERR::REQUEST::NOT_FOUND | 404 | no | No such request, or not yours. (A reaped result is a 200 status record, not a 404.) |
| ERR::INTERNAL | 500 | no | Unexpected server error — retry; contact support if it persists. |
429 (ERR::RATE::LIMITED) tells you how long to wait via Retry-After; a 503 (ERR::SERVICE::UNAVAILABLE) carries no Retry-After — retry after a few seconds with exponential backoff. Failed conversions are never charged.