Building a Custom Renderer¶

This guide explains how to build a renderer that converts SciFlow's ProseMirror document JSON into any target format — HTML, Markdown, LaTeX, a custom XML vocabulary, or anything else. It uses the JATS XML renderer (jats-body-generator.ts) as a reference implementation, and walks through the key decisions and patterns you will encounter.

Before you start, familiarize yourself with the Schema Reference. Every node type, attribute, and mark described there is what your renderer needs to handle.

Architecture overview¶

A renderer walks the document tree (the JSON produced by ProseMirror's Node.toJSON()) and emits output for each node it encounters. The tree has three conceptual layers:

graph TD
    doc["doc"] --> structural["Structural nodes<br/>(part, header)"]
    doc --> block["Block nodes<br/>(paragraph, figure, table, …)"]
    structural --> block
    block --> inline["Inline nodes + marks<br/>(text, citation, footnote, …)"]

    classDef s fill:#e8f5e9,stroke:#388e3c
    classDef b fill:#e3f2fd,stroke:#1565c0
    classDef i fill:#fff3e0,stroke:#e65100

    class doc,structural s
    class block b
    class inline i

Your renderer implements one function per layer:

Function	Responsibility
`renderBlockNode`	Dispatch a single block or structural node to its handler
`renderInline`	Walk an array of inline nodes, handling marks
`renderBlock`	Walk an array of block nodes (convenience wrapper)

A shared context object threads state through the recursion — for example, collecting footnotes encountered inline so they can be emitted at the end of the document.

Step 1 — Define your input and output types¶

The input is always the same: ProseMirror document JSON.

/** Minimal ProseMirror JSON node. */
interface PMNode {
  type: string;
  attrs?: Record<string, unknown>;
  content?: PMNode[];
  text?: string;
  marks?: PMMark[];
}

interface PMMark {
  type: string;
  attrs?: Record<string, unknown>;
}

Your output type depends on your target. For string-based formats (XML, HTML, Markdown, LaTeX) the renderer returns string. For DOM-based formats you would return Element or DocumentFragment.

Define a context type for any cross-cutting state:

interface Ctx {
  footnotes: PMNode[];   // collected for end-of-document rendering
  // add whatever your format needs: link targets, image manifests, …
}

Step 2 — Register block node renderers¶

Create a map from ProseMirror node type names to render functions. Every node type in the schema should have an entry (or a sensible fallback).

BLOCK_RENDERERS = {
  "doc"        → renderChildren(node)
  "paragraph"  → wrap node.content with <p>
  "heading"    → wrap node.content with <h{level}>
  "blockquote" → wrap node.content with <blockquote>
  "code_block" → extract plain text, escape, wrap with <pre><code>
  "bullet_list"  → wrap with <ul>
  "ordered_list" → wrap with <ol>
  "list_item"    → wrap with <li>
  "figure"     → delegate to renderFigure
  "table"      → delegate to renderTable
  …
}

function renderBlockNode(node, ctx):
  if BLOCK_RENDERERS has node.type:
    return BLOCK_RENDERERS[node.type](node, ctx)
  else:
    return renderBlock(node.content, ctx)     # fallback: render children

function renderBlock(nodes, ctx):
  return concatenate renderBlockNode(n, ctx) for each n in nodes

Unknown node types

Always include a fallback that recurses into node.content. The schema may gain new node types in future versions, and a graceful fallback keeps your renderer forward-compatible.

Step 3 — Register inline node renderers¶

Inline nodes appear inside block nodes (typically inside paragraph, heading, and caption). The special one is text — it carries the actual characters and is the only node type that has a marks array.

INLINE_RENDERERS = {
  "hard_break" → emit line break
  "citation"   → emit citation reference (decode source attr for IDs)
  "footnote"   → push node onto ctx.footnotes, emit reference marker
  "math"       → emit formula (inline or display depending on style attr)
  "image"      → emit image tag with src, alt
  "link"       → emit hyperlink wrapping renderInline(node.content)
}

function renderInlineNode(node, ctx):
  if INLINE_RENDERERS has node.type:
    return INLINE_RENDERERS[node.type](node, ctx)
  else:
    return renderInline(node.content, ctx)    # fallback

Notice that text is not in this map. Text nodes are handled directly by renderInline (next step) because their marks need special treatment.

Step 4 — Handle marks (inline formatting)¶

Marks represent inline formatting: bold, italic, links, index entries, and so on. Each ProseMirror text node carries an ordered array of marks. When adjacent text nodes share marks, a naive per-node approach produces redundant tags:

"Te"[strong] + "s"[em, strong] + "t"[strong]

Naive:    <b>Te</b><i><b>s</b></i><b>t</b>     ← 3 × <b>
Merged:   <b>Te<i>s</i>t</b>                   ← 1 × <b>

The mark-merging algorithm¶

The renderer tracks which marks are currently "open" and only opens/closes tags when the set changes. This is the same approach ProseMirror's own DOMSerializer uses, with one enhancement described below.

function renderInline(nodes, ctx):
  result = ""
  activeMarks = []

  for each node in nodes:
    if node is not a text node:
      close all activeMarks (innermost first)
      activeMarks = []
      result += renderInlineNode(node, ctx)
      continue

    nodeMarks = reorderMarksForMerging(node.marks, activeMarks)

    # Find longest common prefix with active marks
    common = 0
    while common < |activeMarks| AND common < |nodeMarks|
          AND activeMarks[common] equals nodeMarks[common]:
      common += 1

    # Close marks no longer active (innermost first)
    for i from |activeMarks| - 1 down to common:
      result += closeTag(activeMarks[i])

    # Open newly needed marks
    for i from common to |nodeMarks| - 1:
      result += openTag(nodeMarks[i])

    activeMarks = nodeMarks
    result += escape(node.text)

  # Close any remaining marks
  for i from |activeMarks| - 1 down to 0:
    result += closeTag(activeMarks[i])

  return result

Mark reordering¶

ProseMirror stores marks in a fixed order defined by the schema (the rank — determined by the order marks are declared in the schema spec). Two text nodes with different mark sets can have a shared mark at different array positions — for example [strong] vs [em, strong]. A prefix-only comparison would miss the shared strong.

ProseMirror's own DOMSerializer does not reorder marks — it respects the schema rank strictly and accepts the redundant open/close tags. This is fine for HTML because browsers render adjacent identical elements identically.

However, for most output formats (JATS XML, LaTeX, Markdown, etc.) mark nesting is commutative: <bold><italic>x</italic></bold> and <italic><bold>x</bold></italic> are semantically identical. In these cases, we reorder each node's marks so that marks shared with the previous node come first. This maximizes the common prefix, producing more compact output.

function reorderMarksForMerging(marks, activeMarks):
  if activeMarks is empty OR marks is empty:
    return marks

  shared = []
  added  = []

  for each mark in marks:
    if mark is found in activeMarks (by type + attrs equality):
      shared += mark
    else:
      added += mark

  sort shared to match their order in activeMarks
  return shared + added

When nesting order matters

If your target format assigns different semantics to nesting order (rare — but possible with custom marks), skip the reordering and use the ProseMirror schema rank order directly, as ProseMirror's own DOMSerializer does.

Mark open/close helpers¶

Each mark needs an opening and closing tag (or equivalent in your format):

function openTag(mark):
  match mark.type:
    "strong"  → "<b>"
    "em"      → "<i>"
    "sup"     → "<sup>"
    "sub"     → "<sub>"
    "anchor"  → "<a href=…>"     (read href, title from mark.attrs)
    "indexEntry" → "<index-term>" (render term entries from mark.attrs)
    "bdi", "tags" → ""           (passthrough, no output)

function closeTag(mark):
  match mark.type:
    "strong"  → "</b>"
    "em"      → "</i>"
    "sup"     → "</sup>"
    "sub"     → "</sub>"
    "anchor"  → "</a>"
    "indexEntry" → "</index-term>"
    "bdi", "tags" → ""

Step 5 — Handle document structure (sectioning)¶

SciFlow documents can be structured in two ways, and your renderer should handle both:

Path A — `part` nodes provide explicit structure¶

When the document contains part nodes, they define sections directly. The first heading inside a part is its title; deeper headings create sub-sections.

doc
├── part (id: "intro")
│   ├── heading (level 1) "Introduction"    ← becomes the section title
│   ├── paragraph …
│   ├── heading (level 2) "Background"      ← nested sub-section
│   └── paragraph …
└── part (id: "methods")
    ├── heading (level 1) "Methods"
    └── paragraph …

If a part contains multiple headings at the same level as its title, they should be split into sibling sections rather than nested inside the first one:

function renderPart(node, ctx):
  children = node.content
  titleNode = first heading in children
  titleLevel = titleNode.level

  # Split remaining content at headings with the same level as the title
  segments = split children-after-title at each heading where level == titleLevel

  # First segment belongs to the original part (with its id, type, etc.)
  emit section(titleNode, firstSegment, partAttrs)

  # Each subsequent same-level heading becomes a sibling section
  for each (heading, content) in remaining segments:
    emit section(heading, content)

Path B — Flat headings, no parts¶

When the document is flat (no part nodes), headings define sections implicitly. Your renderer builds a section tree from the heading levels:

function buildSectionTree(nodes):
  preamble = []          # content before the first heading
  rootSections = []
  stack = []             # tracks nesting: stack[0] is outermost open section

  for each node in nodes:
    if node is a heading:
      level = node.level

      # Pop sections at the same level or deeper — they are now closed.
      while stack is not empty AND stack.top.level >= level:
        pop stack

      section = { heading: node, level, body: [], children: [] }

      if stack is empty:
        append section to rootSections
      else:
        append section to stack.top.children

      push section onto stack

    else if stack is empty:
      append node to preamble      # before any heading

    else:
      append node to stack.top.body # belongs to current section

  return { preamble, rootSections }

Your doc renderer checks which path applies:

function renderDoc(node, ctx):
  children = node.content
  hasParts    = any child has type "part"
  hasHeadings = any child has type "heading"

  if hasParts or hasHeadings:
    return renderMixedDocChildren(children, ctx)
  else:
    return renderBlock(children, ctx)

renderMixedDocChildren handles documents that mix part nodes with flat headings — it renders parts directly and auto-sectionizes any consecutive non-part nodes that contain headings.

Step 6 — Collect deferred content¶

Some inline nodes produce output in a different location than where they appear. The most common example is footnotes: they appear inline as a reference marker, but their content is rendered in a group at the end of the document (or section).

The pattern:

The inline renderer for footnote pushes the node onto ctx.footnotes and returns a reference marker.
After the main tree walk, a separate pass renders the collected footnotes.

# During inline rendering:
# "footnote" handler:
#   push node onto ctx.footnotes
#   return reference marker (e.g. superscript number linking to footnote id)

# After rendering all block content:
function renderFootnotes(ctx):
  if ctx.footnotes is empty: return ""

  for each (footnote, index) in ctx.footnotes:
    emit footnote block with id, index+1, renderInline(footnote.content)

You can use the same pattern for endnotes, glossary terms, or any content that needs to be relocated.

Step 7 — Handle figures and tables¶

Figures and tables have richer structure than simple block nodes.

Figures¶

A figure node wraps an image or a native table, plus an optional caption. The type attribute distinguishes them:

type: "figure" — image figure with src and alt attributes
type: "native-table" — table figure containing a table node instead of an image

function renderFigure(node, ctx):
  if node.attrs.type == "native-table":
    emit table-wrapper with id
    render node.content (contains table + caption)
  else:
    emit figure with id
    emit image with src, alt
    render node.content (contains caption)

Tables¶

Tables contain table_row nodes, each with table_cell or table_header children. Rows where every cell is a table_header form the thead.

function renderTable(node, ctx):
  theadRows = []
  tbodyRows = []

  for each row in node.content:
    if every cell in row is a "table_header" AND tbodyRows is still empty:
      append row to theadRows
    else:
      append row to tbodyRows

  emit <table>
    if theadRows: emit <thead> with rendered rows
    if tbodyRows: emit <tbody> with rendered rows

  # For each cell, check colspan/rowspan attributes
  # (emit only when > 1)

Step 8 — Wire up the public API¶

Wrap everything in a single entry point:

function render(doc):
  ctx = { footnotes: [] }
  body = renderBlockNode(doc, ctx)
  footnotes = renderFootnotes(ctx)
  return body + footnotes

Checklist¶

Use this checklist to verify your renderer handles all schema features:

Block nodes¶

Node type	Key attributes	Notes
`doc`	`type`	Root node; may contain `part`, `header`, or flat content
`part`	`id`, `type`, `locale`	Structural section; first heading is its title
`header`	—	Unwrap and render children directly
`heading`	`id`, `level` (1–6)	Used as section titles and for auto-sectioning
`subtitle`	—	Appears inside `header`
`paragraph`	`id`	Standard text block
`blockquote`	`id`, `lang`	Quoted content
`code_block`	`id`, `language`	Preformatted code
`bullet_list`	—	Unordered list
`ordered_list`	`order`	Ordered list; `order` is the start number
`list_item`	—	List entry
`figure`	`id`, `type`, `src`, `alt`, `orientation`	Image or table wrapper
`caption`	—	Inside `figure`; may contain `label` + `paragraph`
`label`	—	Figure/table label text
`table`	`id`	Contains `table_row` nodes
`table_row`	—	Contains `table_cell` / `table_header`
`table_cell`	`colspan`, `rowspan`	Data cell
`table_header`	`colspan`, `rowspan`	Header cell
`reference`	`id`, `refId`	Bibliography entry
`horizontal_rule`	—	Thematic break
`pageBreak`	—	Page break hint
`placeHolder`	`id`, `type`	Skip in output

Inline nodes¶

Node type	Key attributes	Notes
`text`	—	Carries `text` string and `marks` array
`hard_break`	—	Line break
`citation`	`source`, `style`, `id`	`source` is URI-encoded JSON array of `{id}`
`footnote`	`id`	Inline content collected for deferred rendering
`math`	`id`, `tex`, `style`, `label`	`style`: `"inline"` or `"display"`
`image`	`src`, `alt`	Inline image
`link`	`href`	Cross-reference; `href` starts with `#`

Marks¶

Mark type	Attributes	Output
`strong`	—	Bold
`em`	—	Italic
`sup`	—	Superscript
`sub`	—	Subscript
`anchor`	`href`, `title`	External hyperlink
`indexEntry`	`entries: [{raw}]`	Index term wrapper
`bdi`	—	Bidirectional isolation (passthrough)
`tags`	—	Tagging (passthrough)

Reference: JATS renderer source¶

The complete JATS renderer lives at:

packages/schema/prosemirror/src/lib/jats-body-generator.tsx

It demonstrates all the patterns described above — block/inline dispatch maps, mark merging with reordering, auto-sectioning, deferred footnotes, figure/table handling, and pretty-printing — in roughly 780 lines of annotated TypeScript.