Skip to content

Add built-in extensions infrastructure with lipsum#105

Draft
gordonwoodhull wants to merge 6 commits intomainfrom
feature/builtin-extensions
Draft

Add built-in extensions infrastructure with lipsum#105
gordonwoodhull wants to merge 6 commits intomainfrom
feature/builtin-extensions

Conversation

@gordonwoodhull
Copy link
Copy Markdown
Collaborator

@gordonwoodhull gordonwoodhull commented Apr 1, 2026

Summary

This PR ports five built-in extensions from TS Quarto — lipsum, version, kbd, placeholder, and video — and builds the general infrastructure needed to run them unchanged in both the native CLI and the hub-client WASM environment.

The extensions themselves are relatively thin. The real substance of this PR is the infrastructure: a Lua API layer, a dependency pipeline, WASM file I/O overrides, and hub-client post-processing that together allow TS Quarto extensions to run as-is on our Rust engine.

Infrastructure

1. Lua constructor type coercion

pandoc.Para("text") now auto-coerces strings to Inlines via peek_inlines_fuzzy, matching Pandoc's Lua marshal behavior. TS Quarto extensions routinely pass bare strings to constructors that expect Inline lists, so this is a prerequisite for running them without modification.

2. Built-in extension discovery and embedding

Extensions in resources/extensions/quarto/ are embedded in the binary at compile time via include_dir!. At runtime they're extracted to a temp directory (native) or loaded into VFS (WASM). The discovery system merges built-in extensions with user-installed extensions, with user extensions winning on name collision — the same semantics as TS Quarto.

3. quarto.doc Lua API and dofile/loadfile for WASM

The quarto.doc namespace provides the APIs that extensions use to interact with the rendering pipeline: is_format(), add_html_dependency(), include_text(), and has_bootstrap(). A script-directory stack ensures quarto.utils.resolve_path() resolves relative to the calling script rather than the document, which matters when extensions load sibling files.

dofile() and loadfile() are overridden in the WASM and test environments to route through SystemRuntime, since the native C fopen doesn't work in WASM. The dofile override pushes and pops the script-directory stack, so nested resolve_path calls work correctly (this is the pattern video-filter.lua uses to load shared code from video.lua).

4. Pipeline wiring for HTML dependencies and text includes

A new dependency.rs module provides store_html_dependencies(), which reads CSS/JS files via the runtime and stores them as artifacts at libs/{name}/{filename}, and push_text_includes(), which routes include_text("in-header"|"before-body"|"after-body") calls into PandocIncludes. The template integration adds $for(scripts)$ and $for(include-after)$ loops, and render_to_file writes dependency artifacts to {stem}_files/libs/ on disk.

5. Resource prefix for native file rendering

Extension artifact paths like libs/kbd/kbd.css need a {stem}_files/ prefix in the HTML <link> and <script> tags to match where the files are actually written on disk. A resource_prefix field threads from render_to_file through to the template stage.

6. Hub-client post-processor for VFS resources

The preview iframe can't fetch libs/ URLs — those files only exist in VFS. The post-processor now reads extension CSS and JS from VFS and inlines them: CSS is converted to data URIs on <link> tags, and JS is inserted as inline <script> elements. A synthetic DOMContentLoaded event is dispatched after inlining so that extension scripts which register listeners (like kbd.js) get their initialization callback.

7. quarto.utils.as_inlines and as_blocks

Type coercion functions matching TS Quarto's _utils.lua. These convert between Pandoc AST types (Inlines, Blocks, single elements, tables, strings) without parsing markdown. Our implementation uses lookup tables for block/inline type classification because our pandoc.utils.type() returns specific element names ("Para", "Str") rather than generic categories ("Block", "Inline").

8. Testing infrastructure

A SMOKE_FILTER environment variable allows running individual smoke-all fixtures across all three test runners (native Rust, WASM Vitest, and Playwright E2E), which made iterating on extension bugs much faster.

9. Preview iframe sandbox (for discussion)

The preview iframe's sandbox attribute currently blocks all script execution (allow-same-origin allow-popups without allow-scripts). This restriction was added recently in d6eb060 without specific rationale — before that commit, the iframes had no sandbox at all. The final commit in this PR adds allow-scripts to re-enable extension JS execution. This needs team discussion before merging: do we want to allow scripts in the preview, or should extensions degrade gracefully when JS can't run?

Extensions

All five extensions are byte-identical copies from TS Quarto, with one exception: placeholder.lua has a small simplification where TS Quarto checks quarto.format.is_typst_output() to choose between SVG and PNG output formats. Since we don't have that API yet, we default to SVG, which works for HTML output.

Pandoc's Lua API automatically coerces arguments passed to constructors
(e.g. pandoc.Para("text") or pandoc.Div(pandoc.Para(...))). Our
constructors were strict, only accepting tables of exact userdata types,
causing real-world extensions like lipsum to fail.

Add peek_inlines_fuzzy/peek_blocks_fuzzy matching pandoc-lua-marshal's
peekInlinesFuzzy/peekBlocksFuzzy behavior:
- Strings word-split into Str/Space/SoftBreak (matching B.text)
- Single userdata wrapped in singleton lists
- Mixed tables of strings and userdata accepted
- Inlines-like values in block context wrapped in Plain

Update all constructors (Para, Emph, Div, etc.), helper functions
(parse_list_items, parse_caption, parse_single_citation, etc.), and
pandoc.Inlines()/pandoc.Blocks() to use fuzzy coercion uniformly.
Built-in extensions ship with the binary, embedded at compile time via
include_dir! and extracted at runtime (temp dir on native, VFS on WASM).
Discovery merges them with user extensions, user-wins on name collision.

Lua API layer (quarto.doc namespace):
- is_format(), has_bootstrap(), add_html_dependency(), include_text()
- Script-dir stack for correct resolve_path() in nested dofile() calls
- dofile/loadfile overrides for WASM/test routing through SystemRuntime

Pipeline wiring (FilterOutput, dependency.rs, template integration):
- HTML dependencies stored as artifacts at libs/{name}/{filename}
- Text includes routed into PandocIncludes (header/before-body/after-body)
- Template $for() loops for scripts, includes, matching Quarto 1 behavior
- render_to_file writes dependency artifacts to {stem}_files/libs/

First built-in extension (lipsum) included as proof of infrastructure.
Copy three extensions from TS Quarto:
- version: {{< version >}} outputs the Quarto version string
- kbd: {{< kbd Ctrl-C >}} renders keyboard shortcuts with OS-specific
  display, CSS/JS dependencies, and accessibility markup
- placeholder: {{< placeholder 200 >}} generates SVG placeholder images

The kbd extension exercises the full HTML dependency pipeline end-to-end:
Lua add_html_dependency → artifact storage → template tags → file output.

Also adds SMOKE_FILTER env var for running individual smoke-all fixtures
across all three runners (Rust, WASM Vitest, Playwright).
Copy the video extension from TS Quarto (shortcode + filter + VideoJS
resources). The shortcode handles YouTube, Vimeo, Brightcove, and local
video embedding. The filter (video-filter.lua) handles reveal.js
background-video attributes and is only activated by explicit user
reference, matching TS Quarto behavior.

Implement quarto.utils.as_inlines() and quarto.utils.as_blocks() type
coercion functions matching TS Quarto's _utils.lua. These use lookup
tables for block/inline type classification since our pandoc.utils.type()
returns specific names (e.g., "Para") rather than generic "Block".

Smoke tests: builtin-video-youtube (iframe embed) and builtin-video-local
(VideoJS with add_html_dependency + include_text("after-body")).
Native render: extension <link>/<script> tags pointed to libs/kbd/kbd.css
instead of {stem}_files/libs/kbd/kbd.css. Add resource_prefix to
ApplyTemplateConfig, set to "{stem}_files/" in render_to_file.

Hub-client: the iframe post-processor only handled /.quarto/ CSS links.
Extend to also handle libs/ prefix for both CSS and JS. Scripts are read
from VFS and inlined since the iframe can't fetch relative src paths.
A synthetic DOMContentLoaded is dispatched after inlining so extension
scripts that register listeners (e.g., kbd.js) run their initialization.
Add allow-scripts to the sandbox attribute on MorphIframe and
DoubleBufferedIframe. This is required for extension JS (kbd.js,
video.min.js, etc.) to execute in the preview.

The sandbox restriction (allow-same-origin allow-popups, no scripts)
was added in d6eb060 without specific rationale. Without allow-scripts,
extension scripts inlined by the post-processor are blocked by the
browser. Verified working with kbd extension via Playwright.

Security note: this allows any script in the rendered HTML to execute.
The previous restriction was only 12 days old and the iframes had no
sandbox at all before that. Discuss with team before merging.
@gordonwoodhull gordonwoodhull force-pushed the feature/builtin-extensions branch from ec2c748 to 0c42679 Compare April 3, 2026 18:25
@gordonwoodhull gordonwoodhull marked this pull request as draft April 3, 2026 21:41
@gordonwoodhull
Copy link
Copy Markdown
Collaborator Author

I'm converting to draft while I investigate alternatives to allow-scripts that still allow running at least some blessed scripts or something.

If we can't have extensions add JS, most of the work on this PR is still valuable, but we'd have to rewrite or drop the kbd and video shortcodes.

Might be worth merging this without those shortcodes for now. Will reconsider next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant