2 Workflows
A project’s workflow reads files from the project’s assets directory and writes files to the project’s distribution directory.
A workflow’s implementation is a subclass of unlike-compiler%, which allows any asset to depend on other assets. Any workflow class is a valid value for polyglot+% in .polyglotrc.rkt.
2.1 Built-In Workflows
To save time for new users, Polyglot ships with three built-in workflows: polyglot/base%, polyglot/imperative%, and polyglot/functional%. Each may be extended, overridden, or ignored entirely.
polyglot/imperative% and polyglot/functional% both operate under the assumption that they must produce HTML5 documents from Markdown files. In addition, each Markdown file may contain application elements and library elements. Application elements are of form <script type="application/racket">...</script>, and library elements are of form <script type="text/racket">...</script>. The text content of these elements are copied to Racket modules contained in (polyglot-temp-directory). The name of each Racket module is equal to ID.rkt, where ID is the value of the id attribute of the <script> element. If the id attribute is not present or empty, a temporary name will be used instead.
The Racket modules created on disk from application elements are loaded using dynamic-require in the order they are encountered in Markdown. The intended behavior and responsibility of application element’s Racket module depends on the workflow executing it. Library elements, on the other hand, are saved to disk as-is and used according to the whims of the Racket modules produced from application elements.
2.1.1 polyglot/base
(require polyglot/base) | package: polyglot-lib |
The base workflow sets rules shared by polyglot/functional% and polyglot/imperative% by specializing the behavior of unlike-compiler%.
|
superclass: unlike-compiler% |
If the string looks like a readable path on your system, returns a complete path.Relative paths are completed using (assets-rel). Complete paths are used as-is.
If the completed path does not refer to a readable file, this will raise exn:fail unless the path extension equals #".literal".
method
(send a-polyglot/base delegate clear) → unlike-asset/c
clear : clear/c Selects an advance/c procedure to act as the first representation of an asset with the given clear path.The procedure depends on the result of (path-get-extension clear):
#".literal": The path will be treated as a fulfilled asset without further processing. The path may refer to a non-existent file in the project’s distribution.
#".css": The path is assumed to refer to a CSS file. For every url(X) expression in the stylesheet, the compiler will add! X as a dependency of the CSS file.
#".rkt": The path is assumed to refer to a Racket module. The module must (provide write-dist-file), where write-dist-file is an advance/c procedure that may return a complete path as a fulfilled file in a distribution.
In all other cases, the file located at path clear will be copied directly to the distribution directory, and renamed. The new name of the file will equal the first 8 characters of the SHA-1 hash of the file’s own contents, with the same extension. This is for cache-busting purposes. The asset is then considered fulfilled with a path to the newly-renamed file.
procedure
(make-minimal-html-page body) → txexpr?
body : (listof xexpr?)
procedure
(add-dependencies! clear compiler txexpr/expanded) → advance/c clear : clear/c compiler : (is-a?/c unlike-compiler%) txexpr/expanded : txexpr?
Specifically, add-dependencies! maps the output of (discover-dependencies txexpr/expanded) to clear names using compiler and adds them to the build using (send compiler add!). As a special case, Markdown dependencies are added to the compilation but without a dependency relationship on clear. This is to avoid a circular dependency locking up a build when two pages link to each other.
The returned advance/c procedure must be used as the next step for the asset named by clear. That procedure will resolve the dependencies in txexpr/expanded, write a finished HTML5 document to a distribution, and fulfill the asset using the path of the HTML5 document.
You will likely not call this yourself, but you can use it for custom workflows derived from polyglot/base% if you aim to create HTML5 documents from Markdown using a different set of rules. The dependency resolution step is tedious, and this will take care of it for you.
For an example, see the functional workflow’s delegate implementation.
2.1.2 polyglot/imperative
(require polyglot/imperative) | package: polyglot-lib |
(require (submod polyglot/imperative safe)) |
The Imperative Workflow seeks application elements within Markdown files and runs them under the expectation that they will produce content as a side-effect. In that sense, app elements behave similarly to <?php echo ...; ?> in PHP.
To use module-level contracts, require the safe submodule.
|
superclass: polyglot/base% |
method
(send a-polyglot/imperative delegate clear) → unlike-asset/c
clear : clear/c Like delegate, except Markdown files (identified by a #".md" extension) are parsed and processed using this advance/c procedure:
(λ (clear compiler) (define txexpr/parsed (parse-markdown clear)) (define txexpr/preprocessed (send compiler preprocess-txexprs txexpr/parsed)) (define txexpr/processed (run-txexpr/imperative! txexpr/preprocessed)) (add-dependencies! clear compiler txexpr/processed))
method
(send a-polyglot/imperative preprocess-txexprs tx-expressions)
→ (listof txexpr?) tx-expressions : (listof txexpr?) This method transforms tx-expressions parsed from a source Markdown file into a new list of tagged X-expressions. This transformation occurs before the instance uses run-txexpr/imperative!.Use this to sanitize untrusted code, generate application elements based on content, or attach common metadata to documents.
The default implementation searches tx-expressions for elements with a data-macro attribute. If the attribute exists, it must be a string of at most two words (e.g. "holiday" or "holiday halloween"). If a second word is not specified, it is assumed to be "replace-element".
The first word is converted to a path to a Racket module in the assets directory, and the second word is converted to a symbol used to extract a provided identifier via dynamic-require, with reload support for live builds (e.g. (dynamic-require (assets-rel "holiday.rkt") 'halloween)).
The dynamic-require must return a (-> txexpr? (listof txexpr?)) procedure that transforms the original Tagged X-expression holding the data-macro attribute to at least zero new elements.
procedure
(run-txexpr/imperative! target [ initial-layout]) → (or/c (listof txexpr?) txexpr?) target : (or/c txexpr? (non-empty-listof txexpr?))
initial-layout : (-> (listof txexpr?) (or/c txexpr? (listof txexpr?))) = identity
The transformation does not mutate target, but does depend on side-effects:
Remember initial-layout as the page layout.
Save all Racket modules from application elements and library elements found in target-prime to disk. Remove all library elements from target-prime.
- For each Racket module M written to disk, in order matching app elements encountered:
Instantiate the module using (dynamic-require M #f)
read all tagged X-expressions produced by write calls in the module as a side-effect.
Replace the page layout with (dynamic-require M 'layout), if possible. Otherwise keep the existing layout.
Replace the app element that sourced M with the content written by that element.
Return (layout target-prime), where layout is bound to the layout procedure after Step 3.
Side-effects:
Unique and short-lived directories and Racket modules will appear in (system-temp-rel). They are deleted by the time control leaves this procedure.
Events will appear on unlike-assets-logger. Info-level events will report summarized content fragments created by your application elements. Any output on (current-error-port) caused by script elements will apperar as error-level events.
value
2.1.3 polyglot/functional
(require polyglot/functional) | package: polyglot-lib |
(require (submod polyglot/functional safe)) |
To use module-level contracts, require the safe submodule.
|
superclass: polyglot/base% |
method
(send a-polyglot/functional delegate clear) → unlike-asset/c
clear : clear/c Like polyglot/base%’s implementation, except Markdown files (identified by an #".md" extension) are handled with this advance/c procedure:
(λ (clear compiler) (define fragment (parse-markdown clear)) (define base-page (make-minimal-html-page fragment)) (define preprocessed (preprocess-page base-page)) (add-dependencies! clear compiler (postprocess-page (run-txexpr/functional! preprocessed))))
method
(send a-polyglot/functional preprocess-page page-tx) → txexpr?
page-tx : txexpr? A page-replacing method that runs before any app replace-page procedure provided by an application element’s module.By default, this is the identity function.
method
(send a-polyglot/functional postprocess-page page-tx) → txexpr?
page-tx : txexpr? A page-replacing method that runs after every replace-page procedure provided by an application element’s module.By default, this is the identity function.
procedure
(run-txexpr/functional! target [ #:max-passes max-passes]) → txexpr?
target :
(or/c (listof txexpr?) txexpr?) max-passes : exact-integer? = 1000
The transformation is a fold on target into a new page target-prime that repeats these steps until no substitutions occur:
Save all Racket modules from application elements and library elements found in target-prime to disk.
Evaluate (dynamic-require path 'replace-page (lambda () (lambda (x) x))) for each path derived from an app element, in order.
For each replace-page procedure, functionally replace target-prime with
(parameterize ([current-replace-element-predicate F]) (replace-page target-prime)) where F is a predicate that matches the application element that sourced the replace-page procedure.
Remove from target-prime all app and lib elements discovered in Step 1.
If this process repeats more than max-passes times, the procedure will raise exn:fail.
In addition to all side-effects produced by app or library elements, status events will appear on unlike-assets-logger. Temporary Racket modules are deleted by the time control leaves this procedure.
value
: (parameter/c (-> txexpr-element? any/c)) = (λ _ #f)
2.2 polyglot/txexpr: Workflows from Scratch
(require polyglot/txexpr) | package: polyglot-lib |
(require (submod polyglot/txexpr safe)) |
This module provides all bindings from the txexpr and xml modules, plus the below.
To use module-level contracts, require the safe submodule.
polyglot/txexpr offers workflow-independent tools to define where programs exist within annotated documents, and to create new documents according to how you process those programs. Combining this module with your own subclass of unlike-compiler% allows you to write static site generators that evolve independently of built-in workflows.
2.2.1 Analysis
procedure
(get-text-elements tx) → (listof string?)
tx : txexpr?
procedure
(tx-search-tagged tx tag) → (listof txexpr?)
tx : txexpr? tag : symbol?
If an element matches, the search will not descend into the child elements.
procedure
(tag-equal? tag tx) → boolean?
tag : symbol? tx : any/c
procedure
(make-tag-predicate tags) → (-> any/c boolean?)
tags : (non-empty-listof symbol?)
procedure
(discover-dependencies tx) → (listof string?)
tx : txexpr?
If a single element contains multiple eligible attribute values, they will all appear in the returned output.
Relative paths will not be made complete. It’s up to you to decide a base directory. This frees you from needing to use (assets-rel).
Values that appear on parent elements will come before values that appear on child elements in the output. In the event multiple dependency values appear on a single element, they will appear in the order respecting the attribute list on that element.
> (discover-dependencies '(parent ((href "a.png")) (child ((href "b.png") (src "c.png")))))
2.2.2 Replacing Elements
The following procedures find and replace elements using callbacks. The callback that defines replacements elements is called replace for the sake of this documentation.
The procedures are either passive or aggressive. The aggressive procedures will replace all matching elements, including the ones that appear in the replacements it already made. The passive procedures will replace all matching elements, but if the replacement produces more matching elements, they will simply leave them in the output.
Consider the following replacement rules:
<p title="Hi">...</p> --> <section><h1>Hi</h1><p>...</p></section> |
<p>...</p> --> <section><p>...</p></section> |
In this case you want a passive procedure, because it will replace all <p> elements in a document with <section> elements, but will leave the <p> elements from the replacement as part of the intended output.
An aggressive procedure would not terminate because the substitution rules would disallow any <p> elements, even in the replacement.
<p title="Hi">...</p> |
--> <section><h1>Hi</h1><p>...</p></section |
--> <section><h1>Hi</h1><section><p>...</p></section></section> |
--> <section><h1>Hi</h1><section><section><p>...</p></section></section></section> |
... |
This is not to say that aggressive procedures are generally unhelpful. They are meant for use cases where replacements may vary and eventually stop producing matching elements.
2.2.2.1 Passive Replacement
procedure
(tx-replace tx predicate replace) → txexpr?
tx : txexpr? predicate : (-> txexpr-element? any/c) replace : (-> txexpr-element? (listof txexpr?))
procedure
(tx-replace-tagged tx tag replace) → txexpr?
tx : txexpr? tag : symbol? replace : (-> txexpr? (listof txexpr?))
e.g. (tx-replace-tagged tx 'h2 (lamdba (x) `((h3 unquote (get-elements x)))))
procedure
(substitute-many-in-txexpr tx replace? replace)
→
(or/c (listof txexpr-element?) txexpr?) (listof txexpr?) tx : txexpr? replace? : (-> txexpr-element? any/c) replace : (-> txexpr-element? (listof txexpr-element?))
Pay careful attention to the wording here.
Normally you do not need to call this directly, but it is helpful to understand how it works. This is useful if you want to build a "stepper" to inspect replacements in a broader pipeline.
A matching element can be replaced by at least zero elements, so the replace procedure must return a list of txexpr.
> (substitute-many-in-txexpr '(main (div (p "1") (p "2"))) (λ (x) (tag-equal? 'p x)) (λ _ '((b) (b))))
'(main (div (b) (b) (b) (b))) '((div (p "1") (p "2")))
Return an empty list to remove the element outright (possibly leaving an empty parent element).
> (substitute-many-in-txexpr '(main (div (p "1") (p "2"))) (λ (x) (tag-equal? 'p x)) (λ _ null))
'(main (div)) '((div (p "1") (p "2")))
As a special case, if (replace? tx) is true, then the return values will be (values (replace tx) (list tx)). This is the only case where the first returned value matches (listof txexpr-element?) and not txexpr? in the range contract.
> (substitute-many-in-txexpr '(main (div (p "1") (p "2"))) (λ (x) (tag-equal? 'main x)) (λ _ '((root))))
'((root)) '((main (div (p "1") (p "2"))))
Take care to understand that while all elements with at least one matching child are reconstructed, the substitution will not account for nested children. This avoids the risk of infinite loops in the event replacement elements always includes other matching elements.
To guarentee full replacement of elements, use substitute-many-in-txexpr/loop.
> (substitute-many-in-txexpr '(p "old" (p "old") "old") string? (λ _ '("new")))
'(p "new" (p "old") "new") '((p "old" (p "old") "old"))
procedure
(apply-manifest tx manifest [rewrite]) → txexpr?
tx : txexpr? manifest : dict? rewrite : (-> string? string?) = (lambda ...)
Pair this with discover-dependencies to set up a workflow where discovered build-time assets are replaced with production-ready assets.
(define page (run-txexpr! (parse-markdown md-file) layout)) (define optimized (foldl (λ (dep res) (dict-set res dep (write-optimized-to-disk! dep))) #hash() (discover-dependencies page))) ; Replace things like <img src="logo.png" /> with <img src="809a2d.png" /> (define production-ready (apply-manifest page optimized)) (with-output-to-file "page.html" #:exists 'truncate (λ () (displayln "<!DOCTYPE html>") (displayln (xexpr->html page))))
2.2.2.2 Aggressive Replacement
procedure
(tx-replace/aggressive tx predicate replace) → txexpr?
tx : txexpr? predicate : (-> txexpr-element? any/c) replace : (-> txexpr-element? (listof txexpr?))
procedure
(tx-replace-tagged/aggressive tx tag replace) → txexpr? tx : txexpr? tag : symbol? replace : (-> txexpr? (listof txexpr?))
Acts as a shorthand for substitute-many-in-txexpr/loop, except it only returns the first value.
procedure
(substitute-many-in-txexpr/loop tx replace? replace [ #:max-replacements max-replacements])
→
(or/c (listof txexpr-element?) txexpr?) (listof txexpr?) tx : txexpr? replace? : (-> txexpr? any/c) replace : (-> txexpr? (listof txexpr?)) max-replacements : exact-integer? = 1000
(substitute-many-in-txexpr/loop '(p) (λ (x) (tag-equal? x 'p)) (λ (x) '((p))))
substitute-many-in-txexpr/loop raises exn:fail if it iterates once more after performing max-replacements.
The return values are like those returned from substitute-many-in-txexpr.
procedure
(interlace-txexprs tx-expressions replace?/list-or-proc replace/list-or-proc [ #:max-replacements max-replacements #:max-passes max-passes]) → (non-empty-listof txexpr?) tx-expressions : (or/c txexpr? (non-empty-listof txexpr?))
replace?/list-or-proc :
(or/c (-> txexpr? any/c) (non-empty-listof (-> txexpr? any/c)))
replace/list-or-proc :
(or/c (-> txexpr? (listof txexpr?)) (non-empty-listof (-> txexpr? (listof txexpr?)))) max-replacements : exact-integer? = 1000 max-passes : exact-integer? = 50
Unlike the other substitution procedures, interlace-txexprs accepts multiple pairings of replace? and replace. If replace?/list-or-proc or replace/list-or-proc are not lists, they will be treated as if they were lists containing the original value as the only element. The lists must have the same number of elements, just like if you had provided them to map or foldl.
For each pass, the following happens:
- For each replace? and replace procedure, do this:
(substitute-many-in-txexpr/loop (cons (gensym) tx-expressions) replace? replace #:max-replacements max-replacements) If any replacements occurred, repeat.
interlace-txexprs returns only the transformed list of tagged X-expressions, or raises exn:fail if it would exceed max-passes.
This is the procedure you would likely use to write more flexible workflows. Here is an example program that parses a Markdown file, and defines a pass to remove all script and style elements, then all elements with no children. Because the procedure will continue until no substitutions are possible, only the heading will remain.
(require racket/list racket/string markdown polyglot/txexpr) (define (discard . _) null) (define md (string-join '("# Hello, world" "<script>blah</script>" "<b><i><br></i></b>") "\n")) (interlace-txexprs (parse-markdown md) (list (make-tag-predicate '(script style)) (λ (x) (and (txexpr? x) (empty? (get-elements x))))) (list discard discard))
'((h1 ((id "hello-world")) "Hello, world"))
2.2.3 Content Generation
2.3 polyglot/elements
(require polyglot/elements) | package: polyglot-lib |
polyglot/elements integrates the Racket module system with tagged X-expressions via dynamic-require.
procedure
(script-element? tx) → boolean?
tx : any/c
procedure
(script-element-of-type? type tx) → boolean?
type : string? tx : any/c
procedure
(app-element? tx) → boolean?
tx : any/c
(or (script-element-of-type? "application/rackdown" x) (script-element-of-type? "application/racket" x))
procedure
(lib-element? tx) → boolean?
tx : any/c
(or (lib-element? x) (app-element? x))
procedure
(write-script script dir) → path?
script : script-element? dir : directory-exists?
Writes the text children of script to a file in dir and returns a path to the created file.
If script contains multiple separate strings as children, then they will be separated by newline characters in the output file.
Side-effect: Sends "Wrote script: ~a" info-level event to unlike-assets-logger, where ~a is the displayed form of the returned path.
procedure
(load-script path [make-input])
→
input-port? output-port? input-port? path : path? make-input : (-> output-port? any) = void
Like (dynamic-require path #f), except any use of current-output-port, current-input-port, or current-error-port are reflected in the returned ports. load-script will apply make-input to an output port before loading the module to populate a buffer. That buffer may be consumed via current-input-port in the module’s top-level forms.
(define-values (readable-stdout writeable-stdin readable-stderr) (load-script (write-script '(script ((id "blah")) "#lang racket" "(displayln \"What's your name?\")" "(define name (read-line))" "(printf \"Hi, ~a!\" name)") (current-directory)) (λ (to-module) (displayln "Sage" to-module)))) (displayln (read-line readable-stdout)) (displayln (read-line readable-stdout))
Take care to note that if the module found at path waits for input, you will need to provide it via make-input or else control will not leave load-script.