2 Workflows

8.6

2 Workflows

A project’s workflow reads files from the project’s assets directory and writes files to the project’s distribution directory.

A workflow’s implementation is a subclass of unlike-compiler%, which allows any asset to depend on other assets. Any workflow class is a valid value for polyglot+% in .polyglotrc.rkt.

2.1 Built-In Workflows

To save time for new users, Polyglot ships with three built-in workflows: polyglot/base%, polyglot/imperative%, and polyglot/functional%. Each may be extended, overridden, or ignored entirely.

polyglot/imperative% and polyglot/functional% both operate under the assumption that they must produce HTML5 documents from Markdown files. In addition, each Markdown file may contain application elements and library elements. Application elements are of form <script type="application/racket">...</script>, and library elements are of form <script type="text/racket">...</script>. The text content of these elements are copied to Racket modules contained in (polyglot-temp-directory). The name of each Racket module is equal to ID.rkt, where ID is the value of the id attribute of the <script> element. If the id attribute is not present or empty, a temporary name will be used instead.

The Racket modules created on disk from application elements are loaded using dynamic-require in the order they are encountered in Markdown. The intended behavior and responsibility of application element’s Racket module depends on the workflow executing it. Library elements, on the other hand, are saved to disk as-is and used according to the whims of the Racket modules produced from application elements.

2.1.1 polyglot/base

(require polyglot/base)

package: polyglot-lib

The base workflow sets rules shared by polyglot/functional% and polyglot/imperative% by specializing the behavior of unlike-compiler%.

class
polyglot/base% : class?

superclass: unlike-compiler%

Implements the base workflow. An instance of polyglot/base% does not set any rules for application elements.

In the terminology of unlike-assets, polyglot/base% uses complete paths as clear/c names. Fulfilled assets are represented as complete paths to files in a distribution directory.

method
(send a-polyglot/base clarify unclear) → clear/c
unclear : unclear/c
If the string looks like a readable path on your system, returns a complete path.
Relative paths are completed using (assets-rel). Complete paths are used as-is.
If the completed path does not refer to a readable file, this will raise exn:fail unless the path extension equals #".literal".
method
(send a-polyglot/base delegate clear) → unlike-asset/c
clear : clear/c
Selects an advance/c procedure to act as the first representation of an asset with the given clear path.
The procedure depends on the result of (path-get-extension clear):
#".literal": The path will be treated as a fulfilled asset without further processing. The path may refer to a non-existent file in the project’s distribution.
#".css": The path is assumed to refer to a CSS file. For every url(X) expression in the stylesheet, the compiler will add! X as a dependency of the CSS file.
#".rkt": The path is assumed to refer to a Racket module. The module must (provide write-dist-file), where write-dist-file is an advance/c procedure that may return a complete path as a fulfilled file in a distribution.
In all other cases, the file located at path clear will be copied directly to the distribution directory, and renamed. The new name of the file will equal the first 8 characters of the SHA-1 hash of the file’s own contents, with the same extension. This is for cache-busting purposes. The asset is then considered fulfilled with a path to the newly-renamed file.

procedure
(make-minimal-html-page body) → txexpr?
body : (listof xexpr?)

Returns `(html (head (title "Untitled")) (body ,@body))

procedure
(add-dependencies! clear
compiler
txexpr/expanded) → advance/c
  clear : clear/c
  compiler : (is-a?/c unlike-compiler%)
  txexpr/expanded : txexpr?

Assuming clear is a path to a Markdown file, and txexpr/expanded is derived from that file in a given workflow, this procedure will discover and add! dependencies in txexpr/expanded and return an advance/c procedure that prepares a final HTML5 document with rewritten links to production-ready assets in a distribution.

Specifically, add-dependencies! maps the output of (discover-dependencies txexpr/expanded) to clear names using compiler and adds them to the build using (send compiler add!). As a special case, Markdown dependencies are added to the compilation but without a dependency relationship on clear. This is to avoid a circular dependency locking up a build when two pages link to each other.

The returned advance/c procedure must be used as the next step for the asset named by clear. That procedure will resolve the dependencies in txexpr/expanded, write a finished HTML5 document to a distribution, and fulfill the asset using the path of the HTML5 document.

You will likely not call this yourself, but you can use it for custom workflows derived from polyglot/base% if you aim to create HTML5 documents from Markdown using a different set of rules. The dependency resolution step is tedious, and this will take care of it for you.

For an example, see the functional workflow’s delegate implementation.

2.1.2 polyglot/imperative

(require polyglot/imperative)	package: polyglot-lib
(require (submod polyglot/imperative safe))

The Imperative Workflow seeks application elements within Markdown files and runs them under the expectation that they will produce content as a side-effect. In that sense, app elements behave similarly to <?php echo ...; ?> in PHP.

To use module-level contracts, require the safe submodule.

class
polyglot/imperative% : class?

superclass: polyglot/base%

Implements the imperative workflow.

method
(send a-polyglot/imperative delegate clear) → unlike-asset/c
  clear : clear/c
Like delegate, except Markdown files (identified by a #".md" extension) are parsed and processed using this advance/c procedure:
(λ (clear compiler)
  (define txexpr/parsed (parse-markdown clear))
  (define txexpr/preprocessed (send compiler preprocess-txexprs txexpr/parsed))
  (define txexpr/processed (run-txexpr/imperative! txexpr/preprocessed))
  (add-dependencies! clear compiler txexpr/processed))
method
(send a-polyglot/imperative preprocess-txexprs tx-expressions)
→ (listof txexpr?)
  tx-expressions : (listof txexpr?)
This method transforms tx-expressions parsed from a source Markdown file into a new list of tagged X-expressions. This transformation occurs before the instance uses run-txexpr/imperative!.
Use this to sanitize untrusted code, generate application elements based on content, or attach common metadata to documents.
The default implementation searches tx-expressions for elements with a data-macro attribute. If the attribute exists, it must be a string of at most two words (e.g. "holiday" or "holiday halloween"). If a second word is not specified, it is assumed to be "replace-element".
The first word is converted to a path to a Racket module in the assets directory, and the second word is converted to a symbol used to extract a provided identifier via dynamic-require, with reload support for live builds (e.g. (dynamic-require (assets-rel "holiday.rkt") 'halloween)).
The dynamic-require must return a (-> txexpr? (listof txexpr?)) procedure that transforms the original Tagged X-expression holding the data-macro attribute to at least zero new elements.

procedure
(run-txexpr/imperative! target
[ initial-layout])
→ (or/c (listof txexpr?) txexpr?)
target : (or/c txexpr? (non-empty-listof txexpr?))
initial-layout : (-> (listof txexpr?) (or/c txexpr? (listof txexpr?)))
= identity

Transforms target into a new tagged X-expression, target-prime, presumably representing HTML5.

The transformation does not mutate target, but does depend on side-effects:

Remember initial-layout as the page layout.
Save all Racket modules from application elements and library elements found in target-prime to disk. Remove all library elements from target-prime.
For each Racket module M written to disk, in order matching app elements encountered:
1. Instantiate the module using (dynamic-require M #f)
2. read all tagged X-expressions produced by write calls in the module as a side-effect.
3. Replace the page layout with (dynamic-require M 'layout), if possible. Otherwise keep the existing layout.
4. Replace the app element that sourced M with the content written by that element.
Return (layout target-prime), where layout is bound to the layout procedure after Step 3.

Side-effects:

Unique and short-lived directories and Racket modules will appear in (system-temp-rel). They are deleted by the time control leaves this procedure.
Events will appear on unlike-assets-logger. Info-level events will report summarized content fragments created by your application elements. Any output on (current-error-port) caused by script elements will apperar as error-level events.

value
polyglot% : class?

An alias for polyglot/imperative% kept for backwards compatibility.

value
run-txexpr! : procedure?

An alias for run-txexpr/imperative! kept for backwards compatibility.

2.1.3 polyglot/functional

(require polyglot/functional)	package: polyglot-lib
(require (submod polyglot/functional safe))

To use module-level contracts, require the safe submodule.

class
polyglot/functional% : class?

superclass: polyglot/base%

Specializes polyglot/base% to process Markdown files where application elements can replace page contents without side-effects.

method
(send a-polyglot/functional delegate clear) → unlike-asset/c
  clear : clear/c
Like polyglot/base%’s implementation, except Markdown files (identified by an #".md" extension) are handled with this advance/c procedure:
(λ (clear compiler)
   (define fragment (parse-markdown clear))
   (define base-page (make-minimal-html-page fragment))
   (define preprocessed (preprocess-page base-page))
   (add-dependencies!
        clear
        compiler
        (postprocess-page (run-txexpr/functional! preprocessed))))
method
(send a-polyglot/functional preprocess-page page-tx) → txexpr?
  page-tx : txexpr?
A page-replacing method that runs before any app replace-page procedure provided by an application element’s module.
By default, this is the identity function.
method
(send a-polyglot/functional postprocess-page page-tx) → txexpr?
  page-tx : txexpr?
A page-replacing method that runs after every replace-page procedure provided by an application element’s module.
By default, this is the identity function.

procedure
(run-txexpr/functional! target
[ #:max-passes max-passes]) → txexpr?
   target :
(or/c (listof txexpr?)
      txexpr?)
  max-passes : exact-integer? = 1000

Transforms target into a new tagged X-expression, target-prime, presumably representing HTML5. If target is a list of elements, then this procedure will begin processing with (make-minimal-html-page target) instead.

The transformation is a fold on target into a new page target-prime that repeats these steps until no substitutions occur:

Save all Racket modules from application elements and library elements found in target-prime to disk.
Evaluate (dynamic-require path 'replace-page (lambda () (lambda (x) x))) for each path derived from an app element, in order.
For each replace-page procedure, functionally replace target-prime with
(parameterize ([current-replace-element-predicate F])
(replace-page target-prime))
where F is a predicate that matches the application element that sourced the replace-page procedure.
Remove from target-prime all app and lib elements discovered in Step 1.

If this process repeats more than max-passes times, the procedure will raise exn:fail.

In addition to all side-effects produced by app or library elements, status events will appear on unlike-assets-logger. Temporary Racket modules are deleted by the time control leaves this procedure.

value
current-replace-element-predicate
: (parameter/c (-> txexpr-element? any/c))
= (λ _ #f)

A parameter used to set the target for tx-replace-me. The functional workflow will set this parameter to a predicate that matches the application element in which a replace-page procedure is evaluating.

procedure
(tx-replace-me tx replace) → txexpr?
tx : txexpr?
replace : (-> txexpr? (listof txexpr?))

Like tx-replace, except the predicate is (current-replace-element-predicate).

2.2 polyglot/txexpr: Workflows from Scratch

(require polyglot/txexpr)	package: polyglot-lib
(require (submod polyglot/txexpr safe))

This module provides all bindings from the txexpr and xml modules, plus the below.

To use module-level contracts, require the safe submodule.

polyglot/txexpr offers workflow-independent tools to define where programs exist within annotated documents, and to create new documents according to how you process those programs. Combining this module with your own subclass of unlike-compiler% allows you to write static site generators that evolve independently of built-in workflows.

2.2.1 Analysis

procedure
(get-text-elements tx) → (listof string?)
tx : txexpr?

Returns all string children of tx.

procedure
(tx-search-tagged tx tag) → (listof txexpr?)
tx : txexpr?
tag : symbol?

Return all elements in tx with the given tag. Returns an empty list if there are no matches.

If an element matches, the search will not descend into the child elements.

procedure
(tag-equal? tag tx) → boolean?
tag : symbol?
tx : any/c

Returns #t if tx is a tagged X-expression and its tag is equal? to tag.

procedure
(make-tag-predicate tags) → (-> any/c boolean?)
tags : (non-empty-listof symbol?)

Returns a procedure that checks if a value causes tag-equal? to return #t for any of the given tags.

procedure
(discover-dependencies tx) → (listof string?)
tx : txexpr?

Returns the values of href, src, or srcset attributes in tx that appear to refer to assets on a local file system. This will check for complete paths, relative paths, and URLs with the "file://" scheme.

If a single element contains multiple eligible attribute values, they will all appear in the returned output.

Relative paths will not be made complete. It’s up to you to decide a base directory. This frees you from needing to use (assets-rel).

Values that appear on parent elements will come before values that appear on child elements in the output. In the event multiple dependency values appear on a single element, they will appear in the order respecting the attribute list on that element.

> (discover-dependencies
'(parent ((href "a.png"))
(child ((href "b.png") (src "c.png")))))

'("a.png" "b.png" "c.png")

2.2.2 Replacing Elements

The following procedures find and replace elements using callbacks. The callback that defines replacements elements is called replace for the sake of this documentation.

The procedures are either passive or aggressive. The aggressive procedures will replace all matching elements, including the ones that appear in the replacements it already made. The passive procedures will replace all matching elements, but if the replacement produces more matching elements, they will simply leave them in the output.

Consider the following replacement rules:

In this case you want a passive procedure, because it will replace all <p> elements in a document with <section> elements, but will leave the <p> elements from the replacement as part of the intended output.

An aggressive procedure would not terminate because the substitution rules would disallow any <p> elements, even in the replacement.

--> <section><h1>Hi</h1><p>...</p></section

--> <section><h1>Hi</h1><section><p>...</p></section></section>

--> <section><h1>Hi</h1><section><section><p>...</p></section></section></section>

...

This is not to say that aggressive procedures are generally unhelpful. They are meant for use cases where replacements may vary and eventually stop producing matching elements.

2.2.2.1 Passive Replacement

procedure
(tx-replace tx predicate replace) → txexpr?
  tx : txexpr?
  predicate : (-> txexpr-element? any/c)
  replace : (-> txexpr-element? (listof txexpr?))

Replaces each element E matching a predicate with the list of elements returned from (replace E). Note that this can create empty parent elements. Behaves like substitute-many-in-txexpr, except it only returns the first value.

procedure
(tx-replace-tagged tx tag replace) → txexpr?
  tx : txexpr?
  tag : symbol?
  replace : (-> txexpr? (listof txexpr?))

Like tx-replace, except you can designate all elements of a certain tag.

e.g. (tx-replace-tagged tx 'h2 (lamdba (x) `((h3 unquote (get-elements x)))))

procedure
(substitute-many-in-txexpr tx
replace?
replace)
→
(or/c (listof txexpr-element?) txexpr?)
(listof txexpr?)
  tx : txexpr?
  replace? : (-> txexpr-element? any/c)
  replace : (-> txexpr-element? (listof txexpr-element?))

Pay careful attention to the wording here.

Find and replace all elements in tx with at least one child element matching replace?. Each immediate descendent element C is replaced with all elements from (replace C). Returns the new content as the first value, and a list of the reconstructed elements as the second value.

Normally you do not need to call this directly, but it is helpful to understand how it works. This is useful if you want to build a "stepper" to inspect replacements in a broader pipeline.

A matching element can be replaced by at least zero elements, so the replace procedure must return a list of txexpr.

> (substitute-many-in-txexpr
    '(main (div (p "1") (p "2")))
    (λ (x) (tag-equal? 'p x))
    (λ _ '((b) (b))))

'(main (div (b) (b) (b) (b)))
'((div (p "1") (p "2")))

Return an empty list to remove the element outright (possibly leaving an empty parent element).

> (substitute-many-in-txexpr
    '(main (div (p "1") (p "2")))
    (λ (x) (tag-equal? 'p x))
    (λ _ null))

'(main (div))
'((div (p "1") (p "2")))

As a special case, if (replace? tx) is true, then the return values will be (values (replace tx) (list tx)). This is the only case where the first returned value matches (listof txexpr-element?) and not txexpr? in the range contract.

> (substitute-many-in-txexpr
    '(main (div (p "1") (p "2")))
    (λ (x) (tag-equal? 'main x))
    (λ _ '((root))))

'((root))
'((main (div (p "1") (p "2"))))

Take care to understand that while all elements with at least one matching child are reconstructed, the substitution will not account for nested children. This avoids the risk of infinite loops in the event replacement elements always includes other matching elements.

To guarentee full replacement of elements, use substitute-many-in-txexpr/loop.

> (substitute-many-in-txexpr '(p "old" (p "old") "old")
string?
(λ _ '("new")))

'(p "new" (p "old") "new")
'((p "old" (p "old") "old"))

procedure
(apply-manifest tx manifest [rewrite]) → txexpr?
  tx : txexpr?
  manifest : dict?
  rewrite : (-> string? string?) = (lambda ...)

Returns a new txexpr such that each href, src, and srcset attribute value that appears as a key K in manifest is replaced with (rewrite (dict-ref manifest K)). By default, rewrite returns only the name value returned from split-path.

Pair this with discover-dependencies to set up a workflow where discovered build-time assets are replaced with production-ready assets.

(define page (run-txexpr! (parse-markdown md-file) layout))

(define optimized (foldl (λ (dep res)
                           (dict-set res dep (write-optimized-to-disk! dep)))
                         #hash()
                         (discover-dependencies page)))

; Replace things like <img src="logo.png" /> with <img src="809a2d.png" />
(define production-ready (apply-manifest page optimized))

(with-output-to-file "page.html"
  #:exists 'truncate
  (λ ()
    (displayln "<!DOCTYPE html>")
    (displayln (xexpr->html page))))

2.2.2.2 Aggressive Replacement

procedure
(tx-replace/aggressive tx predicate replace) → txexpr?
  tx : txexpr?
  predicate : (-> txexpr-element? any/c)
  replace : (-> txexpr-element? (listof txexpr?))
procedure
(tx-replace-tagged/aggressive tx
tag
replace) → txexpr?
  tx : txexpr?
  tag : symbol?
  replace : (-> txexpr? (listof txexpr?))

Aggressive variants of tx-replace and tx-replace-tagged.

Acts as a shorthand for substitute-many-in-txexpr/loop, except it only returns the first value.

procedure
(substitute-many-in-txexpr/loop
tx
replace?
replace
[ #:max-replacements max-replacements])
→
(or/c (listof txexpr-element?) txexpr?)
(listof txexpr?)
  tx : txexpr?
  replace? : (-> txexpr? any/c)
  replace : (-> txexpr? (listof txexpr?))
  max-replacements : exact-integer? = 1000

Repeats substitute-many-in-txexpr until no substitutions are possible. To illustrate, this would not terminate if it weren’t for max-replacements:

(substitute-many-in-txexpr/loop '(p)
(λ (x) (tag-equal? x 'p))
(λ (x) '((p))))

substitute-many-in-txexpr/loop raises exn:fail if it iterates once more after performing max-replacements.

The return values are like those returned from substitute-many-in-txexpr.

procedure
(interlace-txexprs tx-expressions
replace?/list-or-proc
replace/list-or-proc
[ #:max-replacements max-replacements
#:max-passes max-passes])
→ (non-empty-listof txexpr?)
  tx-expressions : (or/c txexpr? (non-empty-listof txexpr?))
   replace?/list-or-proc :
(or/c (-> txexpr? any/c)
(non-empty-listof (-> txexpr? any/c)))
   replace/list-or-proc :
(or/c (-> txexpr? (listof txexpr?))
(non-empty-listof (-> txexpr? (listof txexpr?))))
  max-replacements : exact-integer? = 1000
  max-passes : exact-integer? = 50

interlace-txexprs returns a list of tagged X-expressions constructed by a variable number of passes over tx-expressions.

Unlike the other substitution procedures, interlace-txexprs accepts multiple pairings of replace? and replace. If replace?/list-or-proc or replace/list-or-proc are not lists, they will be treated as if they were lists containing the original value as the only element. The lists must have the same number of elements, just like if you had provided them to map or foldl.

For each pass, the following happens:

For each replace? and replace procedure, do this:

(substitute-many-in-txexpr/loop (cons (gensym) tx-expressions)
                                replace?
                                replace
                                #:max-replacements max-replacements)

If any replacements occurred, repeat.

interlace-txexprs returns only the transformed list of tagged X-expressions, or raises exn:fail if it would exceed max-passes.

This is the procedure you would likely use to write more flexible workflows. Here is an example program that parses a Markdown file, and defines a pass to remove all script and style elements, then all elements with no children. Because the procedure will continue until no substitutions are possible, only the heading will remain.

(require racket/list
         racket/string
         markdown
         polyglot/txexpr)

(define (discard . _) null)

(define md (string-join '("# Hello, world"
                          "<script>blah</script>"
                          "<b><i><br></i></b>")
                        "\n"))

(interlace-txexprs (parse-markdown md)
                   (list (make-tag-predicate '(script style))
                         (λ (x) (and (txexpr? x)
                                     (empty? (get-elements x)))))
                   (list discard
                         discard))

'((h1 ((id "hello-world")) "Hello, world"))

2.2.3 Content Generation

procedure
(genid tx) → string?
tx : txexpr?

Returns a value for an id attribute that is not used anywhere in tx.

2.3 polyglot/elements

(require polyglot/elements)

package: polyglot-lib

polyglot/elements integrates the Racket module system with tagged X-expressions via dynamic-require.

procedure
(script-element? tx) → boolean?
tx : any/c

Returns #t if tx is a tagged X-expression with tag 'script.

procedure
(script-element-of-type? type tx) → boolean?
type : string?
tx : any/c

Returns #t if (script-element? tx) is #t and the type attribute equals type.

procedure
(app-element? tx) → boolean?
tx : any/c

Equivalent to

(or (script-element-of-type? "application/rackdown" x)
(script-element-of-type? "application/racket" x))

procedure
(lib-element? tx) → boolean?
tx : any/c

Equivalent to (script-element-of-type? "text/racket" tx).

procedure
(app-or-lib-element? tx) → boolean?
tx : any/c

Equivalent to

(or (lib-element? x)
(app-element? x))

procedure
(write-script script dir) → path?
script : script-element?
dir : directory-exists?

For use with load-script.

Writes the text children of script to a file in dir and returns a path to the created file.

If script contains multiple separate strings as children, then they will be separated by newline characters in the output file.

Side-effect: Sends "Wrote script: ~a" info-level event to unlike-assets-logger, where ~a is the displayed form of the returned path.

procedure
(load-script path [make-input])
→
input-port? output-port? input-port?
path : path?
make-input : (-> output-port? any) = void

For use with write-script.

Like (dynamic-require path #f), except any use of current-output-port, current-input-port, or current-error-port are reflected in the returned ports. load-script will apply make-input to an output port before loading the module to populate a buffer. That buffer may be consumed via current-input-port in the module’s top-level forms.

(define-values (readable-stdout writeable-stdin readable-stderr)
  (load-script (write-script '(script ((id "blah"))
                             "#lang racket"
                             "(displayln \"What's your name?\")"
                             "(define name (read-line))"
                             "(printf \"Hi, ~a!\" name)")
                             (current-directory))
               (λ (to-module)
                 (displayln "Sage" to-module))))

(displayln (read-line readable-stdout))
(displayln (read-line readable-stdout))

Take care to note that if the module found at path waits for input, you will need to provide it via make-input or else control will not leave load-script.

1	Projects
2	Workflows
3	Applying Workflows to Projects
4	Contributions

2.1	Built-In Workflows
2.2	polyglot/ txexpr: Workflows from Scratch
2.3	polyglot/ elements