<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" xmlns:px="http://www.daisy.org/ns/pipeline/xproc" xmlns:cx="http://xmlcalabash.com/ns/extensions" xmlns:d="http://www.daisy.org/ns/pipeline/data" version="1.0" type="px:epub-to-daisy.script" name="main" px:input-filesets="epub2 epub3" px:output-filesets="daisy202 daisy3"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h1 px:role="name">EPUB to DAISY</h1> <p px:role="desc">Transforms an EPUB 2 or EPUB 3 publication into DAISY 2.02 and DAISY 3.</p> <a px:role="homepage" href="http://daisy.github.io/pipeline/Get-Help/User-Guide/Scripts/epub-to-daisy/"> Online documentation </a> </p:documentation> <p:option name="source" required="true" px:type="anyFileURI" px:media-type="application/epub+zip application/oebps-package+xml"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">EPUB Publication</h2> <p px:role="desc" xml:space="preserve">The EPUB 2 or EPUB 3 you want to transform. You may alternatively use the "mimetype" document if your input is a unzipped/"exploded" version of an EPUB.</p> </p:documentation> </p:option> <p:option name="validation" select="'off'" required="false"><p:pipeinfo> <px:type> <choice xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"> <value>off</value> <a:documentation xml:lang="en">No validation</a:documentation> <value>report</value> <a:documentation xml:lang="en">Report validation issues</a:documentation> <value>abort</value> <a:documentation xml:lang="en">Abort on validation issues</a:documentation> </choice> </px:type></p:pipeinfo><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Validation</h2> <p px:role="desc">Whether to abort on validation issues.</p> </p:documentation> </p:option> <p:option name="tts" required="false" select="'default'"> <p:pipeinfo> <px:type> <choice xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"> <value>true</value> <a:documentation xml:lang="en">Yes</a:documentation> <value>false</value> <a:documentation xml:lang="en">No</a:documentation> <value>default</value> <a:documentation xml:lang="en">If publication has no media overlays yet</a:documentation> </choice> </px:type> </p:pipeinfo> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Perform text-to-speech</h2> <p px:role="desc" xml:space="preserve">Whether to use a speech synthesizer to produce media overlays. This will remove any existing media overlays in the EPUB.</p> </p:documentation> </p:option> <p:input port="tts-config" primary="false" px:media-type="application/vnd.pipeline.tts-config+xml"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Text-to-speech configuration file</h2> <p px:role="desc" xml:space="preserve">Configuration file for text-to-speech. [More details on the configuration file format](http://daisy.github.io/pipeline/Get-Help/User-Guide/Text-To-Speech/).</p> </p:documentation> <p:inline><d:config/></p:inline> </p:input> <p:option xmlns:_="tts" name="_:stylesheet" select="''" required="false" px:type="anyURI" px:sequence="true" px:separator=" " px:media-type="text/css text/x-scss"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Style sheets</h2> <p px:role="desc" xml:space="preserve">A list of CSS style sheets to take into account. Must be a space separated list of URIs, absolute or relative to the input. Style sheets specified through this option are called "[user style sheets](https://www.w3.org/TR/CSS2/cascade.html#cascade)". Style sheets can also be attached to the source document. These are referred to as "[author style sheets](https://www.w3.org/TR/CSS2/cascade.html#cascade)". They can be linked (using an ['xml-stylesheet' processing instruction](https://www.w3.org/TR/xml-stylesheet) or a ['link' element](https://www.w3.org/Style/styling-XML#External)), embedded (using a ['style' element](https://www.w3.org/Style/styling-XML#Embedded)) and/or inlined (using '[style' attributes](https://www.w3.org/TR/css-style-attr/)). Only author styles that apply to "[speech](https://www.w3.org/TR/CSS2/aural.html)" media are taken into account. All style sheets are applied at once, but the order in which they are specified has an influence on the [cascading order](https://www.w3.org/TR/CSS2/cascade.html#cascading-order). Author styles take precedence over user styles. </p> </p:documentation> </p:option> <p:option name="stylesheet-parameters" select="'()'" required="false" px:type="stylesheet-parameters"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Style sheet parameters</h2> <p px:role="desc" xml:space="preserve">A list of parameters passed to the style sheets. Style sheets, whether they're user style sheets (specified with the "Style sheets" option) or author style sheets (associated with the source), may have parameters (Sass variables). This option, which takes a comma-separated list of key-value pairs enclosed in parenthesis, can be used to set these variables. For example, if a style sheet uses the Sass variable "foo": ~~~sass @if $foo { /* some style that should only be enabled when "foo" is truthy */ } ~~~ you can control that variable with the following parameters list: `(foo:true)`.</p> </p:documentation> </p:option> <p:option name="lexicon" select="p:system-property('d:org.daisy.pipeline.tts.default-lexicon')" required="false" px:type="anyURI" px:sequence="true" px:separator=" " px:media-type="application/pls+xml"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Lexicons</h2>

xA;add the "regex" attribute, as follows: ~~~xml <lexicon xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" version="1.0" alphabet="ipa" xml:lang="en"> <lexeme regex="true"> <grapheme>([0-9]+)-([0-9]+)</grapheme> <alias>between $1 and $2</alias> </lexeme> </lexicon> ~~~ The regex feature works only with alias-based substitutions. The regex syntax used is that from [XQuery 1.0 and XPath 2.0](https://www.w3.org/TR/xpath-functions/#regex-syntax). Whether or not the regex attribute is set to "true", the grapheme matching can be made more accurate by specifying the "positive-lookahead" and "negative-lookahead" attributes: ~~~xml <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" alphabet="ipa" xml:lang="en"> <lexeme> <grapheme positive-lookahead="[ ]+is">SB</grapheme> <alias>somebody</alias> </lexeme> <lexeme> <grapheme>SB</grapheme> <alias>should be</alias> </lexeme> <lexeme xml:lang="fr"> <grapheme positive-lookahead="[ ]+[cC]ity">boston</grapheme> <phoneme>bɔstøn</phoneme> </lexeme> </lexicon> ~~~ Graphemes with "positive-lookahead" will match if the beginning of what follows matches the "position-lookahead" pattern. Graphemes with "negative-lookahead" will match if the beginning of what follows does not match the "negative-lookahead" pattern. The lookaheads are case-sensitive while the grapheme contents are not. The lexemes are matched in this order: 1. Graphemes with regex="false" come first, no matter if there is a lookahead or not; 2. then come graphemes with regex="true" and no lookahead; 3. then graphemes with regex="true" and one or two lookaheads. Within these categories, lexemes are matched in the same order as they appear in the lexicons."><p px:role="desc" xml:space="preserve">A list of PLS lexicons to take into account. Must be a space separated list of URIs, absolute or relative to the input. Lexicons can also be attached to the source document, using a ['link' element](http://kb.daisy.org/publishing/docs/text-to-speech/pls.html#ex-07). PLS lexicons allow you to define custom pronunciations of words. It is meant to help TTS processors deal with ambiguous abbreviations and pronunciation of proper names. When a word is defined in a lexicon, the processor will use the provided pronunciation instead of the default rendering. The syntax of a PLS lexicon is defined in [Pronunciation Lexicon Specification (PLS) Version 1.0](https://www.w3.org/TR/pronunciation-lexicon), extended with regular expression matching. To enable regular expression matching, add the "regex" attribute, as follows: ~~~xml <lexicon xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" version="1.0" alphabet="ipa" xml:lang="en"> <lexeme regex="true"> <grapheme>([0-9]+)-([0-9]+)</grapheme> <alias>between $1 and $2</alias> </lexeme> </lexicon> ~~~ The regex feature works only with alias-based substitutions. The regex syntax used is that from [XQuery 1.0 and XPath 2.0](https://www.w3.org/TR/xpath-functions/#regex-syntax). Whether or not the regex attribute is set to "true", the grapheme matching can be made more accurate by specifying the "positive-lookahead" and "negative-lookahead" attributes: ~~~xml <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" alphabet="ipa" xml:lang="en"> <lexeme> <grapheme positive-lookahead="[ ]+is">SB</grapheme> <alias>somebody</alias> </lexeme> <lexeme> <grapheme>SB</grapheme> <alias>should be</alias> </lexeme> <lexeme xml:lang="fr"> <grapheme positive-lookahead="[ ]+[cC]ity">boston</grapheme> <phoneme>bɔstøn</phoneme> </lexeme> </lexicon> ~~~ Graphemes with "positive-lookahead" will match if the beginning of what follows matches the "position-lookahead" pattern. Graphemes with "negative-lookahead" will match if the beginning of what follows does not match the "negative-lookahead" pattern. The lookaheads are case-sensitive while the grapheme contents are not. The lexemes are matched in this order: 1. Graphemes with regex="false" come first, no matter if there is a lookahead or not; 2. then come graphemes with regex="true" and no lookahead; 3. then graphemes with regex="true" and one or two lookaheads. Within these categories, lexemes are matched in the same order as they appear in the lexicons.</p> </p:documentation> </p:option> <p:option name="include-tts-log" select="p:system-property('d:org.daisy.pipeline.tts.log')" px:type="boolean"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Enable TTS log</h2> <p px:role="desc" xml:space="preserve">Whether or not to make the TTS log available. The TTS log contains a great deal of additional information that is not present in the main job log and that is helpful for troubleshooting. Most of the log entries concern particular chunks of text of the input document. The default can be changed using the [`org.daisy.pipeline.tts.log`](http://daisy.github.io/pipeline/Get-Help/User-Guide/Text-To-Speech/#common-settings) property. </p> </p:documentation> </p:option> <p:output port="tts-log" sequence="true"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">TTS log</h2> <p px:role="desc">Log file with information about text-to-speech process.</p> </p:documentation> <p:pipe step="status" port="tts-log"/> </p:output> <p:serialization port="tts-log" indent="true" omit-xml-declaration="false"/> <p:option name="epub3" required="true" px:output="result" px:type="anyDirURI"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Intermediary EPUB 3 with media-overlays</h2> <p>Note that the conversion may fail but still output a EPUB 3 document.</p> </p:documentation> </p:option> <p:option name="daisy202" required="true" px:output="result" px:type="anyDirURI"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">DAISY 2.02</h2> </p:documentation> </p:option> <p:option name="daisy3" required="true" px:output="result" px:type="anyDirURI"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">DAISY 3</h2> </p:documentation> </p:option> <p:output port="validation-report" sequence="true" px:media-type="application/vnd.pipeline.report+xml"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Validation reports</h2> </p:documentation> <p:pipe step="load" port="validation-report"/> </p:output> <p:output port="status" px:media-type="application/vnd.pipeline.status+xml"> <p:pipe step="status" port="result"/> </p:output> <p:import href="http://www.daisy.org/pipeline/modules/epub-utils/library.xpl"> <p:documentation> px:epub-load </p:documentation> </p:import> <p:import href="http://www.daisy.org/pipeline/modules/fileset-utils/library.xpl"> <p:documentation> px:fileset-store px:fileset-delete </p:documentation> </p:import> <p:import href="epub-to-daisy.xpl"> <p:documentation> px:epub-to-daisy </p:documentation> </p:import> <px:epub-load name="load" px:message="Loading EPUB" px:progress="1/20"> <p:with-option name="href" select="$source"/> <p:with-option name="validation" select="not($validation='off')"/> <p:with-option name="temp-dir" select="$temp-dir"/> </px:epub-load> <p:sink/> <p:identity> <p:input port="source"> <p:pipe step="load" port="validation-status"/> </p:input> </p:identity> <p:choose> <p:when test="/d:validation-status[@result='error']"> <p:choose> <p:when test="$validation='abort'"> <p:identity px:message="The EPUB input is invalid. See validation report for more info." px:message-severity="ERROR"/> </p:when> <p:otherwise> <p:identity px:message="The EPUB input is invalid. See validation report for more info." px:message-severity="WARN"/> </p:otherwise> </p:choose> </p:when> <p:otherwise> <p:identity/> </p:otherwise> </p:choose> <p:choose name="status" px:progress="19/20"> <p:when test="/d:validation-status[@result='error'] and $validation='abort'"> <p:output port="result" primary="true"/> <p:output port="tts-log" sequence="true"> <p:empty/> </p:output> <p:identity/> </p:when> <p:otherwise> <p:output port="result" primary="true"/> <p:output port="tts-log" sequence="true"> <p:pipe step="convert" port="tts-log"/> </p:output> <px:epub-to-daisy name="convert" px:message="Converting to DAISY" px:progress="15/19"> <p:input port="source.fileset"> <p:pipe step="load" port="result.fileset"/> </p:input> <p:input port="source.in-memory"> <p:pipe step="load" port="result.in-memory"/> </p:input> <p:with-option name="epub3-output-dir" select="$epub3"/> <p:with-option name="daisy202-output-dir" select="$daisy202"/> <p:with-option name="daisy3-output-dir" select="$daisy3"/> <p:with-option name="tts" select="$tts"/> <p:input port="tts-config"> <p:pipe step="main" port="tts-config"/> </p:input> <p:with-option name="include-tts-log" select="$include-tts-log='true'"/> <p:with-option xmlns:_="tts" name="stylesheet" select="string-join( for $s in tokenize($_:stylesheet,'\s+')[not(.='')] return resolve-uri($s,$source), ' ')"/> <p:with-option name="stylesheet-parameters" select="$stylesheet-parameters"/> <p:with-option name="lexicon" select="for $l in tokenize($lexicon,'\s+')[not(.='')] return resolve-uri($l,$source)"/> <p:with-option name="temp-dir" select="$temp-dir"/> </px:epub-to-daisy> <px:fileset-store name="store-epub3" px:message="Storing intermediary EPUB 3" px:progress="1/19"> <p:input port="fileset.in"> <p:pipe step="convert" port="epub3.fileset"/> </p:input> <p:input port="in-memory.in"> <p:pipe step="convert" port="epub3.in-memory"/> </p:input> </px:fileset-store> <p:choose name="store-daisy202" px:progress="1/19"> <p:xpath-context> <p:pipe step="convert" port="daisy202.fileset"/> </p:xpath-context> <p:when test="//d:file"> <p:output port="result"> <p:pipe step="store" port="fileset.out"/> </p:output> <px:fileset-store name="store" px:progress="1" px:message="Storing DAISY 2.02"> <p:input port="fileset.in"> <p:pipe step="convert" port="daisy202.fileset"/> </p:input> <p:input port="in-memory.in"> <p:pipe step="convert" port="daisy202.in-memory"/> </p:input> </px:fileset-store> </p:when> <p:otherwise> <p:output port="result"/> <p:identity> <p:input port="source"> <p:pipe step="convert" port="daisy202.fileset"/> </p:input> </p:identity> </p:otherwise> </p:choose> <p:choose name="store-daisy3" px:progress="1/19"> <p:xpath-context> <p:pipe step="convert" port="daisy3.fileset"/> </p:xpath-context> <p:when test="//d:file"> <p:output port="result"> <p:pipe step="store" port="fileset.out"/> </p:output> <px:fileset-store name="store" px:progress="1" px:message="Storing DAISY 3"> <p:input port="fileset.in"> <p:pipe step="convert" port="daisy3.fileset"/> </p:input> <p:input port="in-memory.in"> <p:pipe step="convert" port="daisy3.in-memory"/> </p:input> </px:fileset-store> </p:when> <p:otherwise> <p:output port="result"/> <p:identity> <p:input port="source"> <p:pipe step="convert" port="daisy3.fileset"/> </p:input> </p:identity> </p:otherwise> </p:choose> <p:identity cx:depends-on="store-epub3"> <p:input port="source"> <p:pipe step="convert" port="temp-audio-files"/> </p:input> </p:identity> <p:identity cx:depends-on="store-daisy3"/> <px:fileset-delete cx:depends-on="store-daisy202" name="delete-temp-files" px:progress="1/19" px:message="Cleaning up temporary files"/> <p:identity cx:depends-on="delete-temp-files"> <p:input port="source"> <p:inline><d:validation-status result="ok"/></p:inline> </p:input> </p:identity> </p:otherwise> </p:choose> </p:declare-step>