xA;add the "regex" attribute, as follows: ~~~xml <lexicon xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" version="1.0" alphabet="ipa" xml:lang="en"> <lexeme regex="true"> <grapheme>([0-9]+)-([0-9]+)</grapheme> <alias>between $1 and $2</alias> </lexeme> </lexicon> ~~~ The regex feature works only with alias-based substitutions. The regex syntax used is that from [XQuery 1.0 and XPath 2.0](https://www.w3.org/TR/xpath-functions/#regex-syntax). Whether or not the regex attribute is set to "true", the grapheme matching can be made more accurate by specifying the "positive-lookahead" and "negative-lookahead" attributes: ~~~xml <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" alphabet="ipa" xml:lang="en"> <lexeme> <grapheme positive-lookahead="[ ]+is">SB</grapheme> <alias>somebody</alias> </lexeme> <lexeme> <grapheme>SB</grapheme> <alias>should be</alias> </lexeme> <lexeme xml:lang="fr"> <grapheme positive-lookahead="[ ]+[cC]ity">boston</grapheme> <phoneme>bɔstøn</phoneme> </lexeme> </lexicon> ~~~ Graphemes with "positive-lookahead" will match if the beginning of what follows matches the "position-lookahead" pattern. Graphemes with "negative-lookahead" will match if the beginning of what follows does not match the "negative-lookahead" pattern. The lookaheads are case-sensitive while the grapheme contents are not. The lexemes are matched in this order: 1. Graphemes with regex="false" come first, no matter if there is a lookahead or not; 2. then come graphemes with regex="true" and no lookahead; 3. then graphemes with regex="true" and one or two lookaheads. Within these categories, lexemes are matched in the same order as they appear in the lexicons."><p px:role="desc" xml:space="preserve">A list of PLS lexicons to take into account. Must be a space separated list of URIs, absolute or relative to the input. Lexicons can also be attached to the source document, using a ['link' element](http://kb.daisy.org/publishing/docs/text-to-speech/pls.html#ex-07). PLS lexicons allow you to define custom pronunciations of words. It is meant to help TTS processors deal with ambiguous abbreviations and pronunciation of proper names. When a word is defined in a lexicon, the processor will use the provided pronunciation instead of the default rendering. The syntax of a PLS lexicon is defined in [Pronunciation Lexicon Specification (PLS) Version 1.0](https://www.w3.org/TR/pronunciation-lexicon), extended with regular expression matching. To enable regular expression matching, add the "regex" attribute, as follows: ~~~xml <lexicon xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" version="1.0" alphabet="ipa" xml:lang="en"> <lexeme regex="true"> <grapheme>([0-9]+)-([0-9]+)</grapheme> <alias>between $1 and $2</alias> </lexeme> </lexicon> ~~~ The regex feature works only with alias-based substitutions. The regex syntax used is that from [XQuery 1.0 and XPath 2.0](https://www.w3.org/TR/xpath-functions/#regex-syntax). Whether or not the regex attribute is set to "true", the grapheme matching can be made more accurate by specifying the "positive-lookahead" and "negative-lookahead" attributes: ~~~xml <lexicon version="1.0" xmlns="http://www.w3.org/2005/01/pronunciation-lexicon" alphabet="ipa" xml:lang="en"> <lexeme> <grapheme positive-lookahead="[ ]+is">SB</grapheme> <alias>somebody</alias> </lexeme> <lexeme> <grapheme>SB</grapheme> <alias>should be</alias> </lexeme> <lexeme xml:lang="fr"> <grapheme positive-lookahead="[ ]+[cC]ity">boston</grapheme> <phoneme>bɔstøn</phoneme> </lexeme> </lexicon> ~~~ Graphemes with "positive-lookahead" will match if the beginning of what follows matches the "position-lookahead" pattern. Graphemes with "negative-lookahead" will match if the beginning of what follows does not match the "negative-lookahead" pattern. The lookaheads are case-sensitive while the grapheme contents are not. The lexemes are matched in this order: 1. Graphemes with regex="false" come first, no matter if there is a lookahead or not; 2. then come graphemes with regex="true" and no lookahead; 3. then graphemes with regex="true" and one or two lookaheads. Within these categories, lexemes are matched in the same order as they appear in the lexicons.</p> </p:documentation> </p:option> <p:option name="include-tts-log" select="p:system-property('d:org.daisy.pipeline.tts.log')" px:type="boolean"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Enable TTS log</h2> <p px:role="desc" xml:space="preserve">Whether or not to make the TTS log available. The TTS log contains a great deal of additional information that is not present in the main job log and that is helpful for troubleshooting. Most of the log entries concern particular chunks of text of the input document. The default can be changed using the [`org.daisy.pipeline.tts.log`](http://daisy.github.io/pipeline/Get-Help/User-Guide/Text-To-Speech/#common-settings) property. </p> </p:documentation> </p:option> <p:output port="tts-log" sequence="true"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">TTS log</h2> <p px:role="desc">Log file with information about text-to-speech process.</p> </p:documentation> <p:pipe step="status" port="tts-log"/> </p:output> <p:serialization port="tts-log" indent="true" omit-xml-declaration="false"/> <p:option name="epub3" required="true" px:output="result" px:type="anyDirURI"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Intermediary EPUB 3 with media-overlays</h2> <p>Note that the conversion may fail but still output a EPUB 3 document.</p> </p:documentation> </p:option> <p:option name="daisy202" required="true" px:output="result" px:type="anyDirURI"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">DAISY 2.02</h2> </p:documentation> </p:option> <p:option name="daisy3" required="true" px:output="result" px:type="anyDirURI"> <p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">DAISY 3</h2> </p:documentation> </p:option> <p:output port="validation-report" sequence="true" px:media-type="application/vnd.pipeline.report+xml"><p:documentation xmlns="http://www.w3.org/1999/xhtml"> <h2 px:role="name">Validation reports</h2> </p:documentation> <p:pipe step="load" port="validation-report"/> </p:output> <p:output port="status" px:media-type="application/vnd.pipeline.status+xml"> <p:pipe step="status" port="result"/> </p:output> <p:import href="http://www.daisy.org/pipeline/modules/epub-utils/library.xpl"> <p:documentation> px:epub-load </p:documentation> </p:import> <p:import href="http://www.daisy.org/pipeline/modules/fileset-utils/library.xpl"> <p:documentation> px:fileset-store px:fileset-delete </p:documentation> </p:import> <p:import href="epub-to-daisy.xpl"> <p:documentation> px:epub-to-daisy </p:documentation> </p:import> <px:epub-load name="load" px:message="Loading EPUB" px:progress="1/20"> <p:with-option name="href" select="$source"/> <p:with-option name="validation" select="not($validation='off')"/> <p:with-option name="temp-dir" select="$temp-dir"/> </px:epub-load> <p:sink/> <p:identity> <p:input port="source"> <p:pipe step="load" port="validation-status"/> </p:input> </p:identity> <p:choose> <p:when test="/d:validation-status[@result='error']"> <p:choose> <p:when test="$validation='abort'"> <p:identity px:message="The EPUB input is invalid. See validation report for more info." px:message-severity="ERROR"/> </p:when> <p:otherwise> <p:identity px:message="The EPUB input is invalid. See validation report for more info." px:message-severity="WARN"/> </p:otherwise> </p:choose> </p:when> <p:otherwise> <p:identity/> </p:otherwise> </p:choose> <p:choose name="status" px:progress="19/20"> <p:when test="/d:validation-status[@result='error'] and $validation='abort'"> <p:output port="result" primary="true"/> <p:output port="tts-log" sequence="true"> <p:empty/> </p:output> <p:identity/> </p:when> <p:otherwise> <p:output port="result" primary="true"/> <p:output port="tts-log" sequence="true"> <p:pipe step="convert" port="tts-log"/> </p:output> <px:epub-to-daisy name="convert" px:message="Converting to DAISY" px:progress="15/19"> <p:input port="source.fileset"> <p:pipe step="load" port="result.fileset"/> </p:input> <p:input port="source.in-memory"> <p:pipe step="load" port="result.in-memory"/> </p:input> <p:with-option name="epub3-output-dir" select="$epub3"/> <p:with-option name="daisy202-output-dir" select="$daisy202"/> <p:with-option name="daisy3-output-dir" select="$daisy3"/> <p:with-option name="tts" select="$tts"/> <p:input port="tts-config"> <p:pipe step="main" port="tts-config"/> </p:input> <p:with-option name="include-tts-log" select="$include-tts-log='true'"/> <p:with-option xmlns:_="tts" name="stylesheet" select="string-join( for $s in tokenize($_:stylesheet,'\s+')[not(.='')] return resolve-uri($s,$source), ' ')"/> <p:with-option name="stylesheet-parameters" select="$stylesheet-parameters"/> <p:with-option name="lexicon" select="for $l in tokenize($lexicon,'\s+')[not(.='')] return resolve-uri($l,$source)"/> <p:with-option name="temp-dir" select="$temp-dir"/> </px:epub-to-daisy> <px:fileset-store name="store-epub3" px:message="Storing intermediary EPUB 3" px:progress="1/19"> <p:input port="fileset.in"> <p:pipe step="convert" port="epub3.fileset"/> </p:input> <p:input port="in-memory.in"> <p:pipe step="convert" port="epub3.in-memory"/> </p:input> </px:fileset-store> <p:choose name="store-daisy202" px:progress="1/19"> <p:xpath-context> <p:pipe step="convert" port="daisy202.fileset"/> </p:xpath-context> <p:when test="//d:file"> <p:output port="result"> <p:pipe step="store" port="fileset.out"/> </p:output> <px:fileset-store name="store" px:progress="1" px:message="Storing DAISY 2.02"> <p:input port="fileset.in"> <p:pipe step="convert" port="daisy202.fileset"/> </p:input> <p:input port="in-memory.in"> <p:pipe step="convert" port="daisy202.in-memory"/> </p:input> </px:fileset-store> </p:when> <p:otherwise> <p:output port="result"/> <p:identity> <p:input port="source"> <p:pipe step="convert" port="daisy202.fileset"/> </p:input> </p:identity> </p:otherwise> </p:choose> <p:choose name="store-daisy3" px:progress="1/19"> <p:xpath-context> <p:pipe step="convert" port="daisy3.fileset"/> </p:xpath-context> <p:when test="//d:file"> <p:output port="result"> <p:pipe step="store" port="fileset.out"/> </p:output> <px:fileset-store name="store" px:progress="1" px:message="Storing DAISY 3"> <p:input port="fileset.in"> <p:pipe step="convert" port="daisy3.fileset"/> </p:input> <p:input port="in-memory.in"> <p:pipe step="convert" port="daisy3.in-memory"/> </p:input> </px:fileset-store> </p:when> <p:otherwise> <p:output port="result"/> <p:identity> <p:input port="source"> <p:pipe step="convert" port="daisy3.fileset"/> </p:input> </p:identity> </p:otherwise> </p:choose> <p:identity cx:depends-on="store-epub3"> <p:input port="source"> <p:pipe step="convert" port="temp-audio-files"/> </p:input> </p:identity> <p:identity cx:depends-on="store-daisy3"/> <px:fileset-delete cx:depends-on="store-daisy202" name="delete-temp-files" px:progress="1/19" px:message="Cleaning up temporary files"/> <p:identity cx:depends-on="delete-temp-files"> <p:input port="source"> <p:inline><d:validation-status result="ok"/></p:inline> </p:input> </p:identity> </p:otherwise> </p:choose> </p:declare-step>