Step html-load {http://www.daisy.org/ns/pipeline/xproc}

Creates a fileset document for a set of HTML documents.

Defined in: http://www.daisy.org/pipeline/modules/html-utils/library.xpl

Input Ports

Port Description
source.fileset primary

The input fileset containing the HTML files (marked with media-type="text/html" or media-type="application/xhtml+xml").

Will also be used for loading other resources. If files are present in memory, they are expected to be c:data documents. Only when files are not present in this fileset, it will be attempted to load them from disk.

source.in-memory sequence

Output Ports

Port Description
result.fileset primary

The result fileset with the HTML files and all the resources referenced from the HTML. Some media types are inferred – users may have to apply additional type detection. A @kind attribute is used to annotate the kind of resource:

  • stylesheet
  • media
  • image
  • video
  • audio
  • script
  • content
  • description
  • text-track
  • animation
  • font

Only contains resources that actually exist on disk. The HTML documents are loaded into memory. The original-href attributes reflects which files are stored on disk.

result.in-memory sequence