htmltoc allows you to specify "significant elements" that will be hyperlinked to in
a "Table of Contents" (ToC) for a given set of HTML documents.
Basically, the ToC generated is a multi-level level list containing links to the
significant elements. htmltoc inserts the links into the ToC to significant elements
at a level specified by the user.
H1s are specified as level 1, than they appear in the first level list of the
ToC. If H2s are specified as a level 2, than they appear in a second level list
in the ToC.
htmltoc what are the significant elements and at
what level they should occur in the ToC.
In standard operation, the created ToC is sent to standard output, or to the file specified by the -toc command-line option. For more information on controlling the contents of the created ToC, see Formating the ToC.
htmltoc also supports the ability to incorporate the ToC into the HTML
document itself via the -inline command-line option. This only works if a single
HTML file is being processed. See Inlining the ToC for more information.
In order for htmltoc to support linking to significant elements, htmltoc inserts
anchors into the significant elements. Since this requires modification of the
original HTML document(s), the originals are backed up with a ".org" suffix
appended to the filenames.
The following sections give more information on htmltoc:
htmltoc is invoked from a Unix shell, with the following syntax:
% htmltoc [options] file ... > tocfile
% htmltoc -toc tocfile [options] file ...
% htmltoc -inline [options] file
The following options are available:
-footer filename
-header filename
-help
htmltoc.
-inline
-noorg
htmltoc normally backs up the original HTML files with a ".org"
extension. When this option is specified, htmltoc will remove the backup
files when done.
-ol
OL).
-prefix string
NAME/HREF attributes in
anchors (A) for linking ToC entries to the document(s). The default prefix
is "xtocid".
-quiet
htmltoc normally prints out informative messages on what it is doing.
This option suppress these messages.
-textonly
htmltoc by default, preserves HTML markup that exists in a significant
element to appear in the ToC. This options tells htmltoc to only use the
textual content of the ToC element. If this option is not specified, htmltoc
will still ignore the following tags: A, HR, P, IMG.
-title string
TITLE element) of the generated ToC document to string.
This option has no affect if the -header or -inline options are specified. The
default title is "Table of Contents".
-toc file
htmltoc to output the ToC to file.
-toclabel string
<H1>Table of Contents</H1>".
-tocmap filename
htmltoc only indexes H1s
and H2s. See ToC Map File for more information.
-useorg
htmltoc to use the ".org" backup files already existing.
In normal operation, htmltoc copies the files to be processed to the same
filenames with ".org" suffixes. Then, htmltoc reads the ".org" files to
find significant elements, and writes the new (modified) files to the
filenames without the ".org" suffix. This operation gives the appearance
that the files were editted in-place.
In other words, the -useorg option tells htmltoc not to perform the
initial copying of the files to ".org" files. However, if a ".org" file does not
exist for a given file, htmltoc will perform the initial copy operation.
When htmltoc is running, htmltoc will normally output some informative
messages on what htmltoc is doing, or done. These messages can be suppressed
via the -quiet option.
htmltoc what significant elements to include in
the ToC, what level they should appear in the ToC, and any text to include before
and/or after the ToC entry. The format of the map file is as follows:
significant_element:level:sig_element_end:before_text,after_text significant_element:level:sig_element_end:before_text,after_text ...Each line of the map file contains a series of fields separated by the `
:' character.
The definition of each field is as follows:
H1, H2, H5.
This field is case-insensitive.
DT tag is a marker in HTML and not a container. However,
one can index DT sections of a definition list by using the value DD in the
sig_element_end field (this does assume that each DT has a DD following it).
If the sig_element_end is empty, then the corresponding end tag of the
specified significant_element is used. Example: If H1 is the
significant_element, than htmltoc looks for a "</H1>" for terminating the
significant_element.
Caution: the sig_element_end value should not contain the `<` and `>' tag
delimiters. If you want the sig_element_end to be the end tag of another
element than that of the significant_element, than use "/element_name".
The sig_element_end field is case-insensitive.
,' character (which implies a comma cannot be contained
in the before/after text). See examples following for the use of this field.
Following are a few examples to help illustrate how a ToC map file works.
htmltoc uses if no map file is
explicitly specified:
# Default mapping for htmltoc # Comments can be inserted in the map file via the '#' character H1:1 # H1 are level 1 ToC entries H2:2 # H2 are level 2 ToC entries
# A ToC map file that adds some formatting H1:1::<STRONG>,</STRONG> # Make level 1 ToC entries <STRONG> H2:2::<EM>,</EM> # Make level 2 entries <EM> H2:3 # Make level 3 entries as is
# A ToC map file that can work for Glossary type documents
H1:1
H2:2
DT:3:DD:<EM>,</EM> # Assumes document has a DD for each DT, otherwise ToC
# will get entries with alot of text.
# A ToC map file that wraps ToC entries in header tags. This is illegal # HTML, but it looks pretty good in Mosaic. H1:1::<H3>,</H3> H2:2::<H4>,</H4> H3:3::<H5>,</H5>
htmltoc has other options to affect the final appearance of the ToC file created.
With the -header option, htmltoc will prepend the contents of the file before the
generated ToC. This allows you to have introductory text, or any other text, before
the ToC.
HTML tag, the HEAD element (containing the TITLE element), and
the opening BODY tag. However, these tags/elements should not be in the
header file if the -inline options is used. See Inlining the ToC for
information on what the header file should contain for inlining the ToC.
htmltoc will append the contents of the file after the
generated ToC.
BODY and HTML
tags.
htmltoc will add the appropriate HTML markup to if either the -header or -footer
option is not specified to insure a valid HTML document is created for the ToC.
If you do not want/need to deal with header, and footer, files, then htmltoc
allows you specify the title, -title option, of the ToC file; and it allows you to specify
a heading, or label, to put before ToC entries' list, the -toclabel option. Both options
have default values, see Usage for more information on each option.
htmltoc supports the ability to incorporating the ToC directly into an HTML
document via the -inline option. Inlining can only occur if one, and ONLY one,
HTML file is being processed, AND the HTML file contains an opening BODY tag.
The ToC generated is inserted right after the opening BODY tag, and before any
other HTML markup in the file. If the -header option is specified, then the contents
of the specified file are inserted after the BODY tag, but before the ToC. Otherwise,
htmltoc inserts the text specified by the -toclabel option.
HTML tag and HEAD
element since the HTML file being processed should already contains
these tags/elements.
htmltoc.
htmltoc is smart enough to detect anchors inside significant elements. If
the anchor defines the NAME attribute, htmltoc uses the value. Else, it
adds its own NAME attribute to the anchor.
htmltoc will not process files related to command-line options if they are
also specified to be processed for ToC significant elements. Example: The
command, "htmltoc -header header.html -toc toc.html
*.html" will cause header.html and toc.html to be included in the
HTML files to processed due to shell filename globbing of "*.html".
htmltoc is smart of enough to detect this, and exempt header.html
and toc.html from being processed for ToC significant elements.
TITLE element is treated specially if specified in the ToC map file. It
is illegal to insert anchors (A) into TITLE elements. Therefore, htmltoc
will actually link to the filename itself instead of the TITLE element of the
document.
htmltoc will ignore significant elements if it does not contain any
non-whitespace characters. A warning message is generated if such a
condition exists.