html2latex - user manual
Running html2latex
Program accepts following command line arguments:
- -input <fileName> input HTML file
- -output <fileName> output LaTeX file
- -css <fileName> file with CSS (optional)
- -config <file> file with configuration (optional -
implicitly takes config.xml)
Configuration
All of the configuration options are stored in a XML file (config.xml).
The format of the configuration is described at the beginning of this file. It's necessary
for the file to be well-formed and to match its XML Schema (config.xsd).
Features
- CSS supported (but only properties defining formatting)
- only basic style names supported (ie.
.class,
p.class,
#id,
p#id), multiple
style names supported (ie. h1, h2, { ... }),
entries like ul li li, p>span not supported
- mappings between HTML tags and LaTeX commands can be defined
- mappings between HTML entities (both named and numerical) and LaTeX commands can be defined
- mappings between CSS properties and LaTeX commands can be defined
- hyperlinks can be converted to footnotes, bibliography items, links using
hyperref package or ignored
- HTML comments are converted to LaTeX comments
- HTML comments starting with "LaTex:" (case is ignored) are put in the output
file like non-comments (so it's possible to include LaTeX commands in HTML file)
- program tries to recognize badly formed HTML documents, still it's
strongly recommended to convert valid (or at least completely well-formed)
documents, ie.
<b><i>foo</b></i>
will be converted as
<b><i>foo</i></b>
- title and cite attributes are
converted to footnotes
Conversion
Tables
Special table
tag attribute latexcols
is recognized. Its content
is put just after the \begin{tabular}
command.
Example:
<table latexcols='|l|p{5cm}|r|' border='1'>