Theological Markup Language (ThML)
This document describes Theological Markup Language, a new markup
language that is being used to mark up texts for the Christian
Classics Ethereal Library and other projects.
This XML application can be thought of as HTML with additions for
electronic books and rich digital libraries, with special
support for theological needs such as scripture references and
Strongs numberings. When books have been prepared in ThML, many new
features become possible in the CCEL: subject and
scripture reference indexes for books and for the whole library,
intelligent searching, automatic conversion to other formats, "lining
up" documents in various ways such as parallel columns, and more.
Most of the work of preparing electronic texts for conversion to ThML
can be done in a word processor. These simple
guidelines are a good starting point. ThML documents can also be
prepared with an XML editor or text editor. The tools used with ThML
documents run under Unix or Windows. Tools currently exist for
converting from Word/RTF to ThML and from ThML to HTML webs.
The following information is available on ThML and related topics:
Resources.
- Perl programs for "finalizing" a ThML document for use on the
CCEL: progs-2004-09.zip
- A perl program -- mkmeta -- which
will help you enter the bibliographic information for a book and
create the ThML head section, stored in a file called bookID.meta
- A perl program -- thm2htm -- which will
convert a ThML file to an HTML web. It builds a table of contents
and indexes, handles notes appropriately, links scripture references
to the Bible Gateway, etc.
- Tools for
conversion of Word 2000 documents to ThML and HTML webs. These are
perl scripts configured for unix, though they would work under windows
if the control script h2h were converted to h2h.bat as needed. These are still likely to have problems, and best used by someone who knows perl, but they
make nice HTML editions -- see, for example,
NPNF1-01.
Also, some documentation on their use is
available.
Tutorial: step-by-step instructions for
converting a Microsoft Word 2000 document to ThML.
Prerequisites: you must have Word 2000 and perl installed.
(Steps are being linked as they are created; it is currently incomplete.)
Sample ThML documents. Several documents are available in
ThML -- these have an xml extension. They have been converted to
HTML webs, and in many cases an RTF source version is available as
well. If so, the ThML version is derived from the RTF version by
programs listed below. The HTML version is linked below; other
versions are accessible from the title page.
You should download the xml files to your disk and view them
with a text editor -- some browsers have trouble loading XML.
Note also that ThML which is automatically generated from an RTF
document is a bit grungy; clean, elegant ThML can only be created by
hand editing. See for example Watts' Psalms and Hymns, below.
Other Projects using ThML
For the SGML/XML types:
There is an XML DTD for ThML. The current version of the DTD (1.0)
is available (ThML10.zip).
This DTD includes (is a superset of, for the most part) the
Voyager (XML) DTD of HTML 4.0.
Software for validating ThML documents, converting
ThML to HTML, etc. is available for DOS (thmlx.zip) or unix (thmlx.tgz). This software has remaining bugs,
but it works reasonably well.
In the unix case, you will need to install the SP package -- see How to get SP.
You will also need to install rtf2xml if you are starting
from Word files.
Help Wanted
- Try your hand at editing a ThML document.
- Write scripts to convert content to ThML. For example, write a
script to convert the Online Bible text files to ThML.
(Check with CCEL first about copyright, etc.).
- Prepare font mapping files to work with Skip Gaeda's program that
maps Greek and Hebrew fonts into Unicode. Or work on a program that
goes the other way: Unicode to particular Greek/Hebrew fonts.
- Write a Microsoft Word/Visual Basic for Applications program to
convert older CCEL word documents to the stylesheet used for ThML.
- Write Word macros to help format documents in Word for ThML.