Everything hsc knows about HTML, it retrieves from a file named hsc.prefs at startup. This file contains information about all tags, entities and icon entites. Additionally, some special attributes are set up there also.
It is a serious problem about HTML that no one can give you competent answer to the question ``Now which tags are part of HTML?''. On the one hand, there is w3c, which you meanwhile can ignore, on the other hand, there are the developers of popular browsers, which implement whatever they just like.
The hsc.prefs coming with this distribution should support most elements needed for everyday use. With the hsc V0.923 release, the prefs have been updated to HTML 4.01; since V0.925 there has also been support for automatic distinction between ``classic'' HTML and XHTML. If you run hsc in XHTML mode, some obsolete attributes will not be known any more, and new ones added.PrefsFile, hsc will look in several places when trying to open hsc.prefs:
PROGDIR:, which is automatically assigned to the same directory where the hsc binary resides when hsc is invoked
If it is unable to find hsc.prefs anywhere, it will abort with an error message.
If you want to find out where hsc has read hsc.prefs from, you can use STATUS=VERBOSE when invoking hsc. This will display the preferences used.
This tag defines
a new entity. The (required) attribute
NAME declares the
name of the entity,
RPLC the character that should be
replaced by this entity if found in the hsc-source and
is the numeric representation of this entity.
NUM may be in the
range 128-65535, allowing for any Unicode (UCS-2 to be exact) character to be
assigned a corresponding entity. Definitions in the range 128-255 are done in the
prefs-file to allow users with character sets other than ISO-8859-1 (Latin-1)
to change the replacement characters; some other characters such as
mathematical symbols or typographical entities are predefined internally by
hsc. They reside at fixed positions in the Unicode charset and are unlikely to
<$defent NAME="uuml" RPLC="ü" NUM="252">
commandline option affects the way hsc will render entities in the resulting
HTML file. Setting the
PREFNUM attribute for an entity will make it
use the numeric representation if
ENTITYSTYLE=replace, no matter
what representation was used in the source text.
Unlike previous versions, hsc 0.931 and later allow redefinition of
entities. In this case, symbolic and numeric representation must match the
previous definition; only the
PREFNUM flag and the
RPLC character will be updated. This allows to change the default
rendering/replacement of internally defined entities.
Warning #92 will be issued and should be
ignored if you really want to do this.
This tag defines
a new icon-entity. The only (required) attribute is
which declares the name of the icon.
This tag defines
a new tag, and is used quite similar to
<$macro>, exept that a
tag-definition requires no macro-text and end-tag to be followed.
<$deftag IMG SRC:uri/x/z/r ALT:string ALIGN:enum("top|bottom|middle") ISMAP:bool WIDTH:string HEIGHT:string>
For those, who are not smart enough or simply to lazy, here are some simple examples, which should also work somehow, though some features of hsc might not work:
<$deftag BODY /CLOSE BGCOLOR:string> <$deftag IMG SRC:uri ALT:string ALIGN:string ISMAP:bool>
This tag lets you define a new CSS property and optionally a list of values
that are allowed for it. If you omit the
VAL attribute, any value
will be permitted. Otherwise it should be a list in pretty much the same style as for
enum parameters: words (which may include spaces) separated by
<$defstyle name="text-align" val="left|center|right|justify"> <$defstyle name="text-indent" val="%P"> <$defstyle name="clip" val="%r|auto">
text-align property has a short list of four possible
values, so they are simply listed as an enumeration.
the other hand is numeric, so its values cannot be listed exhaustively.
Therefore, a special code reminding of C-style formatting strings is used. The
following are supported:
%n, but also allows percentages e.g.
background-color. One of
#rgb'' or ``
rgb(r,g,b)'', where each of r, g and b may be a decimal value between 0 and 255 or a percentage between 0 and 100.
)'', e.g. for
background-image. Only partly implemented as of V0.935!
rect(a,b,c,d)'' with a, b, c and d being numeric specs with a dimension, e.g. for
clip. Unimplemented as of V0.935!
Note:If both these percent-codes and an enumeration of
values are used, as for ``
clip'', the percent-code
must be the first element!
This tag defines an attribute list shortcut to support your laziness
when editing the prefs file. It allows to collect an arbitrary number of
attribute declarations under a single name that you can use later in
<$macro> tags by putting the shortcut name in
<$deftag THEAD /AUTOCLOSE /LAZY=(__attrs) /MBI="table" [HVALIGN]>
This is the same as:
<$deftag THEAD /AUTOCLOSE /LAZY=(__attrs) /MBI="table"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
Browsers should read that line, obtain the DTD and parse the source according to it. The problem about DTDs: they are written in SGML. And the problem about SGML: It's awful. It's unreadable. It's a pure brain-wanking concept born by some wireheads probably never seriously thinking about using it themselves. Even when there is free code available to SGML-parse text.
As a result, only less browsers did support this because it was too easy to write a browser spitting on the SGML-trash, simply parsing the code ``tag-by-tag'', developers decided to spend more time on making their product more user-friendly than computer-friendly (which is really understandable).
These browsers became even more popular when they supported tags certain people liked, but were not part of DTDs. As DTDs were published by w3c, and w3c did not like those tags, they did not made it into DTDs for a long time or even not at all (which is really understandable, too).
This did work for a certain degree until HTML-2.0. Several people (at least most of the serious w3-authoring people) did prefer to conform to w3c than use the funky-crazy-cool tags of some special browsers, and the funky-crazy-cool people did not care about DTDs or HTML-validators anyway.
However, after HTML-2.0, w3c fucked up. They proposed the infamous HTML-3.0 standard, which was never officially released, and tried to ignore things most browsers did already have implemented (which not all of them were useless crap, I daresay.). After more than a year without any remarkable news from w3c, they finally canceled HTML-3.0, and instead came out with the pathetic HTML-0.32.
Nevertheless, many people were very happy about HTML-0.32, as it finally was a statement after that many things became clear. It became clear that you should not expect anything useful from w3c anymore. It became clear that the browser developers rule. It became clear that no one is going to provide useful DTDs in future, as browser developers are too lazy and incompetent to do so. It became clear that anarchy has broken out for HTML-specifications.So, as a conclusion, reasons not to use DTDs but an own format are:
Quite unexpected, with HTML-4.0 this has changed for some extent, as the DTDs are quite readable and well documented. The general syntax of course still sucks, error handling is unbearable for ``normal'' users and so on. Although it will take them more than this to get back the trust they abused in the recent years, at least it is a little signal suggesting there are some small pieces of brain intact somewhere in this consortium.
There is also a disadvantage of this concept: reading hsc.prefs every time on startup needs an awful lot of time. Usually, processing your main data takes shorter than reading the preferences. You can reduce this time, if you create your own hsc.prefs with all tags and entities you don't need removed. But I recommend to avoid this because you might have to edit your preferences again with the next update of hsc, if any new features have been added.