Some Use-Cases

This chapter gives some hints how you can do certain things you might be interested in. It is probably only interesting for experienced users.

Multi-lingual Documents

This will shortly outline how to create multiple versions of one source file, for example to support more than one language. For example, you want to have a document with a picture being the same in all flavours, but a text depending on the language. Maybe you have a source file hugo.hsc with the interesting part looking like
<img src="hugo.gif" alt="hugo>
<english>This is Hugo.</english>
<suomi>Tämä on Hugo.</suomi>
and as result, you want two documents: an English one
<img src="hugo.gif" alt="hugo>
This is Hugo.
and a Finnish one
<img src="hugo.gif" alt="hugo>
Tämä on Hugo.
This can easily be achieved by defined two macro sets, one being stored as english.hsc
<$macro english /close><$content></$english>
<$macro suomi /close></$suomi>
and another one stored as suomi.hsc
<$macro english /close></$english>
<$macro suomi /close><$content></$suomi>

The first one defines two container macros, with <english> simply every time inserting the whole content passed to it, and <suomi> always removing any content enclosed in it.

If you now invoke hsc with a call like
hsc english.hsc hugo.hsc to en-hugo.html
it will look like the first output document described above. To gain a result looking like the second one, you only have to use
hsc suomi.hsc hugo.hsc to fi-hugo.html

This is simply because the macros declared in suomi.hsc work just the other way round like those in english.hsc: everything enclosed in <english> will be ignored, and everything being part of <suomi> remains.

Of course you can have multiple occurrences of both macros, and of course you can define similar macros for other languages.

Extended Character Encodings

This version of hsc officially only supports Latin-1 as input character set. The exact definition of that is a bit messy, but basically it refers to most of those 255 characters you can input on your keyboard.

For this character set, all functions described herein should work, especially the commandline option RplcEnt.

Although Latin-1 is widely used within most decadent western countries, it does not provide all characters some people might need. For instance those from China and Japan, as their writing systems work completely different.

As the trivial idea if Latin was to use 8 bit instead of the rotten 7 bit of ASCII (note that the ``A'' in ASCII, stands for American), the trivial idea of popular encodings like JIS, Shift-JIS or EUC is to use 8 to 24 bit to encode one character.

Now what does hsc say if you feed such a document to it?

Unless you do not specify RPLCENT, it should work without much bothering about it. However, you will need a w3-browser that also can display these encodings, and some fiddling with <META> and related tags.

If you think you are funny and enable RPLCENT, hsc will still not mind your input. But with great pleasure it will cut all your nice multi-byte characters into decadent western 8-bit ``cripplets'' (note the pun). And your browser will display loads of funny western characters - but not a single funny Japanese one.

Recently an old western approach to these encodings problems has gained popularity: Unicode - that's the name of the beast - was created as some waste product of the Taligent project around 1988 or so, as far as I recall.

Initially created as an unpopular gadget not supported by anything, it is now in everybody's mouth, because Java, the language-of-hype, several MS-DOS based operating systems and now - finally - the rotten hypertext language-of-hype support it. At least to some limited extent. (Technical note: Usually you only read of UCS-2 instead of UCS-4 in all those specifications, and maybe some blurred proposals to use UTF-16 later.)

As hsc is written in the rotten C-language (an American product, by the way), it can not cope with zero-bytes in its input data, and therefore is unable to read data encoded in UCS-4, UTF-16 or (würg, kotz, reiha) UCS-2; it simply will stop after the first zero in the input.

Because the rotten C-language is so widely used, there are some zero-byte work-around formats for Unicode, most remarkably UTF-8 and UTF-7. These work together with hsc, although with the same limitations you have to care for when using the eastern encodings mentioned earlier. Read: Don't use the option RplcEnt.

Note that it needs at least five encodings to make Unicode work with most software - again in alphabetical order: UCS-2, UCS-4, UTF-16, UTF-7 and UTF-8. I wonder what the ``Uni'' stands for...

Anyway, as conclusion: you can use several extended character sets, but you must not enable RPLCENT.

Html 4.0

Once upon a time, HTML-4.0 was released, and it sucked surprisingly less (as far as ``sucks less'' is applicable at all to HTML). Of course there still is no browser capable of displaying all these things, but nevertheless you can use hsc to author for it - with some limitations. This will shortly outline how.

As already mentioned, HTML now supports those extended character encodings. See above how to deal with input files using such an encoding, and which to avoid.

If your system does not allow you to input funny characters (for instance one can easily spend ATS 500.000 on a Workstation just for being absolutely unable to enter a simple ``ä''), you can use numeric entities, both in their decimal or hexadecimal representation: for example, to insert a Greek Alpha, you can use &#913 or &#x391, hsc will accept both. However, you still can not define entities beyond 8-bit range using <$defent>.

Some highlights are that the ALT attribute of <IMG> is no required and that there are now loads of ``URIs'' instead of ``URLs'' around. Nothing new for old hsc-users... he he he.

Another interesting thing is that the DTD now contains some meta-information that was not part of earlier DTDs so it maybe can make sense to use the DTD as a base for a hsc.prefs converter.

XHTML

2002 update: seems the W3 committee has learned a thing or two. XHTML has been out for a while now, and they are working on the 2.0 specification. While the chances of turning the official DTD into an HSC prefs file using a dumb ARexx script have gotten even slimmer (anyone for a real parser using Expat or something?), XHTML seems a move in the right direction, regarding the separation of content and presentation and putting an end to the ``tag soup'' that much of the Web is today. It remains to be seen how successful it will be. HSC now has some rudimentary support for authoring XHTML documents, mainly regarding lowercase tag and attribute names and the new empty-tag syntax with a trailing slash, as in ``<br />''. CSS support should be better though, perhaps some automatic rewriting of obsolete presentation attributes to CSS <style> tags...

Creating A Postscript Version

As you can now optionally read this manual in a Postscript version, there might be some interest how it was done.

The rudimentarily bearable application used for conversion is (very originally) called html2ps and can be obtained from http://www.tdb.uu.se/~jan/html2ps.html. As common with such tools, ``it started out as a small hack'' and ``what really needs to be done is a complete rewriting of the code", but "it is quite unlikely that this [...] will take place''. The usual standard disclaimer of every public Perl-script. All quotes taken from the manual to html2ps.

Basically the HTML and the Postscript-version contain the same words. However, there are still some differences, for example the printed version does not need the toolbar for navigation provided at the top of every HTML page.

Therefore, I wrote two macros, <html-only> and <postscript-only>. The principle works exactly like the one described for <english> and <suomi> earlier in this chapter, and you can find them in docs-source/inc/html.hsc and docs-source/inc/ps.hsc.

However, there is a small difference to the multi-lingual examples, as I do not really want to create two versions all the time. Instead, I prefer to create either a fully hypertext featured version or a crippled Postscript-prepared HTML document in the same location.

You can inspect docs-source/Makefile how this is done: if make is invoked without any special options, the hypertext version is created. But if you instead use make PS=1 and therefor define a symbol named PS, the pattern rule responsible for creating the HTML documents acts differently and produces a reduced, Postscript-prepared document without toolbar.

Basically, the rule looks like this:
$(DESTDIR)%.html : %.hsc
ifdef PS
        @$(HSC) inc/ps.hsc   $(HSCFLAGS) $<
else
        @$(HSC) inc/html.hsc $(HSCFLAGS) $<
endif

Needless to say that the conditional in the Makefile does not work with every make - I used GNUmake for that, your make-tool maybe has a slightly different syntax.

For my convenience, there are two rules called rebuild and rebuild_ps with their meanings being obvious: they rebuild the whole manual in the desired flavour.

So after a successful make rebuild_ps, everything only waits for html2ps. Maybe you want to have a look at the docs-source/html2ps.config used, although it is strait forward and does not contain anything special. This should not need any further comments, as there is a quite useful manual supplied with it.

However, making html2ps work with an Amiga deserves some remarks. As you might already have guessed, you will need the Perl-archives of GG/ADE - no comments on that, everybody interested should know what and where GG is.

I suppose you can try the full Unix-alike approach with hsc compiled for AmigaOS/ixemul and GG more or less taking over your machine, and therefor directly invoke perl. This will require a rule like
ps :
        html2ps -W l -f html2ps.config -o ../../hsc.ps ../docs/index.html

As I am a dedicated hater of this, I used the AmigaOS-binary, a SAS-compiled GNUmake and the standard CLI. A usually quite successful way to make such things work is with the help of ksh, which, for your confusion, is in a archive at GG called something like pdksh-xxx.tgz (for ``Public Domain ksh''). Invoking ksh with no arguments will start a whole shell-session (würg!), but you can use the switch -c to pass a single command to be executed. After that, ksh will automatically exit, and you are back in your cosy CLI, just as if nothing evil had had happened seconds before.

So finally the rule to convert all those HTML files into one huge Postscript file on my machine is:

ps :
        ksh -c "perl /bin/html2ps -W l -f html2ps.config -o ../../hsc.ps ../docs/index.html"

Note that html2ps is smart enough to follow those (normally invisible) <LINK REL="next" ..> tags being part of the HTML documents, so only the first file is provided as argument, and it will automatically convert the other ones.

Well, it least you see it can be done.