U.bas allows custom characters to be used in text-based DOS applications. It's original goal was to allow using Unicode characters, but that is not the only way u.bas can be used.
The PC BIOS only provides 256 ``slots'' for storing graphical representations of characters. To get around this limitation, u.bas overwrites uncommon character graphic (glyph) slots. This way Unicode characters can be used the same as as any other character.
Graphics of the control characters (slots &H00 to &H1F) are rarely used. More often control characters actually control something, and their graphics are not used. So u.bas uses the graphics for holding Unicode characters. If enough Unicode characters are loaded, u.bas will also overwrite accented characters, then Greek characters. Note that if all slots are filled up, already used slots will be overwritten.
Before a Unicode character can be printed and used, it must be loaded into
one of the BIOS's 256 character slots. uchar$()
or ucharf$()
can be
used to do this.
char$ = uchar$(&Hcode&)
uchar$()
loads, if not already loaded, the 256-bit glyph specified by CODE
and returns the slot it occupies. The glyph is loaded from the
file number ``FONTFILE'', which is by default opened to
FONTS\UNIFONT\UNIFONT.BIN. All assigned Unicode character codes and their
glyphs are available from http://www.unicode.org/charts/.
The Unicode charts give the character number in hexadecimal, you can too if the &H prefix is used. The default glyphs are part of the Unifont project at http://czyborra.com/unifont/. This project would have not have as very many glyphs without Unifont.
WARNING: The trailing & is needed to make BASIC treat the number as unsigned, otherwise high numbers will be negative. You can leave it off for low-numbered characters.
Example:
euro$ = uchar$(&H20AC) ' Euro sign
char$ = ucharf$(&Hcode&, font$)
Loads a 128-bit glyph from FONTS\128BIT\font$
. You can make your own or
edit existing 128-bit fonts with FontEdit (see FONTS\128BIT\README.txt), or
use one of the existing ones:
The only limitation 128-bit glyphs have over 256-bit glyphs is they can only
be 8x16, while 256-bit glyphs can be 16x16 (by using two slots). See
\fonts\unifont\conv.pl
for the file format used for 256-bit glyphs.
Example:
fourtwo$ = ucharf$(&H42, "Hex") ' 4/2 symbol
A loaded Unicode character can be printed to the screen by simply using PRINT:
PRINT "Balance: " + uchar$(&H20A4) + "10,000"
but for saving the character to a file forlater$(wchar$)
must be used.
utf$ = forlater$(char$)
utf$
is a UTF-8 representation of char$, suitable for storing in files
and other places which can be decoded before use.
To decode a string of UTF-8, use unutf$()
. unutf$()
will look for any
UTF-8 sequences and load any unloaded characters. utf2scalar(utf$)
may
be used to decode a single UTF-8 unit into it's Unicode character number;
use char$ = uchar$(utf2scalar(utf$))
to load the char.
Example:
OPEN "fax" FOR OUTPUT AS #1 PRINT #1, "You owe us: " PRINT #1, "33" + forlater$(uchar$(&HA2)) CLOSE 1
OPEN "fax" FOR INPUT AS #1 LINE INPUT #1, l$ PRINT l$ LINE INPUT #1, l$ PRINT unutf$(l$) 'PRINT MID$(l$, 1, 2) + fornow$(uchar$(utf2scalar(MID$(l$, 2, 1)))) CLOSE 1
code = getcode(char$)
Returns the Unicode character number which char$ holds. slots(ASC(wchar$))
may be faster.
Example:
yingyang$ = uchar$(&H25D3) PRINT "Ying-yang: ", fornow$(yingyang$) + " has a code of " + STR$(getcode(yingyang$))
utf$ = scalar2utf$(index)
Encodes the Unicode character INDEX into UTF-8. UTF-8 is a format which
stores Unicode characters in optiminally sized units. Used by forlater$()
.
index = utf2scalar(index)
Decodes a UTF-8 encoded character into the Unicode character number INDEX.
U.Bas allows you to completely bypass it's Unicode functions and create your own character graphics.
index = newchar(glyph$)
Allocates a free slot in the global slots()
array and sets it to contain
the graphic glyph$
. index
is the slot allocated; you can print your new
character by using CHR$(index)
. glyph$
is a bitmap.
PRINT "Gradiant: ", CHR$(newchar(STRING$(64, &HAA)))
setchars(start, count, glyphs$)
Sets multiple character glyphs, starting from start
and ending at
start+count
to the character graphic glyph$
. glyph$
is a
one-bit-per-pixel bitmap. Setting many characters at a time results in
less flashing than setting each glyph individually.
setchar(slot, glyph$)
Same as setchars(slot, 1, glyph$)
.
Example:
' After running this program, go to a help topic in QBasic. The hidden ' nulls should appear as vertical lines. setchar 0, STRING$(64, &HAA)
Unless you want make other programs look different, your program should restore the default character set. VOLTA.COM from the DosFont distribution is provided to do this.
Windows follows the ISO standard for characters A0-FF, but invents its own characters to fit in slots 80-9F. If you are reading lots of files which use extended Windows characters in DOS, it can be a pain to have to guess what they mean or resort to Notepad. Now you don't have to -- use FONTSEL from DosFont to set FONTS\128BIT\NORMAL\0000WIN.FNT as the DOS font. Of course, box drawing won't work then.
uchar$()
cannot be executed in the Immediate Window
because #FONTFILE
must be opened first. Put your test code in the main
module instead.
Send comments about u.bas to unicodeindos@xyzzy.cjb.net. Comments about Unifont should be directed to it's maintainer.
Modified Sun Mar 25 08:48:47 2007
generated Sun Mar 25 08:56:33 2007
http://jeff.tk/doschar/manual.html