Thursday 27 April 2017

A method for encoding Egyptian quadrats in Unicode

A new document 'A method for encoding Egyptian quadrats in Unicode' is now available from the UTC document register as L2/17-122 [pdf]. The system described takes into account discussions last July during the Informatique et Égyptologie Cambridge meeting and afterwards about extensions to Unicode plain text support to handle vertical hieroglyphic and various complex forms of quadrat structure.

The place for questions, discussion and suggestions is the Egyptian Hieroglyphs in the UCS mailing list (see Informatique et Égyptologie, Cambridge, 2016).

L2/17-122 contains a feasibility report based on three prototype OpenType font developments (Glass, Nederhof, and Richmond) which I hope goes a long way to alleviate concerns raised last year by Egyptologists about the viability of flexible hieroglyphic font implementations in Unicode.

L2/17-122 identifies 9 controls as follows:

Basic quadrat structures

EGYPTIAN HIEROGLYPH VERTICAL JOINER
EGYPTIAN HIEROGLYPH HORIZONTAL JOINER

These two controls were proposed in L2/16-018 (January 2016). They are similar to the original Manuel de Codage (MdC85) ':' and '*' controls.

EGYPTIAN HIEROGLYPH SEGMENT START
EGYPTIAN HIEROGLYPH SEGMENT END

These two controls operate in a similar way to MdC85 brackets '(' and ')'.

L2/17-122 does not contain structure extensions such as the group joiners suggested in L2/16-214 [pdf] to simplify encoding of quadrats in vertical text and tall quadrats in horizontal text. Therefore for most applications the basic quadrat structures of L2/17-122 are encoded as exact equivalents to those of MdC85 (itself derived from the Buurman 1976 model).

However, there are subtle differences from MdC85, most importantly (i) L2/17-122 has more clearly defined control behaviour and (ii) quadrat appearance is determined by a font (or equivalent) so there is more flexibility in handling issues such as hieroglyph sizing, kerning, etc. in plain text implementations.

Hieroglyph combinations

EGYPTIAN HIEROGLYPH STACK MIDDLE

This control overlays one hieroglyph on top of another - a direct equivalent of the MdC88 '#' control (encoded as '##' in JSesh).

EGYPTIAN HIEROGLYPH INSERT TOP START
EGYPTIAN HIEROGLYPH INSERT BOTTOM START
EGYPTIAN HIEROGLYPH INSERT TOP END
EGYPTIAN HIEROGLYPH INSERT BOTTOM END

These four geometrical ligature controls are proposed in place of the L2/16-018 EGYPTIAN HIEROGLYPH LIGATURE JOINER (which was based on an abstract ligature model for non-grid quadrat elements). This set of four ligature controls originates from a consensus formed at the I&E 2016 meeting that four 'corner control' ligatures are sufficient to meet anticipated plain text ligature needs of corpus projects such as Ramses and TLA and that the Egyptologists present preferred geometrical to abstract ligatures. This is a new approach to ligatures although they link fairly well to usage of the original MdC ''&''ligature and MdC extensions familiar to JSesh users.

Bob Richmond

Digital Encoding of Egyptian Hieroglyphic: Origins

Updated 2017-05-17.

I thought it might be useful to summarise some of the background to digital hieroglyphic to help  inform discussion about representations of hieroglyphic writing in Unicode.

This post deals with the early years. Information is thin on the ground so I'd be delighted to learn about any material, unpublished or unpublished, that survives from this formative period.

Apparently, use of computers for hieroglyphic goes back to the 1960s when computers and printing peripherals were hugely expensive and inaccessible to most people except a lucky few. However it was not until the early 1980s that the emergence of personal computer technology started to bring digital techniques and practical tools to Egyptologists and others.

The first Informatique et Égyptology 'round table' meeting (Paris, 26-28 June, 1984) was pivotal in shaping the first generation of digital hieroglyphic that has been used for the last 30 years. Fortunately, the proceedings of the meetings were published in 1985 (although unfortunately and ironically not yet available in digital format) and this post is mostly based on that publication. I'll summarise some papers from I&E 1984 relevant to encoding.

COMPUTER PRINTING OF HIEROGLYPHS AT THE UNIVERSITY PRESS OXFORD (T. G. H. James) describes the replacement of the traditional metal type system at Oxford University Press (used for typesetting the Gardiner font from 1927 to 1983) by a Monotype LaserComp photo type setter adapted for hieroglyphic. The final publication to use the original hot metal font was J.E.A. Vol. 69 (1983).
Typesetting instructions for OUP workflow
This paper gives an insight into older typesetting practices as well as the short-lived LaserComp technology soon to be superceded by desktop publishing on personal computers.

INFORMATIQUE APPLIQUEE A L'EGYPTOLOGIE (Dirk van der Plas) gives a snapshot of his experiences with the Buurman GLYPH program and the practical situation for those printing hieroglyph texts on a budget in 1984 including comparative costs of authography and typesetting.

A PROGRAMM SYSTEM FOR THE EDITION OF TEXT (especially hieroglyphic printing) (Norbert Stief) describes hieroglyph plotting software written in Fortran 77 running on an IBM 370 mainframe driving a CALCOMP plotter. The system was developed at University of Bonn and appears to be what is sometimes later known as the PLOTTEXT system. Mnemonics are used as alternatives to alphanumeric Gardiner codes for encoding purposes in a similar way to Buurman (1976). The 'Bonn Ziechenliste' font catalogue (an extension of the Egyptian Grammar sign list) is given here.

NEW HARDWARE--NEW SOFTWARE (Leonard H. Lesko) gives a short summary of hardware used in his latest setup at Brown University and his earlier hieroglyphic printing workflow at Berkeley during 1973-1982 using the SCRIBE program. The Berkeley system was notable for its use in creating A Dictionary of Late Egyptian 1982-), what I understand to be the first substantial hieroglyphic dictionary to use digital encoding and printing techniques. Not mentioned here is the use of quadrat structure patterns for SCRIBE (rather than a control scheme such as that used by Buurman). Just to prove there's nothing new under the sun, one of the systems I initially considered for Unicode encoding in early 2015 used a similar pattern system although at the time I was unaware of the Lesko work from 40 years ago.

PRINTING OF EGYPTIAN HIEROGLYPHS BY MEANS OF A COMPUTER (Jan Buurman; Astronomer and hobby-Egyptologist). Buurman gives a history of his system which began as a hobby project in 1969. A sketch of the system was first published as The Composing of Hieroglyphic Texts by means of a Computer in Göttinger Miszellen 19 (1976). As far as I'm aware GM19 contains the earliest publication of what would become the basic MdC controls for quadrat structures.
Quadrat structure notation from " The Composing of Hieroglyphic Texts by means of a Computer" (1976)
GM19 reports that the first texts output (in 1971) took an average of 0.2 seconds per hieroglyph to process and 1.3 seconds to plot. The first version of GLYPH was written in Algol 60, running on a CDC Cyber 73 mainframe.

I understand that Buurman showed a video of the plotter in action to the meeting. Far more fun I expect than the laser printers we have grown used to. I'd love to see a video of  hieroglyphs being drawn by a plotter! I am grateful to Hans van den Berg for kindly providing two links showing output from a 1980s HP ColorPro plotter:  'wild bull hunt scarab' of Amenhotep III (https://www.youtube.com/watch?v=8z9aCclxV0U) and part of the Tale of Sinhue (https://www.youtube.com/watch?v=zyp68emXMZM).

RESOLUTION. The meeting resolved to create a standard system for encoding hieroglyphs to be made available before the Fourth International Congress of Egyptology in August 1985. A committee chosen to work on the manual consisted of Jan Buurman, Nicolas Grimal, Michael Hainsworth, Norbert Stief, Robert Vergnieux and Dirk van der Plas.
Proceedings of Informatique et Egyptology 1984 (Paris, 1995). Page 225.
This proposed system was to become known as Manuel de Codage (MdC85) which I aim to summarise in a subsequent post.

Bob Richmond