Monday 27 June 2016

Unicode Hieroglyphs in web browsers: generic web pages

In an ideal world, Egyptian hieroglyphs in Unicode will simply appear as such to the reader when they are used on a website page. The general reader should not have to do anything special such as install a font or configure the web browser in order to display Unicode text.

Egyptian-aware web pages may render text as images (as mentioned in various earlier posts, e.g. on WikiHiero) or use web-fonts to provide a full text experience. These techniques generally work well with reasonably up to date web browsers on all kinds of devices.

However, this post is concerned with generic web pages meaning web pages without any special coding for Egyptian hieroglyphs. These pages rely on the web browser and/or its underlying operating system to render correctly.

Probably the best known example of a generic web page is the Google search page www.google.com. This blog post itself is also such a web page, hosted on www.blogspot.com so if all is well with the browser and device on which you are reading you should see hieroglyphs at the end of this sentence - đ“‡łđ“„źđ“‹´đ“‹´.

Then if you copy then paste these four hieroglyphs into Google search, you see something like this:


I've highlighted the hieroglyphs in red boxes. Search is just one example. If you want to write hieroglyphs in web forums or read hieroglyphs as text on an arbitrary web page chances are you are reliant on generic hieroglyph support from your web-browser.

Modern versions of macOS/iOS from Apple and Windows 10 from Microsoft are Egyptian hieroglyphic ready 'out of the box' and so are their supplied web browsers (Safari, Edge or Internet Explorer) . So is a correctly configured Linux distro. Generic web pages just work for users of hundreds of millions of these modern devices.

If this works for you, great. The browser or system you are using is Unicode hieroglyphic ready and you are done with this post unless you are curious about technicalities.

However the global picture is not yet entirely rosy. Android, by far the most popular system for mobile phones and tablets, is not yet up to speed. Old versions of iOS/macOS and Windows are not hieroglyph ready. When you try the Google search on these kinds of system you likely see box characters where hieroglyphs are expected unless something has been added to these systems to avoid problems.

Why hieroglyphs work (or don't)

Technically for a web browser to render hieroglyphic text on a generic web page all it needs to do is:

1. Recognise the text characters are hieroglyphs.
2. Use a font with hieroglyphs to render the characters.

The Google search illustration above is from Windows 10 using Internet Explorer 11. Everything works as expected because Windows 10 comes with the Segoe UI Historic font (which contains Unicode hieroglyphs). Internet Explorer recognises hieroglyph characters and, in the absence of any more specific font specifications on the web page, uses the Segoe font as default. The Microsoft Edge and Mozilla Firefox browsers also work correctly.

However, even on Windows 10, the Chrome browser (version 59 and earlier) is not hieroglyph ready. It does not detect hieroglyphs on a web page and choose an available font. Chrome 59 serves as an illustration of how things can go wrong when a browser has bugs in font handling.

The latest release of Android (6.01) has an additional problem in that Android comes with a very limited number of fonts installed. Hieroglyphic is not and this is just one of a large number of writing systems that therefore don't render in Chrome, the standard Android web browser, because no font is available or automatically downloaded when required. In theory a Android device could work when the device maker supplies an enhanced version of  Android. In practice I've never seen this happen. Presumably we'll see a working version of Chrome from Google eventually. Meanwhile Chrome should be fine for most Egyptian-aware web-pages.

Getting hieroglyphs to work for generic web pages

With the huge range of devices and browsers available nowadays it is impossible to give detailed information about what may or may not be done to address problems if your setup is not hieroglyph ready.

In many cases the simplest solution is to update your system setup if possible and/or ensure your web browser version is kept up to date.The technical world has moved a long way since 2009 when hieroglyphs were introduced to the Unicode standard but initially unsupported in the then latest systems such as Ubuntu 10.04 (Lucid Lynx), Mac OS X 10.6 (Snow Leopard) and Windows 7.

Updating is easy for most Linux distros, Apple now provides macOS/iOS updates free of charge and likewise Microsoft provides updates from Windows 7/8 to Windows 10 (strangely, this is currently stated to be free of charge for only a limited time until late July 2016).

If updating is impossible, it seems the Firefox browser works correctly on Linux and Windows devices I've tried so long as it finds a Unicode hieroglyphic font installed. This may work for you. In particular if you are stuck with an old PC setup with Windows 7 it should be sufficient to install a Unicode hieroglyphic font then use Firefox for generic web-browsing instead of Internet Explorer or Chrome. Other less well known browsers may also work.


Bob Richmond

Tuesday 21 June 2016

Hieroglyphs on the web: The Digital Topographical Bibliography

The Topographical Bibliography of Ancient Egyptian Hieroglyphic Texts, Statues, Reliefs and Paintings (Topographical Bibliography or TopBib for short) is a long running project (work began in the early 20th Century) based at the Griffith Institute, Oxford. The first part of Volume I of the Bibliography - THE THEBAN NECROPOLIS PART 1. PRIVATE TOMBS - was published in 1927. Since then new volumes and revised versions have been published.

The first seven printed volumes of TopBib followed the Topographical approach. The more recent Volume VIII - OBJECTS OF PROVIDENCE NOT KNOWN (2000-) instead, for obvious reasons, organises objects by type and period.

A brief  History of the Topographical Bibliography is available on the Griffith Institute website.

TopBib is the essential and definitive resource concerning Ancient Egyptian objects.

Behind the printed volumes, modern technology was introduced to TopBib during the editorship of Jaromir Malek with use of databases for digitised versions of the text, hieroglyph encodings and so forth. Some material was made available on the web and the notion of an online version of the Bibliography devised. Aside from the benefits of making digitised material available to scholars, technology helps work on TopBib to continue more efficiently (there is a large amount of material available yet to be analysed and published).

The Digital Topographical Bibliography currently provides access to the printed editions (mostly in PDF format). A small amount of material is also available in new data formats and a useful reference system has been introduced (see DIGITAL TOPOGRAPHICAL BIBLIOGRAPHY: The Digital Approach).

TopBib was studied as part of the process of adding Egyptian Hieroglyphs to the Unicode Standard. The printed editions originally employed the Gardiner 'Oxford' font used as the primary reference for the initial hieroglyph repertoire.

The idea of using Unicode plan text hieroglyphic for Digital TopBib goes back over a decade and its a good example of a major publication for which plain text meets all hieroglyph requirements.Much work needs to be done to complete the first digital edition but this should benefit greatly from plain text availability.

Bob Richmond

Thursday 16 June 2016

Transcription of Hieratic into Unicode Hieroglyphic: Part 2

This post follows on from my earlier Transcription of Hieratic into Unicode Hieroglyphic: Part 1 which stressed the importance of hieratic transcription as an application of Unicode hieroglyphic.

The initial Unicode collection of Egyptian hieroglyphs released in 2009 was based on the Gardiner font and sign list. There is fairly good coverage of signs required for transcription but it is useful to consider how the situation can be improved in the context of Extending the Hieroglyph repertoire in Unicode.

One influential work on hieratic was Hieratische Paläographie by Georg Möller (1876-1921), published in four volumes: Volume I-III 1909-12 and Volume IV 1936 (with introduction by Hermann Grapow). [PDF versions are available for download here]. 

Möller employs numeric codes for hieroglyphs corresponding to hieratic elements as seen in this illustration from Volume I.:



Volume II pp 71-74 links these hieratic codes to the alphanumeric hieroglyph codes used in the Theinhardt font produced for Lepsius.

Hieratic examples are given from a variety of sources, organized by different periods from Old to Late Egyptian through to the Greco-Roman period. Some examples from Volume I:


Here, Möller codes 200 and 200b match Gardiner codes G43 and Z7 and hence Unicode đ“…± G043 and 𓏲 Z007. Gardiner was very familiar with Hieratische Paläographie and its coding system so it is unsurprising that hieroglyphs in the Gardiner font and coding system links to the Möller numeric system. See Identification of the signs from the Hieratische Paläographie [M-J Nederhof website] for a list of matches between encoded hieroglyphs and Möller codes.

Nevertheless not all Möller codes are present in the Gardiner font and sign list. For example Möller 131 (mouse: encoded as E130 in Hieroglyphica but not yet in the Unicode repertoire).

Möller provides lists of groups/ligatures such as:


which need to be available in any Unicode plain text system.

Regarding extensions to the Unicode hieroglyph repertoire, it is desirable to add Möller codes to the Unicode hieroglyph database of candidates for encoding. His work is over a century old but has been influential and any errors known to modern Egyptologists can be identified in the database (when available for review).

I will be recommending hieroglyphs found in the Möller list but not yet encoded in Unicode be included in the next set of hieroglyphs to be included in the standard and thereby improve the scope of Unicode for transcription of hieratic to hieroglyphic.

This is not to ignore more recent scholarly work involving hieratic. If well-documented material from modern databases such as the Ramses Project and Thesaurus Linguae Aegyptiae or other publications is available to further improve hieratic transcription this data should also be added to the Unicode hieroglyph database of candidates for encoding.

As a point of interest, I am documenting all Möller groups/ligatures (where applicable) in the cluster list referred to in Foundations of a Universal Egyptian Hieroglyphic Writing System in Unicode plain text.

Bob Richmond


Wednesday 15 June 2016

Hieroglyphs on the web: Thesaurus Linguae Aegyptiae

The Thesaurus Linguae Aegyptiae (TLA) is an Ancient Egyptian virtual dictionary and thesaurus intended to provide a specialist tool for lexicographic research into the Egyptian language. The web application and content is developed at the Berlin-Brandenburg Academy of Sciences and Humanities with contributions from various sources. Much of the content is in German but help text and some content is also available in English.

The TLA corpus includes Ancient Egyptian writings from the entire historical period from Old Kingdom through to the Greco-Roman period. The majority of the content is concerned with the Egyptian Language as written in hieroglyphic and hieratic scripts but there is also a Demotic database included in the current version.

Features of the website include a browsable version of Wörterbuch der Ă„gyptischen Sprache. [Erman, Adolf (editor), and Herman Grapow (editor)], a digitised slip archive from the Wörterbuch and the Vormanuskript (preliminary manuscript) of the Wörterbuch.

The TLA dictionary itself apparently started with a list of entries from the Wörterbuch by Horst Beinlich (the Beinlich Wordlist). The word list / lemma list has been expanded and developed since then.

The TLA web site went online in 2004 and has undergone various additions and improvements since (the current version is dated October 2014). It remains a work in progress. To access the material you may log in as a guest or register as a user.

The TLA is an invaluable resource for Egyptologists, including a huge volume of material. The database and website designs are very much oriented at specialists. To make the most of the search and research features it is necessary to study the help text and invest some and effort time in finding your way around the site and studying its search functionality.

Hieratic is transcribed as hieroglyphic. TLA uses a subset of the Hieroglyphica sign list for its hieroglyphs, with some additions. A large proportion of hieroglyphs used are already encoded in Unicode. It would be useful to identify those that are not yet encoded so these can be featured in the next update to the Unicode hieroglyph repertoire.

Hieroglyphic writings in TLA are currently implemented as graphics (as with Ramses Online; as outlined in my recent blog entry). As with Ramses, Unicode plain text would open up interesting possibilities for TLA. This should be fairly straightforward to implement in TLA once related work in the Unicode standard is completed.

Bob Richmond

Wednesday 8 June 2016

Hieroglyphic Fonts and the Universal Shaping Engine

An Egyptian hieroglyphic writing system such as that proposed for Unicode plain text needs to cope with writings such as
where two or more hieroglyphs can be arranged in groups such as one hieroglyph above another as in this example. In digital media, hieroglyphic is therefore an example of a writing system that needs Complex Text Layout (CTL) [Wikipedia] to display text. Much of the worlds text doesn't need CTL but Arabic is a popular writing system that does and there are many others including Indic scripts such as Bengali and Devanagari.

From an end user point of view the technical underpinnings of a writing system should be as invisible as possible, ideally everything should just work naturally. That is a goal of the Hieroglyphs Everywhere Project (HEP). However, for that to happen it is necessary for various pieces of the puzzle to fit together. In this note I'll try to explain where the Universal Shaping Engine (USE) fits into the picture.

Traditionally a writing system that requires CTL needs specialist fonts and customized software. The software elements may be part of the Operating System (Android, iOS, Linux, Windows etc.) or part of an application (Microsoft Office, LibreOffice, Web Browser etc.). To include a new writing system or evolve a supported system in this traditional way is a complicated process which can take years to feed through to the user base.

USE reduces the complexity of this situation by providing software features to enable an OpenType font to implement CTL for its chosen writing systems. In practical terms this means once Unicode is released with plain text hieroglyphic functionality, compatible fonts can be made available and they will be usable with applications such as web browsers and systems that support USE. For instance, there is no need in principle to expect a long wait for web browsers to be adapted to support fonts for hieroglyphic writing so long as browser and font support USE.

The first implementations of USE were released in 2015. Windows 10 integrated the engine on initial release. HarfBuzz (an open source component used in Linux, Firefox, Chrome/Chromium and LibreOffice among others) introduced USE support at version 1.0. The USE specification is available from Creating and supporting OpenType fonts for the Universal Shaping Engine [Microsoft Typography].

I've no information on the current status of USE for Android or Apple iOS/Mac systems. All I can say is it would be surprising if USE support were not widely available by around this time next year (the earliest we can expect plain text hieroglyphic to be released in the Unicode standard). I'd also guess that it is unlikely legacy operating systems such as pre-OS X El Capitan or Windows 7/8 will see USE integration (although it is quite possible that applications such as some web browsers may support USE on these legacy systems).

For Hieroglyphs Everywhere Project I'm working on the assumption USE is the future for hieroglyphic fonts. However, alternate methods are used in the short to medium term while Unicode plain text is not available. This alternate mechanism then remains available for situations where for some reason users are unable to work with up to date operating systems and software.

The recent article making fonts for the Universal Shaping Engine [pdf] by John Hudson makes a good read for font developers or those curious to learn more. The post Windows shapes the world’s languages [Windows Experience Blog] gives some interesting background.

Bob Richmond

Tuesday 7 June 2016

Transcription of Hieratic into Unicode Hieroglyphic: Part 1

It has been standard practice for Egyptologists to transcribe hieratic sources into hieroglyphic for many years. This is an important application of Unicode Hieroglyphic so it is worthwhile to consider the background.

Alan Gardiner, writing in the 1920s, reasons:

1. The first and foremost reason for transcription is undoubtedly interpretation. Hieratic hands vary greatly, and beginners always, and advanced students often, require to know what familiar character a particular hieratic sign or scrawl represents. Interpretation reduces diversity to unity, permits the comparison of one variant with another, facilitates translation, and performs a multitude of other valuable services. Interpretation is indisputably the primary function for which transcription is employed.

2. There is, however, another reason and purpose for transcription which is not so clearly and fully recognized by scholars, though it is of equal importance with the last. I refer to the reproductive function of transcription. Practical objections of various kinds - expense, printing difficulties, inaccessibility of the originals, etc. - besides the necessity of interpretation referred to above under 1, make the reproduction of hieratic in exact facsimile sometimes unnecessary, and on occasion definitely undesirable. How inconvenient a grammar of Late Egyptian would be, in which all the examples from papyri and ostraca were given in facsimile! … Here I will touch upon another question of expediency. Late Egyptian hieratic is now so well known that in the case of easily legible, relatively "uncial" hands, it is really superfluous to publish every new document in facsimile. Our Egyptological libraries are already far too expensive. For many literary papyri all that is necessary is a good hieroglyphic transcription, leaving it to doubters to verify their doubts by consulting the originals or by inquiring from other scholars to whom the originals are accessible.

To sum up, our transcriptions of hieratic texts of the New Kingdom should at once provide an interpretation of the original hieratic, and also enable the reader to form in his mind a sufficiently good picture of the reading presented by the manuscript. For my own part, I shall not hesitate to use dots and dashes and diacritical marks whenever these seem appropriate or will aid the reader's visualization of the original. Our transcriptions ought most emphatically not to be translations into contemporary hieroglyphic; they are artificial substitutes for the actual manuscripts, substitutes the fabrication of which must be directed by the twin principles of interpretation and reproduction.

From The Transcription of New Kingdom Hieratic; Journal of Egyptian Archaeology, Vol. 15, No. 1/2 (May, 1929), pp. 48-55 (see  http://www.jstor.org/stable/3854012).

Modern technology has eliminated the problems of expense and inconvenience in making an exact facsimile available of an extant source. A legible photograph costs next to nothing to create and distribute on the web. Some practical concerns Gardiner needed to cope with in his era no longer apply. However, the value of the interpretation and reproductive functions highlighted by Gardiner remains fundamental. Indeed now we have machine automation for search and analysis once hieratic is interpreted and reproduced by transcription into hieroglyphs in one or other digital text encoding. This new facet of reproduction opens up ways of working with hieratic sources undreamed of a century ago. 

To date, the actual work of interpretation and reproduction remain human scholarly activities. Often aided by software applications and (potentially) modern technologies such as Artificial Intelligence. Scholarly questions of transliteration and encoding remain open. How to best represent or work with differences among the orthography of Old, Middle and Late Egyptian manuscripts? How to annotate and present hieratic transcriptions as text and/or present sources now we have far richer textual tools available?

One point I'd like to emphasise is the artificial nature of hieratic transcription to hieroglyphic. It is useful to consider this application of a Unicode hieroglyphic writing system in its own right under the overall Unicode umbrella. In my experience it can be confusing to muddle thinking about original hieroglyphic sources and hieratic transcription.

Transcription of Hieratic to Unicode Hieroglyphic plain text need not address all these issues and practicality boils down to two primary considerations.

  1. Ensure the Unicode plain text hieroglyphic writing system captures the layout of hieroglyphs for modern transcription requirements (e.g. as factored into  L2/16-018R [pdf]).
  2. Identify extensions to the Unicode hieroglyph repertoire helpful for transcription applications.
Part 2 of this series of posts will focus on the repertoire question.

Bob Richmond

Monday 6 June 2016

Hieroglyphs on the web: Ramses Online

Project Ramses is an initiative concerned with the annotated encoding of Late Egyptian texts (approximately scoped to 1350-750 BC). Work started in 2006 based at the University of Liège in Belgium and an interesting snapshot of progress to 2012 is given in The Ramses Project: Exploring Ancient Egyptian linguistic data  using a richly annotated corpus [pdf].

Ramses Online is provides a web application to enable specialists to work with the Project Ramses database. Hieratic is transliterated into hieroglyphic for human readability and machine processing. An initial ‘beta’ version became available online in August 2015 and up to date background information is outlined in a document by StĂ©phane Polis: http://be.dariah.eu/wp-content/uploads/2015/10/Ramses_Stephane_Polis.pdf. The current website is implemented in French but international support (in English) is stated to be planned.

Ramses Online is an invaluable resource for Egyptologists and already contains much useful material from hundreds of ancient hieratic and hieroglyphic writing sources.

A technical point. At present Ramses Online renders hieroglyphic as graphics, not text. Improved functionality of projects such as this should be straightforward to re-implement using Unicode plain text when available. Indeed I envisage the Ramses (𓇳𓄟𓋴𓋴) corpus as a useful test case for plain text and more elaborate schemes based on plain text.

Bob Richmond