Tuesday, 20 December 2016

Hieroglyphs on the web: SINUHE the Hierotyper

The SINUHE the Hierotyper project is about keyboard input and display/edit of Unicode hieroglyphic inside 'Office' and web applications. SINUHE combines an ingenious adaptation of Japanese input methods for typing hieroglyphic by So Miyagawa with a font implementing a version of Simplified Egyptian by Marwan Kilani.

The project website is here and project sources with some documentation on GitHub at https://github.com/somiyagawa/SINUHE-the-Hierotyper. It is well worth taking a look at their videos on YouTube such as such as SINUHE the Hierotyper: demo 1. SINUHE is an ongoing work in progress and you need to feel comfortable with downloading the files and getting elements to work if you want to try it yourself. If you have a medium-large amount of hieroglyphic to type, SINUHE is faster than traditional use of codes and sign palettes so worth considering.

Longer term it is likely that standard Egyptian Hieroglyphic input methods will become available in their own right so this Japanese scenario for direct typing into standard word processing documents and the like will no longer be necessary. However, I can't see this happening earlier than a couple of years from now after Unicode 11 is released.

Meanwhile, SINUHE gives a preview of the kind of thing to expect for input as we move beyond the first generation of digital hieroglyphic and the typical input methods and workflow of traditional MdC applications. Of course you will be able to continue using hieroglyphic editors including traditional MdC-based systems to transcribe hieroglyphic then convert to and from Unicode if you prefer.

The notion of Simplified Egyptian Hieroglyphic fonts has been around for quite some time but as far as I know this is the first system using the technique to be made generally available on the web. The neat part about Simplified Hieroglyphic is it can be used in much modern software. However a major problem with this approach to fonts is one font may render the quadrat structure of hieroglyphic text differently to another font unless there is an agreed standard to which fonts conform to. I therefore recommend that anyone thinking of making simplified fonts should work alongside the SINUHE project to maintain a common approach to hieroglyph sequence mapping.

I'm planning to support SINUHE text format conversion to and from the control character approach of the Unicode Standard for Hieroglyphic used in the Hieroglyphs Everywhere Project (HEP). The great thing about open standards is all software writers have the information to enable them to do the same thing if they wish.

Thanks So and Marwen for interesting discussions and your demos of the system in action.

Bob Richmond

Friday, 16 December 2016

Web browser test for hieroglyphic (December 2016)

I've released a Hieroglyphs Everywhere Project (HEP) web browser test page for Unicode hieroglyphic to provide a simple test you can use to check at a glance the status of the web browser(s) you are using on PCs or other devices.


HEP uses web fonts so there's no need to install a hieroglyphic font on a PC or other device. HEP resources work within the scope of Unicode (2009) hieroglyphic to preview features relating to the new hieroglyphic writing system (currently expected in Unicode 11, 2018).

I'm making the test page available ahead of time so we can become aware of any unexpected issues.

Up to date versions of Google Chrome, Microsoft Edge/Internet Explorer, and Mozilla Firefox work well as a general rule. Keeping your web browser up to date is highly recommended, irrespective of hieroglyphic, for security and performance reasons.

I've found Firefox works consistently over a wide range of devices (including Android, Linux, macOS and Windows systems) aside from some minor visual variations.

Chrome varies from system to system: for instance polychromatic is missing from Windows 7 and (surprisingly) Android at this time. On Windows 10 a Chrome bug displays bold polychromatic text in black and white.

Safari still renders polychromatic text as monochrome on macOS and iOS. Presumably this will be fixed sometime in 2017. Meanwhile Firefox is available to Mac users for greater web standards compliance including polychromatic text.

Internet Explorer, like Chrome, is limited in capabilities by its host system for example IE on Windows 7 lacks modern font technology. If you are forced to use a legacy Windows 7 system I'd recommend you switch web browser from IE if you haven't already. I have a HEP work-around for some IE/Windows 7 issues but would rather spend the time needed on something useful unless this turn out to be a major problem.

During 2017, I expect to update and extend the test page from time to time. Several reasons for this can be anticipated.
  1. Progress with the Unicode Standard for the Egyptian Hieroglyphic writing system and repertoire extensions.
  2. Identification of browser bugs, e.g. in OpenType font rendering.
  3. Additional browser features or browser bug avoidance techniques used by the Hieroglyphs Everywhere Project.

Bob Richmond


Wednesday, 14 December 2016

Several Observations concerning Polychromatic Egyptian Hieroglyphic fonts

The Egyptian Hieroglyphic writing system is inherently polychromatic. The colour or colours used for individual hieroglyphs involve naturalistic, symbolic and conventional characteristics adapted to the materials or pigments available to the artist/scribe. For the most part, hieroglyphs were coloured consistently in a single painting or inscription. Conventions were used, for instance green is associated with growing plants and life, blue the sky and primeval flood. These characteristics of hieroglyphic makes it possible to envisage coherent polychromatic fonts with useful applications.

Symbol & Magic in Egyptian Art (R. H. Wilkinson, 1994), Chapter 5, provides a good introduction to Egyptian use of colour.

Hieroglyphic will be the first intrinsically chromatic Unicode writing system. It is important, however, to be clear that Unicode itself and its hieroglyphic content do not define chromatic features. This is done by systems built on Unicode such as CSS/HTML web standards, typically using the OpenType font standard for text rendering. Modern versions of web browsers such as Google Chrome, Microsoft Edge/Internet Explorer, and Mozilla Firefox already implement font technology to work with basic polychromatic, mainly thanks to the popularity of colour emoji.

Once hieroglyphic writing system additions are available in the Unicode standard there can be little doubt that polychromatic will dominate casual use of hieroglyphic. Hundreds of millions of children around the world encounter Egyptian each year as part of their education and colour makes hieroglyphic more comprehensible and interesting. This situation will benefit Egyptology since it will encourage digital industries to support Unicode hieroglyphic in their products. There are questions such as how far to scope casual hieroglyphic but the earliest this widespread deployment could begin is Summer 2018 so there is ample time to experiment with options and discuss details before a universal font rollout.

The situation with polychromatic for scholarly applications is different. Monochrome hieroglyphic has 150 years of publications to inform on strengths and weaknesses of transcriptions but there is no tradition of polychromatic typefaces at all. There are obvious benefits of colour such as hieroglyphs which look similar in monochrome yet have distinctive colour in paintings. Birds G001 ꜣ 𓄿 and G004 tyw 𓅂 are good examples. Nevertheless in the short term I expect most Egyptological work to continue to focus on monochrome. This principle also applies to implementation of Unicode hieroglyphic fonts and the methods, resources and tools for working with the next generation of hieroglyphic technology where there's a lot still to do. Incidentally, hieratic transcriptions that follow traditional use of red and black ink in hieratic do not need polychromatic fonts. Nevertheless if you are producing a book, thesis, museum website or the like aimed at the 2018/19 time-frame it is not too early to consider whether or how polychromatic hieroglyphic might be used to advantage.

My own work on polychromatic began with the simple question: is it technically feasible at the current state of technology? For a long time the answer has been no. As recently as January, when the Unicode hieroglyphic writing system was expected in 2017, polychromatic seemed best left to the second stage of development when technology was slightly further ahead. However, one positive result of the delay to 2018 is it now proves possible to bring polychromatic forward by a year for use by specialists (as noted above widespread deployment should wait on the standards process).

Hieroglyphic fonts are fairly complex to develop. Polychromatic adds more technical complexity. To simplify the first font I decided to focus on two aspects. 1. Use in a dictionary application and 2. software user interface (UI). This reduces quadrat layout and chromatic complexity, for instance I could ignore vertical writing and the tall or complex quadrats needed in some transcription scenarios. This is still a work in progress but I hope write more on the topic during 2017.

Bob Richmond


Thursday, 1 December 2016

Egyptian numbers and MdC transcription of Hieroglyphic

A useful feature of most implementations of Manual de Codage (MdC) notation is distinct codes for numbers. There are five codes named 1, 2, 3, 4 and 5 used in the same way as hieroglyph alphanumeric codes or mnemonics as quadrat building blocks. This is especially useful when dealing with units 1 to 9 where it is often useful to distinguish between the concept of unity (Z1) and the number 1, duality (Z4/Z4A) and the number 2, plurality (Z2) and the number 3.


The situation with numbers, plurality and so forth is more nuanced than this summary suggests. Ancient scribes did not think in these terms. Conventions changed over time. Hieratic featured ligated forms. Specialist representations of numbers were used for fractions and other applications. My point here is about basic practical use of MdC not this wider picture.

There are two important reasons to make the numeric distinction.

1. Fonts. Most fonts used for MdC implementations to date do not display '1' differently to Z1. This will change with a new generation of improved fonts. See Egyptian Grammar (Gardiner, 1957) p191-p206 for examples showing how the original Gardiner font makes a distinction between Z1 and some numeric forms.

2. Search and analysis. Such applications using MdC are rare so far but the next generation of software will add these features. For instance searching transcriptions for dates is very useful and it is likely such software will be strict about use of number forms to avoid confusion.

Many examples of the practice of treating Z1 and numbers as if they are the same cropped up in the analysis referenced in my earlier post Analysis of Unicode Egyptian hieroglyphs in a collection of MdC-coded transcriptions. Part of the explanation of this is the fact that some of the transcriptions originated with early MdC editing software that did not support the numeric feature.

Examples of number codes used incorrectly are A1*B1:Z2 (people, plurality) incorrectly transcribed as A1*B1:3; G1&Z1 (ideogram) incorrectly transcribed as G1&1; Z1*Z1*Z1:Z1*Z1 (3:2) incorrectly transcribed as Z1*Z1*1:Z1*Z1.

It is not an MdC error to use the Z1 form. The verbose Z1*Z1*Z1:Z1*Z1 transcription, for instance, is a legitimate alternative to 3:2 in most current MdC implementations although it may render differently and in a less satisfactory way in some of these MdC implementations. Sometimes verbose is unavoidable such as using the current release of WikiHiero that does not recognise numeric codes. However it can be expected that future releases of MdC software may give a warning when verbose sequences are encountered. I'm taking this approach with data and software in the Hieroglyphs Everywhere Project (HEP) to avoid confusion when linking traditional MdC with Unicode-based solutions. In short, as a general rule use numeric representations for units in MdC where feasible.

MdC provides other numeric codes 10, 20, 30, 40, 50 and 100, 200, 300, 400, 500 used in a similar way to the units. I would like to recommend these but there is a bug in JSesh 5.5 that results in less satisfactory rendering of quadrats using these numeric forms.  -20:10- is a less satisfactory rendering compared with  -V20*V20:V20-. Fortunately, unlike the ambiguous situation with strokes, the linkage with Unicode is less problematic so for the time being its mostly harmless to use verbose notation.

The narrow Z2 issue

One problem encountered with MdC transcriptions is the fact that Hieroglyphica (1993 and 2000 versions) omitted defining a code for the narrow variant of Z2. The narrow variant is a feature of the Gardiner font and commonplace in Egyptian Grammar but not explicitly given a code in the sign list. This narrow variant is known as Z002A in Unicode and Z2D in Hieroglyphica extensions such as the Aegyptus font. Z2D was used in InScribe (2004) and documented in EGPZ version 1 (2006).

Unfortunately, the Z2D omission from Hieroglyphica means transcriptions that need the narrow variant often resort to using the number 3 as a substitute when using editing software such as JSesh 5.5. This unsatisfactory situation needs to be resolved.

Personally, when working with JSesh material I write 3\NaN (meaning not a number) - a legitimate JSesh construction. So rather than write -D21:X1*3- (as in Egyptian Grammar Exercise XVIII) I use -D21:X1*3\NaN-.

I'm not especially advocating my approach at present just drawing attention to the problem as something you will need to fix in your transcriptions in the future.

If you are developing MdC-related software such as WikiHiero it is safe and easy to add Z2D support.

Conclusions

In principle MdC support for numbers is fairly effective but has problems in some implementations. To understand this fully needs a more detailed analysis than can be adequately treated in a blog post.

I hope that by raising these points MdC users will give more thought to representation of numbers when transcribing new material or updating existing transcriptions.

To end on a positive note, the situation with Unicode-based solutions is better defined and potentially easier to use.

Bob Richmond

Tuesday, 29 November 2016

Software Developer Guidance on supporting Egyptian Hieroglyphic in Unicode: Introduction

This is the first in what I hope to be a series of technical posts aimed at software developers interested in supporting the Egyptian Hieroglyphic writing system in their systems or applications.

Most Egyptologists and others interested in Hieroglyphic need not be aware of these technical details and can skip the topic.

Unicode has contained 1071 hieroglyph characters since 2009. Additional features to enable an actual writing system in Unicode was planned for Unicode 10 (2017) but these are on hold until some requests for specialist additions are clarified meaning the release is delayed to Unicode 11 (2018) or possibly Unicode 12 (2019). These possible additions affect detailed font implementations but do not change what needs to done to make an application capable of displaying Hieroglyphic when it becomes available.

I'm working on resources for 2017 to enable useful work to be done and software to be tested so a working software ecosystem already exists whenever the standard is formally released.

Two innovative aspects of hieroglyphic writing may be especially interesting to software developers.

1. Hieroglyphs are typically arranged in clusters

This is different to most writing systems where characters follow one after another. This feature means that hieroglyphic fonts are probably the most complex examples of OpenType and can potentially reveal bugs or deficiencies in the font processing software used by applications.

The web-browser situation is in pretty good shape. If you are developing web-apps for modern browsers there is probably little to be concerned about. Conversely one well-known development environment with only limited support for OpenType is Microsofts .Net-based Universal Windows Programs (UWP) system which at time of writing does not render complex fonts.

2. Hieroglyphic is the first polychromatic writing system in Unicode

Polychromatic fonts are mostly used for Emoji characters at present predominantly via up to date OpenType font support. They are supported in the latest versions of popular web browsers Google Chrome, Mozilla Firefox and Microsoft Edge/Internet Explorer. However even simple color Emoji are still missing from many other applications.

Polychromatic hieroglyphic fonts are capable of rendering in monochrome so support is optional in your application. However casual users of hieroglyphs will likely be engaged by color and the feature also has value for scholarly work. Polychromatic fonts may gain popularity for other purposes so support is something you might like add to your application road-map.

Many application developers will need access to suitable APIs to add polychromatic support to their application. Therefore at this stage I'm especially interested in hearing from developers of high profile applications and popular API libraries and can be contacted via http://www.hieroglyphseverywhere.net/Home/Contact.

Bob Richmond

Friday, 28 October 2016

Analysis of Unicode Egyptian hieroglyphs in a collection of MdC-coded transcriptions

This is a follow-up to my post last week MdC analysis for Unicode Repertoire Extensions.

I've applied the web app to a collection of 180 MdC files and summarised the results in Analysis of Unicode Egyptian hieroglyphs in a collection of MdC-coded transcriptions [PDF].

There is also a minor update to the MdC analysis for Repertoire Extensions web app itself fixing a couple of bugs and increasing the number of repertoire candidates to 200.

Bob Richmond


Wednesday, 19 October 2016

MdC analysis for Unicode Repertoire Extensions

As part of discussions on expanding the hieroglyph repertoire in Unicode it is useful to be able to inspect existing digital documents in Manuel de Codage (MdC) format. I've therefore made a web app available for this purpose: MdC analysis for Repertoire Extensions.

Most users of MdC will probably find the app instructive, whether interested in Unicode developments or not.

MdC methods of encoding Egyptian hieroglyphs have been around for over 25 years. MdC has proved by far the most popular method of digitally encoding hieroglyphic for publishing and database-type applications.

One complication is the fact that MdC was never technically defined in detail and work on the system appears to have stopped after the publication of the second edition of Hieroglyphica (2000) and before documentation was made available online. Therefore, several interpretations, extensions, variations and subsets of MdC are in existence (e.g. WinHiero, JSesh 5.5, WinGlyph and InScribe 2004). The web app attempts to be fairly permissive on what variation of MdC is analysed.

There is something of the chicken and the egg about releasing an app before there is a clear vision of the first expansion of the Unicode hieroglyph set. Bear that in mind.

I hope to evolve and improve the app over the next few months so feel free to send feedback via www.egpz.org.

Bob Richmond

Tuesday, 11 October 2016

Unicode plain text proposal status (October 2016)

Summary

Plain text hieroglyphic writing in Unicode is currently on hold while some technical points are investigated. These are use of EGYPTIAN HIEROGLYPH LIGATURE JOINER, extensions for rare forms of writing and extensions for vertical text (the initial proposal was focused on the forms of horizontal writing that account for the vast majority of hieroglyphic in print and first generation digital formats).

This means the earliest that hieroglyphic writing will be released as part of the Unicode Standard is Unicode 11 (2018), a year after previously planned for Unicode 10. This delay unfortunately means it won't coincide with the 150th anniversary of the first print publication using a hieroglyphic typeface (mentioned in Some remarks about Unicode Hieroglyphic fonts).

This is a nuisance but in practical terms there is no reason to hold up work on building an ecosystem for Unicode hieroglyphic writing. It simply means it will be necessary to use an approach such as the web font referenced in Unicode Hieroglyphs in web browsers: Web Fonts as the basis of fonts and tools, accepting some limitations in what can be done until the Unicode standard is updated and implemented by web browsers, system software, and applications such as word processors.


Discussions

This Summer featured much discussion of the Proposal to encode three control characters for Egyptian Hieroglyphs (L2/16-018R) at the Informatique et Égyptologie - Cambridge - 2016 meeting in July. And afterwards. In early August, several of us from the I&E meeting participated in a telephone discussion with members of the Unicode Technical Committee (UTC).

There were also discussions about expanding the repertoire of Unicode hieroglyphs. In practical terms, this is the main obstacle to fully encoding a range of ancient sources one might expect to be represented in a plain text writing system. Repertoire is not entirely unconnected with control characters but the two can proceed separately in the standardisation process so I'll treat this topic another time.

The purpose of the proposed three control characters is to enable Hieroglyphic writing in Unicode (the current situation is 1071 basic hieroglyph characters were incorporated in the standard in 2009 but there is no way to form quadrats so there is no authentic writing system as such). Most participants were in agreement with the principle of enabling the writing system. However, the Thesaurus Linguae Aegyptiae (TLA) and Ramses corpsus projects have objected to moving forward with the three until additional features are provided.

The reason these extensions were not proposed for the first release was to focus on the vast majority of modern hieroglyphic in typeset books and digital formats which use horizontal writing and do not need additions to enable digital encoding.

Interest was expressed in extending the scope of  L2/16-018R to deal with vertical writing and the related 'tall quadrat' orthography (used in some horizontal writings). Some are of the opinion it is important this is done for the first release of support for the writing system rather than as a second stage. I've described two extended control characters, EGYPTIAN HIEROGLYPH HORIZONTAL GROUP JOINER and EGYPTIAN HIEROGLYPH HORIZONTAL GROUP JOINER I've been using for vertical text evaluation in a discussion document An Extension to the three control characters for Egyptian Hieroglyphs and some additional remarks (L2/16-214).

Examples of several instances of rare quadrat arrangements were noted that cannot be represented elegantly or unambiguously using L2/16-018R. Analysis so far suggests these account for order 0.01% (sic) of the digitised corpus but may be more common in certain ancient contexts.

Additional controls can be added to deal with rare quadrats but the issue needs to be better characterised and agreed by Egyptologists before deciding what to do. As with the basic 3 characters, data needs to be studied and evaluated before submitting a formal proposal.

Discussions in July yielded a consensus with TLA and Ramses projects on implementation of two of the three characters namely EGYPTIAN HIEROGLYPH HORIZONTAL JOINER and EGYPTIAN HIEROGLYPH VERTICAL JOINER, published as L2/16-227.

M-J Nederhof presented a discussion document L2/16-177 at the I&E meeting based on adapting his RES scheme as an alternative approach to control characters for Unicode quadrat sequences without using the horizontal or vertical joiners. This was followed by a revised document L2/16-210 with addendum L2/16-233 which outline two alternative versions and notations of his system.

There are many alternative ways one might define quadrat sequences using various levels of complexity but there would need to be convincing evidence to drop the vertical and horizontal joiners and/or require complex or hard to read sequences for simple quadrats. It is obviously important that any proposed alternatives are capable of implementation in current technology. 

Discussions continue on the Egyptian Hieroglyphs in the UCS mailing list (see archives at http://evertype.com/pipermail/egyptian_evertype.com/). Egyptologists and others with an interest in the topic are encouraged to join and participate in or follow the list.

Status of the L2/16-018R proposal

L2/16-018R was published in January 2016 as a revision to the original May 2015 publication L2/15-123. No objections had been received by UTC during May-January so the proposal was put out to international ballot as a UTC recommendation in January 2016. Comments were received by UTC in April 2016 ( L2/16-090, my reply at L2/16-104) where specific objections were made to the EGYPTIAN HIEROGLYPH LIGATURE JOINER.

Following discussions outlined above, L2/16-018R is on hold until it is determined what additional features are required to obtain consensus. I suspect the earliest this can be reviewed by UTC is January 2017.

Over the next few months it would be really useful if comments, requirements or objections about any additions can be made in a timely fashion to UTC in future to avoid further unnecessary delays.

Bob Richmond

Thursday, 29 September 2016

Some remarks about Unicode Hieroglyphic fonts

Some topics are so well known that often they are not even mentioned. This post is about one such topic namely that there are many modern fonts on computers and other digital devices so a writer or publisher needs to choose one or more that work for a specific purpose. Choose the wrong font for your content and the result will be be missing or unsatisfactory characters. This is true for hieroglyphic like every other writing system,

Currently all Unicode Egyptian Hieroglyphic font releases I know of implement the whole Unicode Standard (2009) basic repertoire of 1071 hieroglyphs. There are no subset fonts for specific purposes such as early school education where two hundred signs would be more than adequate. So right now we don't see missing characters or need to take a variety of fonts into account to a great extent.

With ongoing work to extend the Unicode repertoire and include quadrat shaping to make an actual writing system, this situation will likely change considerably. A font designer concerned with the classical phase of the writing system will not want to spend months or years dealing with thousands of specialist hieroglyphs attested only from the Ptolemaic period. A font designed to work well for Ramesside hieratic transcription into hieroglyphic cannot be expected to optimise quadrat arrangements irrelevant to hieratic. A font optimised for small print (e.g. 12pt - 18pt) may make different design choices for glyphs and quadrat shaping than one optimised for 24pt plus).

Furthermore decorative colour fonts are now possible using OpenType but such fonts may be limited in scope to specific purposes.

In short, over time we can expect a wide range of hieroglyphic fonts will evolve. As with modern writing systems. Some aspiring to beauty, others providing specialist functionality. Their scope of use will often be different.

First generation digital hieroglyphic systems (typically using one or another form of Manuel de Codage aka MdC coding), mask these issues. Specialist hieroglyphic software to date typically uses a single 'one size fits all' font such as a Gardiner or Hierogyphica derivation with a single fixed method of arranging hieroglyphs in quadrats. New thinking is needed to embrace a rich world of multiple fonts and take advantage of the Unicode principle of separating plain text encoding data from the fonts used to render the text.

Some of the difficulties in gaining consensus on next steps for Unicode appear to be grounded in the mistaken notion that Unicode should be used in exactly the same way as MdC practice as if the goal were a single Unicode hieroglyphic font that does everything an Egyptologist could ever ask for. There is room for a general purpose font or two - a fallback font - but it should be understood its a small part of the story for many purposes,

One practical consequence of font diversity is Unicode hieroglyphic in plain text can expect to show an 'unsupported character' glyph when displayed using a font that does not support the writers intent. Likewise a quadrat control character sequence will have visible control characters for quadrats the font does not support and might look like:

(actually all quadrats in this specific illustration will likely be fine in most fonts but it illustrates the principle).

This is no different to the situation with Unicode in many disciplines. Mathematical typesetting is a good complex example but even simple examples are commonplace in everyday life if you read beyond a-z. The solution is to understand what you are doing and use appropriate fonts, formats, and software tools for the task in hand. If specific fonts don't do the job you simply don't use them.

Like many concepts this topic will be trivial to understand once hieroglyphic is available for use as a writing system and multiple fonts are released. For now it needs a little imagination.

Historical note

As far as I know (and someone correct me if I am wrong!) the first hieroglyphic typeface was that used in the hieroglyphic dictionary and grammar of Samuel Birch, published in 1867. Next year, we can celebrate 150 years of hieroglyphs in print.

Facsimiles of typeface used in Samuel Birch's' Grammar and Dictionary from 1867. 

The Gardiner/Oxford metal typeface was cut around 60 years after Birch/Bonomi/Longman and several other fonts made and used during the intervening years. Lepsius/Theinhardt is probably the best known of these. The usual MdC situation with a single all-purpose font has a long tradition in print.

It is only recently that we are starting to see richer use of multiple fonts. A good example is Middle Egyptian Literature (James Allen, 2015) which makes effective use of a distinctive bold face.

Distinctive bold font in Middle Egyptian Literature (2015)

I expect we'll see many more innovations once the additions to Unicode are available and supported. We are approaching the beginning of a new era.


Bob Richmond

Monday, 26 September 2016

Unicode Hieroglyphs in web browsers: Web Fonts

I outlined the situation with hieroglyphic for general purpose web pages in an earlier post Unicode Hieroglyphs in web browsers: generic web pages. This note outlines another way of presenting Unicode hieroglyphs on a web page designed for Egyptian.

Web fonts are fonts that are requested by web pages for download (if necessary) for use on the web page. WOFF (the Web Open Font Format [Wikipedia]) is the international standard for packaging web fonts. All popular browsers have supported WOFF since 2010. Browsers may still support older schemes for 'font embedding' for compatibility issues with old web pages but for the vast majority of web sites WOFF should be used and the older schemes avoided.

Web fonts are especially useful for web pages that want to make use of specialist writing systems without having to rely on the writing system being fully supported on your device in order to read the page. They are therefore potentially useful for Egyptian Hieroglyphic in Unicode.

Some useful points to be aware of.

  • Web fonts are compressed and usually 'cached' on the local device so they need not be downloaded each time they are used. This is especially useful and efficient for mobile devices.
  • Browser software usually needs to be enhanced every time the Unicode standard is updated as relates to a specific writing system. This will affect progress during introduction of the hieroglyphic writing system.
  • Services such as Google Fonts can supply identical fonts to a wide range of web sites. Google Fonts is easy for web site developers to use, and only a single copy of the font needs to be downloaded for sharing among sites that use it. For Egyptian Hieroglyphs the recent delays in Unicode standardisation and the cost of creation and release of suitable Google Fonts means this route will probably be impractical for some time unless there is adequate support for the process.
  • Fonts can be provided on an individual site basis. This is the short term approach to be adopted for the Hieroglyphs Everywhere Project web site which also hopes to provide the tools to enable other sites to do the same once the Unicode standard is updated. The same approach could be adopted for a large web site such as Wikipedia.
  • If you copy hieroglyphic from a web page that uses a web font to another application (or web page), that application (or web page) will need to have access to the same (or similar) font in order to gain a satisfactory result. This is no different to any writing system but less obvious in the well-supported systems such as the Latin-based scripts used for English and other popular languages . 

Much can be said on technical aspects of web fonts so I'll likely return to the topic when there is more use of Egyptian Hieroglyphic in Unicode on the web.

In order to set the ball rolling on hieroglyphic web fonts I've created a Browser Test Page for Hieroglyphic in Unicode development. This page uses a web font with a feature to get around the fact that the hieroglyphic writing system is not yet available in Unicode (we have 1071 hieroglyphs but control characters for quadrat arrangements are still being delayed). You can use the page to confirm your web-browser is working correctly. (If there are problems on specific devices I'd like to hear about them so this can be resolved or documented to inform others in a similar situation. Thanks.)

Bob Richmond

Monday, 15 August 2016

Informatique et Égyptologie - Cambridge - 2016

A meeting of the working group “Informatique et Égyptologie” of the International Association of Egyptologists took place at the Fitzwilliam Museum, Cambridge on 11-12 July, 2016.

Presentations and discussions were almost entirely concerned with the hieroglyphic writing system in future additions to the Unicode Standard. The main areas of focus were:


A summary of the meeting by Debbie Anderson of the Script Encoding Initiative, Berkeley is available as Brief Report from Cambridge meeting of Egyptologists and Update [pdf].

Ongoing discussions following the meeting are taking place on the Egyptian Hieroglyphs in the UCS mailing list (see archives at http://evertype.com/pipermail/egyptian_evertype.com/). I'll try to deal with some of these follow-up activities in future blog posts.

If documents from presentations made at the meeting become available online, I'd like to link to them here. Let me know if or when anything becomes available. Thanks.

Bob Richmond

Monday, 18 July 2016

Simple higher level protocols and Unicode Plain Text hieroglyphic writing

Simple (plain) text has limitations in what can be expressed without additional information. There are various 'Higher Level Protocols' (HLP) used to enrichen all kinds of writing systems. HTML/CSS and various document formats such as those used in 'office' applications are kinds of HLP managed as international standards. To use Unicode Egyptian hieroglyphic along with these complex protocols it is desirable for Egyptologists and others to develop conventions of how to work with these protocols in consistent ways so as to be able to share rich data. Something for the future.

However, not all protocols need be complex like HTML/CSS.

Consider hieroglyph combinations already encoded as characters in Unicode. 𓃁 (ab) is 𓂝 (a) on top of  𓃀 (b). Gardiner calls these combinations Monograms and the current practice in Unicode is to encode monograms as separate characters.

As things stand in Unicode there is no character for 𓂧 (d) on top of  𓃀 (b) - (db) rare as a monogram. For many applications and users this is not a problem but this may crop up in some situations like corpus database.

The answer is to define or adopt a simple higher level protocol. For instance use the character '+' to indicate 'on top of' so (db) is encoded 𓂧+𓃀. Similarly for any other monograms not yet encoded. This is still plain text just not pure hieroglyphic so can be used in applications and databases. To render this combination visually software will need adaptation but this is something that needs to be done in general for hieroglyphs not yet encoded in Unicode so there is little impact.

This reasoning applies to other specialist writing features for which there is not sufficiently strong evidence to warrant direct support in plain text at this time. I expect various conventions will evolve as users gain more familiarity with Unicode.

There is an active discussion as whether the policy of encoding monograms as separate characters should be continued or an additional control character should be introduced to take the role of + as given here. Either way, the method described here enables work on Unicode solutions to continue regardless.

Bob Richmond


Thursday, 7 July 2016

Transcription of Hieroglyphic into Unicode Hieroglyphic Plain Text: Part 1

Modern web pages and printed publications are dramatically more elaborate than documents from the 1950s which in turn used advances in printing technologies unavailable to our ancestors. Complex examples surround us online and in print. Newspapers and magazines, product labelling, instruction manuals, advertisements, reference works, textbooks and specialist publications the list goes on and on. Graphics, emojis, icons often complement textual elements which themselves increasingly use a variety of writing systems. Even when written in a language such as English, with its simple alphabet and easy to understand notion of plain text, text in documents can be very elaborate and feature a range of scripts, fonts and typographic styles.

Turning attention to hieroglyphic, consider the Middle Kingdom Stela (BM EA851, from BM page):


There is a line of horizontal text and eight columns of vertical text beneath, along with some pictures/hieroglyphs. Note: traditionally, an Egyptologist might create a line drawing of a stela rather than use a photograph as here; especially useful when there is damage or hard to read hieroglyphs. To discuss the text content, the text elements may be extracted and transcribed, often transcribing vertical text into horizontal writing. Hieroglyphic fonts have been available for 150 years and can be useful for this purpose, Transcription to Unicode Plain Text Hieroglyphic (when standardised) may be used in a similar way.

Modern technology has changed the way we look at artefacts that feature Ancient Egyptian in hieroglyphic. Our eyes are now used to elaborate web pages and publications in everyday life. The idea of arranging text and graphics in the Egyptian way is not as unusual as it would have appeared to scholars or general reader in the 19th and 20th century. Older books on the topic can therefore emphasise different and unfamiliar but this view is becoming increasingly anachronistic to the modern reader.

Web technology and editing tools are also designed around increasing complexity of documents and this means there are now natural ways to think of transcription of hieroglyphic sources in terms of modern technology and standards. In some cases this may mean changes, sometimes subtle, to traditional thinking on digital representations of hieroglyphic.

It is instructive to look at any complex web page based around a simple basic writing system such as English and consider how it might be transcribed as plain text. This exercise can help understand what needs to be considered for Egyptian and avoid over-enthusiastic expectations as to what can be expected from hieroglyphic in Unicode plain text.

This post is is about transcription of hieroglyphic sources to Unicode plain text (however it ends up being standardised). It is important to distinguish this practice from hieratic transcription to hieroglyphic (see Transcription of Hieratic into Unicode Hieroglyphic: Part 1 and later posts on the topic) which uses the same encoding system but has its own specific issues.

There are several points on transcription I'd like to make now in this introduction.
  • Limitations to plain text transcription implied by fonts. Obviously a general purpose hieroglyphic font cannot be expected to contain hieroglyphs that look exactly the same as those in an original document, inscription or painting. The same hieroglyph may be written differently in the same document.
  • Different fonts for plain text may render hieroglyph clusters in different ways. For example a font optimised to represent text in a traditional Manuel de Codage (MdC) layout may look different to a font designed to make a more pleasing generic representation. A font optimised for Late Egyptian hieratic transcription may make different choices for its subject matter. At some point we will have colour fonts in various styles. So far, most Egyptologists are only familiar with editing tools that are limited to a single font. Choice and variety enable a large step forward but can require some fresh thinking.
  • Higher level protocols are essential for many scholarly requirements from hieroglyphic. This will involve defining conventions for web pages (HTML/CSS) and new encoding formats using XML, near plain text, or adaptations of other formats. Work on this has scarcely begun.
  • Transcription is not meant to replace facsimile, photography or other techniques for representing hieroglyphic sources. In many cases the goal is to augment the source in order to promote search-ability, analysis and readability.
  • Typically an Egyptologist will transcribe into left to right horizontal text. If the source material uses columns the vertical text arrangements of hieroglyphs is often not copied verbatim but adapted into a more horizontal style of writing. Partly this is about convenience and uniformity but it is also the case that hieroglyphic fonts don't yet support right to left. General purpose software applications such as word processors are not yet geared up to work terribly well with column text. This is not an intrinsic limitation of the Unicode concept of plain text but likely a practical issue for the next few years at least.
I hope to return to various facets of this large topic in future posts.

Bob Richmond


Monday, 27 June 2016

Unicode Hieroglyphs in web browsers: generic web pages

In an ideal world, Egyptian hieroglyphs in Unicode will simply appear as such to the reader when they are used on a website page. The general reader should not have to do anything special such as install a font or configure the web browser in order to display Unicode text.

Egyptian-aware web pages may render text as images (as mentioned in various earlier posts, e.g. on WikiHiero) or use web-fonts to provide a full text experience. These techniques generally work well with reasonably up to date web browsers on all kinds of devices.

However, this post is concerned with generic web pages meaning web pages without any special coding for Egyptian hieroglyphs. These pages rely on the web browser and/or its underlying operating system to render correctly.

Probably the best known example of a generic web page is the Google search page www.google.com. This blog post itself is also such a web page, hosted on www.blogspot.com so if all is well with the browser and device on which you are reading you should see hieroglyphs at the end of this sentence - 𓇳𓄟𓋴𓋴.

Then if you copy then paste these four hieroglyphs into Google search, you see something like this:


I've highlighted the hieroglyphs in red boxes. Search is just one example. If you want to write hieroglyphs in web forums or read hieroglyphs as text on an arbitrary web page chances are you are reliant on generic hieroglyph support from your web-browser.

Modern versions of macOS/iOS from Apple and Windows 10 from Microsoft are Egyptian hieroglyphic ready 'out of the box' and so are their supplied web browsers (Safari, Edge or Internet Explorer) . So is a correctly configured Linux distro. Generic web pages just work for users of hundreds of millions of these modern devices.

If this works for you, great. The browser or system you are using is Unicode hieroglyphic ready and you are done with this post unless you are curious about technicalities.

However the global picture is not yet entirely rosy. Android, by far the most popular system for mobile phones and tablets, is not yet up to speed. Old versions of iOS/macOS and Windows are not hieroglyph ready. When you try the Google search on these kinds of system you likely see box characters where hieroglyphs are expected unless something has been added to these systems to avoid problems.

Why hieroglyphs work (or don't)

Technically for a web browser to render hieroglyphic text on a generic web page all it needs to do is:

1. Recognise the text characters are hieroglyphs.
2. Use a font with hieroglyphs to render the characters.

The Google search illustration above is from Windows 10 using Internet Explorer 11. Everything works as expected because Windows 10 comes with the Segoe UI Historic font (which contains Unicode hieroglyphs). Internet Explorer recognises hieroglyph characters and, in the absence of any more specific font specifications on the web page, uses the Segoe font as default. The Microsoft Edge and Mozilla Firefox browsers also work correctly.

However, even on Windows 10, the Chrome browser (version 59 and earlier) is not hieroglyph ready. It does not detect hieroglyphs on a web page and choose an available font. Chrome 59 serves as an illustration of how things can go wrong when a browser has bugs in font handling.

The latest release of Android (6.01) has an additional problem in that Android comes with a very limited number of fonts installed. Hieroglyphic is not and this is just one of a large number of writing systems that therefore don't render in Chrome, the standard Android web browser, because no font is available or automatically downloaded when required. In theory a Android device could work when the device maker supplies an enhanced version of  Android. In practice I've never seen this happen. Presumably we'll see a working version of Chrome from Google eventually. Meanwhile Chrome should be fine for most Egyptian-aware web-pages.

Getting hieroglyphs to work for generic web pages

With the huge range of devices and browsers available nowadays it is impossible to give detailed information about what may or may not be done to address problems if your setup is not hieroglyph ready.

In many cases the simplest solution is to update your system setup if possible and/or ensure your web browser version is kept up to date.The technical world has moved a long way since 2009 when hieroglyphs were introduced to the Unicode standard but initially unsupported in the then latest systems such as Ubuntu 10.04 (Lucid Lynx), Mac OS X 10.6 (Snow Leopard) and Windows 7.

Updating is easy for most Linux distros, Apple now provides macOS/iOS updates free of charge and likewise Microsoft provides updates from Windows 7/8 to Windows 10 (strangely, this is currently stated to be free of charge for only a limited time until late July 2016).

If updating is impossible, it seems the Firefox browser works correctly on Linux and Windows devices I've tried so long as it finds a Unicode hieroglyphic font installed. This may work for you. In particular if you are stuck with an old PC setup with Windows 7 it should be sufficient to install a Unicode hieroglyphic font then use Firefox for generic web-browsing instead of Internet Explorer or Chrome. Other less well known browsers may also work.


Bob Richmond

Tuesday, 21 June 2016

Hieroglyphs on the web: The Digital Topographical Bibliography

The Topographical Bibliography of Ancient Egyptian Hieroglyphic Texts, Statues, Reliefs and Paintings (Topographical Bibliography or TopBib for short) is a long running project (work began in the early 20th Century) based at the Griffith Institute, Oxford. The first part of Volume I of the Bibliography - THE THEBAN NECROPOLIS PART 1. PRIVATE TOMBS - was published in 1927. Since then new volumes and revised versions have been published.

The first seven printed volumes of TopBib followed the Topographical approach. The more recent Volume VIII - OBJECTS OF PROVIDENCE NOT KNOWN (2000-) instead, for obvious reasons, organises objects by type and period.

A brief  History of the Topographical Bibliography is available on the Griffith Institute website.

TopBib is the essential and definitive resource concerning Ancient Egyptian objects.

Behind the printed volumes, modern technology was introduced to TopBib during the editorship of Jaromir Malek with use of databases for digitised versions of the text, hieroglyph encodings and so forth. Some material was made available on the web and the notion of an online version of the Bibliography devised. Aside from the benefits of making digitised material available to scholars, technology helps work on TopBib to continue more efficiently (there is a large amount of material available yet to be analysed and published).

The Digital Topographical Bibliography currently provides access to the printed editions (mostly in PDF format). A small amount of material is also available in new data formats and a useful reference system has been introduced (see DIGITAL TOPOGRAPHICAL BIBLIOGRAPHY: The Digital Approach).

TopBib was studied as part of the process of adding Egyptian Hieroglyphs to the Unicode Standard. The printed editions originally employed the Gardiner 'Oxford' font used as the primary reference for the initial hieroglyph repertoire.

The idea of using Unicode plan text hieroglyphic for Digital TopBib goes back over a decade and its a good example of a major publication for which plain text meets all hieroglyph requirements.Much work needs to be done to complete the first digital edition but this should benefit greatly from plain text availability.

Bob Richmond

Thursday, 16 June 2016

Transcription of Hieratic into Unicode Hieroglyphic: Part 2

This post follows on from my earlier Transcription of Hieratic into Unicode Hieroglyphic: Part 1 which stressed the importance of hieratic transcription as an application of Unicode hieroglyphic.

The initial Unicode collection of Egyptian hieroglyphs released in 2009 was based on the Gardiner font and sign list. There is fairly good coverage of signs required for transcription but it is useful to consider how the situation can be improved in the context of Extending the Hieroglyph repertoire in Unicode.

One influential work on hieratic was Hieratische Paläographie by Georg Möller (1876-1921), published in four volumes: Volume I-III 1909-12 and Volume IV 1936 (with introduction by Hermann Grapow). [PDF versions are available for download here]. 

Möller employs numeric codes for hieroglyphs corresponding to hieratic elements as seen in this illustration from Volume I.:



Volume II pp 71-74 links these hieratic codes to the alphanumeric hieroglyph codes used in the Theinhardt font produced for Lepsius.

Hieratic examples are given from a variety of sources, organized by different periods from Old to Late Egyptian through to the Greco-Roman period. Some examples from Volume I:


Here, Möller codes 200 and 200b match Gardiner codes G43 and Z7 and hence Unicode 𓅱 G043 and 𓏲 Z007. Gardiner was very familiar with Hieratische Paläographie and its coding system so it is unsurprising that hieroglyphs in the Gardiner font and coding system links to the Möller numeric system. See Identification of the signs from the Hieratische Paläographie [M-J Nederhof website] for a list of matches between encoded hieroglyphs and Möller codes.

Nevertheless not all Möller codes are present in the Gardiner font and sign list. For example Möller 131 (mouse: encoded as E130 in Hieroglyphica but not yet in the Unicode repertoire).

Möller provides lists of groups/ligatures such as:


which need to be available in any Unicode plain text system.

Regarding extensions to the Unicode hieroglyph repertoire, it is desirable to add Möller codes to the Unicode hieroglyph database of candidates for encoding. His work is over a century old but has been influential and any errors known to modern Egyptologists can be identified in the database (when available for review).

I will be recommending hieroglyphs found in the Möller list but not yet encoded in Unicode be included in the next set of hieroglyphs to be included in the standard and thereby improve the scope of Unicode for transcription of hieratic to hieroglyphic.

This is not to ignore more recent scholarly work involving hieratic. If well-documented material from modern databases such as the Ramses Project and Thesaurus Linguae Aegyptiae or other publications is available to further improve hieratic transcription this data should also be added to the Unicode hieroglyph database of candidates for encoding.

As a point of interest, I am documenting all Möller groups/ligatures (where applicable) in the cluster list referred to in Foundations of a Universal Egyptian Hieroglyphic Writing System in Unicode plain text.

Bob Richmond


Wednesday, 15 June 2016

Hieroglyphs on the web: Thesaurus Linguae Aegyptiae

The Thesaurus Linguae Aegyptiae (TLA) is an Ancient Egyptian virtual dictionary and thesaurus intended to provide a specialist tool for lexicographic research into the Egyptian language. The web application and content is developed at the Berlin-Brandenburg Academy of Sciences and Humanities with contributions from various sources. Much of the content is in German but help text and some content is also available in English.

The TLA corpus includes Ancient Egyptian writings from the entire historical period from Old Kingdom through to the Greco-Roman period. The majority of the content is concerned with the Egyptian Language as written in hieroglyphic and hieratic scripts but there is also a Demotic database included in the current version.

Features of the website include a browsable version of Wörterbuch der Ägyptischen Sprache. [Erman, Adolf (editor), and Herman Grapow (editor)], a digitised slip archive from the Wörterbuch and the Vormanuskript (preliminary manuscript) of the Wörterbuch.

The TLA dictionary itself apparently started with a list of entries from the Wörterbuch by Horst Beinlich (the Beinlich Wordlist). The word list / lemma list has been expanded and developed since then.

The TLA web site went online in 2004 and has undergone various additions and improvements since (the current version is dated October 2014). It remains a work in progress. To access the material you may log in as a guest or register as a user.

The TLA is an invaluable resource for Egyptologists, including a huge volume of material. The database and website designs are very much oriented at specialists. To make the most of the search and research features it is necessary to study the help text and invest some and effort time in finding your way around the site and studying its search functionality.

Hieratic is transcribed as hieroglyphic. TLA uses a subset of the Hieroglyphica sign list for its hieroglyphs, with some additions. A large proportion of hieroglyphs used are already encoded in Unicode. It would be useful to identify those that are not yet encoded so these can be featured in the next update to the Unicode hieroglyph repertoire.

Hieroglyphic writings in TLA are currently implemented as graphics (as with Ramses Online; as outlined in my recent blog entry). As with Ramses, Unicode plain text would open up interesting possibilities for TLA. This should be fairly straightforward to implement in TLA once related work in the Unicode standard is completed.

Bob Richmond

Wednesday, 8 June 2016

Hieroglyphic Fonts and the Universal Shaping Engine

An Egyptian hieroglyphic writing system such as that proposed for Unicode plain text needs to cope with writings such as
where two or more hieroglyphs can be arranged in groups such as one hieroglyph above another as in this example. In digital media, hieroglyphic is therefore an example of a writing system that needs Complex Text Layout (CTL) [Wikipedia] to display text. Much of the worlds text doesn't need CTL but Arabic is a popular writing system that does and there are many others including Indic scripts such as Bengali and Devanagari.

From an end user point of view the technical underpinnings of a writing system should be as invisible as possible, ideally everything should just work naturally. That is a goal of the Hieroglyphs Everywhere Project (HEP). However, for that to happen it is necessary for various pieces of the puzzle to fit together. In this note I'll try to explain where the Universal Shaping Engine (USE) fits into the picture.

Traditionally a writing system that requires CTL needs specialist fonts and customized software. The software elements may be part of the Operating System (Android, iOS, Linux, Windows etc.) or part of an application (Microsoft Office, LibreOffice, Web Browser etc.). To include a new writing system or evolve a supported system in this traditional way is a complicated process which can take years to feed through to the user base.

USE reduces the complexity of this situation by providing software features to enable an OpenType font to implement CTL for its chosen writing systems. In practical terms this means once Unicode is released with plain text hieroglyphic functionality, compatible fonts can be made available and they will be usable with applications such as web browsers and systems that support USE. For instance, there is no need in principle to expect a long wait for web browsers to be adapted to support fonts for hieroglyphic writing so long as browser and font support USE.

The first implementations of USE were released in 2015. Windows 10 integrated the engine on initial release. HarfBuzz (an open source component used in Linux, Firefox, Chrome/Chromium and LibreOffice among others) introduced USE support at version 1.0. The USE specification is available from Creating and supporting OpenType fonts for the Universal Shaping Engine [Microsoft Typography].

I've no information on the current status of USE for Android or Apple iOS/Mac systems. All I can say is it would be surprising if USE support were not widely available by around this time next year (the earliest we can expect plain text hieroglyphic to be released in the Unicode standard). I'd also guess that it is unlikely legacy operating systems such as pre-OS X El Capitan or Windows 7/8 will see USE integration (although it is quite possible that applications such as some web browsers may support USE on these legacy systems).

For Hieroglyphs Everywhere Project I'm working on the assumption USE is the future for hieroglyphic fonts. However, alternate methods are used in the short to medium term while Unicode plain text is not available. This alternate mechanism then remains available for situations where for some reason users are unable to work with up to date operating systems and software.

The recent article making fonts for the Universal Shaping Engine [pdf] by John Hudson makes a good read for font developers or those curious to learn more. The post Windows shapes the world’s languages [Windows Experience Blog] gives some interesting background.

Bob Richmond

Tuesday, 7 June 2016

Transcription of Hieratic into Unicode Hieroglyphic: Part 1

It has been standard practice for Egyptologists to transcribe hieratic sources into hieroglyphic for many years. This is an important application of Unicode Hieroglyphic so it is worthwhile to consider the background.

Alan Gardiner, writing in the 1920s, reasons:

1. The first and foremost reason for transcription is undoubtedly interpretation. Hieratic hands vary greatly, and beginners always, and advanced students often, require to know what familiar character a particular hieratic sign or scrawl represents. Interpretation reduces diversity to unity, permits the comparison of one variant with another, facilitates translation, and performs a multitude of other valuable services. Interpretation is indisputably the primary function for which transcription is employed.

2. There is, however, another reason and purpose for transcription which is not so clearly and fully recognized by scholars, though it is of equal importance with the last. I refer to the reproductive function of transcription. Practical objections of various kinds - expense, printing difficulties, inaccessibility of the originals, etc. - besides the necessity of interpretation referred to above under 1, make the reproduction of hieratic in exact facsimile sometimes unnecessary, and on occasion definitely undesirable. How inconvenient a grammar of Late Egyptian would be, in which all the examples from papyri and ostraca were given in facsimile! … Here I will touch upon another question of expediency. Late Egyptian hieratic is now so well known that in the case of easily legible, relatively "uncial" hands, it is really superfluous to publish every new document in facsimile. Our Egyptological libraries are already far too expensive. For many literary papyri all that is necessary is a good hieroglyphic transcription, leaving it to doubters to verify their doubts by consulting the originals or by inquiring from other scholars to whom the originals are accessible.

To sum up, our transcriptions of hieratic texts of the New Kingdom should at once provide an interpretation of the original hieratic, and also enable the reader to form in his mind a sufficiently good picture of the reading presented by the manuscript. For my own part, I shall not hesitate to use dots and dashes and diacritical marks whenever these seem appropriate or will aid the reader's visualization of the original. Our transcriptions ought most emphatically not to be translations into contemporary hieroglyphic; they are artificial substitutes for the actual manuscripts, substitutes the fabrication of which must be directed by the twin principles of interpretation and reproduction.

From The Transcription of New Kingdom Hieratic; Journal of Egyptian Archaeology, Vol. 15, No. 1/2 (May, 1929), pp. 48-55 (see  http://www.jstor.org/stable/3854012).

Modern technology has eliminated the problems of expense and inconvenience in making an exact facsimile available of an extant source. A legible photograph costs next to nothing to create and distribute on the web. Some practical concerns Gardiner needed to cope with in his era no longer apply. However, the value of the interpretation and reproductive functions highlighted by Gardiner remains fundamental. Indeed now we have machine automation for search and analysis once hieratic is interpreted and reproduced by transcription into hieroglyphs in one or other digital text encoding. This new facet of reproduction opens up ways of working with hieratic sources undreamed of a century ago. 

To date, the actual work of interpretation and reproduction remain human scholarly activities. Often aided by software applications and (potentially) modern technologies such as Artificial Intelligence. Scholarly questions of transliteration and encoding remain open. How to best represent or work with differences among the orthography of Old, Middle and Late Egyptian manuscripts? How to annotate and present hieratic transcriptions as text and/or present sources now we have far richer textual tools available?

One point I'd like to emphasise is the artificial nature of hieratic transcription to hieroglyphic. It is useful to consider this application of a Unicode hieroglyphic writing system in its own right under the overall Unicode umbrella. In my experience it can be confusing to muddle thinking about original hieroglyphic sources and hieratic transcription.

Transcription of Hieratic to Unicode Hieroglyphic plain text need not address all these issues and practicality boils down to two primary considerations.

  1. Ensure the Unicode plain text hieroglyphic writing system captures the layout of hieroglyphs for modern transcription requirements (e.g. as factored into  L2/16-018R [pdf]).
  2. Identify extensions to the Unicode hieroglyph repertoire helpful for transcription applications.
Part 2 of this series of posts will focus on the repertoire question.

Bob Richmond

Monday, 6 June 2016

Hieroglyphs on the web: Ramses Online

Project Ramses is an initiative concerned with the annotated encoding of Late Egyptian texts (approximately scoped to 1350-750 BC). Work started in 2006 based at the University of Liège in Belgium and an interesting snapshot of progress to 2012 is given in The Ramses Project: Exploring Ancient Egyptian linguistic data  using a richly annotated corpus [pdf].

Ramses Online is provides a web application to enable specialists to work with the Project Ramses database. Hieratic is transliterated into hieroglyphic for human readability and machine processing. An initial ‘beta’ version became available online in August 2015 and up to date background information is outlined in a document by Stéphane Polis: http://be.dariah.eu/wp-content/uploads/2015/10/Ramses_Stephane_Polis.pdf. The current website is implemented in French but international support (in English) is stated to be planned.

Ramses Online is an invaluable resource for Egyptologists and already contains much useful material from hundreds of ancient hieratic and hieroglyphic writing sources.

A technical point. At present Ramses Online renders hieroglyphic as graphics, not text. Improved functionality of projects such as this should be straightforward to re-implement using Unicode plain text when available. Indeed I envisage the Ramses (𓇳𓄟𓋴𓋴) corpus as a useful test case for plain text and more elaborate schemes based on plain text.

Bob Richmond

Monday, 23 May 2016

Unicode plain text proposal status (May 2016)

The latest Unicode Technical Committee (UTC) discussion about the Egyptian Hieroglyphic Writing system as Unicode plain text is available from the Unicode web site in Recommendations to UTC #147 May 2016 on Script Proposals [pdf]. This document also contains an update on the status of work being done to extend the repertoire of Egyptian Hieroglyphs in Unicode.

I hope to produce a first draft of a list of clusters of hieroglyphs required for plain text using all three L2/16-018R [pdf] control characters sometime in the next few weeks (this draft data initially to be published on www.egpz.org).

Bob Richmond

Hieroglyphs on the web: Egyptian hieroglyph character picker

A recent web application for working with Unicode hieroglyphs is an Egyptian hieroglyph character picker (EHCP) by Richard Ishida; the latest in his collection of Unicode character pickers on http://r12a.github.io/.

Technical note. EHCP uses Unicode hieroglyphs for most purposes but hieroglyph group rendering follows the WikiHiero method of image arrangement (see my earlier post Hieroglyphs on the web: WikiHiero). One difference is EHCP replaces the WikiHiero PHP code (which runs on the remote server) with JavaScript code that runs in the local web browser instead. Two benefits of JavaScript are 1. better performance in many circumstances and 2. No need to restrict the software to PHP server pages.

The EHCP application is at http://r12a.github.io/pickers/egyptian/ with documentation at

EHCP is useful for experimenting with Unicode. It is not intended to be a fully-fledged hieroglyphic editor. Once Unicode plain text hieroglyphic is available it should be very straightforward to modify EHCP to work with plain text hieroglyphic fonts and eliminate the need for WikiHiero hieroglyphs as images,


Bob Richmond

Tuesday, 17 May 2016

Foundations of a Universal Egyptian Hieroglyphic Writing System in Unicode plain text

A basic collection of 1071 Egyptian hieroglyph characters was added to Unicode in 2009 (Unicode 5.2), the conclusion of a process that began in 2005 and during which it was decided not to release a hieroglyphic plain text writing system at this first stage.

To put it simply it is impossible at present to use Unicode hieroglyphs as a writing system.

 cannot be written using Unicode alone.

Works is now underway to take Unicode hieroglyphs to the next level and enable a writing system.

One fundamental point to understand is a Universal Egyptian Hieroglyphic Writing System in Unicode will be accessible by billions of people. A dramatic change from the current situation where specialist tools available to Egyptologists, students and others are used by at most thousands of individuals who all have a greater or lesser degree of knowledge about how hieroglyphic works as a writing system.

The ancient Egyptians did not write or arrange hieroglyphs randomly; the writing system uses a variety of informal and unwritten rules. Certainly there was much flexibility and styles of writing were not static but the overall shape of the writing system was consistent for over 3000 years of everyday use. A Universal writing system must attempt to somehow take these characteristics into account.

Consider the following arrangement of hieroglyphs produced in a traditional Manuel de Codage (MdC) hieroglyph editing application used by Egyptologists:
This sequence of three rectangular arrangements of hieroglyphs contains Egyptian ‘alphabet’ characters spelling out p-a-r-t-y hidden among some random hieroglyphs added for fun. An Egyptologist would recognize this as unauthentic hieroglyphic. Imagine the billions of other random arrangements of hieroglyphs that could be created by accident or for humorous, mischievous or malicious intent. It is unnecessary to know much about the hieroglyphic writing system to appreciate that if Unicode allowed for arbitrary un-Egyptian arrangements of hieroglyphs like p-a-r-t-y the situation would quickly become ludicrous. 

To mitigate this situation, it was clear while designing a plain text hieroglyphic writing system for Unicode that it is essential to limit the ways hieroglyphs can be arranged. The simplest solution is to construct a list of known valid arrangements of groups of hieroglyphs which make sense in the writing system and are attested in ancient sources. Then publish this list alongside the Unicode standard so developers and font designers know exactly what is required from implementations. That way plain text writings can only use well-defined features of the writing system. This approach is included as part of Proposal to encode three control characters for Egyptian Hieroglyphs (latest version L2/16-018R [pdf], January 2016) which is currently being reviewed for possible inclusion in Unicode 10 (2017).

It is inevitable that the initial release of such a list may be missing some perfectly valid arrangements so it can be expected to grow over time as experts make increasing use of the hieroglyphic plain text writing system and additional valid but less commonplace arrangements for plain text are identified.

Experts using hieroglyphic will encounter the obvious problem that ancient scribes didn't follow technical or style guidelines so there will be hieroglyphic writings that it might seem desirable to encode digitally as text but don't entirely fit into a plain text system either in principle or as it is defined at a given time. The simple answer to those who encounter this limitation is to either 1. represent the original writing in a more standard form. 2. use an existing digital encoding scheme that is not based on Unicode plain text or 3. use some new system built on Unicode plain text principles but with higher level features that allow for more elaborate writing or rendering of the writing.

Feedback about the three control character proposal since it was published over a year ago has shown this basic point can be difficult to grasp by some who are familiar with the flexibility of traditional MdC systems. I hope this post helps explain the simple reason why limitations are unavoidable whatever plain text system is adopted.

As a point of interest, I’ll note that experiments prior to L2/16-018R suggested a minimum of 3000 entries in the initial ‘valid’ list would address a very large proportion of requirements. Although the actual number listed to begin with will be likely somewhat higher.

Bob Richmond

Monday, 16 May 2016

Hieroglyphs on the web: WikiHiero


Wikipedia uses a simple technique to render simple hieroglyphic on a web page by arranging graphics of individual hieroglyphs to simulate the look of the hieroglyphic writing system. The Wikipedia page  https://en.wikipedia.org/wiki/Transliteration_of_Ancient_Egyptian contains examples such as:

This feature of Wikipedia uses software called WikiHiero, first developed by Guillaume Blanchard in 2004. This software is open source, licensed under GPL 2, and continues to be maintained by various contributors.

Technical summary. WikiHiero creates hieroglyph arrangements on a web page as bitmap graphics from a source encoding that follows much of that part of Manuel de Codage (MdC) that deals with hieroglyph encoding. The source of the web page contains elements such as <hiero>M23-X1:R4-X8-Q2:D4-W17-R14-G4-R8-O29:V30-U23-N26-D58-O49:Z1-F13:N31-V30:N16:N21*Z1-D45:N25</hiero> (for the illustration above). These elements are converted into the arrangement of graphics by the WikiHiero software running on the web server before the page is downloaded to a web browser. This process requires the web page is implemented using PHP at the server and is therefore limited to web sites that use PHP such as Wikipedia. See the WikiHiero home page at https://www.mediawiki.org/wiki/Extension:WikiHiero for details.

In my opinion, the greatest strength of the WikiHiero design is the fact that it generates web pages that work over a wide range of web-browsers including many obsolete browser versions. This is a major benefit for a web site like Wikipedia and other web sites built on sufficiently similar technology.

The main downside of WikiHiero for simple hieroglyphic is the fact that hieroglyphs are no more than graphics on the web page. WikiHiero pre-dates Unicode hieroglyphs and has not yet been adapted for use with the Unicode Standard.  This means WikiHiero hieroglyphs are not detected by search engines such as Google or Bing and Egyptian hieroglyphic cannot be used in the same way as other writing systems in Wikipedia.

Tip. If you encounter WikiHiero hieroglyphic on Wikipedia and you want to copy the text into a hieroglyph editor such as JSesh choose the [edit] option and you will see the <hiero>…</hiero> encoding. You can then copy the MdC content enclosed between the tags. Likewise, if you want to edit a Wikipedia page you can add your own hieroglyphs by wrapping your MdC in a <hiero>…</hiero> element.

Another potential benefit of WikiHiero I’d like to point out. The implementation of <hiero>…</hiero> encoding ought to be fairly simple to update to modern technology when the time is ripe. This means you shouldn’t feel put off contributing to Wikipedia now if your Egyptian MdC content works with WikiHiero. A future implementation of Wikipedia could elect to turn <hiero>…</hiero> into Unicode hieroglyphic plain text rather than graphics. In which case your content would ‘magically’ become accessible as text to search engines and so forth. Text quality would improve considerably by the use of a font and advanced typography rather than WikiHiero simplified layout of bitmap images.

Aside from its use in Wikipedia, WikiHiero has also been used to implement a simple MdC editor. See http://aoineko.free.fr/index.php?lang=en. This editor is also interesting in that it shows the detailed HTML encoding generated by WikiHiero from MdC.


Bob Richmond