The latest Unicode Technical Committee (UTC) discussion about the Egyptian Hieroglyphic Writing system as Unicode plain text is available from the Unicode web site in Recommendations to UTC #147 May 2016 on Script Proposals [pdf]. This document also contains an update on the status of work being done to extend the repertoire of Egyptian Hieroglyphs in Unicode.
I hope to produce a first draft of a list of clusters of hieroglyphs required for plain text using all three L2/16-018R [pdf] control characters sometime in the next few weeks (this draft data initially to be published on www.egpz.org).
Bob Richmond
Monday, 23 May 2016
Hieroglyphs on the web: Egyptian hieroglyph character picker
A recent web application for working with Unicode
hieroglyphs is an Egyptian hieroglyph
character picker (EHCP) by Richard Ishida; the latest in his collection of Unicode
character pickers on http://r12a.github.io/.
Technical note. EHCP uses Unicode
hieroglyphs for most purposes but hieroglyph group rendering follows the
WikiHiero method of image arrangement (see my earlier post Hieroglyphs on the web: WikiHiero). One difference is EHCP replaces the
WikiHiero PHP code (which runs on the remote server) with
JavaScript code that runs in the local web browser instead. Two
benefits of JavaScript are 1. better performance in many circumstances and 2.
No need to restrict the software to PHP server pages.
The EHCP application is at http://r12a.github.io/pickers/egyptian/
with documentation at
EHCP is useful for experimenting with Unicode. It is not
intended to be a fully-fledged hieroglyphic editor. Once Unicode plain text
hieroglyphic is available it should be very straightforward to modify EHCP to
work with plain text hieroglyphic fonts and eliminate the need for WikiHiero hieroglyphs as images,
Bob Richmond
Tuesday, 17 May 2016
Foundations of a Universal Egyptian Hieroglyphic Writing System in Unicode plain text
A basic collection of 1071 Egyptian hieroglyph characters was added
to Unicode in 2009 (Unicode 5.2), the conclusion of a process that began in
2005 and during which it was decided not to release a hieroglyphic plain text
writing system at this first stage.
To put it simply it is impossible at present to use Unicode hieroglyphs as a writing system.
cannot be written using Unicode alone.
Works is now underway to take Unicode hieroglyphs to the next level and enable a writing system.
Works is now underway to take Unicode hieroglyphs to the next level and enable a writing system.
One fundamental point to understand is a Universal Egyptian Hieroglyphic
Writing System in Unicode will be accessible by billions of people. A dramatic change from the current situation where specialist tools available to
Egyptologists, students and others are used by at most thousands of individuals
who all have a greater or lesser degree of knowledge about how hieroglyphic works
as a writing system.
The ancient Egyptians did not write or arrange hieroglyphs randomly;
the writing system uses a variety of informal and unwritten rules. Certainly there was much flexibility and styles of writing were not static but the overall shape of the
writing system was consistent for over 3000 years of everyday use. A Universal
writing system must attempt to somehow take these characteristics into account.
Consider the following arrangement of hieroglyphs
produced in a traditional Manuel de Codage (MdC) hieroglyph editing application used by Egyptologists:
This sequence of three rectangular arrangements of hieroglyphs contains Egyptian
‘alphabet’ characters spelling out p-a-r-t-y hidden among some random
hieroglyphs added for fun. An Egyptologist would recognize this as unauthentic
hieroglyphic. Imagine the billions of other random arrangements of hieroglyphs that could be
created by accident or for humorous, mischievous or malicious intent. It is
unnecessary to know much about the hieroglyphic writing system to appreciate that
if Unicode allowed for arbitrary un-Egyptian arrangements of hieroglyphs like p-a-r-t-y the situation would quickly become ludicrous.
To mitigate this situation, it was clear while designing a plain text hieroglyphic writing system for Unicode that it is essential to
limit the ways hieroglyphs can be arranged. The simplest solution is to construct
a list of known valid arrangements of groups of hieroglyphs which make sense in
the writing system and are attested in ancient sources. Then publish this list
alongside the Unicode standard so developers and font designers know exactly what
is required from implementations. That way plain text writings can only use well-defined features
of the writing system. This approach is included as part of Proposal to encode three control characters
for Egyptian Hieroglyphs (latest version L2/16-018R [pdf], January 2016) which is
currently being reviewed for possible inclusion in Unicode 10 (2017).
It is inevitable that the initial release of such a list may
be missing some perfectly valid arrangements so it can be expected to grow over
time as experts make increasing use of the hieroglyphic plain text writing system
and additional valid but less commonplace arrangements for plain text are identified.
Experts using hieroglyphic will encounter the obvious problem that ancient scribes
didn't follow technical or style guidelines so there will be hieroglyphic
writings that it might seem desirable to encode digitally as text but don't entirely
fit into a plain text system either in principle or as it is defined at a given
time. The simple answer to those who encounter this limitation is to either 1. represent the original writing in a more standard form. 2. use
an existing digital encoding scheme that is not based on Unicode plain text or 3.
use some new system built on Unicode plain text principles but with higher
level features that allow for more elaborate writing or rendering of the
writing.
Feedback
about the three control character proposal since it was published over a year ago has shown this basic point can be difficult to grasp by some who are familiar with the flexibility of traditional MdC systems. I hope this post helps explain the simple reason why limitations
are unavoidable whatever plain text system is adopted.
As a point of interest, I’ll note that experiments prior to L2/16-018R
suggested a minimum of 3000 entries in the initial ‘valid’ list would address a
very large proportion of requirements. Although the actual number listed to
begin with will be likely somewhat higher.
Bob Richmond
Monday, 16 May 2016
Hieroglyphs on the web: WikiHiero
Wikipedia uses a simple technique to render simple
hieroglyphic on a web page by arranging graphics of individual hieroglyphs to
simulate the look of the hieroglyphic writing system. The Wikipedia page https://en.wikipedia.org/wiki/Transliteration_of_Ancient_Egyptian
contains examples such as:
This feature of Wikipedia uses software called WikiHiero, first developed by Guillaume
Blanchard in 2004. This software is open source, licensed under GPL 2, and
continues to be maintained by various contributors.
Technical summary.
WikiHiero creates hieroglyph arrangements on a web page as bitmap graphics from
a source encoding that follows much of that part of Manuel de Codage (MdC) that
deals with hieroglyph encoding. The source of the web page contains elements
such as
<hiero>M23-X1:R4-X8-Q2:D4-W17-R14-G4-R8-O29:V30-U23-N26-D58-O49:Z1-F13:N31-V30:N16:N21*Z1-D45:N25</hiero
>
(for the illustration above). These elements are converted into the arrangement
of graphics by the WikiHiero software running on the web server before the page is downloaded to a web browser. This process requires the web page is implemented
using PHP at the server and is therefore limited to web sites that use PHP such
as Wikipedia. See the WikiHiero home page at https://www.mediawiki.org/wiki/Extension:WikiHiero
for details.
In my opinion, the greatest strength of the WikiHiero design
is the fact that it generates web pages that work over a wide range of
web-browsers including many obsolete browser versions. This is a major benefit
for a web site like Wikipedia and other web sites built on sufficiently similar
technology.
The main downside of WikiHiero for simple hieroglyphic is the
fact that hieroglyphs are no more than graphics on the web page. WikiHiero
pre-dates Unicode hieroglyphs and has not yet been adapted for use with the
Unicode Standard. This means WikiHiero
hieroglyphs are not detected by search engines such as Google or Bing and Egyptian
hieroglyphic cannot be used in the same way as other writing systems in
Wikipedia.
Tip. If you
encounter WikiHiero hieroglyphic on Wikipedia and you want to copy the text
into a hieroglyph editor such as JSesh choose the [edit] option and you will
see the <hiero>…</hiero> encoding. You can then copy the MdC
content enclosed between the tags. Likewise, if you want to edit a Wikipedia
page you can add your own hieroglyphs by wrapping your MdC in a <hiero>…</hiero>
element.
Another potential benefit of WikiHiero I’d like to point out.
The implementation of <hiero>…</hiero> encoding ought to be fairly
simple to update to modern technology when the time is ripe. This means you
shouldn’t feel put off contributing to Wikipedia now if your Egyptian MdC content
works with WikiHiero. A future implementation of Wikipedia could elect to turn <hiero>…</hiero>
into Unicode hieroglyphic plain text rather than graphics. In which case your
content would ‘magically’ become accessible as text to search engines and so
forth. Text quality would improve considerably by the use of a font and
advanced typography rather than WikiHiero simplified layout of bitmap images.
Aside from its use in Wikipedia, WikiHiero has also been used
to implement a simple MdC editor. See http://aoineko.free.fr/index.php?lang=en.
This editor is also interesting in that it shows the detailed HTML encoding generated by WikiHiero from MdC.
Bob Richmond
Sunday, 8 May 2016
Extending the Hieroglyph repertoire in Unicode
At time of writing, the latest draft proposal about
additional Egyptian hieroglyphs is L2/16-079 “Preliminary draft for the
encoding of an extended Egyptian Hieroglyphs repertoire” http://www.unicode.org/L2/L2016/16079-hieroglyphs.pdf
by Michel Suignard, dated 2016-04-11. This is the latest of several iterations
by Suignard, the first of which was a preliminary draft L2/15-240 dated 2015-10-09.
An important part of this proposal is a database containing basic
information about each of the encoded hieroglyphs. This database to be
maintained on the Unicode website. Over 6000 additional hieroglyphs are proposed
in addition to the 1071 hieroglyphs encoded in Unicode already and this basic
data should make it reasonably straightforward for software tools, fonts and so
forth to work with the expanded repertoire.
There are a number of open issues for discussion in the
draft proposal. I hope to write on some of these topics in future blog posts.
One point I’d like to make now: many of the additional
hieroglyphs first appear in the Greco-Roman period so it is likely that fonts and
tools aimed at classical Egyptian from Old Kingdom to New Kingdom omit or
downplay these additions. In fact, for much use of digital hieroglyphic writing
systems I suspect popular fonts will contain at most hundreds of additions to
the current Unicode standard set rather than thousands. Time will tell.
I’m not personally involved in developing this proposal but
agree with the overall aim to enrichen Unicode support for hieroglyphs. I don’t
know what the thinking is on timescales for completing the proposal but don't see any reason it can't be finished this year. So it seems to me that if Egyptologists or others have ideas that might help
improve what is being proposed, the time to be helpful is to communicate what you have to say during Summer 2016.
Bob Richmond
Subscribe to:
Posts (Atom)