There are two important reasons to make the numeric distinction.
1. Fonts. Most fonts used for MdC implementations to date do not display '1' differently to Z1. This will change with a new generation of improved fonts. See Egyptian Grammar (Gardiner, 1957) p191-p206 for examples showing how the original Gardiner font makes a distinction between Z1 and some numeric forms.
2. Search and analysis. Such applications using MdC are rare so far but the next generation of software will add these features. For instance searching transcriptions for dates is very useful and it is likely such software will be strict about use of number forms to avoid confusion.
Many examples of the practice of treating Z1 and numbers as if they are the same cropped up in the analysis referenced in my earlier post Analysis of Unicode Egyptian hieroglyphs in a collection of MdC-coded transcriptions. Part of the explanation of this is the fact that some of the transcriptions originated with early MdC editing software that did not support the numeric feature.
Examples of number codes used incorrectly are A1*B1:Z2 (people, plurality) incorrectly transcribed as A1*B1:3; G1&Z1 (ideogram) incorrectly transcribed as G1&1; Z1*Z1*Z1:Z1*Z1 (3:2) incorrectly transcribed as Z1*Z1*1:Z1*Z1.
It is not an MdC error to use the Z1 form. The verbose Z1*Z1*Z1:Z1*Z1 transcription, for instance, is a legitimate alternative to 3:2 in most current MdC implementations although it may render differently and in a less satisfactory way in some of these MdC implementations. Sometimes verbose is unavoidable such as using the current release of WikiHiero that does not recognise numeric codes. However it can be expected that future releases of MdC software may give a warning when verbose sequences are encountered. I'm taking this approach with data and software in the Hieroglyphs Everywhere Project (HEP) to avoid confusion when linking traditional MdC with Unicode-based solutions. In short, as a general rule use numeric representations for units in MdC where feasible.
MdC provides other numeric codes 10, 20, 30, 40, 50 and 100, 200, 300, 400, 500 used in a similar way to the units. I would like to recommend these but there is a bug in JSesh 5.5 that results in less satisfactory rendering of quadrats using these numeric forms. -20:10- is a less satisfactory rendering compared with -V20*V20:V20-. Fortunately, unlike the ambiguous situation with strokes, the linkage with Unicode is less problematic so for the time being its mostly harmless to use verbose notation.
The narrow Z2 issue
One problem encountered with MdC transcriptions is the fact that Hieroglyphica (1993 and 2000 versions) omitted defining a code for the narrow variant of Z2. The narrow variant is a feature of the Gardiner font and commonplace in Egyptian Grammar but not explicitly given a code in the sign list. This narrow variant is known as Z002A in Unicode and Z2D in Hieroglyphica extensions such as the Aegyptus font. Z2D was used in InScribe (2004) and documented in EGPZ version 1 (2006).
Unfortunately, the Z2D omission from Hieroglyphica means transcriptions that need the narrow variant often resort to using the number 3 as a substitute when using editing software such as JSesh 5.5. This unsatisfactory situation needs to be resolved.
Personally, when working with JSesh material I write 3\NaN (meaning not a number) - a legitimate JSesh construction. So rather than write -D21:X1*3- (as in Egyptian Grammar Exercise XVIII) I use -D21:X1*3\NaN-.
I'm not especially advocating my approach at present just drawing attention to the problem as something you will need to fix in your transcriptions in the future.
If you are developing MdC-related software such as WikiHiero it is safe and easy to add Z2D support.
Conclusions
In principle MdC support for numbers is fairly effective but has problems in some implementations. To understand this fully needs a more detailed analysis than can be adequately treated in a blog post.
I hope that by raising these points MdC users will give more thought to representation of numbers when transcribing new material or updating existing transcriptions.
To end on a positive note, the situation with Unicode-based solutions is better defined and potentially easier to use.
Bob Richmond
No comments:
Post a Comment