[html4all] pronunciation, homophones, and homographs

Robert J Burns rob at robburns.com
Mon May 26 04:10:38 PDT 2008


Hi Phil,

Thanks for the feedback on the Unicode phonetics idea. Just a few  
things I want to say to help you understand my motivation here. First,  
I'm not trying to supplant the IPA or suggest it is suboptimal. It is  
certainly a widely adopted and effective phonetic alphabet. Second, my  
interest in facilitating other phonetic alphabets is not so much for  
those who speak/hear/readwrite English as a second language, but for  
those who have little or no familiarity with written English at all.  
Imagine, for example, a person from an Arabic speaking country who  
never learned English or any of the languages written using the Latin  
script. Perhaps this is becoming more rare (especially among  
academics) but I'm also interested in facilitating use of phonetic  
alphabets by non-academics (at least a reading and writing literacy  
for phonemes).

You do make a good point regarding the need for a phonetic alphabet  
for languages that use ideographs. However, in that case my goal is  
also undermined, since the ability to develop phonetic graphemes that  
are mnemonically familiar to the native reader of such a language is  
already a problem.

I agree that the IPA Unicode characters already facilitate speech  
synthesis. My thinking on this is that it is much more an enhancement  
to internationalization of phoneme characters than it is about aural  
rendering and accessibility of phonemes. Different users are familiar  
with different phonetic alphabets and encoding abstract semantic  
phonemes allows for effective separation of semantics and grapheme  
presentation. So part of my motivation is also to facilitate wider  
international comprehension of phonetic alphabets so that they become  
more common place and in that way facilitate accessibility.

Finally, my estimate of 512 code points is a liberal estimate based on  
the current needs of the IPA and other phonetic alphabets. The number  
could even be reduced by using character combinations (such as voiced  
or voiceless modifiers) to modify the previous character. I think I  
came up with one character / code point mapping that required only 64  
characters for all phonemes (I was trying to make it fit somewhere in  
the BMP).

Again, I really appreciate your feedback.

Take care,
Rob



On May 26, 2008, at 10:21 AM, Philip TAYLOR (Ret'd) wrote:

>
>
> Robert J Burns wrote:
>
>> Thanks for the reply. That's certainly a point-of-view I'd like to
>> hear form.  suppose in many ways English has a greater need for a
>> separate phonetic alphabet than other languages,
>
> Not really : consider Chinese (Pinyin, BoPoMoFo) and
> Japanese (Hiragana, Katakana) -- when a language is based
> on ideographs, a phonetic alphabet is /essential/ for descriptive
> / pedagogic purposes ...
>
>> but when someone
>> wants to encode phonemes from languages other than their primary
>> language, it makes sense to me that they would want to use graphemes
>> derived from their own familiar primary language.
>
> Yes, that is the idea I wanted to bounce off my Chinese friends.
> It is not /certain/ that they are in any way unhappy with the
> IPA as-is, so I would be interested to learn their point of
> view.  (They are all teachers of Chinese as a second language).
>
>  One example I can
>> think of is even the use of a Latin Letter H for an unvoiced glottal
>> fricative which matches how that letter is often used in English.
>> However, for someone whose primary language is Spanish a Latin letter
>> J may make a better mnemonic. Likewise for speakers of Arabic, Urdu,
>> Korean, Hebrew, etc.
>
> Agreed.
>
> Using appropriate mnemonic graphemes for each
>> language makes the use of a phonetic alphabet easier and more likely
>> to be widely adopted.
>
> Less certain of that : the IPA has /very/ widespread takeup.
>
>> Another important issue is that phonetic alphabets change over time —
>> sometimes swapping one grapheme for another in the representation  
>> of a
>> particular phoneme. By encoding the phonemes themselves (and not the
>> graphemes representing the phonemes), the changes to a phonetic
>> alphabet can be handled by updating fonts while maintaining the text
>> document completely unchanged.
>
> Agreed.
>
> Similarly, a user can change from one
>> phonetic alphabet to another simply by changing fonts (like from the
>> IPA to the Uralic Phonetic Alphabet).
>
> Also true.
>
>  Also, input systems can be
>> localized to the users needs so that a user may use their usual
>> keyboard or a character input palette depicting the graphemes  
>> familiar
>> to their primary language even while the input system is actually
>> inputing phoneme characters and not grapheme characters.
>
> A good point.
>
>> Finally, I think this could lead to better international interchange
>> of phoneme text. Every user can view a phoneme text document in the
>> phonetic alphabet they're most familiar with.
>
> OK, but what we need here is evidence that the IPA as-is is not
> the phonetic alphabet with which some users are most familiar.
> I suspect that Chinese / Japanese academics are very familiar
> with the IPA, whilst "normal" Japanese / Chinese citizens are
> far more familiar with Katakana/Hiragana/Pinyin/BoPoMoFo.
>
>> For others, they simply hear the synthesized
>> speech uttering the phonemes.
>
> This last point is important, but can that not already
> be accomplished using Unicode/IPA ?
>
>> As I said before this is a departure for Unicode that I expect would
>> face some resistance. Unicode has, up until now, been focussed
>> exclusively on encoding graphemes as characters: they might have
>> trouble even thinking about a character as a phoneme (and not a
>> grapheme). However, I think this is a natural evolution for Unicode
>> and since it wouldn't need to use any of the precious basic
>> multilingual plane code points, it shouldn't be much of a burden to
>> devote maybe 512 code points to phonemes out of the 800 thousand code
>> points still available for assignment in Unicode.
>
> Definitely unsure you can express the world's phonemic collection
> in 512 slots !  But the departure (for Unicode) is so radical
> that I fear it is doomed from the outset, which is (a) why I'd
> like the input of non-Latin native speakers, and (b) to consider
> whether (if they agree that the IPA is sub-optimal) whether we
> can find a way to express phonemes without requiring a change
> to Unicode.  Remember that both the IPA and Unicode have a
> /very/ well established user-base; we are even more of an
> edge-case than Flicker in this context :-)
>
> ** Phil.
>
> _______________________________________________
> List_HTML4all.org mailing list
> https://www.html4all.org/wiki




More information about the List_HTML4all.org mailing list