Reposting this reference post since the demise of my previous blog. I also have an updated page here too with a focus on Sanskrit.
If you want to impress your friends (or your blog readers…*ahem*) when you talk about Buddhism, why not use some HTML diacritics?
You see, most of the Buddhist terms you read about derive from one or more non-European langauges:
- Sanskrit: the holy language used in Hinduism, religious literature. Now a dead language.
- Pali: an ancient language in India, mostly used for trade. It was popular as a lingua franca. Also a dead language.
- Classical Chinese: this is how Chinese was in the olden days. There are more Buddhist texts preserved in Classical Chinese than any other language.
- Japanese: actually, most Japanese Buddhist terms are really just Classical Chinese with Japanese pronunciations, as was the style back then.
None of these languages natively use a Romanized script like Western European languages do, so it’s up to translators to figure out how to Romanize things. So, to capture all the sounds that don’t exist in English, linguistics experts recycle Roman letters, but add extra characters: diacritics.
Until real recently, it was pretty difficult to print non-standard Roman characters on a webpage. Back then, users had to download special fonts, and your browser had to be able to read them.
Now though, as the Internet becomes more international, you can pretty much print any Romanized character you want using special “extended-ASCII” codes in HTML.
For example, let’s say I want to print an ā character. In the old days, I could use a Character Palette program on Windows or Mac to copy/paste it (if I could find it), but now I can just use the HTML extended-ASCII code & # 257 ;. This is, all one word, an ampersand, a pound sign, the HTML code number and a semi-colon. If you put these together the web browser will automatically translate it into the right letter you want.
All extended-ASCII letters in HTML have the format of
So, the trick is just remembering what number you want, and fill in the blanks. Remember that you have to do this for each special letter you want to print. Here’s a helpful chart for some commonly used diacritics and letters for Buddhist terms. Most are for Pali/Sanskrit, but for Japanese, the long vowel sounds are used too (ā, ī, ō, ū):
- á – 225, the a with an acute mark
- é – 233, the e with an acute mark
- ñ – 241, the n with a tilde over it
- ú – 250, the u with an acute mark
- ā – 257, the long “ah” sound
- ī – 299, the long “ee” sound
- ō – 333, the long “oh” sound
- ś – 347 (346 for upper case), the s with an acute mark
- ū – 363, the long “oo” sound
- ḍ – 7693, a “d” sound in Sanskrit
- ḥ – 7717, a breathy “h” at the end
- ḷ – 7735, the nasal “l” sound
- ṁ – 7745, a soft “m” sound
- ṃ – 7747, the “ng” sound
- ṅ – 7749, another “ng” sound
- ṇ – 7751, the soft “n” sound
- ḍ – 7693, the nasal “d” sound
- ṛ – 7771, the deep “r” sound in the back of the throat.
- ṝ – 7773, a longer, deep “r” sound.
- ṣ – 7779 (7778 for upper case), the emphatic “s” sound
- ṭ – 7789, the nasal “t” sound
Try it out on your webpages and see if it works well for you. After a few times, it gets much easier to accurate represent Buddhist terms in English, and you can pass yourself off as a Buddhist scholar or something. 😉
For further reference, checkout this excellent reference:
Namo Amida Butsu