Iskandar Ding: Introduction to Tajik Persian 2 – Differences in Pronunciation

Posted  12 May 2020

Tajik Persian has a few phonetic and phonological idiosyncrasies which speakers and learners of Iranian Persian may not be immediately used to. In this post I will talk about some of the main differences. Bear in mind, however, that Tajik Persian, like Persian varieties elsewhere, is not ‘one language’, but consists of many regional dialects/accents. An individual’s pronunciation may be influenced by their ethnicity, region of origin, their education background (particularly whether they have mainly been educated in Russian, as Tajiks whose predominant language is Russian may have a slight ‘Russian twang’ when speaking Tajik), and their exposure to other varieties of Persian. These factors determine how a Tajik Persian speaker deals with the standard phonetics in real-life situations. The following remarks are based on general observations:


  1. The short i and short u

The kasra ِ  (zēr) and ḍamma ُ (pēsh), consistently pronounced in Iranian Persian as a short e and a short o respectively, are written as и and у in the Tajik Cyrillic script and largely pronounced as a short i and a short u. Some speakers may pronounce и (i) with a slight hint of e, and у (u) with a slight hint of o, similar to the Iranian pronunciation, but never really as solid e and o. This makes Tajik Persian words more similar to the transliteration system used by some scholars, especially scholars of classical Persian literature (cf. the transliteration of the name of the poet حافظ as Ḥāfiẓ or Ḥāfeẓ, as the second syllable contains a kasra)

N.B. The kasra is not pronounced as a short i in Tajik when it is at word-initial position and followed by an originally ‘throaty’ sound, namely the glottal stop (mostly ع ), ح and ه: эълон (eʿlān, اعلان) not *иълон, эҳсос (eḥsās, احساس) not *иҳсос, эҳмол (ehmāl, احمال) not *иҳмол, making these (mainly Arabic) words sound similar to their pronunciation in Iranian Persian. It is also a short e before ه (h) in a handful of words, such as меҳмон (mehmān, مهمان) not *миҳмон, деҳ (deh, ده) not *диҳ.


  1. The long ē and the long ō

Tajik (as well as Afghan) Persian has retained the classical distinction between the long ē and the long ī, and the distinction between the long ō and the long ū. Both distinctions have been entirely lost in standard Iranian Persian. For example, ‘lion’ is шер (šēr, شير), and ‘milk’ is шир (šīr, شیر) in Classical Persian as well as Tajik (and Afghan Persian, i.e. ‘Dari’), but both are pronounced as šīr in standard Iranian Persian. ‘Nothing’ is always hēč in Tajik, not hīč as in Iranian Persian. The ی as an indefinite marker is also pronounced as ē in Tajik Persian, including when it is used before the conjunction که (pronounced ki in Tajik): the Iranian وقتی  که (vaqti ke) is вақте ки (vaqte ki) in Tajik. This is reminiscent of its etymology – the suffix -ēw (‘one’) in Middle Persian.

The long ō is also preserved in Tajik Persian and written as ӯ, but some speakers may pronounce it with a slight umlaut, that is, similar to the German and the Turkish ö. Compare دوست: Taj. дӯст (dōst) vs. Ir. dūst; کوشش: Taj. кӯшиш (kōšiš) vs. Ir. kūšeš. Some speakers, however, pronounce the long ō with too much lip-fronting, producing in a sound almost identical to the long ū in Iranian.

These differences have resulted in interesting ‘letter swaps’ in some conventional transliterations: the poet بیدل, transliterated according to the Iranian pronunciation, would be Bīdel, but Bēdil according to the classical and Tajik/Afghan pronunciation. The famous madrasa complex in Samarkand is conventionally called Registan (‘sandy place’ < rēg ریگ ‘sand’) in English, transliterated according to the Tajik pronunciation, but would be Rīgestān according to the Iranian. Similarly, the Iranian gorūh (گروه) is the Tajik gurōh (гурӯҳ).

Unfortunately, there is no easy way to tell where exactly a long ī in Iranian should be a long ē in Tajik and where a long ū should be a long ō, if you judge purely from the Perso-Arabic script, unless you are familiar with Middle Persian (Pahlavi). In general, a long ū written as و in Iranian is more likely to be a long ō in Tajik than a long ī written as ی is to be a long ē. You will get a ‘feeling’ for it with time. Knowledge of Persian words in Urdu and some Turkic languages, especially Uzbek and Uyghur, would be of great help, as Persian words entered these languages when they were still pronounced classically, i.e. similar to modern Tajik.


  1. The long ā

Perhaps the most confusing aspect of Tajik Cyrillic is the fact that the Cyrillic letter о, identical with the Latin o, is used to write the long ā sound in Persian. This is not unreasonable. In modern Tajik, this sound is pronounced with more rounded quality than its counterpart in Iranian and Afghan Persian, making it sound almost like the British English short o (in ‘lot’), or even the British English long ō (the sound made by the word ‘or’) in some speakers. Some Tajik linguists have argued that when Tajiks still used the Perso-Arabic script, the ‘o quality’ of the long ā was absent, and it sounded just like the long ā in Iranian and Afghan Persian. They have also lamented that it is the graph o in Tajik Cyrillic, compounded with generations of compulsory Russian-medium education, that has changed the pronunciation of the long ā in Tajik. This theory has some validity, but is contestable, because Tajik is spoken among other Iranian languages in which the Middle Iranian long ā has become an o, and this may have influenced Tajik.

N.B. In Tajik, there is no such thing as long ā changing into long ū before n, like in Tehrani Persian. In Tajik, نان is always nān (нон) and never *nūn.


  1. The short a

The short a in Tajik Persian is very similar to the Iranian one, but generally more open. Some speakers may even pronounce it like the sound represented by the ‘u’ in RP English ‘cup’ or ‘duck’.


  1. Diphthongs aw and ay

The Iranian ow, for example in روشن (rowšan), مورد (mowred) etc., is habitually pronounced as aw in Tajik, written as ав (av), e.g. روشن: Taj. равшан (ravšan) vs. Ir. rowšan; مورد: Taj. маврид (mavrid) vs. Ir. mowred; مولانا: Taj. Мавлоно (Mavlānā) vs. Ir. Mowlānā; گوهر: Taj. гавҳар (gavhar) vs. Ir. gowhar; چطور: Taj. читавр (čitavr) vs. Ir. čeṭowr. Note that some Tajik speakers, especially on official occasions, tend to pronounce the Cyrillic в literally as v, so as to be closer to the orthography. As Cyrillic does not have a letter representing the sound w, the letter в ended up representing both v and w. Most speakers in everyday situations, however, pronounce в as w after a vowel. I will explain more below in the Consonants section.

In جو ‘barley’ and رو ‘Go!’, it is the monophthongs, ū and o respectively, in Iranian, which correspond to the diphthong aw in Tajik, where the two words are ҷав (jav) and рав (rav). ‘Beer’, therefore is оби ҷав (āb-i jav(w)) in Tajik, rather than āb-e jū.

The Iranian ey is habitually pronounced as ay in Tajik, e.g. کی: Taj. кай (kay) vs. Ir. key; غیبت: Ir. Taj. ғайбат (ghaybat) vs. Ir. qeybat; غیر: Taj. ғайр (ghayr) vs. Ir. qeyr; حیوان: Taj. ҳайвон (hayvān) vs. Ir. eyvān; کیهان: Taj. кайҳон (kayhān) vs. Ir. keyhān; حیرت: Taj. ҳайрат (hayrat) vs. Ir. eyrat.


  1. Prefixes and personal endings

The pronunciation of some frequent prefixes and endings can be slightly different in Tajik. Apart from the pronunciation of the present prefix می as in Tajik (hence Taj. меравам (mēravam) = Ir. میروم mīravam), the subjunctive and imperative prefix بِ is pronounced as bi in Tajik and not be, e.g. Taj. бигир (bigīr) vs. Ir. بگیر (begīr), for the kasra is almost always pronounced as i in Tajik. As is the case with Iranian Persian, mutation of bi does happen in some words in Tajik: бубин (bubīn) (Ir. بِبین bebīn); бубахшед (bubaḫšēd) (Ir. بِبخشید bebaḫšīd), бикунад (bikunad) (Ir. بُکند bokonad), etc. In fact, bi in Tajik is often omitted in the subjunctive and the imperative, which I will mention in a couple of weeks in a post dedicated to grammatical differences.

The first person plural ending ایم and the second person plural ending اید are always pronounced with a long ē, i.e. as ем (ēm) and ед (ēd).

The preposition/prefix بی ‘without’, pronounced bī in Iranian Persian, is always бе (bē) in Tajik (and Afghan) Persian.

It is worth mentioning that the first person possessive and verbal ending م, which is pronounced as am in standard Tajik and standard Iranian Persian, can often become om or um in some dialects and informal registers. E.g. colloq. Ir. میرم (mīram) ‘I go’ vs. colloq. (dial.) Taj. мерум (mērum).

The third person singular present ending -ad is -e in colloquial Iranian Persian, but is always –a, i.e. with the d dropping out, in colloquial Tajik (and Afghan) Persian. Compare colloq. Ir. میکنه (kone) with colloq. Taj. мекуна (mēkuna).


The accusative marker tends to be shortened to o in colloquial Iranian Persian. In colloquial Tajik, it can also be shortened – not to o, but to a. E.g.

Ir. اینو ببین/نگاه کن. (īno bebīn/negāh kon.)

Taj. Ина бин. (īna bīn.)

‘Look at this.’


  1. Pronunciation of final ه as a vowel

The final ه which is pronounced e in Iranian Persian, is always a in Tajik (and Afghan) Persian, recalling its Middle Persian etymology (-ag):

  1. Past participle: کرده: Taj. карда (karda) vs. Ir. karde; شده: Taj. шуда (šuda) vs. Ir. šode, etc.
  2. Adjective/adverb: دوباره: Taj. дубора (dubāra) vs. Ir. dobāre; دیوانه: Taj. девона (dēvāna) Ir. dīvāne; دو نفره: Taj. ду нафара (du nafara) vs. Ir. do nafare; سالیانه: Taj. солиёна (sāliyāna) vs. Ir. sāliyāne; دخترانه: Taj. духтарона (duḫtarāna) vs. Ir. doḫtarāne, etc.
  3. Nouns: خانه: Taj. хона (ḫāna) vs. Ir. ḫāne; پروانه: Taj. парвона (parvāna) vs. Ir. parvāne; روزه: Taj. рӯза (rōza) vs. Ir. rūze, etc.

The original pronunciation, a, is preserved in Iranian Persian when it is followed immediately by the present forms of ‘to be’, where the initial vowel a contracts with ه or prolongs it: e.g. مادرم در خانه ست (Mādaram dar ḫāna‘(a)st), not *ḫāne’st; من این کار را انجام داده ام (Man īn kār rā anjām dāda‘am), not *dāde’am.


  1. Tavānistan

In Tajik Persian, like in Iranian Persian, the present stem tavān– and the past stem tavānist– of this verb can both be contracted in everyday speech. In colloquial Iranian Persian, they become tūn– and tūnest– respectively, but in colloquial Tajik, they are tān– and tānist-, i.e. colloq. Ir. میتونم (mītūnam) = colloq. Taj. метонам (mētānam). This also occurs in Afghan Persian.


  1. Miscellaneous

Many other words are pronounced differently in Tajik Persian. These include:

  1. Vowel insertion: قدر: Taj. қадар (qadar) vs. qadr, hence Ir. چقدر (čeghadr, or colloquially, čeghad is Taj. чи қадар (či qadar, which in colloquial Tajik gets shortened to čiqa).
  2. e -> a: This happens often, e.g. یک: як (yak) vs. Ir. yek; شش: Taj. шаш (šaš) vs. Ir. šeš (or even šīš); کشیدن: Taj. кашидан (kašīdan) vs. Ir. kešīdan, etc.
  3. Different vocalisation: پشیمان: пушаyмон (pušaymān, i.e. پُشَیمان) vs. Ir. pašīmān, i.e. پَشِيْمان.
  4. In general, Arabic loanwords in Persian have the same vocalisation in Tajik Persian, but in Iranian Persian they may be different, e.g. فدا is фидо (fi) in Tajik but fa in Iranian.



The behaviour of consonants of Tajik Persian is largely the same as that in Iranian Persian. Some points to bear in mind are:


  1. No merging of ق and غ

In the Perso-Arabic script, the sound q, a ‘throaty k’, is represented by the letter ق, and the sound gh, a sound similar to the French ‘r’, is represented by غ. These two distinct sounds have merged in standard Iranian Persian, so much so that both letters are pronounced as q at the start and end of a word, and gh elsewhere, regardless of their original phonetic values. Tajik Persian, however, does not mix up q and gh, unlike Iranian Persian. In Tajik, q (қ) is always q, (sometimes relaxed into x before t) and gh (ғ) is always gh. Therefore, you wouldn’t hear غم pronounced as قَم, or عقیده pronounced as عغیده in Tajik.


  1. No palatalisation of k

The k sound in Iranian Persian is palatalised, i.e. acquiring a hint of y sound to it, when it is followed by a, i/ī, and e. I am sure many learners of Iranian Persian have realised that کردن is not straightforwardly pronounced as kardan, but somewhat like kyardan, and که not as ke, but really as kye. This phenomenon does not exist in Tajik Persian.


  1. v and w

The sound v never really existed in Middle Persian. It was only w. W evolved completely into v in modern standard Iranian Persian, but in Tajik Persian, it ended up as two varieties, v and w. Whereas the Tajik Cyrillic script only writes v (в), in reality, it is pronounced v at the start of a word or in between vowels, but as w when it is part of a diphthong, as I mentioned earlier. Some Tajik speakers will pronounce it as w in between vowels, too (vāna -> wāna, but only девона in the script). Standard Afghan Persian, on the other hand, pronounces this consonant predominantly as w in all positions.


  1. The r

Some Iranian Persian speakers may not roll the r completely, giving only a hint of the trill and leaving it sounding largely like the English r, especially when it comes before a consonant (cf. the way some Iranians say the r in کردن). This is not the case in Tajik Persian, where r is always pronounced ‘the hard way’, like, for example, in Italian.


Test yourself

1. Watch this video ( narrated in Tajik, about the Persian poet Omar Khayyam, and see how many of the above-mentioned points you can spot, and whether you can hear differences that I have not mentioned.


2. The following sentences are from the video. I have written them out in the Perso-Arabic script. Try transliterating it into Cyrillic:

نیشاپور یکی از شهرهای اساسی صنعت ایران بوده در ولایت خراسان جایگیر شده است. سالهای کودکی و نورسی عمر خیام در نیشاپور گذشته است. عمر خیام اولا در مدرسه ی نیشاپور که در آن وقت شهرت کلان داشت مشغول علم گشته و بعدا تحصیل خود را در بلخ سمرقند و بخارا دوام داده است.

[image: a colourful panel at the Rudaki Mausoleum in Panjakent, Tajikistan]