You may know that one cannot read Anandabazar Patrika (ABP) the largest Bengali newspaper from India, using any of the modern browsers such as Mozilla Firefox, Apple  Safari, Google Chrome, Opera, and others (only exception is Microsoft IE). This is because Anandabazar has failed to adopt the international standard for digital Bangla namely the Unicode even in 2010.

I am launching a petition site with a live Unicode proxy that demonstrates — Anandabazar Patrika can be read not only in all modern browsers but also in mobiles phones, if they care to adopt the international standard.

On the new year’s day, the Anandabazar Patrika, the largest Bengali newspaper from West Bengal, begins one of their editorial with the sentence — “সাম্প্রতিক পশ্চিমবঙ্গের জনপ্রিয়তম শব্দবন্ধ ‘পরিবর্তন চাই’” (“The most popular words in recent Bengal — ‘We want change'” ). The same editorial ends with the proposition — “নূতন বৎসরের মূলমন্ত্র হউক ‘পরিবর্তন চাই’” (“Let the mantra for new-year be — ‘we want change’ “). Leaving aside the politics, there is a serious need of change in technology adoption in West Bengal and that is to help its beloved language Bengali to survive in its digital avatar.

In their own words, the Anandabazar Patrika (ABP) may sound like a champion of change but in practice they are no different. Being a leader in Bengali publishing industry, one might expect them to be in forefront in improving the digital standard for Bengali. Unfortunately, their action speaks just the opposite. They continue to use non-standard, bitstream font technology in their website instead of using international standard, the Unicode. One of their “supported browser” is Netscape Communicator whose official support has ended in 2008. They also recommend the use of Firefox plugin Padma. Being the author of ABP support in Padma, this seems rather strange to me. They are asking users to convert their contents to Unicode (by using Padma) rather than serving their contents directly using Unicode.

It may be mentioned that like many other non-Latin languages, digital representation of Bengali texts suffered from a lack of encoding standard in its early phase. However with the advent of Unicode, the universal encoding standard, this is no longer an issue. The Unicode standard has been widely adopted across different operating systems and all recent versions of Windows, Mac or Linux support Unicode natively.  According to a statistics from the internet giant Google, the Unicode is most frequently used encoding on the internet since 2008.

Nevertheless, there has been a significant increase in Unicode adoption also for Bengali in recent past. Let me mention some of them.


In may be noted that Bengali is the national language of Bangladesh and they too suffered from the same problem. However, there has been a dramatic increase in adoption of Unicode lately. The largest news paper from Bangladesh by circulation, the Prothom Alo, has now switched to Unicode. Until recently they were using their own proprietary encoding. Other prominent news papers that have switched to Unicode from proprietary encoding are Amar Desh, Sangbad, Daily Sangram, Manab Zamin, Samakal.

West Bengal:

The West Bengal government has now adopted Unicode 5.0 as the standard encoding for Bengali. Their official website Banglar Mukh has finally switched to Unicode. Furthermore, with their funding the entire literary work of Nobel laureate Rabindranath Tagore has been released using Unicode. Tagore’s works are now in public domain due to the expiration of copyrights. The credits for these encouraging developments must go to the Society for Natural Language Technology Research and the company behind some of these implementations, the MAT-3 Impex.

Coming back to the technology front, there is now a new kid in the great browser arena, the Google Chrome. This snappy browser while supports Unicode natively, currently uses a buggy font for Bengali by default. This causes some Bengali texts to appear garbled. Most of these issues can be solved by simply changing its default font. To do so click on

Wrench-->Options-->Under the Hood-->Change fonts and language settings

and then choose the font of your choice for Bengali. For example in Ubuntu you can choose Freesans or Freeserif. These fonts have nice glyphs for Bengali.

[Update: Please see at the bottom of this post for a link to an improved version of Padma.]

Anandabazar Patrika (ABP) and Bartaman Patrika (BP) are two (among big four) well-known Bengali news papers that are published from West Bengal, India. In the Internet era, their online versions are not just a matter of convenience rather the only route of access for many of us. Unfortunately, their online versions continue to live in the past by using non-standard, ancient dynamic font technology instead of upgrading to standard Unicode.

The worst part is that to view their website you need to have Internet Explorer installed in your machine. So if you are Linux, Mac (or any non-Windows) users then you are left at your own.

Fortunately, there is now a simple way for Firefox users in Linux and Mac to read these websites using a Mozilla extension named Padma by Nagarjuna Venna and his team. To get Padma working, (a) you need to have Unicode Bengali font (Linux users may already have one. Mac users can get one from Ekushey), (b) you need to have Firefox (version 3 is recommended for Linux but must have for Mac), and (c) and you need to install Padma.

Padma can transform given non-standard encoding to standard Unicode on the fly. Of course, for Padma to work, it must know the font-encoding of the particular website.

As it turns out, I wrote support for ABP in Padma more than a year ago. My job was made simple by an earlier CGI program by Tanmoy Bhattacharya who had already decoded font-mapping for ABP. Couple of months ago, I also added support for Bartaman Patrika in Padma. So, courtesy Tanmoy’s font-map decoding, latest version of Padma (0.4.13) supports both ABP and BP.

There is a known issue of incorrect rendering of Bengali Matras in certain situations. See for example Runa-Sankarshan’s photostream here. Many of these were due to a simple bug and has been fixed in the latest version (0.4.13). However, fixing of the remaining requires significant changes in Padma. ABP and BP both use three different fonts simultaneously. Most ligatures often come from 2nd and 3rd font whereas Matras come from the 1st font. Padma transforms each font separately and doesn’t merge these different fonts elements into a single element. This leads to the incorrect rendering which is hard to solve without changing the core of Padma.

The Bigger Issue however is the need for Padma itself. I tend to agree with the concerns expressed by Sankarshan in a discussion thread here. The real question is then how long are these websites going to keep themselves confined using their own non-standard encoding?

This led me to wonder: don’t their technical staffs realize what they are missing by not upgrading to Unicode? Firstly, by upgrading to Unicode they could readily expand their current user base. Secondly, the use of Unicode will make their contents search-able in search engine like Google. This could lead to additional search-engine generated revenue for them. The number of Bengali internet users is going to increase in coming future, and a significant portion of new internet users will be coming from the interior part. Undoubtedly, many of these users will be more comfortable in searching using Bengali keywords. Thirdly, by continuing the use of non-standard encoding, they are piling up their archive with non-standard contents which would require a big effort by them to bring into standard form. So, in my humble opinion, it would be prudent decision for them to upgrade their website to use Unicode sooner than later.

Nevertheless, there is now a positive sign that Star Ananda, a sister group of Anandabazar Patrika, has started using Unicode (though their defined “charset” doesn’t say so) for their Bengali website. I hope, this marks the beginning of change.

Update (May 9, 2009): Please see this post for an update on the above mentioned incorrect rendering issue.

