Tuesday, April 25, 2006

Reach of the Wiki

Wikipedia, many of us agree, is a marvelous resource. And no less than the prestigious journal Nature did a study, and found that the Wiki is about as accurate as the Encyclopedia Brittanica. A visit to the Wikipedia has almost become a daily affair for me. But that’s not the main point of this post.

I think the Wiki also provides a very good and real estimate of a number of socio-economic indicators (in this case for India). There’s been too much media talk and hype about IT revolutions in India, and greater nation wide penetration of the internet. So much so that some of us even have started believing the hype. The Wiki turns out to be a very good indicator of how true that is.

The Wiki after all is the ultimate internet tool. It provides information, and is created and maintained by all users. So, the more aware a society is of the internet and it’s power, the more it’s likely to use the Wiki. So, I just decided to take a look at the language wise breakdown in the Wiki. The order of representation of Indian languages (when compared to each other) hardly surprised me, but something else did.

The most represented Indian languages were Telugu, Tamil and Kannada (see the screen capture). This is not that surprising, since they are the most “IT enabled” states in the country. But it’s the sheer number (or lack of) entries that caught my attention. Only Telugu, a language spoken by around 70 million people, had more than 3000 entries. It was ranked 64 in terms of number of entries, with a similar number of entries as Irish-Gaelic, Kurdish, and Latvian. Tamil has around 2500 entries and Kannada has about 1500. Hindi, India’s most widely spoken language (with an estimated 400 million native speakers, and perhaps a couple of hundred more non-native speakers) has only around 1200 entries in total. Other major Indian languages, (Bengali, Punjabi, Marathi, Gujerati, Kashmiri) have only a few hundred entries at most. In contrast, Russian and Chinese have over 50000 entries, and Indonesian or Malay have well over 10000 entries each. So, the general situation in India becomes clearer (and I think it’s very reflective) without having to go in to expensive or inaccurate surveys.

To me, given that the majority of the population speaks one or the other Indian language, this says a few things:
1) Even if internet penetration has increased (with reports of over 10% of the population at least having access to the internet through net cafes etc), understanding of the power of the internet remains minimal.
2) The concept of the internet as an “enabling tool” is yet to catch on.
3) The internet in India not widely used as an information gathering, sharing and educational resource.
4) The internet is either not reaching the masses (who are most fluent in their native language), and/or all internet/computer education in India is only in English.

Which means that (a) there is a large untapped market and fantastic economic opportunity for someone to go in there, and create IT enabled learning in vernacular languages. It is not as if all Indian language speakers are poor. In fact, a majority of the population (even in cities) is most comfortable in Indian languages, and Indian language newspapers outsell the nearest English rival by a few fold. (b) If tapped, there’s a great deal of creative energy here that’s waiting to be released. The extended question would be, if one were to start exploring educating people to use computers and the internet in Indian languages, where would one start? Would it be in an area with some infrastructure and awareness (eg. Andhra, Tamil Nadu, Karnataka) or in a completely untapped and unexplored region (eg. most of North India).


Vikrum said...


This is an excellent article. You're right that there is tremendous potential for the Internet to develop using Indian languages.

But I don't think that the potential will be realized any day soon. In India, success, technology, the Internet, and the English language all seem to be strongly linked. Whether this is okay or not is a different story - but that seems to be the reality.

oddan said...
This comment has been removed by a blog administrator.
oddan said...

I disagree slightly. In india most people who have access to the internet count english as their primary language of writing and reading. In contrast for the Russian and Chinese, Russian and Chinese is the primary language hence more content is available in these langauages rather than Hindi or tamil or telugu or any other indian vernacular. a good example would be you or me. Though we might be comfortable speaking our mother tongue but we write and read primarily in english.

Sunil said...

Vik......you're right, but it needn't be that way....

Oddan.....disagree with what? That's what i said.....that the internet hasn't penetrated deep enough. At best 5% of the total population of the country is comfortable in English to use English print/news media and the internet. The rest are still more comfortable in their own languages. And the power of the internet is not restricted by language......so there's a huge opportunity for that segment to grow! You say Russian or Chinese is the primary language, SO there is more content. But similarly, there are far more people who speak/study in (say) Hindi than in English (even if the school may be an "English medium school"). They don't have much hindi content on the web.....and it's partly because they haven't been exposed to it, or been shown it's reach and power. The best thing about Wiki's is that THEy get to create it and let it grow.

greatbong said...

Very interesting article. I however tend to agree with Oddan in that most Indian netizens would prefer to write in English---I for instance would never think of using Bengali.

Sunil said...

arnab...perhaps you and Oddan are right...........and it'll never come to be that different languages take off on the net. But i still do think there's a great opportunity here, that's being ignored. In some ways its a pity that (like Vik said) progress is measured in terms of english knowledge.

ravptor said...

Valid point that you have made but the fact is these 'IT' enabled states have the young generation who think in English and not in their mother tongue. Take the instance of blogs.

I found a lot of Arabic, Chinese, Japanese and even Greek blogs but there are negligible numbers of Indian Language blogs. Yes we are the most vocal of the people but then as long as we don't think in our native languages, we cannot develop resources for that language.

Sunil said...

ravi.....yeah....i agree with you. But this was one of those "what if", or "if only", kind of posts. There's a lot that's being missed out on in India, and this (i think) is one of those things!

Anonymous said...

I like a game which needs to use wow gold, when you do not have World of Warcraft Gold, you must borrow warcraft gold from friends, or you buy wow gold. If you get cheap wow gold, you can continue this game.q