Local content generation for Wikipedia, Google

January 9, 2011

Wikipedia article growth for Amharic, Swahili. Click for PDF. {w3c: The Multilingual Web: Where are we?}

The new Google in Sub-Saharan Africa resource is a mix of standard Google tools and uniquely African assets. Included in the slew of new features are resources for educators (University programs) and developers (g-Africa). More features will undoubtedly arise as time goes on, so be sure to bookmark the page.

Of interest is the Google Africa Community Translation project which irons out linguistic issues on African domains, and helps research volunteers to organize their activities. The site provides a tidy list of which African nations have specific Google search engines.

In October 2010, Google Africa Community Translation published an insightful presentation for the W3 Consortium. Included are:

  • highlights of the language density found in Africa
  • low demand for local language services due to the high prevalence of a handful of language for education, training, ICT, oral history
  • tables and charts show the gap in Internet user growth and the number of Wikipedia articles for a given language. Amharic and Swahili are specifically listed.
  • ways to grow the online community through publishing contests, tools, and standards
  • finally, the question: Do users first generate content, or does content draw in users?

The document references a Wikipedia Stats page, which was last updated in January 2012. When sorted by African languages, we can see the number of speakers, editors, number of articles, and usage of this content. The most obvious trend is the lack of content for languages other than English, French, Spanish, Portuguese, and Arabic.

More editors are needed. Hopefully with the measures outlined by Google volunteers can be encouraged to contribute to the minority language content on the Web:

CodeLanguageSpeakers (Prim + Sec)Editors per million speakersVisits/hrArticles
enEnglish1500 M249,889,4323,455,258
frFrench200 M24864,9711,025,634
esSpanish500 M81,396,322667,680
ptPortuguese290 M6496,775615,648
arArabic530 M173,754130,833
simpleSimple English1500 M0.18,96365,811
swSwahili50 M0.31,67121,020
afAfrikaans13 M22,76116,412
yoYoruba25 M0.170010,182
arzEgyptian Arabic76 M0.21,4666,984
amAmharic25 M0.35135,519
mgMalagasy20 M0.12622,609
soSomali14 M0.52321,485
lnLingala25 M02041,290
woWolof4 M0.32101,081
kabKabyle8 M0.1130895
igIgbo22 M0158663
kgKongo7 M0100573
bmBambara6 M0118348
ssSiswati3 M0.783256
eeEwe4 M0109252
omOromo26 M055209
zuZulu26 M0.175195
tsTsonga3 M081172
haHausa39 M066147
veVenda875 k050140
rwKinyarwanda12 M046135
tiTigrinya7 M040131
sgSangro3 M067126
kiKikuyu5 M053112
stSesotho5 M065112
xhXhosa8 M052107
akAkan19 M06197
tnSetswana4 M04597
rnKirundi5 M02690
nyChichewa9 M02775
tumTumbuka2 M03066
ffFulfulde13 M03658
snShona7 M03356
twTwi15 M03752
lgGanda10 M02447
ngNdonga690 k01125