Cybergeography of Language Families

This page proposes the analysis of the linguistic evolution of the Internet from a geographic perspective.

By geographic perspective, it shall be understood the geographic location of the origin of languages.

At this effect, the Internet indicators produced by the OBDILCI model for the 343 languages with more than 1 million L1 speakers are regrouped by language famlilies.

European languages are the languages which originated in Europe (English, French, Spanish, Hungarian…), independently of the fact that their speakers have been dissemanted in other geographic regions. Note that the model currently includes 46 European languages.

African languages include all the local languages from Africa (such as Swhaili, Fulfulde or Shona). Note that the model currently includes 143 African languages.

Asian languages include all the local  languages from Asia (such as Chinese, Hindi or Vietnamese). Note that the model currently includes 135 Asian languages.

There is currently 8 languages from the Americas  (such as Aymara, Quechua or Jamaican English Creole)

The different Arabic languages regrouped in the Arabic macro-language are set as one family.

There is unfortunately no Pacific languages at this stage as no one reaches one million speakers.

The indicators which are computed by family are the ones produced by the OBDLCI model:

Internauts %: the percentage of persons connected to the Internet for each language’s family.

Contents %: The percentage of Web Content for each language’s family 

Virtual Presence : the ratio between Content % and Populatrion % for each language’s family

Content productivity: the ratio between Content % and Connected Speakers % for each language’s family

Connected Speakers %: The percentage of Connected Speakers of each language’s family among the world L1+L2 Connected Speakers.

Note that all figures and percentages are computed on L1+L2 basis (first language plus second languages)

V5.1, April 2024 – Update demo-linguistic figures & ITU

LANGUAGES FROM >AfricaAmericasArab worldAsia Europe Pacific NOT INCLUDED TOTAL
Internauts %36,5%68,2%67,8%60,1%88,0%62,1%49,11%63,41%
Contents %4,25%0,29%3,65%44,11%45,94%0,03%1,74%100%
Virtual Presence0,360,830,890,911,430,520,511
Content Productivity0,630,810,840,931,190,730,661
SPEAKERS L1+L2 %11,68%0,35%4,08%48,37%32,06%0,05%3,42%100%
CONNECTED SPEAKERS %6,77%0,36%4,37%47,28%38,55%0,04%2,65%100%
Number of Languages15281147512 361

Comments for V5.1: Not much changes. First Pacific languages.

V4, May 2023 – Update demo-linguistic figures

Languages from >AfricaAmericasArab worldAsiaEuropeNot IncludedTOTAL
Internauts %36,4%65,8%68,4%56,3%84,9%47,92%61,63%
Contents %4,38%0,29%3,91%43,76%45,66%2,00%100%
Virtual Presence0,400,830,920,901,440,501
Content Productivity0,610,810,830,941,190,651
Speakers L1+L2 %11,00%0,35%4,23%48,75%31,71%3,96%100%
Connected Speakers %7,17%0,36%4,70%46,44%38,26%3,08%100%
Number of languages1438114347342

Comment for V4: The start of growth forAfrica family withnessed in V3.2. is confirmed and amplified.

The difference in Contents % between Asian and European languages will probably disappear in the close future as connectivity rates keep growing in Asia.

One may be surprised to see the global rate of connectivity decrease slightly to 61.63%: this is the consequence of corrections in Ethnologue and/or ITU data which would increase the number of speakers in countries with very poor connectivity. For example: the number of English speakers in Pakistan, a country whose connectivity rate has been corrected downwards by the ITU, has increased by 79 million and that of French speakers in the Rep. Democratic Republic of the Congo by 31 million.
Thus the global connectivity rates for English and French decreased between version 3.2 and 4 as well as the global rate.

V3.2, March 2023 – Update persons connected per country figure

Languages from >AfricaAmericasArab worldAsiaEuropeNot IncludedTOTAL
Internauts %36,2%65,7%65,0%56,0%85,1%53,65%62,05%
Contents %3,38%0,25%3,00%43,33%46,11%3,92%100%
Virtual Presence0,370,800,850,901,490,511
Content Productivity0,580,750,810,951,220,591
Speakers L1+L2 %9,21%0,31%3,53%48,39%30,87%7,69%100%
Connected Speakers. %5,83%0,33%3,70%45,81%37,67%6,65%100%
Number of languages1398113546329

Comment for V3.2: There is finally a start of growth for African languages family mainly a consequence of the gowth of the Internet connection rate in several African countries.

V3.1, August 2022 – Update persons connected per country figure

Languages from >AfricaAmericasArab worldAsiaEuropeNot IncludedTOTAL
Internauts %31,1%61,4%65,1%51,9%82,6%49,07%58,69%
Contents %3,07%0,24%3,14%43,29%46,45%3,81%100%
Virtual Presence0,330,790,890,891,500,501
Content Productivity0,590,750,800,941,210,591
Speakers L1+L2 %9,21%0,31%3,53%48,39%30,87%7,69%100%
Connected Speakers %5,21%0,32%3,92%45,82%38,30%6,43%100%
Number of languages1398113546329

V3.c, March 2022

Languages from >AfricaAmericasArab worldAsiaEuropeNOT INCL.TOTAL
Internauts %29,8%56,7%64,0%49,3%82,6%47,06%56,91%
Contents %3,03%0,24%3,16%43,32%46,49%3,77%100%
Virtual Presence0,330,770,890,901,510,491
Content Productivity0,590,750,790,951,210,591
Speakers L1+L2 %9,21%0,31%3,53%48,39%30,87%7,69%100%
Connected Speakers %5,16%0,32%3,97%45,62%38,57%6,36%100%
Number of languages1398113546329

Comment for V3.c: This is the start of the computation of indicators by language’s family

Note that Asian language’s contents are getting closer to European languages in spite the huge difference in connectivity rates.

The below maps give an insight of the medium-term cybergeography of languages trends

Projects by OBDILCI

  • Indicators for the Presence of Language in the Internet
  • The Languages of France in the Internet
  • French in the Internet
  • Portuguese in the Internet
  • Spanish in the Internet
  • AI and Multilingualism
  • Pre-historic Projects…