MECILDI

MECILDI

A new section have been added under the title MAIN PROJECT 2: MECILDI. MECILDI stands for targeted measurement of languages in the Internet from its French initials.

It is a new and ambitious project based on the creation of a program in capacity to extract language repartition and multilingualism parameters from any series of web sites, that with due consideration of the fact that a web site can have more than one language.

This consideration of web multilingualism in the process is both a complex problem and a historical breakthrough. It will allow, when applied to TRANCO, the series of the one million most visited web sites, to correct the extremely biased figures from W3Techs (see this article) and offer the first ever bias corrected measurement of language repartition in the webpages of the most visited sites.

The version 1 of MECILDI is the final testing process and soon we will share results, stay tuned!

This project first version has been funded by la délégation générale à la langue française et aux langues de France of France Ministry of Culture and will be next applied to obtain multillingualism characteristics and language percentages split for a series of gTLDs of languages of France (.alsace, .bzh, .corsica, .gp, .mq, .yt, .nc, .eus, .pf and .wf).