Languages & Cultures Title (click on left side to go to Funredes Home)
Project Overview
Email FUNREDES
Languages & Cultures Home L1 Study L2 Study L3 Study L4 Study C1 Study C2 Study


Between 1996 and 1998, Funredes has led several studies on the topic of Languages & Cultures on the Internet. The results are currenlty available in English on this server. Here is a short presentation of each release, in chronological order.

March 96 - Funredes

L1 : the first study on language

The first study on language is focused on French and Spanish. Using AltaVista, it compares the presence of given words on the Web, in English, French, and Spanish. The sample, comprising 50 words, was made without any real linguistic methodology.

The results are quite approximate; they show a ratio English/French around 22, and French/Spanish around 2.4.

Translation of the study by John Quatermann.

C1: the first study on culture

The first study on culture is focused on the French speaking and the Spanish speaking worlds. It makes comparative measurements, by the means of AltaVista, of the number of quotes, on the Web pages, of people considered as culturally representative. The sample contains about 500 people, divided into 13 categories. The results are quite biaised, but they show a strong presence of representatives of the French-speaking culture, compared to the US world, in the domains where culture and business are not fused.

Translation of the study by John Quatermann.

 

March 97 : Funredes

L2: the second study on language

The study is a mere update of the first study on language. It shows a slight progression of French compared with English, and a strong progression of Spanish.

Translation of the study by John Quatermann.


March 98 : Funredes, with contributions of the Latin Union

L3: the third study on language

The study is an important update presented at the Visionarios conference in Caracas (April 22-24, 98). New features:

  • an analysis of the limitations of the web search engines, and of the relative presence of the diacritic characters. It leads to the following recommendations: the abandonment of AltaVista, and the use of HotBot.
  • the application of the method that we have called "complement of the empty set" to AltaVista: it gives an approximation of the presence of the languages, from the algorithm used for language recognition.
  • a review and evaluation of the results of the study led by Alis Technologies.

Trend: French keeps on progressing at a slow rate, and Spanish is now very close to French.


September 98 : joint study by ACCT/Latin Union/Funredes

L4: the fourth study on language

Several important modifications are made in the method, and the results are more reliable:

  • all the latin languages are taken into account: Spanish, French, Italian, Portuguese, Romanian
  • a sample which meets strict linguistic criteria is built
  • the measurements are made with HotBot within the Web space
  • the measurements are made with DejaNews within the Usenet space
  • the moments of confidence at 90% and 99% have been established.
  • the results are weighted according to the size of the linguistic areas.

Synthesis of the results of the study (percentages compared with English):

WWW Usenet
Spanish 2.53% 1.93%
French 2.81% 1.15%
Italian 1.50% 2.03
Portuguese 0.82% 0.90%
Romanian 0.15% 0.11%


C2: the second study on culture, 3 years later

No significant difference:

  • the same methodology has been kept, with improvements concerning the categories, the selection and the quantity of people; also, the study has been broadened to all the latin languages.
  • an analysis of some results has been made, by language, and as regards to the language of reference.

2000 : New version of the study (Latin Union/Funredes)

L5: Fifth study on language

Several improvements have been made to the methodology:

  • Extending the study to include German.
  • Selecting Google as the search engine after an in-depth study. Measurement on the WWW only.
  • Calculations were automatated through the use of software which interfaced between the search terms, which were held in a database, and the search engines.
  • Correction of certain spelling errors in some of the terms and the discarding of others.
  • Beginning to use sistematic measuring techniques, and use of extrapolation graphs.

Synthesis of the results of the study (in percentage in relation to English):

WWW

SPANISH

10.95%

FRENCH

8.86%

ITALIAN

5.88%

PORTUGUESE

5.40%

ROMANIAN

0.32%

GERMAN

> 13.4% (estimation)



 

[BACK TO TOP]


[email protected]
Copyright © 1996-1999 FUNREDES
Created: 24 VIII 1998
Last Modified: 02 VII 1999

Back
L1
2by2transparent.gif (43 bytes) L22by2transparent.gif (43 bytes) L32by2transparent.gif (43 bytes) L42by2transparent.gif (43 bytes) C12by2transparent.gif (43 bytes) C2
Languages & Culture Home
Funredes Home