Every now and then we hear stories about the making of China as a scientific superpower. How it overtook France in the global University Rankings:
‘In the 2014 edition of the world 500 top research universities, China keeps progressing. China has 9 universities among the top 200 (77 in the USA, 20 in Great-Britain, 14 in Germany, 8 in France). The trend is impressive: in 2004, China had only one University in the to 200.’ Source: LesEchos.fr
How it keeps increasing its volume of scientific publications produced:
‘The increased share of scientific publications produced by China (+231% between 2002 and 2012) is another indicator of the Chinese scientific growth. According to Ghislaine Filliatreau [of the French OST], it’s not only just an augmentation in volume but also in quality.’ LesEchos.fr
In China really becoming a scientific superpower? It may be so. We should however be careful how we interpret bibliometric information (volume and quality of scientific publications). There could be huge cultural biases currently unaccounted for, impacting international bibliometric rankings.
For example, let’s look at Scimago Journal and Country Ranking, an index based on the Scopus® database (Elsevier B.V.). China is already the second country in the world according to the number of scientific publications produced between 1996 and 2003. But if we consider the number of citations excluding self-citations, then China comes after Spain. The reason is that the ratio of citations per citable document (excluding self-citations) is lower than average.
Next week, at ICOS2014 (the 25th International Congress of Onomastic Sciences and premier conference in the field of name studies), we will explore some of the cultural biases at play in LifeSciences. A presentation of PubMed (MedLine/PMC) data mining using NamSor software, conducted with onomast Eugène Schochenmayer, will take place at Glasgow University on the 28th of August.
Onomastics to measure cultural bias in medical research (ABSTRACT)
This project involves the analysis of about one million medical research articles from PubMed. We propose to evaluate the correlation between the onomastic class of the article authors and that of the citation authors. We will demonstrate that the cultural bias exists and also that it evolves in time. Between 2007 and 2008, the ratio of articles authored by Chinese scientists (or scientists with Chinese names) nearly tripled. We will evaluate how fast this surge in Chinese research material (or research material produced by scientists of Chinese origin) became cross-referenced by other authors with Chinese or non-Chinese names. We hope to find that the onomastics provide a good enough estimation of the cultural bias of a research community. The findings can improve the efficiency of a particular research community, for the benefit of Science and the whole humanity.
Some of the tools we’ve used to produce this research:
- MonetDB, the open-source column-store pioneer; due to the multiplicative aspect of some queries (ex. counting articles authored by a scientist with a Chinese name, cited by a scientist with -say- an Italian name) the volume was huge and we couldn’t do with a classic database
- RapidMiner, a leading open-source data mining and predictive analytics software
- Our own RapidMiner Onomastics Extension, to predict the gender and likely origin of personal names
About Evgeny Shokhenmayer
NamSor™ Applied Onomastics is a European vendor of Name Recognition Software (NamSor sorts names). NamSor mission is to help understand international flows of money, ideas and people. NamSor launched @FDIMagnet, a consulting offering to help Investment Promotion Agencies and High-Tech Clusters leverage a Diaspora to connect with business and scientific communities abroad.