The US government’s National Science Foundation’s (NSF) National Center for Science and Engineering Statistics (NCSES) funded a study to develop bibliometric indicators and measure women’s contribution to scientific publications.
2.3 Gender name inference
Many solutions are available on the market to determine gender based on an author’s first name and other available information (e.g., last name, ethnicity, location); this study used a solution developed by NamSor™. NamSor is a European designer of name recognition software committed to promoting diversity and equal opportunity. NamSor was selected for this study because it offers a very high degree of accuracy and recall, and a global coverage.
NamSor claims to cover all languages, alphabets, countries and regions. In addition to using the data mining approaches that are behind most of the solutions available (e.g., using national lists of baby names), NamSor works with linguists, anthropologists and historians to increase their products’ accuracy in various cultural contexts. They also develop solutions to infer the origin or ethnicity of individuals based on their names, and these developments reinforce the quality of gender estimation. Because a surname may change depending on gender in some cultures, the API automatically recognizes if gender can be inferred from the first name (e.g., Carl) or the last name (e.g., Sololova). Finally, the API is quite tolerant of typographic errors and multiple names, a feature that is very handy given the significant number of input errors in the publication databases.
Read the full report on Science Metrix web site [PDF]
NamSor™ Applied Onomastics is a European vendor of sociolinguistics software (NamSor sorts names). NamSor mission is to help understand international flows of money, ideas and people. We proudly support Gender Gap Grader.
Reach us at: firstname.lastname@example.org