API

We publish a small sub-set of NamSor functionalities as an API. Want to play ? Try the Android App or the RapidMiner Extension.20140426_GendRE_Api_v512x

GendRE API : predicting gender from a name

GendRE is our Application Programming Interface (API) to determine the gender of a personal name on a -1 (Male) to +1 (Female) scale, for a given geography/locale. You can test it on Yahoo Pipes or directly from your web browser, entering this kind URL:

or if you prefer a REST JSON format,

The API automatically recognizes which culture to apply when assessing the gender of a personal name. Some examples: “Andrea Rossini” is most likely an Italian name and a male name, whereas “Andrea Parker” is most likely an anglosaxon name and a female name; 声涛周 is most likely male ; “O. Sokolova” is most likely female. Try those:

Still, we recommend passing additional geography/local context (as a ISO2 country code), if you know it:

We have designed this API to be evolutive in terms of coverage and accuracy: the onomastics analysis will gradually be extended to combine the firstName and the lastName (ex. to handle the case of Baltic & Slavic languages). As-of today, the API should serve well its purpose for the US, European Markets as well as in Russia/CIS, China, Israel and the Arab world. Please read the API Terms of Use.pdf

Gender  /ˈdʒɛndə/: from Old French gendre (modern genre), based on Latin genus ‘birth, family, nation’. The earliest meanings were ‘kind, sort, genus’ and ‘type or class of noun, etc.’  The state of being male or female (typically used with reference to social and cultural differences rather than biological ones). Source: Oxford Dictionaries

gendre_logo_v128

By using the free GendRE API,

  • you contribute data to our calibration set and help us improve our name parser API to breakdown a name in several components for various languages and locales ;
  • you participate to our continuous improvement, making our API so accurate that it can support global Open Data initiatives and promote Gender Equality worldwide.

GendRE FREE/Freemium is available with full precision and documentation on Mashape:

GendRE API Samples

GendRE API ChangeLog

  • 2014-09-28 V0.0.16(c) new batch endoint for users registered on mashape, better precision for Kenyan names
  • 2014-09-26 V0.0.16 adds a new batch endoint for users registered on mashape (up to 1000 names at a time -much faster) NB/ due to deployment pb, this version was walking on just on leg… problem solved on 28/09.
  • 2014-07-06 V0.0.15 combines both approaches: dictionary (demographics) and sociolinguistics
  • 2014-05-25 V0.0.14 supports Turkish names, API management for RapidMiner Onomastics Extension
  • 2014-05-18 V0.0.13 supports Jewish, Christian and Muslim names in Hebrew.
  • 2014-05-05 V0.0.12 contains technical enhancements.
  • 2014-04-27 V0.0.11 database contains 784,486 names, with improvements for France, Mexico, Brazil, Latvia, India.
  • 2014-04-16 V0.0.10 mostly technical enhancements to support web automation (Zapier.com, Yahoo! Pipes etc.)
  • 2014-04-10 V0.0.9 database contains 771,769 names, supports Greek names (in Greek).
  • 2014-04-06 V0.0.8 database contains 770,759 names, produces 16% less errors.
  • 2014-04-02 V0.0.7 database contains 553,767 names, supports Arab names (in Arabic).
  • 2014-03-31 V0.0.6 database contains 549,827 names, supports Slavic/Russian names (in Cyrillic).
  • 2014-03-27 V0.0.5 database contains 540,840 names, supports Chinese names (simplified, traditional).
  • 2014-03-21 V0.0.4 database contains 540,840 names.
  • 2014-03-14 V0.0.3 database contains 400,412 names.
  • 2014-03-12 V0.0.1 database contains 384,816 names.

GendRE App

Gendre is an open source mobile application, provided as a sample on how to use Gendre API: it enriches Android contact titles (Mr., Ms., M.) with the gender inferred from a contact name.

Gendre Backtesting

We’re continuously improving Gendre and testing it against an independant database of personal names.

The new version combines two algorithms to maximize accuracy:

  •  A unique global name sociolinguistics algorithm, which (1) recognize the origin of the couple first_name and last_name and (2) infers whether the name sounds male or female in that particular culture;
  • As previously, a query in a massive database (800,000 names), which contains statistical information about baby names in each country of the world

20140708_NamSor_GendRE_API_v0.0.15

 

Previous benchmark (using Dictionary/demographics approach only):

Country Sample Size v011 Pct OK v008 Pct OK Improvement
United States 8906 96.08% 95.97% 0.11%
Great Britain 5928 96.47% 96.22% 0.25%
France 5293 98.02% 93.47% 4.55%
Canada 4411 96.49% 96.24% 0.25%
Germany 4311 98.24% 98.08% 0.16%
Sweden 3556 98.28% 98.03% 0.25%
Australia 3483 96.27% 96.10% 0.17%
Poland 2809 98.26% 97.76% 0.50%
Switzerland 2674 97.16% 96.86% 0.29%
Hungary 2575 99.26% 98.84% 0.43%
Austria 2200 98.95% 98.64% 0.32%
Russia 2084 97.50% 96.88% 0.62%
Spain 1935 98.86% 99.85% -0.98%
Norway 1861 97.69% 96.40% 1.29%
Romania 1664 96.09% 95.67% 0.42%
Greece 1636 95.48% 94.71% 0.76%
Belgium 1626 98.83% 97.87% 0.96%
Germany 1571 99.05% 98.92% 0.13%
Denmark 1543 96.82% 96.63% 0.19%
Argentina 1496 99.13% 99.13% 0.00%
Mexico 1294 99.23% 68.45% 30.78%
New Zealand 1202 95.76% 98.89% -3.13%
Brazil 989 95.55% 87.20% 8.35%
Ukraine 978 97.75% 97.14% 0.61%
Ireland 728 95.19% 94.64% 0.55%
Belarus 678 98.67% 98.38% 0.29%
Czech Republic 655 100.00% 100.00% 0.00%
Egypt 594 96.97% 95.46% 1.51%
Kazakhstan 593 95.78% 94.27% 1.52%
Portugal 537 99.26% 96.38% 2.87%
Puerto Rico 488 97.34% 97.34% 0.00%
Colombia 469 97.01% 95.31% 1.71%
Chile 466 97.85% 97.85% 0.00%
Latvia 405 95.06% 79.75% 15.31%
Slovakia 394 99.75% 99.75% 0.00%
Luxembourg 373 95.44% 95.44% 0.00%
Morocco 351 99.15% 98.58% 0.57%
Slovenia 345 96.81% 96.52% 0.29%
Uruguay 335 98.21% 98.21% 0.00%

By comparison, from the results with Chinese and Korean names (in Latin Alphabet), we recommend tossing a coin instead :

Country Sample Size v011 v008
China 2429 49.77% 49.71%
South Korea 2136 52.62% 49.16%

Predicting gender of names in United States, Gendre API is 96% accurate based on a sample size of 8906 independant personal names. Guessing the gender of names in Great Britain, Gendre API is 96% accurate based on a sample size of 5928 independant personal names. To determine the gender of names of France, Gendre API is 98% accurate based on a sample size of 5293 independant personal names. To infer the sex from a name in Canada, Gendre API is 96% accurate based on a sample size of 4411 independant personal names. To recognize the sex from a name in Germany, Gendre API is 98% accurate based on a sample size of 4311 independant personal names. Enriching the gender from contact names in Sweden, Gendre API is 98% accurate based on a sample size of 3556 independant personal names. To guess if a baby name is a boy or a girl in Australia, Gendre API is 96% accurate based on a sample size of 3483 independant personal names. Predicting gender of names in Poland, Gendre API is 98% accurate based on a sample size of 2809 independant personal names. Guessing the gender of names in Soviet Union, Gendre API is 95% accurate based on a sample size of 2694 independant personal names. To determine the gender of names of Switzerland, Gendre API is 97% accurate based on a sample size of 2674 independant personal names. To infer the sex from a name in Hungary, Gendre API is 99% accurate based on a sample size of 2575 independant personal names. To recognize the sex from a name in Austria, Gendre API is 98% accurate based on a sample size of 2200 independant personal names. Enriching the gender from contact names in Russia, Gendre API is 97% accurate based on a sample size of 2084 independant personal names. Predicting gender of names in Czechoslovakia, Gendre API is 99% accurate based on a sample size of 1965 independant personal names. Guessing the gender of names in Spain, Gendre API is 98% accurate based on a sample size of 1935 independant personal names. To determine the gender of names of Norway, Gendre API is 97% accurate based on a sample size of 1861 independant personal names. To infer the sex from a name in Romania, Gendre API is 96% accurate based on a sample size of 1664 independant personal names. To recognize the sex from a name in Greece, Gendre API is 95% accurate based on a sample size of 1636 independant personal names. Enriching the gender from contact names in Belgium, Gendre API is 98% accurate based on a sample size of 1626 independant personal names. To guess if a baby name is a boy or a girl in West Germany, Gendre API is 99% accurate based on a sample size of 1571 independant personal names. Guessing the gender of names in Denmark, Gendre API is 96% accurate based on a sample size of 1543 independant personal names. To determine the gender of names of Argentina, Gendre API is 99% accurate based on a sample size of 1496 independant personal names. To infer the sex from a name in Mexico, Gendre API is 99% accurate based on a sample size of 1294 independant personal names. To recognize the sex from a name in East Germany, Gendre API is 98% accurate based on a sample size of 1259 independant personal names. Enriching the gender from contact names in New Zealand, Gendre API is 95% accurate based on a sample size of 1202 independant personal names. To guess if a baby name is a boy or a girl in Brazil, Gendre API is 95% accurate based on a sample size of 989 independant personal names. Predicting gender of names in Ukraine, Gendre API is 97% accurate based on a sample size of 978 independant personal names. To infer the sex from a name in Ireland, Gendre API is 95% accurate based on a sample size of 728 independant personal names. To recognize the sex from a name in Belarus, Gendre API is 98% accurate based on a sample size of 678 independant personal names. Enriching the gender from contact names in Czech Republic, Gendre API is 100% accurate based on a sample size of 655 independant personal names. Predicting gender of names in Egypt, Gendre API is 96% accurate based on a sample size of 594 independant personal names. Guessing the gender of names in Kazakhstan, Gendre API is 95% accurate based on a sample size of 593 independant personal names. To determine the gender of names of Portugal, Gendre API is 99% accurate based on a sample size of 537 independant personal names. To infer the sex from a name in Puerto Rico, Gendre API is 97% accurate based on a sample size of 488 independant personal names. To recognize the sex from a name in Colombia, Gendre API is 97% accurate based on a sample size of 469 independant personal names. Enriching the gender from contact names in Chile, Gendre API is 97% accurate based on a sample size of 466 independant personal names. To guess if a baby name is a boy or a girl in Latvia, Gendre API is 95% accurate based on a sample size of 405 independant personal names. Guessing the gender of names in Slovakia, Gendre API is 99% accurate based on a sample size of 394 independant personal names. To determine the gender of names of Luxembourg, Gendre API is 95% accurate based on a sample size of 373 independant personal names. To infer the sex from a name in Morocco, Gendre API is 99% accurate based on a sample size of 351 independant personal names. To recognize the sex from a name in Slovenia, Gendre API is 96% accurate based on a sample size of 345 independant personal names. Enriching the gender from contact names in Uruguay, Gendre API is 98% accurate based on a sample size of 335 independant personal names. 

One response to “API

  1. Pingback: Nieuwe software brengt mannenoverschot in beeld | De Zesde Clan

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s