Tools, methodology, data sources, data output used to produce the article GenderGapGrader: AngelList.
We’ve opened the free GendRE API which extracts gender from names. To make it usable by everyone, we’ve built an extension for RapidMiner, a leading open source data mining and predictive analytics software
.
So you can run your own gender gap analysis, where and when it matters to you!
Data Sources:
- The AngelList API (http://angel.co/api)
- Sample API client TheAngelListAPI_SampleClient.zip
- Twitter (http://dev.twitter.com/)
Data Mining Tools:
- RapidMiner v6
- RapidMiner Onomastics Extension (Extract Gender Operator) v0.0.4
- Get it from RapidMiner Market Place, OR
- Get it from GitHub
- Documentation and video Tutorial
- GendRE API v0.0.15/v0.0.16
- Check out this explanation on using names (and onomastics) to estimate the Gender Gap
- Get your API Key on Mashape.com for double precision and batch API with higher thoughput (x 50+)
- Plus, specials thanks to : MonetDB, PostgreSQL, Apache Tomcat, OpenJDK, Python, Ubuntu… and others.
Data Output:
- AngelList_Genderized.zip (TXT DELIMITED, UTF-8, ZIP)
Estimates:
- 201409_AngelList_Preanalysis_v003_Preview_vF.zip (EXCEL, ZIP)
3 comments