Two weeks ago, we generated several portraits using DALL-E of hypothetical Fatimata SWADOGO, a Bukinabé name shared by hundreds of people in Burkina Faso, mostly in the Centre-Nord, Nord regions of the country. Today, we present some new portraits generated from personal names with the tag #thisnamedpersondoesnotexist – and we feature a “classic fail” of AI software.
Mariam KABORE and Hamado KABORE are two respectively female and male names, popular in the Plateau Central region of Burkina Faso. The main ethnic groups of the Plateau-Central region of Burkina Faso are : the Mossi people [1, 2], the Fula people [1,2] and the Bissa people .
Here are some other common names in Plateau Central,
|Common female names||Common male names|
|MARIAM KABORE||ADAMA KABORE|
|SALAMATA KABORE||HAMADO KABORE|
|AWA OUEDRAOGO||ISSAKA OUEDRAOGO|
|MAMOUNATA OUEDRAOGO||RASMANE OUEDRAOGO|
This named person does not exist
At NamSor, we’ve worked over 10 years with various Maching Learning algorithms, trying to figure out the best solutions to build fair, transparent and explainable AI. It’s not an easy task … In fact, I truely believe this is one of the tasks that will require huge efforts by Humanity, like the sending a James-Webb telescope at the other end of our universe to explore further time and space.
Personal names reflect gender, race, ethnicity, cultural origin, history of international migrations … but they are not perfect scientific instruments, like telescopes. They are imperfect but useful, because the accuracy level of name classification is sufficient to make scientific discoveries, to “see” important sociological and anthropological aspects of society, in a specific country or in our globalized world. They can be used to explore discriminations in complex decision making processes (recruitment, credit allocation, public employment, housing …) For example, according to NamSor name classification, the name Mariam KABORE is most likely from Burkina Faso – with Mali as a second best alternative (both countries are in West Africa and share a geographical boundary).
When we prompt DALL-E text-to-image AI with the text “A portrait of Mariam KABORE”, we do expect gender, racial and ethno-cultural bias by the AI. We do expect the portrait to represent a female human, black african, possibly dressed according to our own “stereotype” view of what Mariam KABORE will look like. In fact, the result is quite impressive.
The AI has generated some beautiful women portraits. What about trying DALL-E with a male name?
This Hamado KABORE does not exist
Confusing human and gorilla : a “classic fail” for AIs. We’re not aiming for controversy : I do believe most companies doing AI and machine learning are doing their best to put safety measures in place, and to try and reduce biases. Should AI teams be more diverse? Yes. But the lack of gender, racial diversity among data scientists is a complex issue to address. For example, women account for 5% to 10% of data scientists (depending on the country) and this imbalance is only partially caused by discrimination. An other factor are cutural and gender biases towards mathematics at a very early age, affecting women’s access to higher education in mathematics and science. Then, there is also the issue of biases in the training / validation / test data. For example, it no surprise if less open data is accessible in Burkina Faso (which has one of the lowest Human Development Index, ranked 182 out of 191 countries) to train machine learning models, compared to -for example- the United States. The developpers at DALL-E have tried various solutions to correct potential gender and racial biases with their AI, like randomly appending “female” and “black” to the text input. But in this case, it wasn’t enough.
Several other companies have had issues with their AI miss-classifying black or african people, for example Facebook had to apologize after it A.I. put ‘Primates’ label on video of black men and Google tagged two African-Americans as gorillas through facial recognition software. So it should be no surprise that prompting an African name to a text-to-image generation software could be biased, and generate the portrait of a gorilla instead. Luckily, DALL-E allowed us to report the image. I’m sure they will further improve their software by either improving their training data, or by adding additional safety checks. As part of their safety checks, they probably already re-process the generated images through an image classification software – but still the image went through :
Also DALL-E allows end-users to report image-to-text generation errors, as part of their continuous improvement process,
Still DALL-E generated two valid images, based on the prompt “A portrait of Hamado KABORE”:
Please stay tuned, as we’re going to announce a new project by the end of the year, offering new solutions to adress gender and racial biases in artificial intelligence and machine learning. Also, if you are a spanish speaker, check-out our new blog about gender and ethnic diversity in Spanish,
- El uso de la autocita como medidor de disparidad en la investigación interdisciplinar entre hombres y mujeres
- ¿Qué tan “neutrales” pueden ser las IA al disminuir el sesgo racial y de género?
- ¿Las mujeres matemáticas son menos citadas que los hombres?
NamSor™ Applied Onomastics is a European vendor of sociolinguistics software (NamSor sorts names). NamSor mission is to help understand international flows of money, ideas and people. We proudly support Gender Gap Grader.