How might we explain how stereotypes are unintentionally amplified by AI systems to a non-technical audience?
Runaway Models is an interactive narrative that explains how stereotypes are unintentionally amplified by AI systems that are powered by neural language models, and how these biases can be mitigated. Designed for a broader audience with limited knowledge of machine learning, Runaway Models can be used by educators to explain the concepts and applications of deep learning with natural language processing, fairness and ethics in AI, information visualisation and human-data interaction.
Runaway Models was designed and developed for a masters dissertation for the Design Informatics programme at the University of Edinburgh. This project earned a distinction.
PROJECT DURATION
6 Months
MY ROLE
Project Lead
Researcher
Data Scientist
Designer
Developer
ADVISORS
Dr. Benjamin Bach
Mentallah El-Assady
Tashfeen Ahmed
TOOLS & METHODS
Deep NLP with Python
Prototype design & testing with Figma
RITE Method for Usability Testing
Web Development in JS, HTML, CSS
THE PROBLEM
Language models such as Google's BERT and Open AI's GPT-3 have been shown to achieve state-of-the-art performance in powering systems for search engine optimisation, machine translation or virtual assistants like Siri and Alexa.
The power of language models lies in that they are pre-trained on large open-sourced text corpora from the internet such as Wikipedia and Google News, during which they learn underlying patterns of meaning in natural languages exactly as they are used. As a result, they inherit any inconsistencies and biases captured in this data. The AI systems that are powered by these language models risk perpetuating those biases and can therefore unintentionally amplify harmful stereotypes based on race, gender or religion that are already prevalent in our society.
Given the widespread applications of these AI systems, it is imperative to promote transparency and provide context to the datasets these models are trained on, how they transform data to power AI systems, as well as the stages in the pipeline where potential stereotypes may be learned, propagated and even amplified by these systems.
I set out to design an interactive narrative to simply explain how stereotypes are unintentionally amplified by AI systems powered by language models work to a non-technical audience.
RESEARCH OBJECTIVES
I explored the following research objectives in this project:
Use strategies of designing interactive narratives to simply communicate how biases are propagated by AI systems powered by language models, and how they can be mitigated.
Evaluate the efficacy of interactive narratives in simply communicating how biases are propagated by AI systems powered by language models, and how they can be mitigated.
DESIGN PROCESS
Discover:
In the requirements gathering phase, I:
Learned about how language models work, and how to fine-tune them for different NLP tasks like machine translation, text generation or question-answering using Python.
Learned about how biases are learned and propagated in language models.
Carried out a comparative analysis of existing interactive narratives to see which interactive components e.g. simulations or ‘scrollytelling’; and explanation strategies to use e.g. explanation by example or use of metaphors, in the final design.
I ran a survey with N=7 participants to get a sense of their understanding of biases in machine learning. These participants were a combination of students aged 23-34 years from Design Informatics, and various social sciences programmes at the University of Edinburgh.
Explore:
To distil all my research findings, I carried out the following activities:
Prototyping
I designed various prototypes of the Runaway Models interactive narrative using Figma, drawing inspiration from board and video games. I decided to do this to gamify the experience of learning complex concepts of deep learning networks and biases in machine learning.
Prototype 1 : The earlier version of Runaway Models was in the form of an ‘interactive slideshow’.
Prototype 2 : The following prototype of Runaway Models had more interactivity and uses scrollytelling techniques.
USABILITY TESTING
I ran usability testing sessions (30-35minutes) with the same N=7 participants, using Microsoft’s RITE (Rapid Iterative Testing and Evaluation) methodology at various stages of the design process.
The RITE method was an appropriate prototyping and testing methodology to use in the design of an interactive narrative because:
It offered a 'quick and dirty’ approach to identify usability issues with regard to the messaging and interactivity of each of the prototypes.
It also enabled me to rapidly implement solutions where possible, depending on the nature of the issue.
During usability testing sessions, participants were invited to engage in verbal 'think aloud' sessions while they completed a list of tasks that I defined for them. During the sessions, I observed their interactions and behaviours with the prototypes.
For each session with a participant, as soon as issues were identified and where quick solutions were found, changes were implemented to the prototype prior to the next usability testing session.
Problems were identified through a participant's direct statements or in their behaviour in interacting with the prototypes e.g. confusion on what to do next or clicking in the wrong places or not at all.
The turnaround time for the implementation of changes depended on the nature of the issue identified. This practice also allowed for the validation of the effectiveness of any changes implemented with the next participant.
THEMATIC ANALYSIS
I coded all the user feedback from usability testing sessions using Reflexive Thematic Analysis. This provided insights on what worked and did not work with the Runaway Models prototype.
What didn’t Work
Too much text: Various participants stated that they were overwhelmed with 'too much text'.
Confusion on what buttons to click: Various participants stated being confused about which tokens were buttons, in some cases participants were clicking in the wrong areas or on the wrong tokens (e.g. the text data bubble).
Confusion on what to do next As aforementioned, various participants stated being confused knowing what behaviours were expected of them through interaction with the interactive narrative.
No way to navigate to different sections: Two participants mentioned that they did not know that they could go back to other flows or pages in the interactive narrative. This suggested the need to design a navigation affordance such as an omnipresent navigation menu or breadcrumbs.
Palette too broad: One participant mentioned that the colour scheme was 'too broad' stating that it distracted away from the messaging of the interactive narrative as there seemed to be 'too much going on'.
What worked
During informal debriefs after usability testing sessions, some participants mentioned that:
Clear Messaging: the design prototype was 'colourful' and 'engaging', stating that the message that neural language models learn and perpetuate human biases and stereotypes and that these biases can be mitigated was 'absolutely clear'. This comment was supported by the fact that the majority of participants answered simple questions on the content of the interactive narrative correctly. This statement suggests that research objectives (1) and (2) have been achieved.
On Interactivity: the design prototype resembled an interactive 'map' such that it 'had an outline, leads you to the next stage, if you want to you can explore the whole topic." This statement suggests that research objective (2) has been achieved.
FINAL DESIGN
I took into consideration all the user feedback from the usability sessions when building a functional prototype of the Runaway Models interactive narrative. I designed the following solutions:
Problem 1: Too much text
Pace messaging: I broke down explanations into 'bullet point lists' and revealing the content, one message at a time and gradually presenting the full narrative so that users do not get overwhelmed with too much information. This narrative pattern helped with flow and sequence of messaging.
On-demand details: I left it up to the user to explore the external links for further information of case studies, complex technical details.
Explorable components: added more explorable visual components to replace text such as Interactive Models, Simulations and Animations and Embedded videos to keep the reader engaged with what they were learning.
Problem 2: Confusion on which buttons to click
Usable Buttons: I designed button colours and labels that have stronger CTAs, for instance I used orange buttons that required users to take an action and changed button labels to 'Explore Here' instead of just 'Explore'.
Problem 3 : Confusion on what to do next
Pop-up Tooltips with CTAs and Confirmations: I designed pop-up tooltip messages that allow the user to actually practise the interactions that a button token offers e.g. 'Scroll now to see' or 'Click this button to try', with a confirmation message box e.g. 'You did it!'.
Problem 4 : No way to jump to different sections of the story
Omnipresent navigation bar: I designed an omnipresent navigation menu to all the user to jump to different sections of the interactive narrative.
Problem 5 : Palette too broad
Neutral palette: I used a more neutral and reduced colour scheme was implemented in the functional prototype of Runaway Models.
REFLECTION
Tradeoff between simplicity and statistical nuance: it is challenging to build a simple and compelling narrative structure. Simplified language with examples, metaphors and analogies to describe complex statistical phenomena tends to leave out a lot of technical nuances that are key to fully understand how neural language models work.
Technical complexity: the design and development of interactive narratives is time and labour intensive, requiring the use of multiple tools, libraries and packages. How well the interactivity is implemented in an interactive narrative influences its ability to communicate its key messages to its audience.
Efficacy of Interactive Narratives: The efficacy of interactive narratives is based on the premise that they actually facilitate communication, learning and critical thinking. This is still an area under investigation with some sources like the New York Times recommending that communication designers steer away from interactivity as data suggests that very few readers actually interact with non-static content in interactive narratives. Research on tools (such as Google Analytics) to capture and analyse reader activity such as keyboard clicks or scroll event-listeners to evaluate engagement with interactive narratives is under investigation.