How to: Avoid Gender Data Bias at Each Stage of the Data Lifecycle
This series explores phases at which gender data bias may be introduced or exacerbated in the simplified data lifecycle. The series contains four parts: requirements gathering, data collection, data cleansing, and analysis and use. This is part one: requirements gathering.
Before any data-related actions are taken, a kind of accounting goes on in which goals and objectives of data use, collection, or cleansing are defined. This planning phase, called requirements gathering, should lay out the questions that need be asked and answered to best use a particular set of data and avoid gender data bias and optimize the data-related solution.
Before data-related initiatives, principles are identified to ensure there is a purpose to the collection, cleansing, and/or analyzing of data. Two such principles are usefulness and pragmatism:
- Usefulness and relevance: data must be directly linked to expected future analysis and outcome
- Pragmatism: data can easily be collected within a reasonable timeframe and existing data is used when possible
In addition to common principles, gender bias mitigation principles should be considered.
- Consistency: data must align with a standard definition of gender
- Flexibility: data collection process is flexible so that studies can adapt definitions to fit variables
- Balance: disaggregated data is balanced in data type
- Aspiration: model for future studies to aspire to gather data in accordance with past
Considering consistency of a gender definition, the flexibility of the data-related stage or initiative, the balance of the dataset, and the aspiration of the data-related stage or initiative, gender bias may be more easily identified and avoided. It is best practice, to incorporate these principles into the earliest action stage possible, or the collection stage.
Other requirements gathering considerations include thoughtful variable inclusion, study design, and primary outcome. While variable inclusion and study design take form during the second stage, collection, primary outcome should be considered at all stages, and in particular, the cleansing and analysis stage.
- Variable Inclusion: Data includes a (typically binary) distinction of male and female. Without this distinction, women are effectively invisible in analysis. Inclusion of a gender variable allows for women to be identified and results to be analyzed for gender-specific impacts.
- Study Design: Data is collected with gender equitable considerations in the design, implementation, and analysis of the study. Important to consider where and how gender bias can arise in study design and how this could potentially impact results. Key questions to consider include: Whose (m/f) needs did they consider? Who (m/f) did they coordinate with? Who (m/f) did they gather feedback from? Who (m/f) was involved in the decision making? Whose (m/f) satisfaction was collected and analyzed?
- Primary outcome: Gender equality and/or empowerment is a primary outcome of the study. Study directly enables monitoring and targeting of gender equality-focused outcomes.
For data to be considered gender disaggregated, it must at minimum meet the level of variable inclusion. Future studies should also consider study design to ensure there is no gender bias in data collection. It is not necessary for studies to focus on gender equality as the primary outcome, however, if available, data from these studies should be prioritized.
A common long-term goal of data collection and use is the use of data for advanced analytics. Once principles and data intentions are defined, it is important to especially scrutinize data intended for this use, as models tested and trained on gender biased data create a negative feedback loop of gender bias in those real-world questions that model seeks to answer.
A common use of data for advanced analytics occurs in the medical field as model use can reduce human bias and error. The implications of human gender bias in this field are often steep, for example, women are fifty percent more likely to be misdiagnosed if they have a heart attack, as most training is done on the male body, making it difficult to identify female heart attack symptoms which are different from males.
The introduction of medical diagnostic tools using AI aims to reduce this bias as it removes the human element. However, many of these tools are trained on medical studies that exclude women entirely or underrepresent women (lacking pregnant women, women in menopause, or women using birth control pills). This result can perpetuate that human bias that already exists without any model-informed decision making.
For example, an online app may advise a female user that her symptoms of pain in the left arm are due to depression and to see the doctor in a couple of days. With the same symptoms, the same algorithm is likely to advise a male user to immediately see a doctor based on a diagnosis of a possible heart attack.
Through research, many professionals have come to the conclusion that the misrepresentation and exclusion of women in medical datasets comes down to diversity. The data is collected with a certain perspective in mind, and ultimately, the lack of diversity in the field of technology means that the lack of female data and its potential harm goes unnoticed in many situations, leading to technology that exacerbates gender bias.
Team diversity should be considered at every stage of the data process from requirements gathering through to analysis and use. A diverse team is comprised of individuals of demographic, personality, and functional diversity.
Bias Impact Statement
Once the data collection principles are understood and the intentions and implications of the data-related action are considered, a bias impact statement should be assembled before any data-related action is taken.
To mitigate such harmful mistakes and oversights, a bias impact statement should be used before the collection phase of each analytics endeavor. The impact statement may be developed using the questions and criteria developed by the Brookings Institute below.
What will the automated decision do?
- Who is the audience for the algorithm and who will be most affected by it?
- Do we have training data to make the correct predictions about the decision?
- Is the training data sufficiently diverse and reliable? What is the data lifecycle of the algorithm?
- Which groups are we worried about when it comes to training data errors, disparate treatment, and impact?
How will potential bias be detected?
- How and when will the algorithm be tested? Who will be the targets for testing?
- What will be the threshold for measuring and correcting for bias in the algorithm, especially as it relates to protected groups?
What are the operator incentives?
- What will we gain in the development of the algorithm?
- What are the potential bad outcomes and how will we know?
- How open (e.g., in code or intent) will we make the design process of the algorithm to internal partners, clients, and customers?
- What intervention will be taken if we predict that there might be bad outcomes associated with the development or deployment of the algorithm?
How are other stakeholders being engaged?
- What’s the feedback loop for the algorithm for developers, internal partners, and customers?
- Is there a role for civil society organizations in the design of the algorithm?
Has diversity been considered in the design and execution?
- Will the algorithm have implications for cultural groups and play out differently in cultural contexts?
- Is the design team representative enough to capture these nuances and predict the application of the algorithm within different cultural contexts? If not, what steps are being taken to make these scenarios more salient and understandable to designers?
- Given the algorithm’s purpose, is the training data sufficiently diverse?
- Are there statutory guardrails that companies should be reviewing to ensure that the algorithm is both legal and ethical?
Understanding all the components of the data through the bias impact lens makes it easier to detect biases at each phase and audit a solution for potential problems.
It is the onus of those working with data to do everything they can to ensure equitable outcomes of data-related projects and processes, as data is the asset being used to inform not only decisions made today but also decisions made in the future.