How can decision making by algorithms discriminate people, and how to prevent that Indrė Žliobaitė Aalto University School of Science, Dept. of Computer Science Helsinki Institute for Information Technology HIIT University of Helsinki 23 October, 2015
Background
Algorithms learned from data are increasingly used in decision making Who gets a loan Who goes to an extra security check Who is released from prison
Decisions informed by big data could have discriminatory effects even in the absence of discriminatory intent Policy recommendation: to expand technical expertise to stop discrimination
Legislation ●
Discrimination is forbidden by international and national laws
●
The scope of protection is expanding
●
●
In Finland: new non-discrimination act since Jan 2015
●
new EU non-discrimination directive is in preparation
It is not yet clear how to address how to address digital discrimination properly (work in progress) ●
●
Obama report 2014 on big data
Using sensitive variables (e.g. race) in models is forbidden ●
Removing the sensitive variable does not solve the problem (redlining)
Implications ●
Human decision makers may discriminate occasionally
●
Algorithms would discriminate systematically and continuously
●
Algorithms are often considered to be inherently objective
●
●
But models are as good as data and modeling assumptions
●
Algorithms may capture human biases, and may exaggerate
Why care? ●
To protect vulnerable people ? Law requires ?
●
Accountability for algorithm performance
How does discrimination by algorithms happen?
What is an algorithm ●
Typically in machine learning an algorithm is a procedure for learning a model from data ●
●
(Predictive) model is the resulting decision rule(s) E.g., linear regression: algorithm for finding parameters - OLS, resulting equation with parameters – model
Historical data
new data
algorithm model
model decision
What is an algorithm ●
Typically in machine learning an algorithm is a procedure for learning a model from data ● ●
●
Model is the resulting decision rule(s) E.g., linear regression: algorithm for finding parameters OLS, resulting equation with parameters – model
In the mass media algorithms typically refer to models new data
First-principle vs. data science First-principle models ●
●
●
●
Deeply study the phenomenon Understand the phenomenon Assume the form of relation (model) Find the parameters Rule-based systems Parametric models Linear, non-linear regression Logistic regression
Data driven models (“black-box”) ●
Collect a lot of data
●
Formulate learning task
●
Learn model from data Bayesian Networks Deep learning Support Vector Machines Decision trees Instance-based learning Clustering Collaborative filtering
Example – translation Grammar based translation vs. Google translate
Can algorithms discriminate? ●
●
Discrimination – inferior treatment based on ascribed group rather than individual merits Algorithms can discriminate even when the sensitive variable is not used in decision making (redlining) ●
Indirect discrimination
Source: "Home Owners' Loan Corporation Philadelphia redlining map". Licensed under Public Domain via Wikipedia
Toy example ●
Suppose salary is decided as
●
Data scientists assumes
●
Observes data
●
Learns model
When do algorithms become discriminatory? ●
Algorithms can become discriminatory when ●
●
●
data is incorrect –
due to discriminatory decisions in the past
–
population is changing over time
data is incomplete –
omitted variable bias
–
sampling bias
global optimization criteria is used –
maximize performance for the majority not considering how the remaining inaccuracies distribute
Algorithmic challenges ●
●
If less training data is available for minority, the model is likely to be less accurate on the minority Populations may be non-homogenous ●
●
●
Data for minority may be more complex (e.g. complex names) Underlying patterns for minority may be more complex (e.g. separate models for males and females)
Accuracy is never 100%, how do the errors distribute?
maximize performance for the majority not considering how the remaining inaccuracies distribute
Algorithms discriminate unintentionally, and often discrimination is not easy to spot, and not trivial to fix
How to prevent potential discrimination by algorithms?
Algorithms and discrimination ●
●
Discrimination-aware data mining and machine learning is a new emerging discipline Computer science + law +social science
book
journal special issue 2014 22(2)
Algorithms and discrimination ●
●
●
Discrimination-aware data mining and machine learning is a new emerging discipline Computer science + law +social science ●
Fairness model (how it should be) – from social sciences
●
Non-discrimination constraints – from law
●
Computer science makes algorithms to obey those constraints
Translating fairness models and laws into mathematical constraints is challenging
Questions for social science ●
Fairness model ●
●
●
what would be the principles for fair allocation? Increase the deprived community? Reduce the favored community? Average? Weighted average? To what extent differences can be justified? Location? Neighborhood?
Discrimination mechanisms ●
What are different discrimination mechanisms? Score + bias? Is bias the same for everybody, or does it depend on some other characteristics? How does multiple discrimination happen?
Discrimination-aware data mining and machine learning Discrimination discovery ●
●
Discover discrimination in data using data mining methods Data records human decisions
Discrimination prevention ●
●
Incorporating nondiscrimination constraints into algorithms Resulting algorithms (models) are used for decision support
Discrimination-aware data mining and machine learning Discrimination discovery ●
●
Discover discrimination in data using data mining methods Data records human decisions
Discrimination prevention ●
●
Incorporating nondiscrimination constraints into algorithms Resulting algorithms (models) are used for decision support
Measuring discrimination by algorithms ●
There is no consensus (yet) how to measure
●
Main types of measures ●
●
●
●
Statistical measures for indirect discrimination are applied to model outputs New mathematically convenient measures – relation between model output and sensitive variable Structural measures –
Identify discriminated individuals (discrimination discovery) and count how many
–
Identify and count discriminatory rules (discrimination discovery)
Active debate on how to account for explanatory variables
Prevention baselines ●
Hiding sensitive variable from the model does not solve the problem, unless..
Any attributes in X that could be used to predict s are changed such that a fairness constraint is satisfied approach is similar to sanitizing datasets for privacy preservation
Feldman et al 2014
Resample ●
Preferential sampling
Kamiran and Calders 2010
Regularization
Data subset due to split
Regular tree induction ●
Decision tree
Entropy wrt class label Non-discriminatory tree
Entropy wrt protected characteristic
Tree induction IGC - IGS Kamiran et a 2010
Postprocessing ●
Modifying model
Relabel tree leaves to remove the most discrimination with the least damage to the accuracy
Kamiran et a 2010
Prevention solutions ●
●
●
Preprocessing ●
Modify input data X, s or y
●
Resample input data
Regulatization Postprocessing ●
Modify models
●
Modify outputs
From legal perspective Decision manipulation – very bad ●Data manipulation – quite bad ●Protected characteristic should not be used in decision making ●
Challenges ahead ●
Research challenges ●
Defining the right discrimination measures and optimization criteria
●
Translating legal requirements into mathematical constraints and back
●
●
Transparency and interpretability of the solutions is critical – stakeholders need to understand and trust the solutions
Impact challenges ●
What is the scope of potentially discriminatory applications?
●
Businesses are reluctant to collaborate, afraid of negative publicity
●
Public is not concerned thinking that algorithms are always objective
How can decision making by algorithms discriminate ...
Decisions informed by big data could have discriminatory effects even in the absence of ... But models are as good as data and modeling assumptions. â. Algorithms may .... What is the scope of potentially discriminatory applications? â.
Tweet draws on ideas from compression ⢠Making your way through a grocery list helps explain priority queues and traversing graphs. ⢠And more As you better.
Can Use Business Analytics to Turn Data into ... Relatet. Storytelling with Data: A Data Visualization Guide for Business Professionals ... Big Data: Using Smart Big Data, Analytics and Metrics to Make Better Decisions and Improve Performance.