Search Box

Tuesday, August 25, 2015

Computer Bias

Programming and Prejudice: How to Find Bias in Machine Learning Algorithms

University of Utah | August 17, 2015


Suresh Venkatasubramanian, an associate professor in the University of Utah’s School of Computing, leads a team of researchers that has discovered a technique to determine if algorithms used for tasks, such as hiring or administering housing loans, could in fact discriminate unintentionally. The team also has discovered a way to fix such errors if they exist. Their findings were recently revealed at the 21st Association for Computing Machinery’s SIGKDD Conference on Knowledge Discovery and Data Mining in Sydney, Australia.

<more at http://www.scientificcomputing.com/news/2015/08/programming-and-prejudice-how-find-bias-machine-learning-algorithms; related links: https://followthedata.wordpress.com/2012/06/02/practical-advice-for-machine-learning-bias-variance/ (Practical advice for machine learning: bias, variance and what to do next. July 2, 2012) and http://unews.utah.edu/programming-and-prejudice/ (Programming and Prejudice. Utah Computer Scientists Discover How to Find Bias in Algorithms. August 14, 2015); further: http://arxiv.org/pdf/1412.3756 (Certifying and removing disparate impact. Michael Feldman, Sorelle A. Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. [Abstract: What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded viadisparate impact, which occurs when a selection process has widely different outcomes for differentgroups, even as it appears to be neutral. This legal determination hinges on a definition of aprotected class (ethnicity, gender) and an explicit description of the process.When computers are involved, determining disparate impact (and hence bias) is harder. Itmight not be possible to disclose the process. In addition, even if the process is open, it might behard to elucidate in a legal setting how the algorithm makes its decisions. Instead of requiringaccess to the process, we propose making inferences based on the data it uses.We present four contributions. First, we link disparate impact to a measure of classificationaccuracy that while known, has received relatively little attention. Second, we propose a test fordisparate impact based on how well the protected class can be predicted from the other attributes.Third, we describe methods by which data might be made unbiased. Finally, we present empiricalevidence supporting the effectiveness of our test for disparate impact and our approach for bothmasking bias and preserving relevant information in the data. Interestingly, our approach resemblessome actual selection practices that have recently received legal scrutiny.])>

No comments:

Post a Comment