Algorithms have been shown to outperform humans in a range of settings from medical diagnosis to image recognition. However, critics caution that automated approaches could codify existing human biases and disadvantage those from under-represented groups. Examining the use of automation in hiring, MIT Sloan Prof. Danielle Li found that algorithms which value exploration improve the quality of candidates while also increasing demographic diversity.
“There is a growing body of work on the potential gains from following algorithmic recommendations, but this is the first paper to highlight the role of algorithm design on the hiring process,” says Li.
She explains that typical hiring algorithms are designed to solve a static prediction problem. They look at a historical data set of people who were previously selected to predict who will be a good hire from a pool of current applicants. “As a result, those algorithms often end up providing a leg-up to people from groups who have traditionally been successful and grant fewer opportunities to minorities and women.”
In the study, Li and her colleagues focused on the decision to grant first-round interviews for positions in consulting, financial analysis, and data science – sectors which offer well-paid jobs and which have also been criticized for their lack of diversity. They analyzed records on job applications to these types of positions with a Fortune 500 firm. Like many other firms in its sector, the company receives a large number of applications and rejects the majority of candidates on the basis of an initial automated resume screen. Among those who pass this screen and go on to be interviewed, hiring rates are still relatively low, with only 10% receiving an offer.
The researchers built three resume screening algorithms to compare outcomes. The first model used a typical static supervised learning approach (SL), relying on past data sets to make predictions. The second model used a similar SL approach, but it updated the training data used throughout the test period with hiring outcomes of applicants selected for interviews (updating SL). The third approach implemented an upper confidence bound (UCB), incorporating exploration bonuses that increase the algorithm’s degree of uncertainty about quality. Those bonuses tend to be higher for groups of candidates who are underrepresented – meaning they could have unusual college majors, different geographies, unique work histories, etc. — in the training data. The algorithm was not told in advance to select minorities or women.
They found that the UCB model more than doubled the share of selected applicants who are Black or Hispanic, from 10% to 23%. In comparison, static and updated SL algorithms decreased Black and Hispanic representation to approximately 2% and 5% respectively.
Li points out that the increase in diversity from the UCB model was persistent throughout the test sample. If the additional minority applicants selected by the UCB model were weaker, the model would update and learn to select fewer such applicants over time. Instead, the UCB model continued to select more minority applicants relative to both the human and SL models.
As for gender, all algorithms increased the share of selected applicants who are women, from 35% under human recruiting to 41% with the SL model, 50% with the updating SL model, and 39% with the UCB model.
Li explains the reason why the exploration-based model selects fewer women: “Although there are fewer women in our data set, increases in female representation under the UCB model were blunted because men tend to be more heterogenous on other dimensions like geography, education, and race, leading them to receive higher exploration bonuses on average.”
The researchers further found that the machine learning models generated “substantial and comparable” increases in the quality of selected applicants, as measured by their hiring potential.
“Our study shows that firms don’t need to tradeoff equity for efficiency when it comes to expanding diversity in the workplace. Even though our UCB model placed no value on diversity in and of itself, incorporating exploration in the model led to the firm interviewing twice as many under-represented minorities while more than doubling its predicted hiring yield,” says Li.
However, she cautions, that not all algorithms are designed equally. “In our study, the supervised learning approach – which is commonly used by commercial vendors of machine learning based hiring tools – would improve hiring rates, but at the cost of virtually eliminating Black and Hispanic representation. This underscores the importance of algorithmic design in labor market outcomes.”
Li is coauthor of the NBER working paper “Hiring as exploration” with MIT PhD candidate Lindsey R. Raymond and Peter Bergman of Columbia University.