What Are the Risks of Algorithmic Bias in Higher Education?

Artificial intelligence and machine learning are increasingly embedded in the software colleges and universities use for admissions, advising, courseware, and assessment. These technologies hold tremendous promise to help higher education expand access, overcome structural barriers, and close equity gaps. However, higher education also needs to think about the risk of structural inequities informing the software’s recommendations, a phenomenon known as algorithmic bias.

For example, the sophisticated algorithms embedded in courseware can identify struggling students in gateway courses. But as Jessica Rowland Williams, the Director of Every Learner Everywhere® cautions, “Algorithms are only as informed as the programmers and developers who design them . . . . If an algorithm routinely places a student in a learning track that doesn’t align with their learning needs, then this could ultimately hinder growth.”

Despite the potential to help colleges and universities confront equity gaps, these powerful new technologies also have the potential to reproduce or even amplify existing biases. This is a challenge that many industries — and increasingly higher education — are grappling with as artificial intelligence and machine learning are built into computer software.

What is machine learning?

Machine learning is a methodology in artificial intelligence where a computer program “trains” itself automatically, using a reservoir of data. Without machine learning, the ability of a computer program to make predictions and personalized recommendations is limited by the ability of a software developer to imagine every possible scenario and write instructions into the code for those scenarios. With machine learning, the software developer instead writes instructions for looking back at an existing large set of scenarios and using them to develop inferences, predictions, and recommendations about the present scenario.

For example, when the auto-suggest feature in a messaging or email application suggests the next word or phrase, it’s not because the software developer wrote a line of instructions saying “When the user types X, suggest Y.” Instead, they instructed the software to observe a very large set of examples of past users, to learn from that, and then to develop a recommendation for this user.

Machine learning is playing a larger role in higher education technology. For example, software can potentially scan the countlessly unique learning paths students have taken through a university, identify correlations and patterns related to academic success, and make startlingly specific recommendations to assist individual students with particular interventions.

What is algorithmic bias?

Algorithmic bias is discrimination against one group over another due to the recommendations or predictions of a computer program. In theory, this isn’t unique to the growth of artificial intelligence and machine learning. Any algorithm could be biased if it were written to deliberately weight or discard some factors. In current usage, algorithmic bias typically refers to seemingly hidden bias resulting from the data used as inputs for a program.

This is increasingly an issue as artificial intelligence and machine learning are embedded into technology tools. The algorithm appears impartial because it seemingly doesn’t have biased instructions in it, so its recommendations are perceived to be impartial. But the data the algorithm is learning from could have structural and historical bias baked into it.

For example, if a college admissions office is using a software program to identify applicants likely to succeed at the college and the algorithm is “learning” from that institution’s previous admissions data, it will tend to make recommendations that resemble past biases. The software may not have a line of instructions saying to prioritize students with more advanced placement (AP) classes, but in looking back at old data, the software could train itself to treat more AP classes as a signal of quality and thereby replicates a structural bias.

Algorithmic bias in higher education

There are many well-documented examples of corporations recognizing and struggling with the potential harm caused by algorithmic bias. Amazon experimented with a recruitment software that relied on 10 years of résumés from applicants and hires but had to discontinue it when it consistently favored male candidates over women.

Algorithmic bias in higher education is showing up in similar ways. In 2020, the University of Texas at Austin’s computer science department ditched a machine learning program it used to evaluate applicants to its Ph.D. program. The program’s database used past admission decisions in its algorithm, which critics contended reduced opportunities for students from diverse backgrounds.

Algorithmic bias is also a risk in courseware and other digital learning technology, warns Roxana Marachi, Associate Professor of Education at San Jose State University. “The systems we are putting into place are laying the tracks for institutional racism 2.0 unless we address it — and unless we put guardrails or undo the harms that are pending,” she said in an EdSurge report.

Earlier this year, the technology news site The Markup investigated the advising software Navigate from consulting firm EAB, which is widely used by large public universities. They found that Black students were identified as “high risk” to not graduate in their selected major at four times the rate of white peers.

“This opens the door to even more educational steering,” Ruha Benjamin, Professor of African American studies at Princeton University, told The Markup. “College advisors tell Black, Latinx, and Indigenous students not to aim for certain majors. But now these gatekeepers are armed with ‘complex’ math.”

Arguably, this example is a better illustration of misusing the algorithmically driven recommendation. But as Maryclare Griffin, a Statistics Professor at The University of Massachusetts Amherst, points out, “You can easily find situations where there are two students who have similar low GPAs and different risk scores, and there’s no obvious explanation for why that’s the case,” except that one is from an underrepresented group for that major.

One possible improvement might be to “scrub” social categories such as race and gender from the data that algorithms are learning from by omitting signals like names, the college someone went to, or the number of AP classes they took. However, writing in The Journal of Information Policy, three faculty from The University of Arizona School of Information Center for Digital Society and Data Studies argue that leaving out social category data does not necessarily make an algorithm less biased. Social identifiers are so pervasive that machine learning algorithms can detect other proxies or patterns in the data that reveal sex or race, such as social connections or addresses.

The authors go on to recommend including social identifiers in data and then improving the design and use of algorithms to directly confront bias. “When such sensitive information is used responsibly and proactively, ongoing discrimination can be made transparent through data-checking processes that can ultimately improve outcomes for discriminated-against groups,” the authors conclude.

One example of that approach may be Salesforce’s Education Cloud admissions software, which can be configured to alert a user when a data point could reveal race. Kathy Baxter, Architect of Ethical Practice at Salesforce, told Fast Company, “if an admissions officer wants to avoid race or gender bias in the model they’re building to predict which applicants should be admitted, but they include zip code in their data set not knowing that it correlates to race, they will be alerted to the racial bias this can cause.”

Improving algorithm accountability

As institutions increasingly turn to algorithms for recruitment and student assessment, questions around accountability and ethics arise, as well. When humans discriminate, it’s easy to assign fault. But if a student is unfairly harmed by an algorithm relying on machine learning, who is accountable for that inequity? Given the “black box” nature of artificial intelligence, and with companies disinclined to reveal their proprietary data sets, discriminatory practices may be harder to spot or reverse.

Lawmakers have begun to think of ways to address this issue. In 2019, the Algorithmic Accountability Act was introduced in the House of Representatives. The law would mandate businesses conduct internal reviews of their automated decision-making processes to uncover any bias, unfairness, or discrimination in the algorithm’s output. Organizations would also need to detail the system’s design, training data, and purpose. The bill did not receive a vote in 2019, but one of the bill’s original sponsors plans to reintroduce the measure.

In the meantime, higher education institutions must ensure their students’ data privacy and take measures to prevent unintended discrimination arising from algorithms, say experts in the field. Rosa Calabrese, manager-digital design, WCET – WICHE Cooperative for Educational Technologies, writing in Algorithms, Diversity, and Privacy: Better Data Practices to Create Greater Student Equity, suggests humans should be a part of the decision-making process alongside AI-powered algorithms, and that institutions must routinely audit algorithms for bias.

The best route to algorithmic accountability will be to diversify the group of people involved in creating the algorithms to begin with. As Jessica Rowland Williams said in an earlier interview, “Developers must intentionally avoid status quo design or designing for the ‘average student,’ who often is thought of as white, male, and middle- to upper-income. They also must diversify their pool of developers to include a broader range of perspectives and lived experiences in the development process.”

To reduce the risk of algorithmic bias in digital learning, use the comprehensive Adaptive Courseware Implementation Guide, which advises on ways to center equity and student voice in digital learning technologies.