Can Technology Enable Authentic Assessment at Scale in Higher Education?

Educators know one of the best ways to assess learning is with old-fashioned project-based assignments — creative work, presentations, experiments, and written papers that give students opportunities to demonstrate, practice, or apply their learning.

Educators also know that grading art portfolios, performances, lab work, speeches, original research, and essays is time intensive and difficult to manage across large programs — particularly online programs with a distributed faculty and student body. The growth of online learning has probably meant a resurgence of multiple-choice style tests and quizzes at the expense of assessments that invite creative work.

Moreover, unique individual assessments don’t traditionally generate structured data that easily plugs into the data from those quantitative assessments. Both a presentation and a quiz may evaluate a competency such as using evidence, but student progress reports from a digital learning platform may only show the results of the quiz without incorporating the instructor’s scoring from the presentation.

However, education technology companies are potentially getting to the point of easing this tension between personalized and scalable, allowing colleges, universities, systems, and credentialing entities to incorporate authentic assessments into their online or hybrid learning programs.

Reimagining high-value assessment

One such company is MZD, founded and led by Zac Henrich. MZD has produced a suite of tools allowing high-stakes assessments, both quantitative and qualitative, to be scored at scale. Henrich sees potential for higher education to use — or return to using — more of what he calls “performance assessments” built on authentic work.

MZD’s clients are interested in incorporating traditional academic essays, lab work, theater performances, and art portfolios into large online learning programs. MZD’s software moves the “red pen on paper” experience of scoring those unique projects to an online interface and makes commenting and marking more efficient for large groups of distributed instructors. Its administrative tools manage the workflow between people participating in the assessment and enables collaboration between them. A rubric authoring tool allows both informal and formal responses, the latter aggregating into data an instructor or program can then use to evaluate where more attention is needed.

For example, a K-12 district can assign a creative project for all its high school buildings alongside — or in lieu of — a multiple-choice test and end up with a district-wide view of the results. Similarly, the scoring of low-stakes formative assessments, such as response essays, can be integrated with the scoring of summative quantitative assessments to track progress on particular competencies.

Performance assessments can look different for each program. MZD works with Mississippi State University’s Center for Continuing Education to support its welding technician program. To earn certification, students must demonstrate welds that are evaluated by their own instructors as well as by a panel of experts from across the state.

“They’re actually evaluating the student welds at scale,” Henrich says. “The student records themselves doing a welding project, uploads the video, and that’s evaluated and scored.”

Another institution using performance assessment at scale is Vin University (VinUni) in Hanoi. In its English language courses, rather than using a multiple-choice exam to test reading comprehension, VinUni asks students to record themselves reading and discussing passages and to submit written assignments. Instructors listen to the spoken work, read the written work, and score and comment on both, all through the same interface.

How might scaling authentic assessments work for a large-enrollment gateway college class like Biology 101 that has dozens of sections and instructors teaching a thousand students? Henrich says the key to an authentic assessment in that scenario is a good rubric.

“The professor creates their authentic task and sends a link and the students go off, do their work, and submit it,” he explains. “There’s no more ‘Here’s the stack of papers. I’m going to take it home and score it.’ It’s ‘I’m going to log in to a system and I’m going to score against the rubric until there’s no more left.’ And it could be one person or 50 teaching assistants doing that scoring.”

The ability to grade at scale, and to include the input of a variety of instructors, experts, and other stakeholders, also helps make scoring more consistent, particularly when strangers are grading a project. And because student work can be presented to a rater anonymously, distributed collaborative scoring can also reduce bias. Assuming a well-structured rubric, “I’m actually going to evaluate the student on their work and not my preconceived notions of what kind of student they are,” says Henrich.

Authentic assessments also support equity efforts, says Henrich, by giving every student an opportunity to use their personal experience. Rather than asking students a reading comprehension question about a text they may or may not have read in a test prep course, they’re creating a project or solving a problem. Not only does that mean they’re producing more personally relevant work, it also means students without access to test prep courses or tutors are able to participate in a meaningful way and demonstrate progress.

Related reading — Getting Started with Equity: A Guide for Academic Leaders includes advice on assessment strategies that benefit racially minoritized and poverty-affected students.

The future of assessment?

Henrich thinks better assessment technology will allow education to leverage the efficiencies of digital learning modalities while returning to its roots: asking students to practice and demonstrate learning through authentic work rather than through memorization and recall of content.

“Developing domain knowledge in any discipline — be it biochemistry or journalism, visual arts or behavioral economics — requires exposure and practice with concepts, skills, and contextual applications,” he says. For example, an introductory science courseware might describe the steps of a laboratory procedure, but a quiz to show students memorized those steps only goes so far.

“It’s one thing to ask a multiple-choice question about a laboratory procedure and it’s another to allow students to upload a video of them doing the procedure, or to perform a simulation, or author a professional-style protocol document as their test,” he says. “It is increasingly possible to deliver and even automatically score these more authentic assessments online, so let’s use these better tools to measure students’ preparation for advanced studies and careers instead of just their readiness for a particular test.”

Tools from Every Learner Everywhere® for personalizing learning: