Drawing on interviews with 27 Australian and 30 Swedish teachers from across eight schools, researchers have flagged the ‘sustained’ and ‘often unseen’ human labour that’s needed for GenAI tools to be integrated into classrooms.
They warn that AI technologies are “dependent on the hidden labour of humans to co-produce the illusion of automation”.
All 57 teachers had used AI tools to support their work, including planning lessons, creating resources, grading work and providing feedback to students.
But in many instances, teachers reported how these tasks would “inevitably involve continual acts of checking, validating, judging, discerning, reworking, and sometimes outright rejecting” the outputs that were produced, the study found.
Many teachers were suspicious of colleagues who claimed to simply ‘lift and shift’ GenAI outputs unproblematically into their practice, suggesting any material that was going to be used in the classroom had to be analysed and modified in some way.
“Some teachers referred to this work in unassuming ways – ‘tweaking’, ‘fiddling around’, ‘double checking’ and providing additional ‘afterthought’.
“Others referred to this work in more substantial terms: ‘negotiating’, ‘going deeper’, and ‘going back to using myself only’,” the researchers note.
Teachers clearly perceive a ‘lack-of-fit’ between AI materials and their own students and teaching expectations, the study states.
Tellingly, GenAI content was also regularly criticised for being ‘not sophisticated enough’ to align with local curriculum expectations.
Legal Studies teachers in both countries, for example, described their teaching as highly locally-specific, with nuances that were missed by AI tools.
One Australian Legal Studies teacher lamented how AI produced exam questions that weren’t relevant to the subject.
“In our Legal Studies there’s an emphasis on specificity within study designs.
“So, for higher order thinking you're only really going to get Discuss, Evaluate or Analyse. But [GenAI] will often want to bring in other kinds of task words or action words to questions. But we don’t use them in the exam!” the teacher noted.
“Also, everything we teach has to be very contextual to Victoria. For example, the principles of justice are defined in a particular way in Victoria that don’t include things such as rights like they do in lots of other countries.
“So, little things like that are a big barrier … and even when you try and teach [GenAI] to produce a study design relevant to the Victorian legal system it just doesn’t have the power to do that,” they added.
Teachers also described how something felt inherently ‘off’, inauthentic or sketchy about AI-produced content. One Australian teacher deemed it ‘flaky’.
“What it wrote seemed very stilted, it basically sounded like spam,” another noted.
Repeated prompting only resulted in outputs that were progressively more ‘generic’ and ‘flat’, or insufficiently ‘vivid’ and ‘lively’, the study found.
Many of the problems teachers ran into with AI content are unlikely to be resolved via more finely-tuned prompting, researchers say.
“Indeed, much of the teacher work highlighted in this paper was not simply focused on ‘double-checking’ GenAI outputs in terms of accuracy, veracity, hallucinations or biased results (although all these issues are clearly important).
“Rather this was often work related to anticipating how GenAI output would fare within the specific social contexts of a particular classroom or small group of students.
“This was work that drew on teachers’ prior experiences of ‘what works’ as well as localised judgements of what is appropriate, useful or simply ‘feels right’,” researchers declare.
Experienced teachers possess a ‘wisdom’ that is simply impossible to automate, the study concludes.
The findings run in direct contrast to industry and policy rhetoric that presents AI as a helpful and efficient tool for teachers.
“…GenAI tools were certainly not perceived as producing ‘accurate, high-quality content’ for teachers,” researchers flag.
“For sure, some of these perceived shortcomings of GenAI outputs might be ironed out with more pertinent and protracted prompting, yet this goes against the current dominant discourse of GenAI enhancing the ease, convenience, speed and all-knowingness of teachers’ work.”
Most teachers were well cognisant of the irony at play when they pour considerable effort and time into reworking AI content that was meant to be saving them just that – time and effort.
“Re-inventing the wheel every time is shooting yourself in the foot’,” one participant reflected.
Others seemed to justify (albeit ‘flippantly’) their re-working efforts by noting their professional mindset.
“I’m a bit of a control freak,” one said, while another suggested, “Some teachers have personalities where you want to have control over everything”.
“Underpinning these explanations, however, were heartfelt concerns with wanting to do a good job and ensuring that standards were not compromised,” the study notes.
In light of the findings, we need to “remain mindful of the long history of teachers being blamed for the failure of new technologies to take hold in classrooms – be it due to lack of skill and confidence, or a presumed conservative mindset”, researchers urge.
“Blaming teachers therefore remains a convenient way of distracting attention from the ‘continuing wasteful investment [in edtech], and more importantly, significant difficulties for teachers who try to fit their practice to technologists’ unrealistic aspirations.”
EducationHQ were told the research team will be offering media commentary as of next year once the project has progressed.