AI in Schools: Pilot Before Scale — Personalised Tutoring or Erosion of Core Skills?

AI in the Classroom: Personalised Tutoring or Erosion of Skills?

AI can scale personalised support or quietly hollow out skills — which path will schools choose? A National Education Union (NEU) poll of 9,000 state school teachers shows teachers are already using AI in education for planning and admin, even as many warn that widespread pupil use of ChatGPT-style tools and voice-to-text is changing how children learn.

A classroom vignette: voice-to-text meets spelling tests

Picture a Year 6 pupil named Amina. She uses voice-to-text to capture ideas quickly for a homework essay, and her teacher praises the richer content. But two weeks later she struggles on a spelling test and handwriting practice looks rushed. The convenience of AI automation (for example, voice-to-text) helped the output — not the underlying skill.

That quick scenario captures why the debate matters: AI can enable personalised learning and free teacher time, but it can also let students bypass practice that builds durable skills like spelling, structured argument and face-to-face discussion.

Why teachers are worried

Teachers are a study in contradiction. Sixty-six percent of respondents to the NEU poll reported noticing declines in pupils’ critical thinking they link to AI use. At the same time, teacher adoption of AI for day-to-day work has jumped: 76% now use AI for tasks such as creating resources, planning lessons and administration (up from 53% last year). Among primary teachers, usages include resource creation (61%), lesson planning (41%), admin (38%) and marking (7%).

Common teacher concerns fall into four clusters:

Academic skills: teachers report weaker critical thinking, creativity, writing and conversational ability when students rely on AI-generated answers.
Integrity: AI makes cheating easier and more tempting, challenging assessment models.
Social development: disadvantaged pupils may miss out on the motivational and interpersonal benefits of human tutoring.
Governance and competence: many schools lack policies and staff training, so AI is often used poorly, producing low-quality learning experiences.

“Voice-to-text has reduced pupils’ incentive to learn to spell,” one teacher observed.

“AI is destroying the core purposes of learning: problem-solving, critical thinking and collaborative effort,” warned another teacher.

Daniel Kebede, NEU general secretary: “Students must retain the ability to think for themselves, and rolling out AI tutoring before understanding the impacts is risky.”

Where policy stands

The UK government proposes developing AI tutoring tools for up to 450,000 disadvantaged pupils, framing the move as a way to expand personalised instruction. Education Secretary Bridget Phillipson and the schools white paper position AI as part of a broader digital education drive with promises of guidance on safe, responsible use.

But teachers are skeptical: 49% of teachers oppose the government’s AI tutoring plan while only 14% support it. Policy gaps are tangible — 49% of schools reported no AI policy for staff and 66% have no student-specific AI policy at all.

What makes generative AI different from past tech?

Calculators and the web changed what students could do; generative AI automates cognitive output. ChatGPT and other AI agents can draft essays, explain concepts, and simulate dialogues. That power makes AI a tool for personalised practice and tutoring — but it also creates a risk: outputs can look polished even when students haven’t internalised the skills. Generative AI shifts some learning tasks from “practice to master” to “prompt to collect.”

Two deployment paths — choose your future

Deployment choices will decide whether AI narrows or widens learning differences. One path: disciplined, evidence-driven pilots plus teacher upskilling that let AI amplify instruction, widen access to personalised practice and free teachers for higher-value coaching. The other path: rapid scale without evaluation, weak governance, and tech-driven shortcuts that hollow out core skills and substitute algorithmic interaction for human support.

Design pilots that answer the real questions

Before scale, schools should run robust trials. A practical 12-week pilot design checklist:

Define clear objectives: specify targeted learning outcomes (e.g., improved paragraph structure, not just higher essay scores).
Control and comparison: include a control group and randomised assignment where possible to measure causal impact.
Mixed-method metrics: combine test scores with: socio-emotional measures, retention after 3–6 months, incidence of detected AI-assisted cheating, teacher time saved, and parent/pupil satisfaction.
Sample size & duration: aim for statistically meaningful cohorts (sample sizes depend on expected effect sizes) and run for at least one term.
Ethical safeguards: pupil consent, opt-out options, data minimisation and age-appropriate transparency.
Teacher-in-the-loop: teachers must be active partners, trained to integrate AI into formative assessment rather than outsourcing judgement.
Evaluation plan: pre-register metrics, appoint an independent evaluator, and publish results—positive or negative.

Teacher training: a practical micro-course

A short, practical teacher-training module (one half-day + two in-class coaching sessions) might include:

Learning objectives: prompt design, verifying AI outputs, spotting hallucinations, integrating AI into formative feedback, and ethical use policies.
Format: 3-hour workshop (hands-on prompts with ChatGPT-style tools), two 60-minute in-class coaching cycles with peer observation, and a one-page reference cheat-sheet.
Assessment: teachers submit a lesson plan using AI and receive targeted coaching on pedagogic alignment.

Vendor evaluation checklist for procurement teams

Who owns student data and how long is it retained?
Has the model been tested for bias across socio-economic and linguistic groups?
How transparent and explainable are the tool’s suggestions?
Can the tool integrate with your LMS and operate offline or in restricted-network environments?
What are the support and training commitments, and are they included in cost-per-student?
Are there contractual clauses for model updates, audits, and incident response?
Does the vendor provide results from independent evaluations or peer-reviewed studies?

What AI in schools means for AI for business leaders

Procurement and product leaders should treat school rollouts as experiments, not commodities. The temptation to scale quickly — to capture market share or claim impact — risks reputational and financial costs if products underdeliver or exacerbate inequities. Recommendations for business leaders:

Require pilot evidence on learning outcomes before wide deployment.
Build teacher training into the product package and fund initial coaching cycles.
Include data privacy, explainability and bias-testing clauses in contracts.
Measure long-term retention of skills, not just immediate score gains.

Questions leaders should ask (and short answers)

Who owns the student data, and how long is it stored?

Ensure contracts specify data ownership, retention limits, and deletion processes. Prefer minimal retention and local-only storage where feasible.

Has the tool been tested against bias and across different pupil groups?

Demand independent bias audits and subgroup performance reports (e.g., EAL pupils, SEND, socio-economic status).

Can teachers see how the AI arrived at an answer?

Look for transparency features (explanations, source citations) and controls that let teachers override or edit outputs.

What training and ongoing support do you provide?

Vendors should include onboarding, classroom coaching, and refresher sessions as part of the SLA, not as an expensive add-on.

How will this impact assessment integrity?

Agree a plan for assessment design changes, detection strategies, and formative assessment that relies on teacher judgement, not solely on AI-checked outputs.

Final recommendation

Don’t buy scale before you can measure it. Pilot with clear objectives, protect pupil data and equity, and invest in teacher training so AI augments pedagogic craft rather than substitutes for it. Deployment choices now will determine whether AI narrows learning differences or quietly erodes the skills schools aim to build. Pilot, measure, iterate — and govern with care.