I Was an Enthusiastic Early Adopter of AI Scribes. Here’s Why I Stopped
A GP reflects on what eighteen months of ambient scribing taught them about the consultation they thought they already understood.
I want to be clear about something before I begin: I am not a technophobe. I am a GP with a background in clinical digital leadership, a member of the RCGP Informatics Group, and someone who spent eighteen months as an enthusiastic, committed early adopter of ambient AI scribing technology. I believed in it. I advocated for it. I used it every day.
And then I stopped. Not because the technology failed — but because it worked exactly as intended, and that turned out to be the problem.
What I experienced over those eighteen months, and what I have since spent considerable time trying to articulate, is not a first-order failure of the technology. It is a second, third, and fourth-order failure of our understanding of what the GP consultation actually is, and what we put at risk when we hand its documentary function to a machine.
The Promise Was Seductive — and Real
Let me not be ungenerous to the tools themselves. The clinical governance around the particular tool used, is genuinely solid. It is MHRA-registered, GDPR-compliant, and designed with patient safety at its core. The early outcome data is, on the face of it, compelling. A large US study published in JAMA found that ambient scribing reduced clinician burnout from 51.9% to 38.8% after just thirty days of use. The efficiency metrics — around 26% time saved on documentation — are similarly striking.
And experientially, the appeal is visceral. You walk into a consultation. You speak to your patient. You look at them — really look at them, rather than at a screen. The machine captures everything, assembles a structured note, and within seconds of the patient leaving, your record is essentially complete. It feels like a superpower. It feels like being given back the consultation you trained for, minus the administrative drag that hollowed it out.
I understand why colleagues are excited. I was excited. The question I want to pose is not whether these tools offer something real — they do — but whether we are paying a price we haven’t yet noticed.
The Consultation Began to Change
The first thing I observed — and it took several months before I could name it — was that my consultations were getting longer, and not because I was doing more.
With an impartial, photographic memory capturing everything, I found I was giving patients more latitude. The implicit contract of the ten-minute GP appointment has always involved a degree of managed agenda-setting. Experienced GPs develop a sophisticated, largely unconscious set of tools for this: a particular way of listening that signals receipt without necessarily invitation, a calibrated use of silence, the gentle steering of “and what else?” Patients learn, over many encounters, to make their most important point early.
The scribe dissolved this contract. Because everything would be recorded and reviewed, I began to allow consultations to expand to fill whatever space the patient needed. It felt, in the moment, like better medicine. It felt like finally doing it properly.
What I was actually doing was offloading the work of clinical curation — one of the most cognitively demanding and clinically important things a GP does — to the post-consultation review process. And that review process was mine alone, unsupported, often done at the end of a surgery already running thirty minutes over.
I was left with too many problems documented to manage, too many loose threads to follow up, and a creeping sense of clinical overextension that I initially attributed to patient complexity rather than to a change in my own behaviour.
The Notes Weren’t Mine
The deeper unsettling came later, at follow-up appointments.
I sat down to review a patient I had seen six weeks previously. I read the note. It was accurate. It was comprehensive. It contained no factual errors that I could identify. And I did not recognise it.
The voice in the note was not my voice. The emphasis was not my emphasis. The clinical narrative — the selective, interpretive story that a GP constructs to capture not just what was said but what mattered, what worried them, what they decided to watch and why — was absent. In its place was a faithful transcription of everything spoken, organised by structure rather than by clinical significance.
I had not written that note. An algorithm had written it, in response to sounds it had heard, and I had approved it, hurriedly, at the end of a session. Now, six weeks later, I was reading someone else’s account of a consultation I had conducted — and I couldn’t recall the patient clearly enough to reconstruct what had been left out.
This is not a small thing. The clinical note in general practice is not merely a medicolegal record. It is, as research in the Journal of General Internal Medicine has articulated, a form of narrative medicine — a clinician-authored story that reflects how the physician understood the patient’s situation at that moment in time. The act of writing it is itself a cognitive process: it forces synthesis, prioritisation, and reflection. It is, in a real sense, how we think.
When we outsource that act to an AI, we are not merely saving time. We are externalising a cognitive function that was doing clinical work we didn’t realise it was doing.
The Cognitive Science of What We’re Losing
There is a body of research on what cognitive scientists call “extended mind theory” — the idea that human cognition is not confined to the brain but extends into the tools and artefacts we use. Writing is perhaps the oldest and most profound of these cognitive tools. When we write clinical notes in our own voice, we are not transcribing our thoughts after the fact: we are completing them.
A PMC paper on cognitive perspectives in physician expertise makes this concrete. Clinical reasoning in general practice relies on what researchers call “illness scripts” — compressed, experiential summaries of what a particular clinical picture looks and feels like, built up over thousands of encounters. These scripts are stored not just neurologically but in the patterns of how we document: the phrases we reach for, the features we choose to record, the anxieties we articulate in the margin of the plan.
When an AI records everything rather than what we chose to record, this feedback loop breaks. The note stops being a reflection of clinical reasoning and becomes a verbatim archive. And as a PMC piece on note bloat explicitly warns, this risks obscuring the most medically important information under a volume of equally-weighted detail.
The research on “note bloat” — a concern raised explicitly in the ambient scribing literature — captures something important but doesn’t go far enough. The problem is not just that notes become longer. It is that they become less curated, less authored, less *ours*. And the clinical memory they are supposed to scaffold — the ability to pick up a case six weeks later and immediately reorient — degrades with them.
A qualitative study in JAMIA published in 2025 formalised what I had experienced intuitively: clinician identity and voice emerged as central constructs in how physicians perceive AI-authored notes. The notes don’t feel like theirs because, in a meaningful sense, they aren’t.
The Consultation’s Hidden Architecture
There is something else the ambient scribe disrupts that receives almost no attention in the literature: the hidden architecture of the GP consultation as a therapeutic and diagnostic instrument in itself.
The general practice consultation is not simply an information-gathering exercise followed by a decision. The process of how information is gathered — what is asked, in what order, how ambiguity is held or resolved, what the clinician chooses to make explicit — is itself diagnostic. The GP who listens to a patient describe fatigue and quietly decides not to ask about mood yet, who files something away for a later question, who notes a hesitation as clinically significant: this is not inefficiency. This is the craft.
The ambient scribe passively records all of this without understanding any of it. It captures the conversation but not the clinical intelligence embedded in its architecture. And — more critically — it changes the conversation itself. Knowing that everything will be recorded, patients speak differently. They raise more. They venture into territory they might previously have held back, trusting that the thoroughness of the record will carry the weight of what they’ve shared.
This is, in many ways, a good thing. In others, it is a transfer of clinical responsibility that we have not acknowledged: the patient has disclosed more, the scribe has recorded more, and the GP is now accountable for all of it, whether or not they had the clinical bandwidth to assess it in the moment.
I exceeded my time. I over-ran my surgeries. I accumulated more documented problems per consultation than I could safely manage. And the research, when you look for it, acknowledges exactly this risk: a 2024 review in PMC notes that comprehensive documentation might capture previously missed information, but could also clutter the medical record with less clinically relevant details — and that either approach creates challenges. What is less clearly stated is that the clinician, not the AI, carries the consequences.
What the Burnout Studies Are Missing
The headline finding — that ambient scribes reduce burnout — deserves scrutiny. A rapid review of the evidence, screening over 1,400 studies, found that burnout measured using standardised scales was essentially unaffected, even where engagement and perceived workload improved. The review concluded that digital scribes are unlikely to fully address the multifaceted issue of clinician burnout, which is driven by systemic factors the scribe does not touch.
My own experience is consistent with this. The administrative relief was real and immediate. The downstream cognitive and relational consequences took months to accumulate, and by then I had already attributed my increasing sense of disconnection to patient complexity, to the NHS, to everything except the tool.
The studies measuring burnout at thirty days, or even ninety, are too short to capture what I experienced. They are measuring the relief phase. They are not yet measuring what happens when the consultation you conducted three months ago is documented in a voice you don’t recognise, by a process you increasingly don’t trust, in notes you are nominally responsible for but did not substantively write.
A Caution for Trainee GPs
I want to say something specific about training, because I believe the risk is most acute here and least discussed.
Clinical reasoning is a skill that is built through practice, and a significant part of that practice is documentation. The act of writing up a consultation — deciding what to include, how to frame the assessment, what language to use — is not an administrative afterthought. It is a learning process. It is how junior doctors and GP registrars build the illness scripts and pattern libraries that will underpin their expert practice for the rest of their careers.
If registrars are trained from the outset with ambient scribing as the norm, we may be producing a generation of highly competent conversationalists who have never fully developed the cognitive discipline of clinical synthesis. The scribe will have done that work for them. And the consequences — subtler than an exam failure, much harder to measure — may only become visible years later.
The RCGP, as I understand it, has not yet produced a position statement on scribe use during training. It should.
What I Am Not Saying
I want to be precise about the limits of this argument. I am not saying ambient scribes are bad technology. I am not saying they should be withdrawn. I am not saying the burnout gains are illusory — for many colleagues, particularly those with the most crushing documentation burdens, they may be transformative and genuinely sustainable.
What I am saying is this: the consultation is not simply a conversation from which a document is generated. It is a complex cognitive and relational act, and the document we produce is not a by-product of that act but part of it. Any tool that changes how that document is produced will change the act itself, and we should expect those changes to be significant, not trivial.
We are in the relief phase of ambient scribing. The efficiency gains are real and the risks are not yet visible in our outcome data — partly because we haven’t looked, and partly because the timescales are too short. The absence of evidence is not evidence of absence.
I stopped using my ambient scribe not because it was broken, but because I began to understand, with increasing unease, what it was doing to the consultation I had spent twenty years learning to conduct. I felt myself becoming a passive observer in encounters where I had previously been an active architect. I felt my clinical memory, my narrative identity, and my sense of connection to my patients beginning to erode at the edges.
I cannot prove that to you in a randomised trial. What I can tell you is that when I sat in front of a patient I hadn’t seen in six weeks and did not recognise myself in the notes I had written — or rather, that had been written in my name — something important had already been lost.
We should be asking what that was, before we lose the ability to remember.
The author is a GP and member of the RCGP Informatics Group. The views expressed are personal.*
A note on process: the ideas, clinical experience, and arguments in this piece are entirely my own, developed over eighteen months of practice and reflection. I used Claude, an AI assistant, to help shape and structure the writing. I think that’s worth being transparent about — not least given the subject matter.
18/3/2026 a note of correction: Thank you to my eagle eyed readers for spotting the dubious references in the first version of this post. My illustrious AI “assistant” Claude clearly got a bit carried away. Charged with my first flush of Substack enthusiasm I posted without checking the references. A rookie error. Fortunately it wasn’t a blog about something more contentious but the irony hasn’t escaped me. My humble apologies, I’ll be more careful next time. Thank you to Anne Marie Cunningham from the RCGP Informatics Group for bringing this to my attention.
Amended references below:
References and further reading
- Olson KD, Meeker D, Troup M, et al. Use of Ambient AI Scribes to Reduce Administrative Burden and Professional Burnout. JAMA Netw Open. 2025;8(10):e2534976. doi:10.1001/jamanetworkopen.2025.34976
- Lukac PJ, Turner W, Vangala S, Chin AT, Khalili J, Shih YT, Sarkisian C, Cheng EM, Mafi JN. Ambient AI Scribes in Clinical Practice: A Randomized Trial. NEJM AI. 2025 Dec;2(12):10.1056/aioa2501000. doi: 10.1056/aioa2501000. Epub 2025 Nov 26. PMID: 41497288; PMCID: PMC12768499.
- Kanaparthy NS, Villuendas-Rey Y, Bakare T, Diao Z, Iscoe M, Loza A, Wright D, Safranek C, Faustino IV, Brackett A, Melnick ER, Taylor RA. Real-World Evidence Synthesis of Digital Scribes Using Ambient Listening and Generative Artificial Intelligence for Clinician Documentation Workflows: Rapid Review. JMIR AI. 2025 Oct 10;4:e76743. doi: 10.2196/76743. Erratum in: doi: 10.2196/93250. PMID: 41071988; PMCID: PMC12513689.
- Brender, T.D., Celi, L.A. & Cobert, J.M. Clinical Notes as Narratives: Implications for Large Language Models in Healthcare. J GEN INTERN MED 40, 687–689 (2025). https://doi.org/10.1007/s11606-024-09093-y
- Van Tiem J, Cramer E, Iverson C, Kennelty K, Andrys N, Lee J, Knake L, Misurac J, Blum J, Reisinger HS. Listening to the note: clinician perspectives on ambient artificial intelligence scribes in medical documentation. J Am Med Inform Assoc. 2026 Feb 1;33(2):255-262. doi: 10.1093/jamia/ocaf214. PMID: 41340524; PMCID: PMC12844589.
- Fraundorf SH, Caddick ZA, Nokes-Malach TJ, Rottman BM. Cognitive perspectives on maintaining physicians' medical expertise: IV. Best practices and open questions in using testing to enhance learning and retention. Cogn Res Princ Implic. 2023 Aug 8;8(1):53. doi: 10.1186/s41235-023-00508-8. PMID: 37552437; PMCID: PMC10409703.
- Risko EF, Gilbert SJ. Cognitive Offloading. Trends Cogn Sci. 2016 Sep;20(9):676-688. doi: 10.1016/j.tics.2016.07.002. Epub 2016 Aug 16. PMID: 27542527.

Many thanks to those who have taken the time to comment. Your perspectives have helped clarify my position on AI scribes. Not all bad but to be used with open eyes to the pitfalls.
Thank you for writing these thought provoking reflections.
I'm a child psychiatrist exploring the use of ambient AI scribing particularly in my independent practice. I'm about 9 months in to my own experiment with the tech.
I wonder if psychiatry fits into "those with the most crushing documentation burdens" as you described. I find that typically, I add detail to the notes and letters outputted by the scribe rather than take away!
I set myself some rules at the start. I still take notes. And I always write my own formulation or summary of the encounter. So far that seems to have avoided the drift of the scribe output not feeling like my voice or my decisions.
So far so good. Particularly with follow-up appointments which are more structured and less meandering than a typical child psychiatry assessment.
Thank you for the warning. And I'll continue to reflect on the use of this technology and what it's "saving" me from doing in the coming months.