Matt Shumer’s “Something Big Is Happening” has gone viral – and understandably so. His core observation is hard to argue with: AI capability – most notably in coding but also many other fields – has crossed a threshold. The acceleration is observable and, for many practitioners, already reshaping daily work so they are literally guiding rather than doing what their jobs were 6 months ago.
He doesn’t say much about research. There might be a reason for that. Research lacks objective outputs to test against; evidence is inherently contested, differing interpretations are unavoidable, judgement calls are entangled with politics and power.

But something is definitely happening with AI-assisted research – I want to describe some of my own recent experiments with AI-assisted practice, where my role as researcher is changing, and why I’m simultaneously excited and worried about what it may mean for my work.
An illustrative walkthrough – AI-assisted rapid literature review
Landscape analyses for strategy research in international development are common. It is worth being clear that I am not (yet anyway) discussing robust and rigorous academic literature reviews, but the sort of snapshots that kick-off most UN/NGO type research to ground everyone in the current state of play for the topic being researched.
I have seen the value and role of AI tools literally transform in the last 3-6 months across every stage:
Sourcing: The first time I tried using AI to generate sources for a rapid lit review – it helped – a little… But the results were patchy – at least 20% of its references didn’t even exist and it needed significant time to review, validate and plug the gaps in the rest.
Doing the same experiment last week – entirely invented references were close to zero. The results are materially stable and usable. It still needed validation and review of course – whittling its ~80 or so suggestions down to 30-40 that are relevant, aren’t vendor promotion, or 1-page promotional work, or inaccessible paywalled sources, or just entirely tangential. What counts as relevant evidence remains a hugely important and human analytical judgement.
Sourcing that took me 3-4 days before, is now down to roughly half a day.
Synthesis & drafting: A few months back, this was still a 90% manual process – sure I’d use AI tools to explore questions, to critique drafts, to write first drafts of specific extracts – but if I tried to get it to do more, the results were so poor it probably took more time to review and correct than to just do it myself.
Now, I can get a pretty decent first (internal use only) version of a large corpus in a day (maybe a day more if I need to familiarise myself with the subject matter first, which AI also helps with), and a good enough client-facing draft in another day or two. This was more like 2-3 weeks in the past – even for the ‘non-academic’ level of rigour I am working within. Why the difference?
The latest frontier models really are just way better – I won’t lie, probably half the improvements are that simple (results from Perplexity now it has Claude Opus 4.6 under the hood are incomparable to it results before), but the rest is a more reflective way of using these models.
- Research needs complex hybrid workflows
I took the time to design a careful step-by-step workflow with researcher intervention at numerous points to guide, to flag gaps, add my insights, to inject domain-specific interpretation, to warn off dead-ends etc. – what I’ve been referring to as HGTL instead of HITL (Human Guiding the Loop, not Human In the Loop which I always felt was downplaying the role of the human!). The difference is not just semantics – the human is structuring epistemic boundaries, not just approving outputs. Guiding the loop is a meaningful difference to how things seem to be developing in other types of work. Domain and methodology knowledge remain crucial and are what holds the process together. Without them, flawed studies are accepted authoritative, personal blogs risk being treated as robust evidence, poorly designed or vendor claims masquerade as research. To be fair, junior researchers often make similar mistakes – my role now feels closer to a traditional oversight role than the ‘OMG what is this mess I need to clean up’ of the not so distant past! - Model triangulation and comparison
Crucially, I learned to build Triangulation into my workflow. This is a core safeguard and probably the main reason that early AI-produced syntheses are becoming usable. Basically, I run the same prompts on the same corpus through two different frontier models and then get AI to compare the outputs before it does anything with them. Each model fails differently. One produces narrative completeness but irons out disagreement. Another surfaces tensions well but has significant gaps in what it covers. This structured comparison stage seems to force the divergence to be taken into account when the model is later asked to synthesise the two, giving markedly better initial results.
I find AI now behaves less like a fabricator or an over-eager helper, and more like an enthusiastic junior researcher. It still surfaces some irrelevant, weak, or tangential content; it missed some issues around methodological robustness; mistakes still filter through into drafts – but my level of filtering and review has shifted from fiction and damage control to improving relevance and quality. That is huge. It does not eliminate the need for someone who knows the terrain, but it changes the nature of that person’s role in meaningful ways.
What does this mean for research as a role?
For me personally, task-level gains are real and substantive, reducing the time I need to get to key milestones, and making the type of agile research I’ve always tried to do, a reality – why not do some analysis on 3-4 areas and then decide which to focus on, if each can be done in a few days – before this was a nice idea but each area would need weeks, so it rarely happened.
But there is strong emerging evidence suggesting task-level efficiency does not necessarily translate to job-level or system-level productivity and sometimes quite the opposite – review burdens shift to senior staff, expectations increase, workload becomes increasingly mismatched with pay. It is definitely not fair to say “AI makes researchers jobs faster or better”.
There is also the meta- level worry about what this new way of working may mean from a neuro-science perspective. There is a growing body of research on the idea that AI can lead to cognitive burnout. Might the very aspects I am increasingly delegating to AI (the sourcing, the initial direct engagement with the material, the pattern identification etc) also be the pieces that are keeping my brain working. What might the reduction in the slow work of reading widely, noticing what doesn’t fit, scribbling notes in the margins etc mean gets lost – for example, might the researcher’s instincts for weak evidence start to stagnate if this function is not exercised.
Or maybe the opposite is true, maybe by operating at a higher level, our brains will become better at spotting patterns, maybe having genuine back-and-forth exchanges with an AI about the material in a corpus will prove to be better for our cognition than reading it alone.
I don’t know. And research is such a different space to the areas being studied (coding, finance, admin, law etc) that it may be a long time until anyone does know.
Power dynamics – a thorn in AI’s side or can it help us see them?
Most of my work involves thinking about power structures, dynamics, locally-led approaches, Southern voices etc.
How well did this approach cope with these additional complications?
It is well documented that AI tends to reproduce structural power biases – training data reflects existing hierarchies of voice, funding, and institutional authority; global South research, practitioner knowledge, community perspectives are systematically under-represented.
I’ve found that – with the latest models – explicitly building this lens into every prompt helps more than I expected and more than I’ve seen before. Obviously it doesn’t solve it – the core issue is the training data – but it definitely mitigates it to an extent.
At the same time, I found deliberately asking the AI tools about power dynamics in a literature base, can actually help me surface things I might not have noticed – which institutional voices are over-represented, which geographies or methodological traditions are missing. It is not perfect, but the trade-offs are now real – where 6 months ago I would simply assume that all AI-produced interim outputs would be missing this power lens, now I can see it more as a thought partner in identifying them (tempered always with the knowledge that its own data and algorithms are also a major part of the problem!)
And of course, the truly interesting part of power dynamics – not just surfacing them, but the interpretive step – reframing, choosing what to foreground, what to recommend etc – stays entirely with me as the human researcher.
Is Shumer right? Is AI coming for research too?
Something big is definitely happening in research. As Shumer notes for coding and other areas – recent acceleration is real and observable and in the next six months it is likely the acceleration will be even faster
But it does seem slower and trailing behind areas like coding – likely due to subjective outputs and lack of tight feedback loops much of his argument depends on.
The trajectory I see is toward increasing delegation of stages, not toward fully autonomous research. Reconfiguration, not replacement. For now anyway.
What do I think of this direction of travel? I’m not yet sure. This article is an observation of what is already happening, not a comment on whether that is good or bad!
For me personally, efficiency gains are real and transformative at the task level. What this means at the job role or system level is yet to be seen.
If it turns out that the stages being delegated are the very stages that build the judgement to do the work well, then the question is not whether AI can do more of the research but whether we still have the expertise to know when it has done it badly.
A few links for context:
1. Shumer’s original piece in case you are living under a rock and missed it: https://shumer.dev/something-big-is-happening
2. Harvard piece on potential negative impact of AI on job efficiency and workload: https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it
3. Summary of state of research on links between AI usage and cognitice decline: https://www.ie.edu/center-for-health-and-well-being/blog/ais-cognitive-implications-the-decline-of-our-thinking-skills/