Academia on Big Muddy

Scientists invent a fake disease, AI picks it up, other scientists cite it

Fri, 10 Apr 2026 18:27:00 -0400

A somewhat disturbing bit of reporting from Nature tells the story of bixonimania, a fake eye disease invented by Swedish medical researcher Almira Osmanovic Thunström and her team. She seeded the idea for the fake disease in a series of ridiculous, joke-filled blog posts and preprints in mid-2024.

Because AI can be overly credulous with its sourcing (how often do Google’s AI answers confident cite random Reddit posts for the bulk of an answer?), the disease got picked up as an “emerging term” by the leading chatbots. The preprints even got cited a handful of times in real publications, which is further evidence that scientists don’t read the papers they cite (I guess the modern equivalent of copying citations from other papers is having AI dredge the literature for you).

I can see AI agents being exploited by those pushing dubious medical diagnoses to flood the Internet and preprint servers with articles aimed at convincing LLMs of the validity of their positions. That is if the agents aren’t too busy spinning of websites to defame those who incur their wrath.

AI makes it easier to generate fake papers, too

Wed, 08 Apr 2026 20:09:00 -0400

Here’s a fun project from Tyler Vigen, creator of the famous Spurious Correlations page (which has been cited as a cautionary tale in many a science class). Using his database of real but spurious correlations (created by calculating the Pearson correlation coefficient r between a very large number of variables and picking out the hits), he used AI to create amusing fake manuscripts expounding on these statistical flukes as if they were real research questions.

These papers were generated in January 2024, and as previously discussed on this blog, the pipeline for end-to-end paper generation has come a long way in two years. I have no doubt Tyler could make these paper’s sound much more convincing using today’s models, though of course his goal here is to make you laugh (and think), not to trick you. But I have no doubt there will be many scholars adopting this data dredging strategy to generate “real” papers, contributing to a deluge of papers flooding the academic publishing system.

Some insight into writing a book using Quarto

Mon, 16 Mar 2026 20:48:00 -0400

Prof. Kieran Healy (Sociology, Yale University) shares some nice insight into the process of writing a book in Quarto using R in this post. The output screenshots he shares look beautiful, and the idea of deploying the same content as a clean PDF and a responsive website is awesome. A full draft of the book, Data Visualization: A Practical Introduction (Second Edition), is available as a website here.

I have grown increasingly tired of writing in any format other than a plain text file I can easily version control and move around, so the idea of writing a book in Quarto is appealing to me (as long as it has enough technical content to justify the format).

What will the paper of the future look like?

Tue, 10 Mar 2026 23:48:00 -0400

I am sharing today a short blog post by the Institute for Replication: “What will the paper of the future look like?”

In short: research looking more like software development (as presaged by Prof. Richard McElreath, author of the excellent Statistical Rethinking), with the ability to reuse common material, formalize results, and remix analyses built into the pipeline.

Editors hate this one weird trick

Thu, 05 Mar 2026 20:05:00 -0500

Given my recent posts on AI in academic publishing, I just wanted to share this joke from Prof. Arthur Spirling on Twitter:

Actually you cant run my paper through Claude to desk reject it because Claude is a regular coauthor of mine. Conflict of interest. Checkmate, editors

The productivity shock coming to academic publishing

Tue, 03 Mar 2026 19:33:00 -0500

Today, I wanted to share this piece from economist Scott Cunningham (Baylor University), who wrote about how AI is widening the gap between research and publishing. Or, in economics terms (emphasis mine):

But what happens when the same productivity shock hits a system where the bottleneck was never really production in the first place, but rather was a hierarchical journal structure that depended immensely on editor time, skill, discretion and voluntary workers with the same talents called referees for screening quality deemed sufficient for publication?

The post mentions the Autonomous Policy Evaluation project—the end-to-end AI paper pipeline I wrote about a few weeks ago—and discusses the likely consequences of this flood of AI-generated papers. Assuming the number of publication slots in reputable journals is relatively fixed, AI-generated papers should add a very large amount of mass to the left side of the paper quality distribution. Acceptance rates will plummet and journals may rely on other signals of quality (name recognition, pedigree, institution) to thin the herd before actually reviewing content. As always, the rich get richer!

But this is imperfect, not to mention unfair, and so desk rejection gets noisier: good papers get killed by tired editors and marginally lower quality papers slip through to referees. It’s a cascading failure: volume breaks editors, broken editing wastes referees, wasted referees slow science.

Some examples of just-build-things-ism

Sun, 01 Mar 2026 11:58:00 -0500

The best mantra to come out of the AI era is: “You can just build things”. (So good OpenAI ripped it off for their Super Bowl ad.)

I’ve been pretty inspired to see how many people are now building all kinds of incredible tools thanks to advances in AI coding agents, even if they have no previous background in coding (see my post on Havelack.AI from a few days ago).

Here are a few more examples I’ve been following:

Canadian journalist Alex Panetta writes about his AI-augmented workflow at A.I. For You. I first came across his work with his debut article “I killed my doomscrolling habit with AI. You can too”. In it, he explains how to vibe code an automated, personalized daily news digest. I’ve tried to build something for myself but I haven’t gotten it quite right yet. A great follow for big news consumers.
Economics professor Scott Cunningham, author of the great textbook Causal Inference: The Mixtape, has a presentation explaining how to encourage AI adoption among academic faculty. This starts with faculty experiencing a killer use case for AI, which he suggests is building slide decks. He shares his tools/agent skills for this use case and more on GitHub.
Another economist, Chris Blattman, built a website to share the productivity tools he developed with Claude Code. He provides a tutorial and code on Claude Blattman.

And of course, Simon Willison has been building and sharing tools habitually for years now.

These academic journal AI policies aren't going to last

Thu, 26 Feb 2026 16:51:00 -0500

I recently came across the following policy on the submission page of an academic journal:

Use of Artificial Intelligence (AI) tools: One of the goals of Spectrum is to stimulate critical thinking and skill development among authors and reviewers alike. Spectrum discourages the submission of content generated by artificial intelligence (AI)-assisted technologies (such as chatGPT and similar tools). This includes tools that generate text, data, images, figures, or other materials, as well as tools that are used to summarize and synthesize sources. Authors should be aware that such tools are vulnerable to factual inaccuracies, biases, and logical fallacies, and may pose risks to privacy, confidentiality, and copyright.

If authors choose to submit work created with the assistance of AI tools, such use must be disclosed and described in the submission. The disclosure must include: 1) what system was used, 2) who used it, 3) the time/date of the use, 4) the prompt(s) used to generate the content, and 5) the content in the submission that resulted from use of AI tools. The output from the AI system should also be submitted as supplementary material. Authors must accept full responsibility for the accuracy and integrity of the submission. AI systems do not meet the criteria for authorship, and should not be listed as a co-author.

LLMs automate the erosion of online anonymity

Mon, 23 Feb 2026 22:37:00 -0500

Economist Florian Ederer linked a new preprint describing the creation of an automated LLM-based pipeline for linking anonymous users across datasets based on unstructured text written by or about them. Prof Ederer is himself famous for unmasking the IP addresses of users of the infamous (but influential) Economics Job Market Rumors message board, exploiting a flaw in how usernames were assigned to anonymous posters. For platforms not encoding a user’s IP address in their “anonymous” username, the LLM-based approach involves:

Extracting structured features from free text
Encoding extracted features to embeddings to compare to candidate profiles
Reasoning using all available context to identify the most likely match among top candidates
Calibrate the quality of match by asking the LLM to report confidence

I guess it’s only a matter of time before someone uses this strategy to unmask Reviewer 2. (Currently this is only possible if Reviewer 2 insists you cite all of the work of the brilliant Dr. X.)

More on vibe researching

Fri, 13 Feb 2026 23:49:00 -0500

To follow on yesterday’s post on AI-produced research, here is a reflection on “vibe researching” from Prof. Joshua Gans of the University of Toronto’s Rotman School of Management. Since the release of the first “reasoning” models in late 2024, he has gone all in on experimenting with AI-first research.

One of the key takeaways is that he found himself pursuing low quality ideas to completion more often, precisely because the cost of choosing to continue to pursue a questionable idea has been lowered. Sycophancy is a problem, too. With an AI cheerleader, it is easy to convince yourself you have a result when you do not.

Those ideas were all fine but not high quality, and what is worse, I didn’t realise that they weren’t that significant until external referees said so. I didn’t realise it because they were reasonably hard to do, and I was happy to have solved them.

I will note that (human) peer reviewers cannot be the levee that stops the flood of middling AI research: the system of uncompensated labour that undergirds all of academic publishing is already strained to bursting, as every editor desperate to find referees for a paper will tell you.

Prof. Gans concludes his year-long experiment in “vibe researching” was a failure, despite publishing many working papers and publishing a handful of them:

An end-to-end AI pipeline for policy evaluation papers

Thu, 12 Feb 2026 19:11:00 -0500

Prof. David Yanagizawa-Drott from the Social Catalyst Lab at the University of Zurich has launched Project APE (Autonomous Policy Evaluation), an end-to-end AI pipeline to generate policy evaluation papers. The vast majority of policies around the world are never rigorously evaluated, so it would certainly be useful if we were able to do so in an automated fashion.

Claude Code is the heart of the project, but other models are used to review the outputs and provide journal-style referee reports. All the coding is done in R (though Python is called in some scripts). Currently, judging is done by Gemini 3 Flash to compare against published research in top economics journals:

Blind comparison: An LLM judge compares two papers without knowing which is AI-generated Position swapping: Each pair is judged twice with paper order swapped to control for bias TrueSkill ratings: Papers accumulate skill ratings that update after each match

The project’s home page lists the AI’s current “win rate” at 3.5% in head-to-head matchups against human-written papers.

Prof. Yanagizawa-Drott says “Currently it requires at a minimum some initial human input for each paper,” although he does not specify exactly what. If we look at initialization.json that can be found in each paper’s directory, we see the following questions with user-provided inputs:

Policy domain: What policy area interests you?

Method: Which identification method?

Data era: Modern or historical data?

API keys: Did you configure data API keys?

External review: Include external model reviews?

Risk appetite: Exploration vs exploitation?

Other preferences: Any other preferences or constraints?

The code, reviews, manuscript, and even the results of the initial idea generation process are all available on GitHub. Their immediate goal is to generate a sample of 1,000 papers and run human evaluations on them (at time of posting, there are 264 papers in the GitHub repository).