Synthetic audiences won't replace your community, and we need to stop pretending they might

Our recent webinar on synthetic audiences sparked exactly the kind of debate we hoped for. But the more we sit with the conversations that followed, the more concerned we are. Not about the technology itself, but about the narrative forming around it. And look, I get it. The pitch is seductive... "why spend months and thousands of dollars engaging real communities when an AI can simulate what they'd say in seconds?"

We think that framing is dangerous, and it's worth being direct about why.

I wanted to pick up where we left off in the webinar, and dive deeper into what the research actually says, where the ethical lines are, and why the biggest risk isn't bad technology, it's giving reluctant organisations a sophisticated excuse not to do the hard work of real engagement.

The research is in, and it's humbling

Let's start with what the science tells us. The foundational study in this space, Argyle et al.'s "Out of One, Many" (2023), showed that LLMs conditioned on demographic backstories could produce responses with "algorithmic fidelity" to certain human subgroups. That finding generated enormous excitement. But the replication studies have been sobering.

Bisbee et al. (2024), the most comprehensive critical replication to date, found that ChatGPT-generated personas produced measurements of partisan polarisation that were 7x larger than real human opinion, while capturing only 31% of the variation found among actual respondents. When they ran regressions on synthetic data, 48% of coefficients were significantly different from real-world equivalents. Among those, the direction of the effect flipped 32% of the time. That's not a rounding error. That's the model confidently telling you the opposite of what's true.

Verasight's white paper series (2025–2026) found that while top line political questions could be approximated within about ~4 percentage points, subgroup errors ballooned to ~10 points on average and up to 30 points for the smallest groups. Personal experience questions fared the worst, with error rates above 30%. Their conclusion was pretty raw and blunt: researchers have no way of knowing in advance whether a particular LLM or prompt is increasing or decreasing error.

The bias picture is equally concerning. Multiple studies confirm LLMs over-represent WEIRD* populations (Western, Educated, Industrialised, Rich, Democratic), exhibit left-liberal political leanings amplified by alignment training, and display social desirability bias that actually increases with model size. For engagement professionals working with culturally diverse Australian communities, including First Nations peoples, these aren't abstract limitations. They're fundamental disqualifiers for anything resembling representative input.

*I promise, I did not make that acronym up.

The real risk: a convenient out for organisations afraid to engage

Here's what keeps me up at night, and let's be frank about it: community engagement is already undervalued. Budgets are tight. Timelines are compressed. Plenty of organisations, whether they be councils, utilities, developers, already treat engagement as a sort of compliance exercise, rather than a genuine conversation. Many are just plain scared of it. Scared of angry residents, of hearing things they don't want to hear, of the messy, time-consuming, uncomfortable reality of sitting across the table from people whose lives are affected by your decisions.

Synthetic audiences offers those organisations something incredibly tempting... the appearance of having listened, without actually having to listen. A neat spreadsheet of "community sentiment", that never raised its voice, never went off-script, and never said something inconvenient that changed the direction of a project. That's not engagement. That's a mirror. Northeastern University's Reboot Democracy initiative puts it best:

Synthetic publics won't fail loudly; they'll fail confidently and persuasively.

A synthetic audience doesn't push back, doesn't share a lived experience that changes everything, doesn't force you to confront your own assumptions. It produces plausible-sounding text that mirrors the patterns in its training data, which, as we mentioned above, overwhelmingly reflect the digitally visible, not the communities most affected by policy decisions.

IAP2 Canada's September 2025 position paper said it plainly:

Public participation is about people, not simulations.

We agree. And I'd like to go further: every time an organisation substitutes a synthetic response for a real conversation, they're not just getting worse data, they are actively eroding the democratic compact that engagement exists to uphold.

There's a narrow, honest role, but let's not kid ourselves about what it is

We're not technophobes (I mean... we were one of the first on the scene with AI), and we're not saying synthetic audiences have zero utility. There is a bounded, legitimate role for this technology, but it sits firmly in the preparation phase, never the participation phase.

Stress-test your engagement plan before you go live? Check.
Run your draft survey past synthetic personas to spot confusing wording or identify demographics your outreach might miss? Check.
Use it to train staff on handling difficult conversations before they face real community members in heated sessions? Check.
Pre-test whether your response options capture the full range of likely views? Check.

These are sharpening exercises. They make your real engagement better. And here's the critical distinction: every one of these use cases assumes you're still going to do the real engagement afterwards. The moment synthetic input replaces a single conversation with a real resident, you've crossed the line.

For practitioners wanting to experiment responsibly, anchor your approach in the IAP2 Spectrum. Synthetic audiences can strengthen how you design engagement at the Inform and Consult levels. To be clear, they have no role at Involve, Collaborate, or Empower, where authentic human voice is the entire point.

Validate everything against real responses. Label every output as AI-generated. And never, ever, present synthetic findings in a statutory consultation report, a regulatory submission, or a council paper as evidence of what the community thinks.

The bottom line

The technology will get better. The models will get more accurate. The cost argument will get louder. And the temptation for time-poor, budget-constrained organisations to cut corners will only grow.

Our position is simple: synthetic audiences are a sharpening stone, not a shortcut. Used honestly, as a design tool, a stress-test, a rehearsal space, they can make your engagement more considered before a single real community member is asked to participate.

But, used as a substitute for genuine participation, they become what researchers are already calling "participation washing": the appearance of listening without the substance.

I started Communiti Labs because I believe communities deserve to be heard, actually heard, in their own words, with all the mess and complexity that entails. No algorithm replicates that. And any organisation telling themselves otherwise isn't saving money. They're borrowing trust they'll eventually have to repay.

Every resident deserves to be heard, not simulated.