{"id":1752,"date":"2026-06-24T10:00:46","date_gmt":"2026-06-24T10:00:46","guid":{"rendered":"https:\/\/cms.research.wpp.com\/?post_type=research_feed&#038;p=1752"},"modified":"2026-06-24T13:56:10","modified_gmt":"2026-06-24T13:56:10","slug":"synthetic-audiences-as-proxies-for-human-users-a-systematic-literature-review","status":"publish","type":"research_feed","link":"https:\/\/cms.research.wpp.com\/?research_feed=synthetic-audiences-as-proxies-for-human-users-a-systematic-literature-review","title":{"rendered":"Evaluating Synthetic Audiences as Proxies for Human Users"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Imagine running a focus group with ten thousand &#8220;people&#8221; overnight, for roughly the price of a few coffees. No recruiting, no scheduling, no drop-outs. Just ask your questions and get answers by morning.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That&#8217;s the promise of <strong>synthetic audiences<\/strong>: groups of AI agents, powered by the same large language models (LLMs) behind tools like ChatGPT, configured to stand in for human participants in surveys, experiments, and market research. The idea has exploded across marketing, psychology, economics, and design in just a couple of years.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In a <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11563566\/\"> recently published systematic review of over 100 related studies<\/a>, we looked at what these AI &#8220;participants&#8221; can actually do, where they fall apart, and how to use them in practice. This blog post present a summary of the main findings. We also list some of the key references at the end of the post.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Why everyone is excited<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The appeal becomes obvious the moment you look at the numbers. Studies that directly compared AI participants with human recruitment found cost differences of <strong>one to two orders of magnitude<\/strong> (that is, 10\u00d7 to 100\u00d7 cheaper per response). In one case, researchers generated tens of thousands of survey responses for roughly the API cost of a single query; in another, AI-generated open-ended answers for a user study cost a tiny fraction of what the equivalent human participants would have charged.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Beyond cost, there are practical superpowers:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>\u2022 No logistics headaches.<\/strong> An AI agent always answers, never gets bored, and never abandons your survey halfway through.<\/li>\n\n\n<li><strong>\u2022 Experiments you otherwise couldn&#8217;t run.<\/strong> One team simulated a <em>bank run<\/em> using tens of thousands of AI-generated depositors across hundreds of demographic groups, a scenario you obviously can&#8217;t stage with real customers and real money. Others have built entire simulated societies: sandbox towns of AI &#8220;residents&#8221; that develop their own social dynamics, and large-scale models with <strong>over 10,000 agents<\/strong> interacting in urban and economic environments.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Where they actually shine<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The honest headline from the research: synthetic audiences are good at capturing <strong>broad, aggregate patterns<\/strong> (the overall direction of an effect, rankings, and average tendencies), even when they miss the fine detail.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The more capable the model, the better this gets. Newer models have reproduced classic results from economics and social psychology, closely matched human behavior in trust games, and broadly agreed with human raters on things like brand perceptions. In other words, if your question is &#8220;which option do people lean toward?&#8221; an AI panel can often point you in the right direction, fast.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Where they break<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The catch is that AI participants <em>look<\/em> convincing, and that&#8217;s exactly the trap. Fluent, human-sounding answers make it easy to assume there&#8217;s human-like thinking behind them. There usually isn&#8217;t. The review found four recurring failure modes, each backed by hard evidence.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>1. They flatten the crowd.<\/strong> Real people disagree, surprise you, and sit at the extremes. AI participants tend to cluster around a bland &#8220;average.&#8221; One large replication study found an earlier model reproduced only <strong>37.5%<\/strong> of a well-known set of psychology effects, versus a <strong>50%<\/strong> rate for human samples. More tellingly, it kept giving near-identical answers to questions that normally produce lots of human variety. Other work documented a &#8220;hyper-accuracy&#8221; quirk, where AI participants give unrealistically precise, noise-free answers no real population would.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>2. They carry hidden biases.<\/strong> Because they learn from internet-scale text, AI participants tend to sound disproportionately Western, English-speaking, wealthy, and politically skewed. One study comparing AI-simulated public opinion against a global survey found it performed far better for Western, English-speaking, high-income countries while systematically misrepresenting attitudes across gender, age, education, and class. Worse, when asked to portray specific groups, models often produce <em>caricatures<\/em>. Research on such groups found AI depictions that actual human participants judged not just inaccurate but actively harmful. The uncomfortable implication: synthetic audiences are least reliable precisely where fresh insight is most valuable, namely underrepresented and non-Western populations.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>3. They&#8217;re twitchy.<\/strong> Reword a question slightly, reorder the options, or tweak the persona, and the answers can swing dramatically. Studies showed AI survey responses with unstable estimates and strong sensitivity to wording, persona, and even <em>timing<\/em>. In decision-making tasks, AI agents proved far more sensitive than humans to small &#8220;nudges&#8221;: a tiny change in framing produced outsized changes in behavior. That fragility makes results hard to reproduce.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>4. They can fake reasoning.<\/strong> An answer that looks thoughtful may just be pattern-matching on things the model absorbed during training. Researchers showed that changing one variable in a study (say, a product&#8217;s price) can quietly shift the model&#8217;s <em>unstated assumptions<\/em> about everything else, contaminating any cause-and-effect conclusion. In another case, near-perfect economic &#8220;forecasts&#8221; turned out to be memorized history rather than genuine prediction. Add to this the finding that hallucinations are, mathematically, an unavoidable feature of these models, not a bug that will simply be patched away.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">How researchers are fixing them<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The field isn&#8217;t standing still. Three families of techniques are making synthetic audiences more trustworthy:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong>\u2022 Better prompting.<\/strong> Giving the AI richer personas (detailed demographics, life-story backstories, even real interview transcripts) measurably improves how well it matches a target group. The gains are real but fragile, since the same prompt-sensitivity problem lurks underneath.<\/li>\n\n\n<li><strong>\u2022 Fine-tuning.<\/strong> Training models on real human responses for a specific population improves accuracy, though it tends to compress diversity unless done carefully.<\/li>\n\n\n<li><strong>\u2022 Human-AI hybrids, the most reliable approach.<\/strong> Instead of replacing people, you collect a <em>small<\/em> human sample and use it to calibrate and correct the AI&#8217;s output. A statistical framework called prediction-powered inference makes this rigorous, and studies report cutting the required number of human participants by <strong>20% to 30%<\/strong> in some settings, while another approach reduced the human data needed by <strong>up to 80%<\/strong>, all without sacrificing statistical validity. You keep much of the speed and cost savings while keeping the results honest.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">The promising road ahead<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">What makes this field genuinely exciting isn&#8217;t just today&#8217;s tools; it&#8217;s where the research is heading. The review highlights three directions that could turn synthetic audiences from a clever shortcut into a dependable scientific instrument.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>1. Open, transparent models built for human behavior.<\/strong> Most studies so far rely on closed, proprietary systems whose training data and updates are secret, which makes results hard to audit or reproduce (a &#8220;silent&#8221; update can change behavior between studies). Open models change that. The standout example is <em>Centaur<\/em>, a model fine-tuned on <strong>over 10 million real human decisions across 160 psychology experiments<\/strong>. It outperformed traditional models at predicting how people actually behave and generalized to new tasks, offering a glimpse of purpose-built, inspectable &#8220;model populations&#8221; rather than borrowed chatbots.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>2. Standards, benchmarks, and reproducibility.<\/strong> Right now, every team prompts differently, making studies hard to compare. The field is moving toward shared benchmarks and tools, such as standardized environments for running AI-based surveys and experiments, so researchers can tell whether an improvement truly generalizes or just got lucky on one prompt. This is the unglamorous infrastructure that turned modern machine learning into a cumulative science, and synthetic audiences need it too.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>3. Looking inside the black box.<\/strong> The frontier is <em>mechanistic interpretability<\/em>: tools that trace which internal &#8220;circuits&#8221; of a model produce a given behavior. The payoff would be enormous: imagine pinpointing the exact components that trigger a demographic stereotype and dialing them down, or verifying that a persona prompt genuinely activates the right knowledge rather than just changing the writing style. This work is early and mostly aspirational today, but it points toward synthetic audiences we can actually diagnose and correct, rather than merely observe.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Best Practice<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">If you&#8217;re tempted to use AI participants, the research suggests:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li><strong> \u2022 Evaluate the fit.<\/strong> Research shows promising evidence for early exploration, pilots, brainstorming, broad directional questions, and well-represented (Western, English-speaking) populations. Still a poor fit for fine-grained individual differences, underrepresented or non-Western groups, and high-stakes domains like policy or healthcare.<\/li>\n\n\n<li><strong> \u2022 Explore with AI, validate with Humans. <\/strong> Even a modest dose of real-world data, used to check the AI&#8217;s answers, dramatically improves fidelity.<\/li>\n\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">The bottom line<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Synthetic audiences are one of the most exciting new tools in research, but they&#8217;re a tool, not a replacement for people. They&#8217;re best understood as a fast, cheap way to explore ideas and generate hypotheses, with their results treated as a starting point to be validated, not a final verdict.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Used carefully, they can democratize research and let small teams ask big questions. Used carelessly, they produce confident-sounding answers that quietly reflect an AI&#8217;s blind spots rather than reality. The difference comes down to knowing what they can and can&#8217;t do, and increasingly on the open models, shared standards, and interpretability tools that will determine how far this technology can really go.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p class=\"wp-block-paragraph\"><em>This post summarizes the systematic literature review &#8220;Synthetic Audiences as Proxies for Human Users,&#8221; published in IEEE Access, which analyzed 100 studies on LLM-based synthetic audiences following established review protocols. It&#8217;s written for a general audience; see the full paper for methods, evidence, and the complete set of 100 citations.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Read the full paper:<\/strong> Lappas &amp; Filippas, &#8220;Synthetic Audiences as Proxies for Human Users: A Systematic Literature Review,&#8221; <em>IEEE Access<\/em>, 2026. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11563566\/\">https:\/\/ieeexplore.ieee.org\/document\/11563566\/<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Sources &amp; further reading<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The specific findings mentioned above come from the studies below (a curated subset of the 100 reviewed). Links are provided where a stable one exists.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Applications, scale, and cost<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li>Argyle et al. (2023), &#8220;Out of one, many: Using language models to simulate human samples,&#8221; <em>Political Analysis.<\/em> <a href=\"https:\/\/doi.org\/10.1017\/pan.2023.2\">https:\/\/doi.org\/10.1017\/pan.2023.2<\/a><\/li>\n\n\n<li>H\u00e4m\u00e4l\u00e4inen et al. (2023), &#8220;Evaluating large language models in generating synthetic HCI research data,&#8221; <em>CHI.<\/em> <a href=\"https:\/\/doi.org\/10.1145\/3544548.3580688\">https:\/\/doi.org\/10.1145\/3544548.3580688<\/a><\/li>\n\n\n<li>Kazinnik (2024), &#8220;Bank run, interrupted: Modeling deposit withdrawals with generative AI,&#8221; <em>SSRN.<\/em> <a href=\"https:\/\/doi.org\/10.2139\/ssrn.4656722\">https:\/\/doi.org\/10.2139\/ssrn.4656722<\/a><\/li>\n\n\n<li>Park et al. (2023), &#8220;Generative agents: Interactive simulacra of human behavior,&#8221; <em>UIST<\/em> (the &#8220;Smallville&#8221; study). <a href=\"https:\/\/doi.org\/10.1145\/3586183.3606763\">https:\/\/doi.org\/10.1145\/3586183.3606763<\/a><\/li>\n\n\n<li>Piao et al. (2025), &#8220;AgentSociety: Large-scale simulation of LLM-driven generative agents,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2502.08691\">https:\/\/arxiv.org\/abs\/2502.08691<\/a><\/li>\n\n\n<li>Aher et al. (2023), &#8220;Using large language models to simulate multiple humans and replicate human subject studies,&#8221; <em>ICML.<\/em> <a href=\"https:\/\/proceedings.mlr.press\/v202\/aher23a.html\">https:\/\/proceedings.mlr.press\/v202\/aher23a.html<\/a><\/li>\n\n\n<li>Xie et al. (2024), &#8220;Can large language model agents simulate human trust behavior?,&#8221; <em>NeurIPS.<\/em> <a href=\"https:\/\/arxiv.org\/abs\/2402.04559\">https:\/\/arxiv.org\/abs\/2402.04559<\/a><\/li>\n\n\n<li>Li et al. (2024), &#8220;Frontiers: Determining the validity of large language models for automated perceptual analysis,&#8221; <em>Marketing Science.<\/em> <a href=\"https:\/\/doi.org\/10.1287\/mksc.2023.0454\">https:\/\/doi.org\/10.1287\/mksc.2023.0454<\/a><\/li>\n\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Limitations<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li>Park et al. (2024), &#8220;Diminished diversity-of-thought in a standard large language model,&#8221; <em>Behavior Research Methods<\/em> (the 37.5% vs 50% replication finding). <a href=\"https:\/\/doi.org\/10.3758\/s13428-023-02307-x\">https:\/\/doi.org\/10.3758\/s13428-023-02307-x<\/a><\/li>\n\n\n<li>Qu &amp; Wang (2024), &#8220;Performance and biases of large language models in public opinion simulation,&#8221; <em>Humanities and Social Sciences Communications.<\/em> <a href=\"https:\/\/doi.org\/10.1057\/s41599-024-03609-x\">https:\/\/doi.org\/10.1057\/s41599-024-03609-x<\/a><\/li>\n\n\n<li>Cheng et al. (2023), &#8220;CoMPosT: Characterizing and evaluating caricature in LLM simulations,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2310.11501\">https:\/\/arxiv.org\/abs\/2310.11501<\/a><\/li>\n\n\n<li>Wang et al. (2025), &#8220;Large language models that replace human participants can harmfully misportray and flatten identity groups,&#8221; <em>Nature Machine Intelligence.<\/em> <a href=\"https:\/\/doi.org\/10.1038\/s42256-025-00986-z\">https:\/\/doi.org\/10.1038\/s42256-025-00986-z<\/a><\/li>\n\n\n<li>Gadiraju et al. (2023), &#8220;&#8216;I wouldn&#8217;t say offensive but&#8230;&#8217;: Disability-centered perspectives on large language models,&#8221; <em>FAccT.<\/em> <a href=\"https:\/\/doi.org\/10.1145\/3593013.3593989\">https:\/\/doi.org\/10.1145\/3593013.3593989<\/a><\/li>\n\n\n<li>Bisbee et al. (2024), &#8220;Synthetic replacements for human survey data? The perils of large language models,&#8221; <em>Political Analysis.<\/em> <a href=\"https:\/\/doi.org\/10.1017\/pan.2024.3\">https:\/\/doi.org\/10.1017\/pan.2024.3<\/a><\/li>\n\n\n<li>Cherep et al. (2025), &#8220;LLM agents are hypersensitive to nudges,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2505.11584\">https:\/\/arxiv.org\/abs\/2505.11584<\/a><\/li>\n\n\n<li>Gui &amp; Toubia (2025), &#8220;The challenge of using LLMs to simulate human behavior: A causal inference perspective,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2312.15524\">https:\/\/arxiv.org\/abs\/2312.15524<\/a><\/li>\n\n\n<li>Xu et al. (2025), &#8220;Hallucination is inevitable: An innate limitation of large language models,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2401.11817\">https:\/\/arxiv.org\/abs\/2401.11817<\/a><\/li>\n\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Improvement techniques and human-AI hybrids<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li>Moon et al. (2024), &#8220;Virtual personas for language models via an anthology of backstories,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2407.06576\">https:\/\/arxiv.org\/abs\/2407.06576<\/a><\/li>\n\n\n<li>Suh et al. (2025), &#8220;Language model fine-tuning on scaled survey data (SubPOP),&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2502.16761\">https:\/\/arxiv.org\/abs\/2502.16761<\/a><\/li>\n\n\n<li>Angelopoulos et al. (2023), &#8220;Prediction-powered inference,&#8221; <em>Science.<\/em> <a href=\"https:\/\/doi.org\/10.1126\/science.adi6000\">https:\/\/doi.org\/10.1126\/science.adi6000<\/a><\/li>\n\n\n<li>De Bartolomeis et al. (2025), &#8220;Efficient randomized experiments using foundation models,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2502.04262\">https:\/\/arxiv.org\/abs\/2502.04262<\/a> (the 20% to 30% figure).<\/li>\n\n\n<li>Wang et al. (2024), &#8220;Large language models for market research: A data-augmentation approach,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2412.19363\">https:\/\/arxiv.org\/abs\/2412.19363<\/a> (the up-to-80% figure).<\/li>\n\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Future directions<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n\n<li>Binz et al. (2024), &#8220;Centaur: A foundation model of human cognition,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2410.20268\">https:\/\/arxiv.org\/abs\/2410.20268<\/a><\/li>\n\n\n<li>Shapira et al. (2024), &#8220;GLEE: A unified framework and benchmark for language-based economic environments,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2410.05254\">https:\/\/arxiv.org\/abs\/2410.05254<\/a><\/li>\n\n\n<li>Sharkey et al. (2025), &#8220;Open problems in mechanistic interpretability,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2501.16496\">https:\/\/arxiv.org\/abs\/2501.16496<\/a><\/li>\n\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Imagine running a focus group with ten thousand &#8220;people&#8221; overnight, for roughly the price of a few coffees. No recruiting, no scheduling, no drop-outs. Just ask your questions and get answers by morning. That&#8217;s the promise of synthetic audiences: groups of AI agents, powered by the same large language models (LLMs) behind tools like ChatGPT, [&hellip;]<\/p>\n","protected":false},"author":5,"featured_media":1756,"template":"","meta":{"_acf_changed":false},"tags":[],"content_types":[{"id":50,"name":"Blog Post","slug":"article"}],"ppma_author":[{"id":5,"display_name":"Ted Lappas","first_name":"Ted","last_name":"Lappas","nickname":"theodoros.lappas","user_nicename":"theodoros-lappas","user_email":"Theodoros.Lappas@wpp.com","biographical_info":"Ted co-leads WPP Research and serves as Head of Data Science at Satalia, co-founder of Conscium, and Assistant Professor in the Department of Marketing and Communication at the Athens University of Economics and Business. His research spans scalable algorithms for multimodal data, synthetic data generation, simulation-based verification for AI agents, and information diffusion and collective intelligence in expert networks. He publishes regularly in top-tier computer science and business venues.","avatar_url":"https:\/\/cms.research.wpp.com\/wp-content\/uploads\/2026\/04\/pic.png","job_title":"Head of Data Science","is_lead":false,"display_as_researcher":true,"order_priority":1}],"class_list":["post-1752","research_feed","type-research_feed","status-publish","has-post-thumbnail","hentry","content_type-article"],"acf":{"content":"<p><!-- wp:paragraph --><\/p>\n<p>Imagine running a focus group with ten thousand &#8220;people&#8221; overnight, for roughly the price of a few coffees. No recruiting, no scheduling, no drop-outs. Just ask your questions and get answers by morning.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>That&#8217;s the promise of <strong>synthetic audiences<\/strong>: groups of AI agents, powered by the same large language models (LLMs) behind tools like ChatGPT, configured to stand in for human participants in surveys, experiments, and market research. The idea has exploded across marketing, psychology, economics, and design in just a couple of years.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>In a <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11563566\/\"> recently published systematic review of over 100 related studies<\/a>, we looked at what these AI &#8220;participants&#8221; can actually do, where they fall apart, and how to use them in practice. This blog post present a summary of the main findings. We also list some of the key references at the end of the post.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">Why everyone is excited<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The appeal becomes obvious the moment you look at the numbers. Studies that directly compared AI participants with human recruitment found cost differences of <strong>one to two orders of magnitude<\/strong> (that is, 10\u00d7 to 100\u00d7 cheaper per response). In one case, researchers generated tens of thousands of survey responses for roughly the API cost of a single query; in another, AI-generated open-ended answers for a user study cost a tiny fraction of what the equivalent human participants would have charged.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>Beyond cost, there are practical superpowers:<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li><strong>\u2022 No logistics headaches.<\/strong> An AI agent always answers, never gets bored, and never abandons your survey halfway through.<\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li><strong>\u2022 Experiments you otherwise couldn&#8217;t run.<\/strong> One team simulated a <em>bank run<\/em> using tens of thousands of AI-generated depositors across hundreds of demographic groups, a scenario you obviously can&#8217;t stage with real customers and real money. Others have built entire simulated societies: sandbox towns of AI &#8220;residents&#8221; that develop their own social dynamics, and large-scale models with <strong>over 10,000 agents<\/strong> interacting in urban and economic environments.<\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">Where they actually shine<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The honest headline from the research: synthetic audiences are good at capturing <strong>broad, aggregate patterns<\/strong> (the overall direction of an effect, rankings, and average tendencies), even when they miss the fine detail.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The more capable the model, the better this gets. Newer models have reproduced classic results from economics and social psychology, closely matched human behavior in trust games, and broadly agreed with human raters on things like brand perceptions. In other words, if your question is &#8220;which option do people lean toward?&#8221; an AI panel can often point you in the right direction, fast.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">Where they break<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The catch is that AI participants <em>look<\/em> convincing, and that&#8217;s exactly the trap. Fluent, human-sounding answers make it easy to assume there&#8217;s human-like thinking behind them. There usually isn&#8217;t. The review found four recurring failure modes, each backed by hard evidence.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>1. They flatten the crowd.<\/strong> Real people disagree, surprise you, and sit at the extremes. AI participants tend to cluster around a bland &#8220;average.&#8221; One large replication study found an earlier model reproduced only <strong>37.5%<\/strong> of a well-known set of psychology effects, versus a <strong>50%<\/strong> rate for human samples. More tellingly, it kept giving near-identical answers to questions that normally produce lots of human variety. Other work documented a &#8220;hyper-accuracy&#8221; quirk, where AI participants give unrealistically precise, noise-free answers no real population would.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>2. They carry hidden biases.<\/strong> Because they learn from internet-scale text, AI participants tend to sound disproportionately Western, English-speaking, wealthy, and politically skewed. One study comparing AI-simulated public opinion against a global survey found it performed far better for Western, English-speaking, high-income countries while systematically misrepresenting attitudes across gender, age, education, and class. Worse, when asked to portray specific groups, models often produce <em>caricatures<\/em>. Research on such groups found AI depictions that actual human participants judged not just inaccurate but actively harmful. The uncomfortable implication: synthetic audiences are least reliable precisely where fresh insight is most valuable, namely underrepresented and non-Western populations.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>3. They&#8217;re twitchy.<\/strong> Reword a question slightly, reorder the options, or tweak the persona, and the answers can swing dramatically. Studies showed AI survey responses with unstable estimates and strong sensitivity to wording, persona, and even <em>timing<\/em>. In decision-making tasks, AI agents proved far more sensitive than humans to small &#8220;nudges&#8221;: a tiny change in framing produced outsized changes in behavior. That fragility makes results hard to reproduce.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>4. They can fake reasoning.<\/strong> An answer that looks thoughtful may just be pattern-matching on things the model absorbed during training. Researchers showed that changing one variable in a study (say, a product&#8217;s price) can quietly shift the model&#8217;s <em>unstated assumptions<\/em> about everything else, contaminating any cause-and-effect conclusion. In another case, near-perfect economic &#8220;forecasts&#8221; turned out to be memorized history rather than genuine prediction. Add to this the finding that hallucinations are, mathematically, an unavoidable feature of these models, not a bug that will simply be patched away.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">How researchers are fixing them<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The field isn&#8217;t standing still. Three families of techniques are making synthetic audiences more trustworthy:<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li><strong>\u2022 Better prompting.<\/strong> Giving the AI richer personas (detailed demographics, life-story backstories, even real interview transcripts) measurably improves how well it matches a target group. The gains are real but fragile, since the same prompt-sensitivity problem lurks underneath.<\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li><strong>\u2022 Fine-tuning.<\/strong> Training models on real human responses for a specific population improves accuracy, though it tends to compress diversity unless done carefully.<\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li><strong>\u2022 Human-AI hybrids, the most reliable approach.<\/strong> Instead of replacing people, you collect a <em>small<\/em> human sample and use it to calibrate and correct the AI&#8217;s output. A statistical framework called prediction-powered inference makes this rigorous, and studies report cutting the required number of human participants by <strong>20% to 30%<\/strong> in some settings, while another approach reduced the human data needed by <strong>up to 80%<\/strong>, all without sacrificing statistical validity. You keep much of the speed and cost savings while keeping the results honest.<\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">The promising road ahead<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>What makes this field genuinely exciting isn&#8217;t just today&#8217;s tools; it&#8217;s where the research is heading. The review highlights three directions that could turn synthetic audiences from a clever shortcut into a dependable scientific instrument.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>1. Open, transparent models built for human behavior.<\/strong> Most studies so far rely on closed, proprietary systems whose training data and updates are secret, which makes results hard to audit or reproduce (a &#8220;silent&#8221; update can change behavior between studies). Open models change that. The standout example is <em>Centaur<\/em>, a model fine-tuned on <strong>over 10 million real human decisions across 160 psychology experiments<\/strong>. It outperformed traditional models at predicting how people actually behave and generalized to new tasks, offering a glimpse of purpose-built, inspectable &#8220;model populations&#8221; rather than borrowed chatbots.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>2. Standards, benchmarks, and reproducibility.<\/strong> Right now, every team prompts differently, making studies hard to compare. The field is moving toward shared benchmarks and tools, such as standardized environments for running AI-based surveys and experiments, so researchers can tell whether an improvement truly generalizes or just got lucky on one prompt. This is the unglamorous infrastructure that turned modern machine learning into a cumulative science, and synthetic audiences need it too.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>3. Looking inside the black box.<\/strong> The frontier is <em>mechanistic interpretability<\/em>: tools that trace which internal &#8220;circuits&#8221; of a model produce a given behavior. The payoff would be enormous: imagine pinpointing the exact components that trigger a demographic stereotype and dialing them down, or verifying that a persona prompt genuinely activates the right knowledge rather than just changing the writing style. This work is early and mostly aspirational today, but it points toward synthetic audiences we can actually diagnose and correct, rather than merely observe.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">Best Practice<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>If you&#8217;re tempted to use AI participants, the research suggests:<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li><strong> \u2022 Evaluate the fit.<\/strong> Research shows promising evidence for early exploration, pilots, brainstorming, broad directional questions, and well-represented (Western, English-speaking) populations. Still a poor fit for fine-grained individual differences, underrepresented or non-Western groups, and high-stakes domains like policy or healthcare.<\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li><strong> \u2022 Explore with AI, validate with Humans. <\/strong> Even a modest dose of real-world data, used to check the AI&#8217;s answers, dramatically improves fidelity.<\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">The bottom line<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>Synthetic audiences are one of the most exciting new tools in research, but they&#8217;re a tool, not a replacement for people. They&#8217;re best understood as a fast, cheap way to explore ideas and generate hypotheses, with their results treated as a starting point to be validated, not a final verdict.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>Used carefully, they can democratize research and let small teams ask big questions. Used carelessly, they produce confident-sounding answers that quietly reflect an AI&#8217;s blind spots rather than reality. The difference comes down to knowing what they can and can&#8217;t do, and increasingly on the open models, shared standards, and interpretability tools that will determine how far this technology can really go.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:separator --><\/p>\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<!-- \/wp:separator --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><em>This post summarizes the systematic literature review &#8220;Synthetic Audiences as Proxies for Human Users,&#8221; published in IEEE Access, which analyzed 100 studies on LLM-based synthetic audiences following established review protocols. It&#8217;s written for a general audience; see the full paper for methods, evidence, and the complete set of 100 citations.<\/em><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>Read the full paper:<\/strong> Lappas &amp; Filippas, &#8220;Synthetic Audiences as Proxies for Human Users: A Systematic Literature Review,&#8221; <em>IEEE Access<\/em>, 2026. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11563566\/\">https:\/\/ieeexplore.ieee.org\/document\/11563566\/<\/a><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:heading --><\/p>\n<h2 class=\"wp-block-heading\">Sources &amp; further reading<\/h2>\n<p><!-- \/wp:heading --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p>The specific findings mentioned above come from the studies below (a curated subset of the 100 reviewed). Links are provided where a stable one exists.<\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>Applications, scale, and cost<\/strong><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li>Argyle et al. (2023), &#8220;Out of one, many: Using language models to simulate human samples,&#8221; <em>Political Analysis.<\/em> <a href=\"https:\/\/doi.org\/10.1017\/pan.2023.2\">https:\/\/doi.org\/10.1017\/pan.2023.2<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>H\u00e4m\u00e4l\u00e4inen et al. (2023), &#8220;Evaluating large language models in generating synthetic HCI research data,&#8221; <em>CHI.<\/em> <a href=\"https:\/\/doi.org\/10.1145\/3544548.3580688\">https:\/\/doi.org\/10.1145\/3544548.3580688<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Kazinnik (2024), &#8220;Bank run, interrupted: Modeling deposit withdrawals with generative AI,&#8221; <em>SSRN.<\/em> <a href=\"https:\/\/doi.org\/10.2139\/ssrn.4656722\">https:\/\/doi.org\/10.2139\/ssrn.4656722<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Park et al. (2023), &#8220;Generative agents: Interactive simulacra of human behavior,&#8221; <em>UIST<\/em> (the &#8220;Smallville&#8221; study). <a href=\"https:\/\/doi.org\/10.1145\/3586183.3606763\">https:\/\/doi.org\/10.1145\/3586183.3606763<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Piao et al. (2025), &#8220;AgentSociety: Large-scale simulation of LLM-driven generative agents,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2502.08691\">https:\/\/arxiv.org\/abs\/2502.08691<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Aher et al. (2023), &#8220;Using large language models to simulate multiple humans and replicate human subject studies,&#8221; <em>ICML.<\/em> <a href=\"https:\/\/proceedings.mlr.press\/v202\/aher23a.html\">https:\/\/proceedings.mlr.press\/v202\/aher23a.html<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Xie et al. (2024), &#8220;Can large language model agents simulate human trust behavior?,&#8221; <em>NeurIPS.<\/em> <a href=\"https:\/\/arxiv.org\/abs\/2402.04559\">https:\/\/arxiv.org\/abs\/2402.04559<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Li et al. (2024), &#8220;Frontiers: Determining the validity of large language models for automated perceptual analysis,&#8221; <em>Marketing Science.<\/em> <a href=\"https:\/\/doi.org\/10.1287\/mksc.2023.0454\">https:\/\/doi.org\/10.1287\/mksc.2023.0454<\/a><\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>Limitations<\/strong><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li>Park et al. (2024), &#8220;Diminished diversity-of-thought in a standard large language model,&#8221; <em>Behavior Research Methods<\/em> (the 37.5% vs 50% replication finding). <a href=\"https:\/\/doi.org\/10.3758\/s13428-023-02307-x\">https:\/\/doi.org\/10.3758\/s13428-023-02307-x<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Qu &amp; Wang (2024), &#8220;Performance and biases of large language models in public opinion simulation,&#8221; <em>Humanities and Social Sciences Communications.<\/em> <a href=\"https:\/\/doi.org\/10.1057\/s41599-024-03609-x\">https:\/\/doi.org\/10.1057\/s41599-024-03609-x<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Cheng et al. (2023), &#8220;CoMPosT: Characterizing and evaluating caricature in LLM simulations,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2310.11501\">https:\/\/arxiv.org\/abs\/2310.11501<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Wang et al. (2025), &#8220;Large language models that replace human participants can harmfully misportray and flatten identity groups,&#8221; <em>Nature Machine Intelligence.<\/em> <a href=\"https:\/\/doi.org\/10.1038\/s42256-025-00986-z\">https:\/\/doi.org\/10.1038\/s42256-025-00986-z<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Gadiraju et al. (2023), &#8220;&#8216;I wouldn&#8217;t say offensive but&#8230;&#8217;: Disability-centered perspectives on large language models,&#8221; <em>FAccT.<\/em> <a href=\"https:\/\/doi.org\/10.1145\/3593013.3593989\">https:\/\/doi.org\/10.1145\/3593013.3593989<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Bisbee et al. (2024), &#8220;Synthetic replacements for human survey data? The perils of large language models,&#8221; <em>Political Analysis.<\/em> <a href=\"https:\/\/doi.org\/10.1017\/pan.2024.3\">https:\/\/doi.org\/10.1017\/pan.2024.3<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Cherep et al. (2025), &#8220;LLM agents are hypersensitive to nudges,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2505.11584\">https:\/\/arxiv.org\/abs\/2505.11584<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Gui &amp; Toubia (2025), &#8220;The challenge of using LLMs to simulate human behavior: A causal inference perspective,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2312.15524\">https:\/\/arxiv.org\/abs\/2312.15524<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Xu et al. (2025), &#8220;Hallucination is inevitable: An innate limitation of large language models,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2401.11817\">https:\/\/arxiv.org\/abs\/2401.11817<\/a><\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>Improvement techniques and human-AI hybrids<\/strong><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li>Moon et al. (2024), &#8220;Virtual personas for language models via an anthology of backstories,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2407.06576\">https:\/\/arxiv.org\/abs\/2407.06576<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Suh et al. (2025), &#8220;Language model fine-tuning on scaled survey data (SubPOP),&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2502.16761\">https:\/\/arxiv.org\/abs\/2502.16761<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Angelopoulos et al. (2023), &#8220;Prediction-powered inference,&#8221; <em>Science.<\/em> <a href=\"https:\/\/doi.org\/10.1126\/science.adi6000\">https:\/\/doi.org\/10.1126\/science.adi6000<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>De Bartolomeis et al. (2025), &#8220;Efficient randomized experiments using foundation models,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2502.04262\">https:\/\/arxiv.org\/abs\/2502.04262<\/a> (the 20% to 30% figure).<\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Wang et al. (2024), &#8220;Large language models for market research: A data-augmentation approach,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2412.19363\">https:\/\/arxiv.org\/abs\/2412.19363<\/a> (the up-to-80% figure).<\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n<p><!-- wp:paragraph --><\/p>\n<p><strong>Future directions<\/strong><\/p>\n<p><!-- \/wp:paragraph --><\/p>\n<p><!-- wp:list --><\/p>\n<ul class=\"wp-block-list\">\n<!-- wp:list-item --><\/p>\n<li>Binz et al. (2024), &#8220;Centaur: A foundation model of human cognition,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2410.20268\">https:\/\/arxiv.org\/abs\/2410.20268<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Shapira et al. (2024), &#8220;GLEE: A unified framework and benchmark for language-based economic environments,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2410.05254\">https:\/\/arxiv.org\/abs\/2410.05254<\/a><\/li>\n<p><!-- \/wp:list-item --><br \/>\n<!-- wp:list-item --><\/p>\n<li>Sharkey et al. (2025), &#8220;Open problems in mechanistic interpretability,&#8221; <a href=\"https:\/\/arxiv.org\/abs\/2501.16496\">https:\/\/arxiv.org\/abs\/2501.16496<\/a><\/li>\n<p><!-- \/wp:list-item -->\n<\/ul>\n<p><!-- \/wp:list --><\/p>\n","content_quarter":"","related_pods":[1362]},"research_categories":[],"raw_acf":{"content":"<!-- wp:paragraph -->\n<p>Imagine running a focus group with ten thousand \"people\" overnight, for roughly the price of a few coffees. No recruiting, no scheduling, no drop-outs. Just ask your questions and get answers by morning.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>That's the promise of <strong>synthetic audiences<\/strong>: groups of AI agents, powered by the same large language models (LLMs) behind tools like ChatGPT, configured to stand in for human participants in surveys, experiments, and market research. The idea has exploded across marketing, psychology, economics, and design in just a couple of years.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>In a <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11563566\/\"> recently published systematic review of over 100 related studies<\/a>, we looked at what these AI \"participants\" can actually do, where they fall apart, and how to use them in practice. This blog post present a summary of the main findings. We also list some of the key references at the end of the post.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">Why everyone is excited<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>The appeal becomes obvious the moment you look at the numbers. Studies that directly compared AI participants with human recruitment found cost differences of <strong>one to two orders of magnitude<\/strong> (that is, 10\u00d7 to 100\u00d7 cheaper per response). In one case, researchers generated tens of thousands of survey responses for roughly the API cost of a single query; in another, AI-generated open-ended answers for a user study cost a tiny fraction of what the equivalent human participants would have charged.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Beyond cost, there are practical superpowers:<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li><strong>\u2022 No logistics headaches.<\/strong> An AI agent always answers, never gets bored, and never abandons your survey halfway through.<\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li><strong>\u2022 Experiments you otherwise couldn't run.<\/strong> One team simulated a <em>bank run<\/em> using tens of thousands of AI-generated depositors across hundreds of demographic groups, a scenario you obviously can't stage with real customers and real money. Others have built entire simulated societies: sandbox towns of AI \"residents\" that develop their own social dynamics, and large-scale models with <strong>over 10,000 agents<\/strong> interacting in urban and economic environments.<\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">Where they actually shine<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>The honest headline from the research: synthetic audiences are good at capturing <strong>broad, aggregate patterns<\/strong> (the overall direction of an effect, rankings, and average tendencies), even when they miss the fine detail.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>The more capable the model, the better this gets. Newer models have reproduced classic results from economics and social psychology, closely matched human behavior in trust games, and broadly agreed with human raters on things like brand perceptions. In other words, if your question is \"which option do people lean toward?\" an AI panel can often point you in the right direction, fast.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">Where they break<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>The catch is that AI participants <em>look<\/em> convincing, and that's exactly the trap. Fluent, human-sounding answers make it easy to assume there's human-like thinking behind them. There usually isn't. The review found four recurring failure modes, each backed by hard evidence.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>1. They flatten the crowd.<\/strong> Real people disagree, surprise you, and sit at the extremes. AI participants tend to cluster around a bland \"average.\" One large replication study found an earlier model reproduced only <strong>37.5%<\/strong> of a well-known set of psychology effects, versus a <strong>50%<\/strong> rate for human samples. More tellingly, it kept giving near-identical answers to questions that normally produce lots of human variety. Other work documented a \"hyper-accuracy\" quirk, where AI participants give unrealistically precise, noise-free answers no real population would.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>2. They carry hidden biases.<\/strong> Because they learn from internet-scale text, AI participants tend to sound disproportionately Western, English-speaking, wealthy, and politically skewed. One study comparing AI-simulated public opinion against a global survey found it performed far better for Western, English-speaking, high-income countries while systematically misrepresenting attitudes across gender, age, education, and class. Worse, when asked to portray specific groups, models often produce <em>caricatures<\/em>. Research on such groups found AI depictions that actual human participants judged not just inaccurate but actively harmful. The uncomfortable implication: synthetic audiences are least reliable precisely where fresh insight is most valuable, namely underrepresented and non-Western populations.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>3. They're twitchy.<\/strong> Reword a question slightly, reorder the options, or tweak the persona, and the answers can swing dramatically. Studies showed AI survey responses with unstable estimates and strong sensitivity to wording, persona, and even <em>timing<\/em>. In decision-making tasks, AI agents proved far more sensitive than humans to small \"nudges\": a tiny change in framing produced outsized changes in behavior. That fragility makes results hard to reproduce.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>4. They can fake reasoning.<\/strong> An answer that looks thoughtful may just be pattern-matching on things the model absorbed during training. Researchers showed that changing one variable in a study (say, a product's price) can quietly shift the model's <em>unstated assumptions<\/em> about everything else, contaminating any cause-and-effect conclusion. In another case, near-perfect economic \"forecasts\" turned out to be memorized history rather than genuine prediction. Add to this the finding that hallucinations are, mathematically, an unavoidable feature of these models, not a bug that will simply be patched away.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">How researchers are fixing them<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>The field isn't standing still. Three families of techniques are making synthetic audiences more trustworthy:<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li><strong>\u2022 Better prompting.<\/strong> Giving the AI richer personas (detailed demographics, life-story backstories, even real interview transcripts) measurably improves how well it matches a target group. The gains are real but fragile, since the same prompt-sensitivity problem lurks underneath.<\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li><strong>\u2022 Fine-tuning.<\/strong> Training models on real human responses for a specific population improves accuracy, though it tends to compress diversity unless done carefully.<\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li><strong>\u2022 Human-AI hybrids, the most reliable approach.<\/strong> Instead of replacing people, you collect a <em>small<\/em> human sample and use it to calibrate and correct the AI's output. A statistical framework called prediction-powered inference makes this rigorous, and studies report cutting the required number of human participants by <strong>20% to 30%<\/strong> in some settings, while another approach reduced the human data needed by <strong>up to 80%<\/strong>, all without sacrificing statistical validity. You keep much of the speed and cost savings while keeping the results honest.<\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">The promising road ahead<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>What makes this field genuinely exciting isn't just today's tools; it's where the research is heading. The review highlights three directions that could turn synthetic audiences from a clever shortcut into a dependable scientific instrument.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>1. Open, transparent models built for human behavior.<\/strong> Most studies so far rely on closed, proprietary systems whose training data and updates are secret, which makes results hard to audit or reproduce (a \"silent\" update can change behavior between studies). Open models change that. The standout example is <em>Centaur<\/em>, a model fine-tuned on <strong>over 10 million real human decisions across 160 psychology experiments<\/strong>. It outperformed traditional models at predicting how people actually behave and generalized to new tasks, offering a glimpse of purpose-built, inspectable \"model populations\" rather than borrowed chatbots.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>2. Standards, benchmarks, and reproducibility.<\/strong> Right now, every team prompts differently, making studies hard to compare. The field is moving toward shared benchmarks and tools, such as standardized environments for running AI-based surveys and experiments, so researchers can tell whether an improvement truly generalizes or just got lucky on one prompt. This is the unglamorous infrastructure that turned modern machine learning into a cumulative science, and synthetic audiences need it too.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>3. Looking inside the black box.<\/strong> The frontier is <em>mechanistic interpretability<\/em>: tools that trace which internal \"circuits\" of a model produce a given behavior. The payoff would be enormous: imagine pinpointing the exact components that trigger a demographic stereotype and dialing them down, or verifying that a persona prompt genuinely activates the right knowledge rather than just changing the writing style. This work is early and mostly aspirational today, but it points toward synthetic audiences we can actually diagnose and correct, rather than merely observe.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">Best Practice<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>If you're tempted to use AI participants, the research suggests:<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li><strong> \u2022 Evaluate the fit.<\/strong> Research shows promising evidence for early exploration, pilots, brainstorming, broad directional questions, and well-represented (Western, English-speaking) populations. Still a poor fit for fine-grained individual differences, underrepresented or non-Western groups, and high-stakes domains like policy or healthcare.<\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li><strong> \u2022 Explore with AI, validate with Humans. <\/strong> Even a modest dose of real-world data, used to check the AI's answers, dramatically improves fidelity.<\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">The bottom line<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>Synthetic audiences are one of the most exciting new tools in research, but they're a tool, not a replacement for people. They're best understood as a fast, cheap way to explore ideas and generate hypotheses, with their results treated as a starting point to be validated, not a final verdict.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p>Used carefully, they can democratize research and let small teams ask big questions. Used carelessly, they produce confident-sounding answers that quietly reflect an AI's blind spots rather than reality. The difference comes down to knowing what they can and can't do, and increasingly on the open models, shared standards, and interpretability tools that will determine how far this technology can really go.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:separator -->\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n<!-- \/wp:separator -->\n\n<!-- wp:paragraph -->\n<p><em>This post summarizes the systematic literature review \"Synthetic Audiences as Proxies for Human Users,\" published in IEEE Access, which analyzed 100 studies on LLM-based synthetic audiences following established review protocols. It's written for a general audience; see the full paper for methods, evidence, and the complete set of 100 citations.<\/em><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>Read the full paper:<\/strong> Lappas &amp; Filippas, \"Synthetic Audiences as Proxies for Human Users: A Systematic Literature Review,\" <em>IEEE Access<\/em>, 2026. <a href=\"https:\/\/ieeexplore.ieee.org\/document\/11563566\/\">https:\/\/ieeexplore.ieee.org\/document\/11563566\/<\/a><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:heading -->\n<h2 class=\"wp-block-heading\">Sources &amp; further reading<\/h2>\n<!-- \/wp:heading -->\n\n<!-- wp:paragraph -->\n<p>The specific findings mentioned above come from the studies below (a curated subset of the 100 reviewed). Links are provided where a stable one exists.<\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:paragraph -->\n<p><strong>Applications, scale, and cost<\/strong><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li>Argyle et al. (2023), \"Out of one, many: Using language models to simulate human samples,\" <em>Political Analysis.<\/em> <a href=\"https:\/\/doi.org\/10.1017\/pan.2023.2\">https:\/\/doi.org\/10.1017\/pan.2023.2<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>H\u00e4m\u00e4l\u00e4inen et al. (2023), \"Evaluating large language models in generating synthetic HCI research data,\" <em>CHI.<\/em> <a href=\"https:\/\/doi.org\/10.1145\/3544548.3580688\">https:\/\/doi.org\/10.1145\/3544548.3580688<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Kazinnik (2024), \"Bank run, interrupted: Modeling deposit withdrawals with generative AI,\" <em>SSRN.<\/em> <a href=\"https:\/\/doi.org\/10.2139\/ssrn.4656722\">https:\/\/doi.org\/10.2139\/ssrn.4656722<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Park et al. (2023), \"Generative agents: Interactive simulacra of human behavior,\" <em>UIST<\/em> (the \"Smallville\" study). <a href=\"https:\/\/doi.org\/10.1145\/3586183.3606763\">https:\/\/doi.org\/10.1145\/3586183.3606763<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Piao et al. (2025), \"AgentSociety: Large-scale simulation of LLM-driven generative agents,\" <a href=\"https:\/\/arxiv.org\/abs\/2502.08691\">https:\/\/arxiv.org\/abs\/2502.08691<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Aher et al. (2023), \"Using large language models to simulate multiple humans and replicate human subject studies,\" <em>ICML.<\/em> <a href=\"https:\/\/proceedings.mlr.press\/v202\/aher23a.html\">https:\/\/proceedings.mlr.press\/v202\/aher23a.html<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Xie et al. (2024), \"Can large language model agents simulate human trust behavior?,\" <em>NeurIPS.<\/em> <a href=\"https:\/\/arxiv.org\/abs\/2402.04559\">https:\/\/arxiv.org\/abs\/2402.04559<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Li et al. (2024), \"Frontiers: Determining the validity of large language models for automated perceptual analysis,\" <em>Marketing Science.<\/em> <a href=\"https:\/\/doi.org\/10.1287\/mksc.2023.0454\">https:\/\/doi.org\/10.1287\/mksc.2023.0454<\/a><\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:paragraph -->\n<p><strong>Limitations<\/strong><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li>Park et al. (2024), \"Diminished diversity-of-thought in a standard large language model,\" <em>Behavior Research Methods<\/em> (the 37.5% vs 50% replication finding). <a href=\"https:\/\/doi.org\/10.3758\/s13428-023-02307-x\">https:\/\/doi.org\/10.3758\/s13428-023-02307-x<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Qu &amp; Wang (2024), \"Performance and biases of large language models in public opinion simulation,\" <em>Humanities and Social Sciences Communications.<\/em> <a href=\"https:\/\/doi.org\/10.1057\/s41599-024-03609-x\">https:\/\/doi.org\/10.1057\/s41599-024-03609-x<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Cheng et al. (2023), \"CoMPosT: Characterizing and evaluating caricature in LLM simulations,\" <a href=\"https:\/\/arxiv.org\/abs\/2310.11501\">https:\/\/arxiv.org\/abs\/2310.11501<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Wang et al. (2025), \"Large language models that replace human participants can harmfully misportray and flatten identity groups,\" <em>Nature Machine Intelligence.<\/em> <a href=\"https:\/\/doi.org\/10.1038\/s42256-025-00986-z\">https:\/\/doi.org\/10.1038\/s42256-025-00986-z<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Gadiraju et al. (2023), \"'I wouldn't say offensive but...': Disability-centered perspectives on large language models,\" <em>FAccT.<\/em> <a href=\"https:\/\/doi.org\/10.1145\/3593013.3593989\">https:\/\/doi.org\/10.1145\/3593013.3593989<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Bisbee et al. (2024), \"Synthetic replacements for human survey data? The perils of large language models,\" <em>Political Analysis.<\/em> <a href=\"https:\/\/doi.org\/10.1017\/pan.2024.3\">https:\/\/doi.org\/10.1017\/pan.2024.3<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Cherep et al. (2025), \"LLM agents are hypersensitive to nudges,\" <a href=\"https:\/\/arxiv.org\/abs\/2505.11584\">https:\/\/arxiv.org\/abs\/2505.11584<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Gui &amp; Toubia (2025), \"The challenge of using LLMs to simulate human behavior: A causal inference perspective,\" <a href=\"https:\/\/arxiv.org\/abs\/2312.15524\">https:\/\/arxiv.org\/abs\/2312.15524<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Xu et al. (2025), \"Hallucination is inevitable: An innate limitation of large language models,\" <a href=\"https:\/\/arxiv.org\/abs\/2401.11817\">https:\/\/arxiv.org\/abs\/2401.11817<\/a><\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:paragraph -->\n<p><strong>Improvement techniques and human-AI hybrids<\/strong><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li>Moon et al. (2024), \"Virtual personas for language models via an anthology of backstories,\" <a href=\"https:\/\/arxiv.org\/abs\/2407.06576\">https:\/\/arxiv.org\/abs\/2407.06576<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Suh et al. (2025), \"Language model fine-tuning on scaled survey data (SubPOP),\" <a href=\"https:\/\/arxiv.org\/abs\/2502.16761\">https:\/\/arxiv.org\/abs\/2502.16761<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Angelopoulos et al. (2023), \"Prediction-powered inference,\" <em>Science.<\/em> <a href=\"https:\/\/doi.org\/10.1126\/science.adi6000\">https:\/\/doi.org\/10.1126\/science.adi6000<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>De Bartolomeis et al. (2025), \"Efficient randomized experiments using foundation models,\" <a href=\"https:\/\/arxiv.org\/abs\/2502.04262\">https:\/\/arxiv.org\/abs\/2502.04262<\/a> (the 20% to 30% figure).<\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Wang et al. (2024), \"Large language models for market research: A data-augmentation approach,\" <a href=\"https:\/\/arxiv.org\/abs\/2412.19363\">https:\/\/arxiv.org\/abs\/2412.19363<\/a> (the up-to-80% figure).<\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->\n\n<!-- wp:paragraph -->\n<p><strong>Future directions<\/strong><\/p>\n<!-- \/wp:paragraph -->\n\n<!-- wp:list -->\n<ul class=\"wp-block-list\">\n<!-- wp:list-item -->\n<li>Binz et al. (2024), \"Centaur: A foundation model of human cognition,\" <a href=\"https:\/\/arxiv.org\/abs\/2410.20268\">https:\/\/arxiv.org\/abs\/2410.20268<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Shapira et al. (2024), \"GLEE: A unified framework and benchmark for language-based economic environments,\" <a href=\"https:\/\/arxiv.org\/abs\/2410.05254\">https:\/\/arxiv.org\/abs\/2410.05254<\/a><\/li>\n<!-- \/wp:list-item -->\n<!-- wp:list-item -->\n<li>Sharkey et al. (2025), \"Open problems in mechanistic interpretability,\" <a href=\"https:\/\/arxiv.org\/abs\/2501.16496\">https:\/\/arxiv.org\/abs\/2501.16496<\/a><\/li>\n<!-- \/wp:list-item -->\n<\/ul>\n<!-- \/wp:list -->","content_quarter":"","related_pods":["1362"],"featured":"","legacy_perspective_source_id":""},"featured_image_url":"https:\/\/cms.research.wpp.com\/wp-content\/uploads\/2026\/06\/dc0883f4-f916-4361-9cdf-03e2768c2348.jpg","featured_image_sizes":{"thumbnail":"https:\/\/cms.research.wpp.com\/wp-content\/uploads\/2026\/06\/dc0883f4-f916-4361-9cdf-03e2768c2348-150x150.jpg","medium":"https:\/\/cms.research.wpp.com\/wp-content\/uploads\/2026\/06\/dc0883f4-f916-4361-9cdf-03e2768c2348-300x164.jpg","large":"https:\/\/cms.research.wpp.com\/wp-content\/uploads\/2026\/06\/dc0883f4-f916-4361-9cdf-03e2768c2348.jpg","full":"https:\/\/cms.research.wpp.com\/wp-content\/uploads\/2026\/06\/dc0883f4-f916-4361-9cdf-03e2768c2348.jpg"},"_links":{"self":[{"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=\/wp\/v2\/research_feed\/1752","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=\/wp\/v2\/research_feed"}],"about":[{"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=\/wp\/v2\/types\/research_feed"}],"author":[{"embeddable":true,"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=\/wp\/v2\/users\/5"}],"acf:post":[{"embeddable":true,"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=\/wp\/v2\/research_pods\/1362"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=\/wp\/v2\/media\/1756"}],"wp:attachment":[{"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1752"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1752"},{"taxonomy":"content_type","embeddable":true,"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcontent_types&post=1752"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/cms.research.wpp.com\/index.php?rest_route=%2Fwp%2Fv2%2Fppma_author&post=1752"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}