Synthetic
Synthetic respondents are AI-generated personas that take a survey end to end. Each persona has a rich profile — demographics, personality, values, behaviors, brand affinities — and answers every question in character. Responses land in the same data store as real responses, tagged with source=synthetic so you can include them, exclude them, or compare them per analysis.
Synthetic panels are account-scoped and reusable. Build a panel once, run it against as many surveys as you want over weeks or months.
What you can do with it
| Use case | What it gets you |
|---|---|
| Validate a hypothesis | Run 30 synthetics against your most controversial hypothesis. If the signal isn't there even directionally, rewrite the question before spending on real respondents. |
| Concept testing | Run three product copy variants past the same panel and compare the relative lift. |
| Pre-flight a study | See the shape of the data before you pay for real respondents. Spot leading wording, routing dead-ends, or missing answer options in minutes. |
| Longitudinal cohorts | One panel, multiple surveys over time. Run a 20-person "EV-curious US drivers" panel against your Q1 study, then your Q2 follow-up — same personas, comparable answers. |
| Calibrate against real | Run the same survey against a synthetic panel and a paid Prolific cohort. Compare distributions to learn where synthetic tracks real for your audience. |
How a persona is built
When you create a synthetic panel, each persona materializes with a full profile. The persona simulator draws from four layers:
Demographics
Age, gender, location, education, income bracket, occupation, marital status, children, ethnicity.
Psychographics
Personality traits, values, interests, lifestyle.
Behaviors
Media consumption habits, shopping behavior, tech adoption patterns, brand loyalty and affinities.
Survey traits
Survey attitude (engaged, skeptical, or rushed) and trust level. These shape how the persona responds — a skeptical persona leaves shorter open-ends and is more likely to pick neutral scale points, mirroring real respondent variance.
You do not author these fields by hand. Describe the audience in plain language — "tech-forward US millennials who own at least one EV" — and the platform fills in the profile. The richer your description, the more coherent the personas.
Grounding in real demographic data
For higher fidelity, ground a panel in real population distributions. Ask for demographic data on any geography — "give me demographics for Austin, TX" — and pass that context into panel creation. Personas are then generated with distributions that track the real population for that location.
This is useful whenever you want a panel to mirror a real place (a US state, a metro area, a country) rather than a hand-described audience segment.
The instant-insight loop
A typical session looks like this:
1. Create a survey (seconds)
2. Create a synthetic panel (seconds -- personas materialize)
3. Have the panel take the survey (minutes -- responses stream in)
4. Analyze as data lands (live -- charts update per response)
5. Iterate (edit the survey or spin up another panel)
Steps 3 and 4 overlap. The analysis surface updates as each persona finishes. By the time the cohort is done, your analysis is already done.
Typical workflow
| Round | What you do |
|---|---|
| 1 | Draft survey, run 20-30 synthetics, look at where responses cluster oddly or break. Fix the survey. |
| 2 | Re-run 50 synthetics with the cleaned survey. Read directional signal across your hypotheses. |
| 3 | Adjust. Optionally create a second panel with a different profile to compare two audiences. |
| 4 | If the signal is strong and you need defensible numbers, launch a paid Prolific or Dynata cohort. Compare against your synthetics to validate. |
Most exploratory work stops at round 2 or 3. Round 4 matters when you need to publish headline numbers to an external audience.
Agent prompts
Drive the whole flow from chat. The agent creates, fields, and analyzes without requiring you to manage IDs or navigate menus.
Create a panel:
Create a synthetic panel of 20 tech-forward US millennials who own EVs.
Create a survey and field it:
Create a 5-question survey on charging-station preferences, then have the panel take it.
Analyze results:
Give me the top 3 takeaways from the synthetic run.
Reuse a panel across surveys:
Have my EV-owners panel take this new survey.
Compare against real respondents:
Now launch 30 Prolific respondents on the same survey and show me NPS by panel.
The agent reuses the panel and survey from context — you do not need to repeat IDs or names.
How synthetic responses are ingested
Synthetic responses are submitted via an internal service-token-authenticated endpoint. Each response carries:
data— answers keyed by question label (e.g.,{"Q1": "1", "Q2": "3"}).persona— the full persona profile stored asrespondent_metadata.status—COMPLETEorINCOMPLETE.panel_id— the synthetic panel UUID, used for segmentation in analysis.
Every row is written with source=synthetic and accounted=False. Synthetic responses never consume real-world quotas. The analytics toggle controls visibility, not this bookkeeping flag.
Synthetic responses are allowed on draft surveys — the whole point is to validate a survey before it goes live. No rate limiting, no IP capture, no fraud checks (those belong on the real-respondent pipeline).
Comparing synthetic vs. real
Every response carries a panel identifier. Analysis lets you slice and compare:
- Stack panels in one crosstab — "show NPS by panel" returns a column per panel (your synthetic cohort, your Prolific cohort, your email list).
- Filter to one — "show me just the synthetic responses" or "exclude synthetic from this view."
- Validate fidelity — run the same survey against a small paid cohort and a same-size synthetic cohort. Closed-ended distributions typically land within a few points; open-ends read differently in voice but cluster into the same themes.
The calibration loop
We recommend this the first time a team uses synthetic for a real study:
- Run a small paid cohort (30-50 respondents) alongside a same-size synthetic cohort.
- Compare where the distributions converge and where they diverge.
- Decide whether the divergence matters for your decision.
This builds team-level confidence in when synthetic signal is trustworthy for your domain.
When to use synthetic
Synthetic is the right tool for:
- Exploratory research where directional signal is enough.
- Validating hypotheses before committing budget to real respondents.
- Concept tests and A/B copy comparisons.
- Pressure-testing a questionnaire's logic, wording, and routing.
- Longitudinal use of the same persona cohort across multiple studies.
When not to use synthetic
Synthetic is not the right tool for:
- Regulated audiences (clinical research, financial-services compliance) where respondent provenance is required.
- Very narrow niches where the underlying model has limited signal — verify with a small real-cohort calibration first.
- Tracker studies requiring real-respondent statistical consistency over time.
- Headline numbers published to an external audience without context.
The platform always tags synthetic responses. You decide where the line is for your study.
Limits
- Cohort size — panels of 20-50 personas are typical for fast iteration. Larger cohorts are supported and take proportionally longer.
- Survey length — every persona answers the whole survey in one pass. Very long surveys (50+ questions with many open-ends) take longer to field.
- Screen-outs reduce surviving count — synthetic generates one response per persona. If a screener filters out half, half of your panel remains in analysis. Plan panel size with expected qualification rate in mind.