How AI Reads Open-Ended Survey Responses — and Why It Changes Research

featured image

April 28, 2026

Kinga Barczewska-Pflanz, Head of Tech & Data

When a respondent types “it’s fine I guess” or “bought it because it was on sale, not sure why” into a survey — what do you do with that? For years the answer was straightforward: a researcher reads it, categorizes it, and repeats that a thousand times. Tedious, time-consuming, and prone to inconsistencies between coders.

At In-Pulse, we built two proprietary AI systems that change this process from the ground up. Not to eliminate the researcher — but to give them time for what they do best: interpretation, insight, and recommendations.

Below I describe how it works, what we’ve achieved, and why a hybrid model — human and AI together — delivers better results than either alone.

The Starting Point: A Problem the Industry Underestimates

Start with a number that should stop anyone who works with survey data: according to the Insights Association Data Quality Insights Report (March 2025), as many as 69% of respondents in online surveys are not real humans. Bots, click farms, automatically generated responses — this isn’t a marginal issue. It’s a structural challenge to data quality.

On top of that, there’s the phenomenon of low-effort responses: short, contentless, random or purely formal answers that technically pass survey validation but carry no analytical value. Detecting and filtering them traditionally fell entirely on the researcher — after the fact, manually, project by project.

At In-Pulse, this was our problem too. Before deploying AI, coding open-ended responses in a straightforward research project took the team up to two full days. As the volume of research grew, that became a bottleneck for the entire process.

AI w analizie jakościowej

Two Tools, One Goal: Higher Quality Insights

We built two complementary systems:

PulseCheck — response quality assessment. The system analyzes each answer for its substantive value: does it address the question, does it contain specifics, does it show signs of being automatically generated? It detects patterns characteristic of professional respondents and flags responses that need a researcher’s attention.

OpenPulse — semantic classification of open-ended responses. The system doesn’t search for keywords — it understands the meaning of an utterance in the context of the full response and the question it was answering. The result is multi-level classification: thematic (what is the respondent talking about), emotional (with what tone), and intentional (what attitude or intent underlies it). Importantly, categories are not pre-defined. The model proposes a coding structure based on the actual data — and evolves across successive research waves.

Both systems operate on a human-in-the-loop model: AI performs the initial classification, the researcher validates a sample, and the system learns from that validation. The more projects it processes, the more precise it becomes — without rebuilding the tool.

75% — reduction in open-ended response coding time while maintaining quality approved by researchers

90% — agreement with expert assessment achieved by AI classification on the validation sample

5h — maximum time from survey close to first results (previously: up to 2 days for straightforward projects)

99% — effectiveness of automatic PII anonymization in open-ended responses

AI in qualitative research

What the System Actually Does — and What It Doesn’t

Before anyone imagines a black box that “just works” — it’s worth being precise about where the innovation lies and where the limits are.

Traditional text analysis relies on keywords and rules: if the word “expensive” appears, the response goes into the “price” category. Fast, but blind to context. “Not expensive for the quality” and “way too expensive” are completely different insights — a dictionary-based approach won’t distinguish them.

OpenPulse works at the semantic level — it understands the meaning of a sentence in the context of the full utterance and the question being answered. The result is multi-level classification that captures not just topic, but tone and intent.

PulseCheck goes further and evaluates not just what was said, but whether it’s worth analyzing at all. The system cross-checks open-ended responses against closed responses in the same survey — and detects contradictions that may signal inattentive completion or automatically generated answers.

What the system doesn’t do: it doesn’t replace the researcher in interpreting results. Classification is not insight. AI organizes the material — the researcher draws conclusions from it, understands the client’s context and formulates recommendations. This is an intentional division of roles, not a limitation of the technology.

Personal Data and GDPR: A Risk That’s Easy to Miss

One of the underappreciated risks in automated analysis of open-ended responses involves personal data. Respondents — often unintentionally — type their name, address, phone number, or sometimes health or financial data into free-text fields.

Before any response reaches the AI system, it passes through a process of automatic detection and masking of PII (Personally Identifiable Information). We measure the effectiveness of this mechanism rigorously: the target is a minimum 99% correct masking rate on test samples, with zero critical incidents.

This is not a technical footnote to the project — it’s a prerequisite for GDPR compliance and the trust of clients who entrust us with their respondents’ data.

AI in qualitative research

What This Means for In-Pulse’s Clients

Quality of the analytical tool directly determines the quality of the insights the client receives. Specifically:

Faster access to results. Compressing the time from survey close to first findings from days to hours means business decisions can be made on fresh data — not data from a week ago.

Higher quality source material. Automatic identification of low-effort and inconsistent responses means analysis is based on what respondents actually said — not on data noise.

Deeper insights from open-ended questions. Multi-level semantic classification extracts more from open-ended responses than keyword counting. You can see not just what respondents say, but how and why.

Comparability across studies. A consistent category taxonomy — applied automatically across all projects — makes it possible to track shifts in opinions and attitudes over time, across segments, and across survey waves.

Why the Hybrid Model Outperforms Full Automation

When designing the system, we faced a dilemma that applies to any AI deployment in expert processes: how much autonomy do you give the algorithm?

Full automation would be faster and cheaper to operate. But it would also mean giving up control over meaning and substance — and that is the core of research work. A “technically correct” code is not the same as a “substantively accurate” code.

The human-in-the-loop model we implemented rests on a simple principle: AI executes, humans validate and teach. The researcher doesn’t review every response — they review a sample, assess classification quality, and correct where AI errs. Those corrections feed back into the model as training data.

Questions & Answers

What is the difference between PulseCheck and OpenPulse?

PulseCheck assesses the quality of collected responses — it detects low-effort answers, bots, and internal contradictions within a survey. OpenPulse semantically classifies the content of open-ended responses, assigning them to thematic, emotional, and intentional categories. Both systems work together and complement each other in the qualitative analysis process.

What are low-effort responses and why are they a problem?


Low-effort responses are answers that technically meet survey requirements (e.g. minimum character count) but carry no substantive value — random strings of characters, repetitions, responses completely unrelated to the question. Their presence in the data degrades insight quality and can lead to incorrect conclusions. According to the Insights Association Data Quality Insights Report (March 2025), as many as 69% of online survey respondents may not be real humans.

Odkryj więcej informacji o In-Pulse

FAQ

Frequently Asked Questions

In-Pulse is a research and analytics system that enables marketers to fully understand shopper behaviors and preferences, target with precision, and evaluate the effectiveness of their marketing activities.

This has become possible through a partnership between Żabka Polska and Stagwell Inc., providing access to insights based on transactional data from the Żappka mobile app.

This knowledge makes it possible to understand shopper behavior derived from real transactions — not just declarations.

As a result, In-Pulse measures the “pulse” of Polish consumers in real time, empowering brands to enhance the effectiveness of their marketing efforts.

In-Pulse brings together, in one place, a full range of services that enable marketers to deeply understand who the shoppers within the Żabka network are, how their purchase behaviors are shaped, what influences their decisions, which products attract them most, and which types of marketing communication truly resonate.

Marketers can use this knowledge to reach shoppers at the most relevant time and place — with the confidence that they’re targeting people who actually buy, not just those who claim certain behaviors.

In-Pulse is built on a single-source data architecture, ensuring consistency and comparability of insights. It also enables marketers to measure the effects of their activities in relation to changes in brand perception and sales performance.

All of this allows Polish marketers to quickly access high-quality insights that help them:

  • manage marketing and media budgets more effectively
  • increase return on investment
  • drive sales growth

In-Pulse is an innovative solution that enables an unprecedented level of insight into Polish consumers — their motivations, preferences, and purchase decisions — based on transactional data, not just declarations.

It allows for the creation of built on solid, reliable foundation - data flowing in from millions of transactions captured daily at thousands of Żabka stores.

With In-Pulse, you can:

  • see how consumers respond to a new product, service, or product variant
  • measure the effectiveness of advertising campaigns by tracking changes in brand perception and product sales
  • understand the motivations and shopping journeys of specific shopper segments
  • identify what drives category volume — shopper segment, purchase mission, or consumption occasion

The In-Pulse solution is designed for a wide range of companies — both within and beyond the FMCG sector.

By combining insights derived from transactional data analysis with the ability to deepen shopper understanding through in-app surveys, In-Pulse enables both endemic and non-endemic brands — across industries such as beauty, pharmaceuticals, finance, insurance, automotive, and telecommunications — to gain fast and precise insights into consumer behaviors, preferences, and motivations.

All it takes is a quick call or a message through the contact form below — and we’ll guide you through the entire process:

  • We’ll start with a short intro meeting, where we’ll walk you through how the In-Pulse system works and its capabilities — from understanding your audience and exploring their preferences and shopping habits, to testing product and creative concepts, reaching the right consumers with the right message, and tracking campaign effectiveness.
  • Next, we’ll ask you to share your challenge or hypothesis you’d like to test. Based on that, our team of experts will prepare a tailored research proposal made up of products that will help you find the answers you’re looking for.
  • Don’t worry — you don’t need to have everything ready from day one. In-Pulse is a single-source system, which means that once your consumer segments are built, you can reuse them in future studies and analyses.
  • As our collaboration develops, you’ll be able to monitor how your target group’s behaviors, needs, and preferences evolve — as well as how they perceive your brand, service, or product.
  • This will allow you to respond in real time, optimize your strategies, and deepen your understanding of the areas that matter most to your business at any given moment.