How AI Reads Open-Ended Survey Responses — and Why It Changes Research
When a respondent types “it’s fine I guess” or “bought it because it was on sale, not sure why” into a survey — what do you do with that? For years the answer was straightforward: a researcher reads it, categorizes it, and repeats that a thousand times. Tedious, time-consuming, and prone to inconsistencies between coders.
At In-Pulse, we built two proprietary AI systems that change this process from the ground up. Not to eliminate the researcher — but to give them time for what they do best: interpretation, insight, and recommendations.
Below I describe how it works, what we’ve achieved, and why a hybrid model — human and AI together — delivers better results than either alone.
The Starting Point: A Problem the Industry Underestimates
Start with a number that should stop anyone who works with survey data: according to the Insights Association Data Quality Insights Report (March 2025), as many as 69% of respondents in online surveys are not real humans. Bots, click farms, automatically generated responses — this isn’t a marginal issue. It’s a structural challenge to data quality.
On top of that, there’s the phenomenon of low-effort responses: short, contentless, random or purely formal answers that technically pass survey validation but carry no analytical value. Detecting and filtering them traditionally fell entirely on the researcher — after the fact, manually, project by project.
At In-Pulse, this was our problem too. Before deploying AI, coding open-ended responses in a straightforward research project took the team up to two full days. As the volume of research grew, that became a bottleneck for the entire process.

Two Tools, One Goal: Higher Quality Insights
We built two complementary systems:
PulseCheck — response quality assessment. The system analyzes each answer for its substantive value: does it address the question, does it contain specifics, does it show signs of being automatically generated? It detects patterns characteristic of professional respondents and flags responses that need a researcher’s attention.
OpenPulse — semantic classification of open-ended responses. The system doesn’t search for keywords — it understands the meaning of an utterance in the context of the full response and the question it was answering. The result is multi-level classification: thematic (what is the respondent talking about), emotional (with what tone), and intentional (what attitude or intent underlies it). Importantly, categories are not pre-defined. The model proposes a coding structure based on the actual data — and evolves across successive research waves.
Both systems operate on a human-in-the-loop model: AI performs the initial classification, the researcher validates a sample, and the system learns from that validation. The more projects it processes, the more precise it becomes — without rebuilding the tool.
75% — reduction in open-ended response coding time while maintaining quality approved by researchers
90% — agreement with expert assessment achieved by AI classification on the validation sample
5h — maximum time from survey close to first results (previously: up to 2 days for straightforward projects)
99% — effectiveness of automatic PII anonymization in open-ended responses

What the System Actually Does — and What It Doesn’t
Before anyone imagines a black box that “just works” — it’s worth being precise about where the innovation lies and where the limits are.
Traditional text analysis relies on keywords and rules: if the word “expensive” appears, the response goes into the “price” category. Fast, but blind to context. “Not expensive for the quality” and “way too expensive” are completely different insights — a dictionary-based approach won’t distinguish them.
OpenPulse works at the semantic level — it understands the meaning of a sentence in the context of the full utterance and the question being answered. The result is multi-level classification that captures not just topic, but tone and intent.
PulseCheck goes further and evaluates not just what was said, but whether it’s worth analyzing at all. The system cross-checks open-ended responses against closed responses in the same survey — and detects contradictions that may signal inattentive completion or automatically generated answers.
What the system doesn’t do: it doesn’t replace the researcher in interpreting results. Classification is not insight. AI organizes the material — the researcher draws conclusions from it, understands the client’s context and formulates recommendations. This is an intentional division of roles, not a limitation of the technology.
Personal Data and GDPR: A Risk That’s Easy to Miss
One of the underappreciated risks in automated analysis of open-ended responses involves personal data. Respondents — often unintentionally — type their name, address, phone number, or sometimes health or financial data into free-text fields.
Before any response reaches the AI system, it passes through a process of automatic detection and masking of PII (Personally Identifiable Information). We measure the effectiveness of this mechanism rigorously: the target is a minimum 99% correct masking rate on test samples, with zero critical incidents.
This is not a technical footnote to the project — it’s a prerequisite for GDPR compliance and the trust of clients who entrust us with their respondents’ data.

What This Means for In-Pulse’s Clients
Quality of the analytical tool directly determines the quality of the insights the client receives. Specifically:
Faster access to results. Compressing the time from survey close to first findings from days to hours means business decisions can be made on fresh data — not data from a week ago.
Higher quality source material. Automatic identification of low-effort and inconsistent responses means analysis is based on what respondents actually said — not on data noise.
Deeper insights from open-ended questions. Multi-level semantic classification extracts more from open-ended responses than keyword counting. You can see not just what respondents say, but how and why.
Comparability across studies. A consistent category taxonomy — applied automatically across all projects — makes it possible to track shifts in opinions and attitudes over time, across segments, and across survey waves.
Why the Hybrid Model Outperforms Full Automation
When designing the system, we faced a dilemma that applies to any AI deployment in expert processes: how much autonomy do you give the algorithm?
Full automation would be faster and cheaper to operate. But it would also mean giving up control over meaning and substance — and that is the core of research work. A “technically correct” code is not the same as a “substantively accurate” code.
The human-in-the-loop model we implemented rests on a simple principle: AI executes, humans validate and teach. The researcher doesn’t review every response — they review a sample, assess classification quality, and correct where AI errs. Those corrections feed back into the model as training data.
Questions & Answers
PulseCheck assesses the quality of collected responses — it detects low-effort answers, bots, and internal contradictions within a survey. OpenPulse semantically classifies the content of open-ended responses, assigning them to thematic, emotional, and intentional categories. Both systems work together and complement each other in the qualitative analysis process.
Low-effort responses are answers that technically meet survey requirements (e.g. minimum character count) but carry no substantive value — random strings of characters, repetitions, responses completely unrelated to the question. Their presence in the data degrades insight quality and can lead to incorrect conclusions. According to the Insights Association Data Quality Insights Report (March 2025), as many as 69% of online survey respondents may not be real humans.
LinkedIn