<img alt="" src="https://secure.businessintuition247.com/264129.png" style="display:none;">

A host of unique challenges drives a highly specialized AI study to augment Voice-of-Customer capabilities.

Supercharging a VoC Program with AI helps it handle higher volumes and perform quicker analysis, without increasing team size.

Challenge: Understand how to harness AI to improve a highly specialized Voice-of-Customer Program

Our client—a Quality Assurance group’s VoC team—had investigated the “state of the possible” with Artificial Intelligence prior to the pandemic. Since then, reports and success stories about machine learning (ML) and AI capabilities using newer technologies and algorithms had been proliferating. Even within the client’s organization, their use of AI (and acceptance of its benefits) were becoming more widely accepted.

The QA team approached McorpCX with a host of unique challenges:

  • The VoC was gathered via a series of interviews with engineers by engineers. In other words, the context of the language was highly technical and extremely domain specific.
  • The interviews were conducted in local languages and then summarized in English by the interviewer; therefore, the quality of the English was localized and not uniform.
  • The number of interviews conducted was fairly small, limiting the corpus of data. Traditionally, the use of AI to derive sentiment, classify concepts, or discover similarities from a small corpus of highly domain specific content rendered poor results.
  • The enterprise taxonomy was expected to have limited applicability, as it was not focused on the terminology used by the QA team.
  • And finally, the client was looking for a solution that could be fairly independent of the organization’s AI team, as they were already focused on multiple “extremely critical for business” projects and under tremendous capacity pressure.
Approach: Assess the latest advances in Machine Learning-enabled, Natural Language Processing (ML NLP) in an intensive, systematic project

Since the purpose of this engagement was to assess the art of the possible and provide insights into the feasibility of different solutions, we designed a systematic study of platform capability with respect to NLP that included: sentiment analysis, categorization into topics, and discovering insights (clustering).

The study leveraged large language models such as BERT (Google) and GPT-3 (openAI). We also selected some newer low-code AI platforms being brought to market by a host of startups. As the client was already using Qualtrics we also evaluated Qualtrics Discover (previously Clarabridge—one of the most highly rated NLP engines by Gartner).

We computed scores at three points in the process: A “Raw Score” was computed when training data was processed by the platform. The second score was computed once we tuned the systems, and a third score was computed once test data was processed. We then introduced a fourth “Final Score,” which was computed after another round of time-boxed tuning of the platform to understand how tunable the platform was in light of the “data drift” inherent in operational circumstances.

Our three areas of analysis included:

  1. Sentiment Analysis: We summarized content to categorize it as positive, negative or neutral. We computed the precision, recall, and F1 (the reciprocal mean of precision and recall) values for each of the sentiments, as well as the overall accuracy scores.
  2. Competitive Intelligence: We identified fragments of content that contained competitor brand, product or service information, and we then computed those precision, recall, F1 and accuracy values.
  3. Improvement Required: This was a two-stage analysis. In stage one we determined topics of interest (clustering). We then added a few of these topics to the taxonomy (category) tree, categorized the content, and computed precision, recall and F1 for each of the categories.
We’ve seen how well AI can enhance VoC programs when used to augment human capabilities. Just don’t underestimate operational aspects; unlike previous technologies, AI requires SMEs to actively maintain, evaluate and tune the solution.
” – Chirag Gandhi, Chief Technology Officer, McorpCX
Findings: Some platforms are more suited to AI than others, but for most VoC programs the right circumstances exist to deliver benefits and justify the investment

Assessing the capabilities of three platforms, we eliminated both the low-code platforms and the large language models. We found that their day-to-day operations and maintenance would require a higher level of capacity than the client could justify sustaining over time, and therefore they failed the test of being independent enough of the organization’s AI team.

Additionally, integrating and sustaining these tools into the larger organization ecosystem would require an ongoing coding capability best served by external skilled resources, which the client made clear at the outset, was not their desired outcome.

We also found that (unsurprisingly), due to the specialized nature of the client’s content and language usage in the domain, the ability to fine tune the out-of-the-box models was another important hurdle. By way of example: In normal speech, “fault” is considered negative, however in the quality assurance domain, “fault” is used to indicate count, and therefore cannot be considered negative unless accompanied by a verb such as “increase.” So it became an important marker for any tool to provide extremely user-friendly analytical and tuning capabilities.

One benefit of our analysis was that we identified and demonstrated some inconsistencies in their current methodologies introduced by human classification and judgment. While the data was insufficient to determine whether this inconsistency was due to bias or noise, it was clear from our discussions that the team hadn’t recognized their current inconsistencies prior to this review.

Most importantly, our quantifications of precision, recall, F1 and accuracy proved that AI has come a long way, and it is now possible to obtain results that justify the investments in AI for NLP-based analysis of verbatims in a VoC program.

Recommendations / Results: Ideal for speed, volume and consistency when used to augment human capabilities

NLP implementations are not like a typical tool implementation, in which you provide the requirements and the technical team provides a fully configured tool that requires occasional break-fixes and/or enhancements. NLP implementations are highly dependent on the data they process. And the reality is, that data tends to change over time because the usage of terms changes or the topics being discussed change.

To evaluate how effective it would be to manage this, the VoC Program needs to consider both the work required to continuously tune the platform and handle exceptions, as well as the capability that the program would need to perform these tasks. In other words, does the value provided by the automation adequately offset the additional operational burden? When the answer is “yes,” then we found NLP could be a great benefit.

For any VoC leader looking for an edge, AI is here and ready to augment human capabilities. The three primary ways to leverage the advances in NLP to augment VoC work include:

  1. The speed with which analysis results need to be presented;
  2. The volume of the content to be analyzed;
  3. The consistency of the analysis.

While premature as a fully “hands off” approach, we found recent AI-driven NLP advances to be significant. And this project effectively demonstrated the applicability of NLP for open-ended VoC questions, providing the operational benefits of improved speed and consistency that—in the right circumstances—can more than offset the personnel lift.


McorpCX provided the client with a VoC operations model, and expanded their view of how their VoC program might incorporate additional AI capabilities, such as transcription, translation and summarization.
Exploring the Art of the Possible
Download Pdf