Machine Learning Assisted Text Utterance Collection

How to improve the accuracy of your verbal interface or chatbot and give your users the answers they’re really looking for.

In our faced-paced society, Federal and corporate organizations are increasingly turning to verbal interfaces and chatbots to provide 24/7 on-demand services to a variety of users and customers. Verbal interfaces provide immediate response times and huge scalability, increasing the efficiency and cost effectiveness of operations. But how do you ensure that your verbal interface is accurate and provides the right answers to your users?

If your verbal interface or chatbot isn’t providing consistently accurate responses, chances are it hasn’t been sufficiently trained. These interfaces are only as good as the training data they consume, but generating large amounts of high-quality training data, typically using varied text utterances, can be a difficult and time-consuming process. Until now.

Figure Eight Federal’s new Machine Learning Assisted Text Utterance Collection (MLATUC) capability optimizes utterance collection by using Machine Learning to filter out useless utterances, using our three smart validators (duplicates, incoherence, different language) as soon as they’re collected.

With poor quality utterances removed, your staff can focus on generating high-quality usable verbal interface data, which reflects the variety and nuances of how your users ask questions, ensuring your interface is ready to provide the right answer every time.


Here’s how it works:

Every utterance submitted by your contributors is checked for its accuracy and quality using our platforms three smart validators, by removing duplicates, incoherent statements, or language that is different than expected. With such poor quality utterances filtered out, users can focus on collecting only high-quality, usable data at a large scale, ensuring your verbal interface is fully trained to give a clear and informative user interaction every time.


Our smart validators functionality leverages machine learning to improve the quality of collected data by using:

  • Duplicate Detection: Ensures only unique utterances are collected
  • Coherence Detection: Ensures coherent input
  • Language Detection: Ensures contributor’s submissions use the specific target language

Higher Quality

Validation ensures only high quality, usable data is collected, resulting in up to 35% reduction in rejected utterances.

Greater Speed and Scale

With Figure Eight Federal’s training data validation, you can feel confident in widening your verbal interface usage to a larger user base, knowing that every utterance is checked for quality on a secure enterprise-grade platform.

Reduced Cost

You can eliminate peer reviews on utterances you can’t use, saving money and time.


Figure Eight Federal can make your machine learning projects a success. Contact us and we’ll get you started.