Appendix A - Model Maintenance
  • 30 Jul 2024
  • 1 Minute to read
  • Contributors
  • Dark
    Light
  • PDF

Appendix A - Model Maintenance

  • Dark
    Light
  • PDF

Article summary

After a model is active, it starts fielding engagements. However, the data that the model encounters can change over time, or unseen data can come to Ushur for inference. Ushur recommends that the performance of a model continually be monitored for inaccuracies.

The general approaches for improving the performance of sub-optimal categories are as follows:

  • If the number of training samples is significantly less, try adding more data.

  • Check for labeling errors. Data provided for training would have been labeled by humans, and human label error rates are typically 3-5%. Potential mislabels identified by the Ushur platform are also made available for this exercise.

  • Double-check the data collection process for potential ‘sampling bias’. A biased sample is one that is not representative of the entire population. For example, email samples during holidays such as Christmas may have a very different tone when compared to the rest of the year. Data collection should purely be a random exercise. That is, each data point has an equal chance of being chosen (for example, using an entire database archive, collecting samples across a wide time range).  

  • In some scenarios, two or more categories can overlap. It might be hard to distinguish between categories based on the textual content of samples. In those cases, consider using the Ushur platform’s intelligent data extraction capability. It helps to identify Key Business Indicators (KBI) such as IDs in the email to override the model’s predicted classifications.  

  • If the number of topics or categories is large, and those topics have a lot of overlap, start with fewer categories and progressively add more. The initial set of categories can be chosen based on the volume or availability of data and business value. The Enterprise can also consider process improvements such as merging overlapping categories as the majority of emails are typically handled in a few category queues.

  • Experiment with the wide variety of built-in model types provided in the Ushur platform. Ushur provides simple supervised learning models to neural network-based deep learning models in the platform. Ushur data scientists also offer guidance on the best model to use for the data under consideration.


Was this article helpful?