Friday, April 27, 2012

A More Scientific Process for Boosting Quality Assurance





Current State

Most contact centers employ a Quality Assurance program, which consists of reviewing customer interactions and grading the performance of an agent or self-serve technology. For this brief article, I am going to focus on live rep interactions. In nearly all cases, the measurement is based upon behavioral performance against well-defined standards and measured on a well-defined scale.

Many companies now supplement or even place more priority on Customer Satisfaction or C-SAT measurement, which is obtained via customer survey either directly or sometime after the interaction.

The Problems
There are significant problems with both methods, especially where measurement of employee performance is concerned.

The two biggest issues are:
1. Low sample size
2. High levels of subjectivity

Quality Assurance Review
For a sample size to be statistically relevant, you're typically targeting 5-10%. The sample size however for the manual interaction review will be well below 1% due to the time expense to thoroughly review an interaction, often multiple times. Though over time these measurements will begin to show some consistency and dependability of trends, frequency and subjectivity are sore spots not only for the business but also pose distractions for the employees themselves. The speculation is often, "but why did they pick that call? ... my other calls are so much better" or "the other Quality Specialist gave me good grades but this one doesn't."

C-SAT
The sample size for customer survey is typically somewhere between 2 and 6% on the good side and lower in some environments. While it is generally a great practice to survey your customers, this is also highly subjective and results are inconsistent. If you're asking customers to rate on a scale from 1 - 10, one customer's 9 is not necessarily another customer's 9. Also, despite how the questions are framed, customers still show a tendency to punish agents when they are dissatisfied with a company policy (like cases where an agent is required to say "no"). As with the Quality Assurance efforts, a center will be able to see some useful trends over time and will probably identify highly substandard interactions quickly but as a dependable employee performance measurement it is heavily lacking.

The Solution
We need to increase sample size to well above statistical relevance and remove as much subjectivity as possible.

The way to do this is with the use of Speech or Text Analytics technology, ('speech' for voice and 'text' for chat, email, or social media support interactions). Unfortunately, most of today's analytical solutions are devoid of the most important component - a modeling component - though the availability of this technology is on the rise.

Step 1 - build the model
Run a large sample of both exemplary and poor interactions through the modeling engine. A good modeling engine will be able to compose a model on hundreds if not thousands of statistical attributes.

Step 2 - develop a scoring process against your model
Your modeling engine shall be able to analyze a new interaction against the model interaction and assign a score (matches on positive attributes and fewer matches on negative attributes). 

Step 3 - calibrate the model
If you plan to use this measurement in concert with your other measurements, the scale should mirror those other measurements, i.e. a call measured by the model to be 90% should align with a personally reviewed interaction scored at 90%.

Step 4 - apply the model at scale
Once you have a model, you will use the engine to grade many interactions. Theoretically, you could even assess every interaction collected in your center (many centers are required by law to record every interaction). For performance measurement, every interaction for each agent can be run against the model in order to come up with an average machine score, which I'll call a "soft score" if I am communicating the process to employees (sounds a bit more warm and fuzzy, right?).

Step 5 - continually refine the process 
Business climates, customer preferences, and standards change, so it will be important to rebuild your model with some regularity. How you use the soft score is up to you and there are options. The idea I present here is not necessarily to replace your other measurements but to inject more statistical relevance and objectivity into your ongoing efforts. The soft score is probably best used as a given percentage of an overall measurement, which includes Quality Assurance monitoring and C-SAT. At the very minimum, this process could be used to calibrate your staple efforts and find points of inconsistency among Quality Specialists on the performance review side, or perhaps products, services, or policies on the business side.