• The dashboard card Ratings by Category focuses on the trend across categories for each agent. It looks specifically at the performance per category, not the performance per conversation. Differences between performance per category and performance per conversation can be caused by fail categories or differences in the number of ratings per category. 

The dashboard card 'Scores by conversation' allows you to look at individual conversations per agent.


Team Meow uses the following Rating Categories, with different weights attached to those. The Critical Categories only have a weight of 0.05, because they are either fail or pass.

What this means for the calculation:

The highest possible rating for a an agent is: 0.05+0.05+2+2+1+1+1+0.5+1+1=9.6 >> 100%

Where does the data on the dashboard come from? 

Rating by category

This card compares ratings by category per agent. It does not compare conversation averages for agents, but category averages. 

Each column shows the average of the category ratings given to a specific agent. The score column averages those averages taking into account the weights of each category. 

Note that differences can arise when not all ratings had the same number of ratings. 

In this example, the average of grammar is 100%.
Grammar has a weight of 0.05 - so this is the proportion to which it will be included in the final score. 

Relation of individual conversation scores to dashboard conversation scores

If you want to find out the exact conversation scores for your agent, use the last dashboard card (filtering for that agent).

If you want to find out the category scores for your agents, independent of fail categories, use the Ratings by categories card. 

The main difference here is that a negative fail category here will not fail the entire conversation. In order to allow for finding trends even if a fail category is triggered every single time, the calculation in these general cards treats each individual category as its own entity. 

This allows you to analyze those categories across agents, independent of the errors that they might have made in fail categories.

A practical example for Individual Conversation Scores

For the sake of visualization, let’s imagine the following scenario, with these 5 categories: 

Agent 1 has received the following ratings (you can get this data from the VERY LAST CARD on the dashboard)

The conversation score here is calculated: 

ticket_score = (cat1_score * cat1_weight + cat2_score* cat2_weight + ...) / (cat1_weight + cat2_weight + ...)

In this case, this means: 

ticket_score = (request_Score * 0.05 + clarification_score* 2 + explanation_score* 2 + writing_score* 0.5 + internal-data_score* 1  / (0.05 + 2 + 2 + 0.5 = 1) >> unless request_Score = 0%, then 0 %

In the dashboard, these numbers will be rounded.

The conversation score of conversation 5 is 0, because the FAIL category automatically puts this to zero. 

Taking these same conversations together as category scores gives the following data: 

Agent A

The score here is the average of the averages, not the average of the averages per conversation.

Why would we want that differentiation?

The average per category highlights opportunities for improvement far more explicitly than the average per conversation, before then diving into specifics themselves (in the last dashboard card). It’s like a 10,000m view of the performance instead of a granular view. 

Both calculations have their place and are used by different types of users looking at the dashboard. 

Did this answer your question?