Articles in this section

Using Agent QA to Automatically Evaluate Agent Performance

Agent QA automatically analyzes your support agents’ responses using customizable, AI-powered rubrics. With this tool, support leaders can consistently measure agent performance—like resolution quality, grammar, and empathy—without relying on manual reviews. You’ll learn how to set up rubrics, interpret QA results, and use insights to coach agents at scale.

When to Use This Feature

Use Agent QA when:

  • You want to measure agent performance across all tickets, not just a sample
  • Manual QA is too slow or inconsistent for your team’s needs
  • You’re scaling your support team and want to enforce quality standards

💡 Agent QA is available for Pro and Enterprise plans.

Why It Matters

Automated QA at Scale

Evaluate every support ticket using structured, AI-powered rubrics instead of manually reviewing small samples.

Performance Insights You Can Act On

Agent QA dashboards reveal which agents are excelling and where coaching is needed, with rubrics tailored to your standards.

Custom QA, Your Way

You decide what matters— grammar, empathy, accuracy, or customer satisfaction. Build rubrics that reflect your internal policies and customer expectations.

How It Works

Step 1: Connect Your Help Desk

Before you can use Agent QA, connect your help desk to Forethought:

  • Go to Settings > Integrations
  • Click + Connect new integration (top right)
  • Search for your help desk (e.g., Zendesk, Salesforce)
  • Select it and follow the on-screen instructions

Step 2: Create AI QA Rubrics

Rubrics allow you to consistently score interactions using either agent-focused or user-focused metrics. 

To create a custom rubric:

  1. Go to Discover > Agent QA.
  2. In the upper right corner, click QA rubrics.
  3. On the upper left side, select either the Agent metrics or User metrics tab.
  4. Click + Create New to build your custom metric.
     

💡 Prebuilt rubrics available:

Forethought provides prebuilt rubrics to help you get started quickly.

Agent-focused rubrics:

  • Grammar
  • Empathy
  • Closing
  • Solution Offered

User-focused rubrics:

  • Starting Sentiment
  • Ending Sentiment

Note: You can create up to 10 custom rubrics. If you want to create additional rubrics, please contact your Customer Success Manager.

Agent Metrics

Agent metrics enable you to perform quality assurance on how effectively your support agents manage tickets. You can measure their performance against your organization’s standards, such as resolution quality, technical accuracy, communication clarity, and adherence to internal processes.

Configure your metrics such as:

  1. Metric Name: Provide a clear and descriptive name for your metric.
  2. Metric Definition: Define the criteria the AI will use to assess agent performance. For best practice, use structured, specific rules.
  3. Scoring Definition: Describe how the AI should calculate the score based on your defined rules.

Example of a custom Agent metric:

Metric Name: Resolution Quality

Metric Definition:

This metric measures whether the agent provided a clear, complete, and accurate resolution to the customer's issue within their public replies. Focus strictly on the presence or absence of key resolution elements stated in the criteria below. Do not evaluate grammar, tone, or formatting unless it directly impacts clarity.

 

Criteria for Evaluation:

  1. The agent directly answers or addresses the customer’s main question or issue.
  2. The resolution includes necessary next steps, links, or instructions (if applicable).
  3. The resolution is factually accurate based on available information.
  4. There is no misinformation or irrelevant content that could confuse the customer.
  5. The response avoids excessive technical jargon unless the customer has already demonstrated familiarity.

Scoring Definition:

  • Evaluate only the public agent messages in the thread.
    • If there is no public agent message, assign a score of “N/A.”
  • Start with a score of 100.
  • Deduct 20 points for each of the following resolution issues found (up to 100 total):
    • The agent did not directly address or resolve the issue.
    • The resolution lacks key steps or required follow-up info.
    • The information provided is inaccurate or misleading.
    • The message contains unnecessary or confusing content.
    • The message uses overly technical language not aligned with the customer’s level of understanding.

 

Best Practice:

To ensure visual consistency and better chart readability across the platform, we highly recommend using a 0–100 scoring range whenever possible.

User Metrics

User metrics help you perform quality assurance on the customer experience by assessing sentiment, engagement, or behavior in response to a support interaction. These insights help you understand how users perceive the support they received and whether their needs were effectively addressed.

Configure your metrics such as:

  1. Metric Name: Provide a clear and descriptive name for your metric.
  2. Metric Definition: Define how the AI should analyze user behavior or sentiment. For best practice, use structured, specific rules.
  3. Scoring Definition: Explain how the metric will be scored based on user engagement or sentiment.

Example of a custom User metric:

Metric Name: User Engagement

Metric Definition:

This metric measures whether the user dropped off before receiving sufficient instructions or information. A confirmed resolution in the conversation is not required.

Scoring Definition:

Yes, the user dropped off  (Score:  1 - Low Engagement)

Examples:

  • The user stopped responding after being asked for more information to proceed to resolution.
  • The user reopened the ticket with unrelated issues post-resolution, which were not addressed.

No, the user did not drop off (Score: 5 – High Engagement)

Examples:

  • The user confirmed the solution worked or expressed satisfaction (e.g., "Thanks, that fixed it!").
  • The user indicated no further help was needed (e.g., "All good now").
  • The user explicitly closed the loop (e.g., "Thank you" with no outstanding questions or concerns").
  • The user did not respond after receiving sufficient instructions or information, which was interpreted as silent acceptance.

 

Important:

In this example, we’re using a 1–5 scale for scoring. While you can set custom score ranges in your Scoring Definition, please be aware that non-standard ranges like 1–5 may not display well on radar charts.

 

Best Practice:

To ensure visual consistency and better chart readability across the platform, we highly recommend using a 0–100 scoring range whenever possible.

Step 3: Test and Save Your Metric

  • Enter a ticket ID into the test field
  • Click Test to generate a sample evaluation
  • Repeat for at least 5 different tickets for better accuracy
  • Click Save when you're satisfied

⚠️ Rubrics only apply to tickets created after they're activated. Past tickets will not be scored retroactively.

Reviewing QA Insights in the Agent QA Dashboard

The Agent QA dashboard offers valuable insights into your support team's performance using both default and customized rubrics. Here’s how to navigate and understand the dashboard.

Default Metrics

 

Once you connect your help desk and enable Agent QA, the following metrics will automatically appear at the top of the dashboard:

  • Total Received Tickets – The number of tickets received in your help desk for the selected time period.
  • Tickets Assigned – The number of tickets assigned to agents.
  • Resolved Tickets – The number of tickets marked as resolved. The resolution rate is calculated by dividing resolved tickets by assigned tickets.
  • First Contact Resolution – The number and percentage of tickets resolved in the first response.
  • Average Full Resolution Time – The average time taken to fully resolve a ticket.
  • Average Time to First Response – The average time taken for an agent to send the first response.

Agent Performance Comparison 

AI QA Tab

 

After adding your custom rubrics, the AI QA tab displays a time series comparison chart:

  • Y-axis: Scores across custom metrics.
  • X-axis: Time (daily, weekly, monthly, or quarterly).
  • Filters: Use filters to adjust the time range.

Click on a data point to view detailed AI QA insights

Ticket Tab

 

In the Ticket tab, default metrics are plotted over time:

  • Y-axis: Number of tickets or time in seconds.
  • X-axis: Time period.
  • Filters: Select daily, weekly, monthly, or quarterly views.

Click on a data point to view detailed ticket details.

Agent Insights and Radar Charts

Agent Cards

 

Below the dashboard charts, you’ll find agent summary cards that include:

  • Agent name
  • Number of solved tickets
  • Average QA score
  • Worst-performing topic
  • Radar chart with rubric scores

By default, cards are sorted by solved ticket count.

Understanding the Radar Chart

Each radar chart provides a visual breakdown of an agent’s performance across all custom QA rubrics:

  • Each axis represents a different metric.
  • Higher scores extend further toward the edge.
  • A balanced, expansive shape indicates well-rounded performance.

Radar charts help managers quickly identify an agent’s strengths and uncover areas that may benefit from targeted coaching.

Viewing QA Insights for a Specific Agent

You can view detailed QA insights for any agent by clicking their card on the Agent QA dashboard. This gives you a full breakdown of how they're performing based on your custom QA rubrics.

Agent Overview

At the top of the page, you'll see:

  • Radar charts that show the agent's performance across your custom rubrics (both agent and user metrics).

     

  • A card showing the least efficient topics—these are topics with the lowest first contact resolution rates. They help you spot areas where the agent might need extra support or training.

     

AI QA - Agent Report

Click the AI QA – Agent card to view the agent’s scores across your custom agent rubrics. In addition to the scores, you’ll see:

  • A summary and reasoning behind each score
  • A comparison to the previous time period
  • A date filter that lets you choose the time range for the comparison

This report helps you track how an agent’s core support skills—like grammar, empathy, and solution quality—are improving or declining over time. It allows you to pinpoint where they’ve made progress or where additional coaching may be needed.

AI QA - User Report

Click the AI QA – User card to view the agent’s scores across your custom user rubrics. Similar to the Agent Report, this view includes:

  • A summary and explanation of each score
  • A comparison to the previous period
  • A date filter to adjust the comparison timeframe

This report gives you insight into how users are experiencing support interactions with the agent, such as whether sentiment improved from the start to the end of the conversation. These insights help you identify how the agent’s tone and communication style impact the customer experience, making it easier to coach for improvements in empathy, clarity, and overall customer satisfaction.

Default Metrics

 

Below the overview cards, you'll find default performance stats, such as:

  • Tickets assigned
  • Tickets solved
  • First contact resolution rate
  • Full resolution time
  • Time to first response

These numbers give you a quick snapshot of the agent’s workload and response times.

Performance Over Time

 

Next, you'll see a time series comparison chart that tracks how an agent’s QA scores evolve over time, along with the tickets they worked on (based on default metrics). This chart reflects performance across your custom QA rubrics and compares results to the previous time period.

You can use this chart to:

  • Monitor trends in performance
  • Spot improvements or areas that need attention
  • Filter results by day, week, month, or quarter to view short- or long-term progress

This view makes it easy to see how agents are performing over time and identify when additional coaching or recognition might be needed.

Tickets

 

At the bottom of the page, you'll find a table showing all tickets the agent has worked on. This includes:

  • Timestamp
  • Ticket ID
  • Ticket title
  • Ticket body
  • Agent name
  • Ticket status
  • Support channel
  • Topic
  • QA scores based on your custom rubrics

This section is great for digging into specific examples or reviewing tickets that impacted the agent’s overall score.

Ticket Details

Click any row in the Tickets table to open a side drawer with more information about the interaction. This includes:

  • Scores across all QA rubrics (agent and user rubrics), with indicators for positive, neutral, or negative performance
  • Basic ticket information, including date created, channel, ticket status, etc.
  • A full transcript of the conversation between the agent and the customer

This view provides the context needed to understand why a score was assigned and helps QA reviewers give more informed feedback.

Best Practices

Start with 3–5 key metrics

Focus on a handful of important metrics like resolution quality, grammar, and empathy. Too many metrics can dilute focus and make trends harder to interpret.

Test each rubric before activating

Use sample ticket IDs to verify how your metrics perform across different scenarios. This helps ensure scoring accuracy before rollout.

Use specific, structured scoring rules

Avoid vague criteria. Instead, define clear, measurable standards (e.g., "Agent must confirm resolution in final message").

Review radar charts regularly for coaching trends

Radar charts help visualize agent strengths and weaknesses. Use them to guide 1:1 coaching and team-wide training plans.

💡 Tailor rubrics to your internal QA goals

Align your metrics with your existing coaching or performance standards—don’t feel limited by default KPIs.

Avoid overly subjective metrics

Metrics based on tone or intent should still include clear definitions to ensure consistent scoring.

Don’t reuse rubrics without testing

Each team is different—rubrics should reflect your specific workflows and customer expectations.

Use Cases

Measuring Agent Performance at Scale

A support team of 100 agents connects their help desk to Forethought and builds custom rubrics like resolution quality and grammar accuracy. Agent QA then evaluates 100% of tickets. This gives the team consistent, scalable insights across all agents—without the manual QA burden.

Coaching Agents with AI-Powered Insights

A manager identifies an agent with consistently low grammar scores. Using Agent QA insights, they create a personalized 1:1 coaching plan. After regular feedback sessions, the agent’s scores and customer satisfaction show steady improvement.

Q&A

Can I apply rubrics to past tickets?

No. Rubrics only apply to tickets created after the rubric is activated.

How long before I see QA scores after setup?

QA scores typically appear within 24 hours of rubric activation.

Can I track customer sentiment or behavior?

Yes. You can use User Metrics to evaluate sentiment, tone, and engagement.

What’s the best number of metrics to use?

We recommend starting with 3–5 custom metrics to keep your QA rubric focused and manageable.

What’s the minimum ticket volume needed for Agent QA to work?

There’s no minimum ticket volume required. However, we’re currently testing performance limits at higher volumes.

What must be included in a ticket for Agent QA to run? Are there any channel-specific requirements?

Agent QA requires that a ticket has a Closed or Resolved status. There are no channel-specific limitations.

Which agents will be assessed by Agent QA? Can we exclude certain agents?

All agents are assessed by default. Agent QA reads the conversation between the agent and customer to generate scores. You currently can't exclude individual agents from evaluation.

Can we exclude certain tickets from QA scoring?

No. Agent QA evaluates all closed or resolved tickets. There is no ticket-level filtering at this time.

How many custom QA rubrics can we create?

You can create up to 10 custom QA rubrics. If you want to create additional rubrics, please contact your Customer Success Manager.

Do we need to purchase Discover to use Agent QA?

No. Agent QA is available without Discover. However, some features—like Worst Performing Topics—require a Discover subscription.

Which help desks are supported?
  • Zendesk
  • Salesforce
  • Freshdesk
If a ticket is reopened, will the AI metric be re-run?

Yes. Agent QA will re-run the AI metric after the ticket is closed or resolved again.

Note: Detecting whether a ticket was reopened depends on your help desk setup. Agent QA uses the first resolution time to identify reopened tickets.

If multiple agents respond in the same ticket, how does scoring work?

Agent QA scores at the ticket level, not per agent. All agents involved will receive the same QA score for that ticket.

What content does Agent QA analyze?

Agent QA evaluates the text-based conversation between the agent and customer. Attachments like files, audio, or images are not currently analyzed.

If we add a new AI metric, will past tickets be scored using it?

No. New metrics only apply to future tickets created after the metric is activated.

If an AI metric is deleted, what happens to previous scores?

Scores generated by that metric will be cleared from the tickets.

How often does Agent QA run its evaluations?

Agent QA runs once per day. Scores are usually available within 24 hours of ticket closure.

What happens if the daily QA job fails?

If QA prediction fails (e.g., due to OpenAI or internal issues), the next day’s run will automatically process any missed tickets.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.

Support

  • Need help?

    Click here to submit a support request. We are here to assist you.

  • Business hours

    Monday to Friday 8am - 5pm PDT excluding US holidays

  • Contact us

    support@forethought.ai