ChatBees tops RAG quality leaderboard

ChatBees, a serverless LLM platform, achieved a 4.6/5 average RAG quality score in a benchmark published by Tonic.ai. 

ChatBees tops RAG quality leaderboard
Do not index
Do not index
ChatBees, a serverless LLM platform, achieved a 4.6/5 average RAG quality score in a benchmark published by Tonic.ai. ChatBees is in the leaderboard beating other popular platforms like Cohere, and OpenAI assistant.
Even more incredibly, we were able to achieve this score without any client-side configuration! Every customer of ChatBees has access to this state-of-the-art RAG system out of the box. You can verify our results here.
The detailed test scores ChatBees vs. OpenAI
notion image
The average scores of the top RAG products.
notion image

Not Just Benchmarks, Try it for your data!

Ultimately, what matters most are the outcomes. While we're thrilled to be at the forefront in accuracy, we’re most excited to help businesses leveraging LLM! Sign up now and try it for yourself!

About ChatBees

ChatBees is built by a team of industry experts and leaders with decades of experience building scalable enterprise cloud, data, and search platforms at top-tier tech companies like Snowflake, Google, Microsoft, Uber and EMC.
We built ChatBees to help organizations tackle the challenge of running LLM apps in production. ChatBees offers:
  • 1. State-of-the-art RAG pipeline out of the box
  • 2. Simple APIs to build your own LLM application in just a few lines of code
  • 3. No devops or tuning required to make this work in production

Benchmarking RAG quality with tonic.ai

The benchmark from Tonic consists of a series of needle-in-a-haystack type of retrieval challenges aimed to evaluate a RAG system’s accuracy and completeness. You can find the complete jupyter notebook with full setup and evaluation code here.
We will highlight the relevant code snippets to set up ChatBees below. You can explore the full features of ChatBees here.

1. Get or Create the test collection

import chatbees as cb
import shortuuid

def get_shared_collection():
    """
    Use the shared collection that already has all essays.
    """
    cb.init(account_id=”QZCLEHTW”)
    return cb.Collection(name=”tonic_rag_test”)

def create_collection():
    """
    Creates a collection and uploads all essays
    """
    cb.init(api_key="my_api_key", account_id="my_account_id")

    col = cb.create_collection(cb.Collection(name='tonic_test' + shortuuid.uuid()))
    col.upload_document('./all_essays_in_single_file.txt')
    return col

collection = get_shared_collection()
# If you prefer to create a new collection to try, call create_collection()
# collection = create_collection()

2. Run retrieval challenges

from tonic_validate import (
    ValidateScorer,
    Benchmark,
    BenchmarkItem,
    LLMResponse,
    BenchmarkItem,
    Run,
)
from tqdm import tqdm

def get_cb_rag_response(benchmarkItem: BenchmarkItem, collection):
    prompt = benchmarkItem.question
    response, _ = collection.ask(prompt)
    return response

raw_cb_responses = []
for x in tqdm(benchmark.items):
    raw_cb_responses.append(get_cb_rag_response(x, collection))

3. Evaluate ChatBees responses

cb_responses = [
    LLMResponse(
        llm_answer=r, llm_context_list=[], benchmark_item=bi
    ) for r, bi in zip(raw_cb_responses, benchmark.items)
]

scorer = ValidateScorer([AnswerSimilarityMetric()])
cb_run = scorer.score_run(cb_responses, parallelism=5)

Related posts

Complete Step-by-Step Guide to Create a RAG Llama SystemComplete Step-by-Step Guide to Create a RAG Llama System
How to Use LangServe to Build Rest APIs for Langchain ApplicationsHow to Use LangServe to Build Rest APIs for Langchain Applications
12 Strategies for Achieving Effective RAG Scale Systems12 Strategies for Achieving Effective RAG Scale Systems
The Ultimate Guide to OpenAI RAG (Performance, Costs, & More)The Ultimate Guide to OpenAI RAG (Performance, Costs, & More)
A Step-By-Step Guide for Serverless AWS RAG ApplicationsA Step-By-Step Guide for Serverless AWS RAG Applications
Complete Guide for Designing and Deploying an AWS RAG SolutionComplete Guide for Designing and Deploying an AWS RAG Solution
In-Depth Look at the RAG Architecture LLM FrameworkIn-Depth Look at the RAG Architecture LLM Framework
Complete RAG Model LLM Operations GuideComplete RAG Model LLM Operations Guide
Decoding RAG LLM Meaning & Process Overview for AppsDecoding RAG LLM Meaning & Process Overview for Apps
What Is RAG LLM & 5 Essential Business Use CasesWhat Is RAG LLM & 5 Essential Business Use Cases
LLM RAG Meaning & Its Implications for Gen AI AppsLLM RAG Meaning & Its Implications for Gen AI Apps
What Are Some RAG LLM Examples?What Are Some RAG LLM Examples?
Enhancing Customer Service with RAG and Knowledge GraphsEnhancing Customer Service with RAG and Knowledge Graphs
Revolutionizing Customer Service with ChatBees AI Ticket AgentRevolutionizing Customer Service with ChatBees AI Ticket Agent
Boost Productivity and Faster Response Times with ChatBees AI Ticket Agent for HubSpotBoost Productivity and Faster Response Times with ChatBees AI Ticket Agent for HubSpot
ChatBees Joins Microsoft for StartupsChatBees Joins Microsoft for Startups