Select Page



Peter Zhang
Aug 27, 2024 07:48

Together.ai introduces a serverless Rerank API and exclusive access to Salesforce’s LlamaRank model, enhancing enterprise search and Retrieval Augmented Generation (RAG) systems.





In a significant development for enterprise search and Retrieval Augmented Generation (RAG) systems, Together.ai has announced the launch of its new serverless Together Rerank API. This release also includes exclusive access to LlamaRank, a cutting-edge reranker model developed by Salesforce AI Research, according to Together Ai blog.

Revolutionizing Enterprise Search

The newly introduced Together Rerank API is a serverless endpoint designed to integrate seamlessly with enterprise applications. This API simplifies the process for developers, enabling the incorporation of supported reranker models with minimal code. Key features of the API include:

  • Flagship support for Salesforce’s LlamaRank model
  • Support for JSON and tabular data
  • Long 8K context per document
  • Low latency for fast search queries
  • Compatibility with Cohere’s Rerank API

Exclusive Access to LlamaRank

LlamaRank, developed by Salesforce AI Research, has shown superior performance compared to other leading rerank models like Cohere Rerank v3 and Mistral-7B. This model enhances document ranking capabilities, thereby improving the accuracy and efficiency of information retrieval in both RAG and traditional search systems. LlamaRank supports documents up to 8,000 tokens in length and is particularly effective for semi-structured data such as JSON, email, tables, and code.

What is a Reranker Model?

A reranker is a specialized model that improves search relevancy by reassessing and reordering a set of documents based on their relevance to a given query. For example, in a technical support scenario, a user query about resetting a password would result in the reranker prioritizing the most relevant documents, thus enhancing the search results.

How Reranking Improves Search and RAG

Reranking is a critical component in modern search and RAG systems, acting as a quality filter that reassesses initially retrieved documents. This step enhances the quality of information fed into language models, reducing the likelihood of inaccurate or irrelevant results. Rerankers are particularly valuable in enterprise settings, where large volumes of data in various formats require precise and accurate retrieval for decision-making.

Salesforce LlamaRank: A More Accurate Enterprise Reranker Model

Salesforce’s LlamaRank model is a fine-tuned version of Llama3-8B-Instruct, trained using both synthesized data and human-labeled data from Salesforce’s in-house data analysts. The model excels in ranking both general documents and code, making it highly useful for various enterprise applications. Salesforce evaluated LlamaRank on public datasets such as SQuAD, TriviaQA, Neural Code Search, and TrailheadQA, where it demonstrated superior performance.

Together Rerank API

The Together Rerank API is designed to provide a seamless developer experience for building RAG applications. It allows developers to integrate supported reranker models into their enterprise applications easily. The API takes in a query and a set of documents, returning a relevancy score and ordering index for each document. It can also filter responses to show only the most relevant documents.

How to Get Started

To get started, developers can create an API key with Together AI and follow the steps in the quickstart documentation to try Salesforce’s LlamaRank model. For production-scale deployment, enterprises are encouraged to contact Together.ai’s sales team.

Image source: Shutterstock


Share it on social networks