MongoDB Atlas

Compatibility

Only available on Node.js.

LangChain.js supports MongoDB Atlas as a vector store, and supports both standard similarity search and maximal marginal relevance search, which takes a combination of documents are most similar to the inputs, then reranks and optimizes for diversity.

Setup

Installation

First, add the Node MongoDB SDK to your project:

npm
Yarn
pnpm

npm install -S mongodb

yarn add mongodb

pnpm add mongodb

Initial Cluster Configuration

Next, you'll need create a MongoDB Atlas cluster. Navigate to the MongoDB Atlas website and create an account if you don't already have one.

Create and name a cluster when prompted, then find it under Database. Select Collections and create either a blank collection or one from the provided sample data.

Note The cluster created must be MongoDB 7.0 or higher. If you are using a pre-7.0 version of MongoDB, you must use a version of langchainjs<=0.0.163.

Creating an Index

After configuring your cluster, you'll need to create an index on the collection field you want to search over.

Go to the Search tab within your cluster, then select Create Search Index. Using the JSON editor option, add an index to the collection you wish to use.

{
  "mappings": {
    "fields": {
      // Default value, should match the name of the field within your collection that contains embeddings
      "embedding": [
        {
          "dimensions": 1024,
          "similarity": "euclidean",
          "type": "knnVector"
        }
      ]
    }
  }
}

The dimensions property should match the dimensionality of the embeddings you are using. For example, Cohere embeddings have 1024 dimensions, and OpenAI embeddings have 1536.

Note: By default the vector store expects an index name of default, an indexed collection field name of embedding, and a raw text field name of text. You should initialize the vector store with field names matching your collection schema as shown below.

Finally, proceed to build the index.

Usage

Ingestion

import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
import { CohereEmbeddings } from "langchain/embeddings/cohere";
import { MongoClient } from "mongodb";

const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const namespace = "langchain.test";
const [dbName, collectionName] = namespace.split(".");
const collection = client.db(dbName).collection(collectionName);

await MongoDBAtlasVectorSearch.fromTexts(
  ["Hello world", "Bye bye", "What's this?"],
  [{ id: 2 }, { id: 1 }, { id: 3 }],
  new CohereEmbeddings(),
  {
    collection,
    indexName: "default", // The name of the Atlas search index. Defaults to "default"
    textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
    embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
  }
);

await client.close();

API Reference:

MongoDBAtlasVectorSearch from langchain/vectorstores/mongodb_atlas
CohereEmbeddings from langchain/embeddings/cohere

Search

import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
import { CohereEmbeddings } from "langchain/embeddings/cohere";
import { MongoClient } from "mongodb";

const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const namespace = "langchain.test";
const [dbName, collectionName] = namespace.split(".");
const collection = client.db(dbName).collection(collectionName);

const vectorStore = new MongoDBAtlasVectorSearch(new CohereEmbeddings(), {
  collection,
  indexName: "default", // The name of the Atlas search index. Defaults to "default"
  textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
  embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});

const resultOne = await vectorStore.similaritySearch("Hello world", 1);
console.log(resultOne);

await client.close();

API Reference:

MongoDBAtlasVectorSearch from langchain/vectorstores/mongodb_atlas
CohereEmbeddings from langchain/embeddings/cohere

Maximal marginal relevance

import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
import { CohereEmbeddings } from "langchain/embeddings/cohere";
import { MongoClient } from "mongodb";

const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const namespace = "langchain.test";
const [dbName, collectionName] = namespace.split(".");
const collection = client.db(dbName).collection(collectionName);

const vectorStore = new MongoDBAtlasVectorSearch(new CohereEmbeddings(), {
  collection,
  indexName: "default", // The name of the Atlas search index. Defaults to "default"
  textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
  embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});

const resultOne = await vectorStore.maxMarginalRelevanceSearch("Hello world", {
  k: 4,
  fetchK: 20, // The number of documents to return on initial fetch
});
console.log(resultOne);

// Using MMR in a vector store retriever

const retriever = await vectorStore.asRetriever({
  searchType: "mmr",
  searchKwargs: {
    fetchK: 20,
    lambda: 0.1,
  },
});

const retrieverOutput = await retriever.getRelevantDocuments("Hello world");

console.log(retrieverOutput);

await client.close();

API Reference:

MongoDBAtlasVectorSearch from langchain/vectorstores/mongodb_atlas
CohereEmbeddings from langchain/embeddings/cohere

MongoDB Atlas

Setup​

Installation​

Initial Cluster Configuration​

Creating an Index​

Usage​

Ingestion​

API Reference:

Search​

API Reference:

Maximal marginal relevance​

API Reference:

Setup

Installation

Initial Cluster Configuration

Creating an Index

Usage

Ingestion

Search

Maximal marginal relevance