MongoDB Atlas
Only available on Node.js.
LangChain.js supports MongoDB Atlas as a vector store, and supports both standard similarity search and maximal marginal relevance search, which takes a combination of documents are most similar to the inputs, then reranks and optimizes for diversity.
Setup
Installation
First, add the Node MongoDB SDK to your project:
- npm
- Yarn
- pnpm
npm install -S mongodb
yarn add mongodb
pnpm add mongodb
Initial Cluster Configuration
Next, you'll need create a MongoDB Atlas cluster. Navigate to the MongoDB Atlas website and create an account if you don't already have one.
Create and name a cluster when prompted, then find it under Database
. Select Collections
and create either a blank collection or one from the provided sample data.
Note The cluster created must be MongoDB 7.0 or higher. If you are using a pre-7.0 version of MongoDB, you must use a version of langchainjs<=0.0.163.
Creating an Index
After configuring your cluster, you'll need to create an index on the collection field you want to search over.
Go to the Search
tab within your cluster, then select Create Search Index
. Using the JSON editor option, add an index to the collection you wish to use.
{
"mappings": {
"fields": {
// Default value, should match the name of the field within your collection that contains embeddings
"embedding": [
{
"dimensions": 1024,
"similarity": "euclidean",
"type": "knnVector"
}
]
}
}
}
The dimensions
property should match the dimensionality of the embeddings you are using. For example, Cohere embeddings have 1024 dimensions, and OpenAI embeddings have 1536.
Note: By default the vector store expects an index name of default
, an indexed collection field name of embedding
, and a raw text field name of text
. You should initialize the vector store with field names matching your collection schema as shown below.
Finally, proceed to build the index.
Usage
Ingestion
import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
import { CohereEmbeddings } from "langchain/embeddings/cohere";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const namespace = "langchain.test";
const [dbName, collectionName] = namespace.split(".");
const collection = client.db(dbName).collection(collectionName);
await MongoDBAtlasVectorSearch.fromTexts(
["Hello world", "Bye bye", "What's this?"],
[{ id: 2 }, { id: 1 }, { id: 3 }],
new CohereEmbeddings(),
{
collection,
indexName: "default", // The name of the Atlas search index. Defaults to "default"
textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
}
);
await client.close();
API Reference:
- MongoDBAtlasVectorSearch from
langchain/vectorstores/mongodb_atlas
- CohereEmbeddings from
langchain/embeddings/cohere
Search
import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
import { CohereEmbeddings } from "langchain/embeddings/cohere";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const namespace = "langchain.test";
const [dbName, collectionName] = namespace.split(".");
const collection = client.db(dbName).collection(collectionName);
const vectorStore = new MongoDBAtlasVectorSearch(new CohereEmbeddings(), {
collection,
indexName: "default", // The name of the Atlas search index. Defaults to "default"
textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});
const resultOne = await vectorStore.similaritySearch("Hello world", 1);
console.log(resultOne);
await client.close();
API Reference:
- MongoDBAtlasVectorSearch from
langchain/vectorstores/mongodb_atlas
- CohereEmbeddings from
langchain/embeddings/cohere
Maximal marginal relevance
import { MongoDBAtlasVectorSearch } from "langchain/vectorstores/mongodb_atlas";
import { CohereEmbeddings } from "langchain/embeddings/cohere";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const namespace = "langchain.test";
const [dbName, collectionName] = namespace.split(".");
const collection = client.db(dbName).collection(collectionName);
const vectorStore = new MongoDBAtlasVectorSearch(new CohereEmbeddings(), {
collection,
indexName: "default", // The name of the Atlas search index. Defaults to "default"
textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});
const resultOne = await vectorStore.maxMarginalRelevanceSearch("Hello world", {
k: 4,
fetchK: 20, // The number of documents to return on initial fetch
});
console.log(resultOne);
// Using MMR in a vector store retriever
const retriever = await vectorStore.asRetriever({
searchType: "mmr",
searchKwargs: {
fetchK: 20,
lambda: 0.1,
},
});
const retrieverOutput = await retriever.getRelevantDocuments("Hello world");
console.log(retrieverOutput);
await client.close();
API Reference:
- MongoDBAtlasVectorSearch from
langchain/vectorstores/mongodb_atlas
- CohereEmbeddings from
langchain/embeddings/cohere