Atlas Search Local and Testcontainers

Photo by CHUTTERSNAP on Unsplash

Atlas Search Local and Testcontainers

Running Search on Local machine

Thanks to Andrea Basile for the initial ideas for this research.

Testcontainers is a very popular open-source library, ideal to quickly spin up a database, a message broker, or any other external dependency in a local machine, through code.

Under the hood, this library uses Docker images to run containers.
It has implementations in the most popular programming languages and there are additional packages that help the integration with the many different dependencies. Indeed, if a specific integration is missing, it can still run docker images.

This article uses C# as a reference language for the sample code but the API structure is shared among the different languages.

As stated by the title, the use case is the integration with Atlas Search (both text search and vector search).
The Docker image for this purpose is the one provided by Atlas:

mongodb/mongodb-atlas-local

The code used can be found at:

https://github.com/menalb/testcontainers-samples/tree/main/atlas-console

Atlas Search is part of the cloud ecosystem built around the MongoDB database.
In particular, among the various services that Atlas provides, there are:

  • full-text search

  • vector search

They both use the concept of index on the MongoDB collection to implement additional functionalities.

Setup

First, install the necessary package to use Testcontainer with MongoDB, the database that must be used for Atlas.

dotnet add package Testcontainers.MongoDb --version 4.1.0-beta.12035377103

Once the package is installed, initialize the container using C#

using Testcontainers.MongoDb;

var username = "mongo_username";
var password = "mongo_password";
MongoDbContainer container = new MongoDbBuilder()
    .WithImage("mongodb/mongodb-atlas-local")
    .WithUsername(username)
    .WithPassword(password)
    .WithEnvironment("MONGODB_INITDB_ROOT_USERNAME", username)
    .WithEnvironment("MONGODB_INITDB_ROOT_PASSWORD", password)
    .Build();

await container.StartAsync();
...
await container.StopAsync();

This code is quite self-explanatory: it tells MongoDbBuilder (the Testcontainers implementation for MongoDB) to download the Atlas local Docker image and initialize it with a specific admin user for the database.
Then it starts the container and, at the end, it stops.

Note that the code is setting directly the credentials both via the specific fluent API and through the environment variables. This is because atlas-local requires these specific variables to have a proper user setup.

By running this code, right after the container is started, on Docker Desktop, we’ll see that there are two new running containers:

Running containers

One is testcontainers/ryuk which is the container used by Testcontainers to manage the other containers. It also takes care of disposing of them when they are disposed of. As a last thing, when Testcontainers is disposed of, it removes itself.
The other one is the atlas-local container.

Because this code uses the MongoDB package, the API provides methods to get some basic information, like the database connection string:

var connectionString = container.GetConnectionString();

With this, it is then possible to instantiate MongoClinet to connect to the database.

Full-Text Search Index

There are a couple of ways the index can be created.

With an instance of MongoClient, it would be enough to get an instance that the collection and apply the index to it using C#

collection.SearchIndexes.CreateOneAsync(....)

However, it is common to have the structure of the index in a separate JSON file and have it created by a deployment pipeline or by a person who administers the database.

Suppose that the structure of the index in a file indexes\text.json.

To create the index, it is possible to just run mongosh inside the container with this code.

 var indexScript = await File.ReadAllTextAsync(indexFilePath);
 var name = collection.CollectionNamespace.CollectionName;
 var command = new[]
 {
     "mongosh",
     "--username", _username,
     "--password", _password,
     "--quiet",
     "--eval",
     $"db.{name}.createSearchIndex('{indexName}',{indexScript})"
 };
 var result = await Container.ExecAsync(command);

 Console.WriteLine(result.Stdout);
 Console.WriteLine(result.Stderr);
  • Reads the JSON file with the structure of the index

  • Retrieves the name of the collection

  • Builds the mongosh script to execute (using the createSearchIndex command).

  • Prints the output of the command: standard out and standard err

This is quite simple.

Mind that, the index creation is not an instant process.
Even if the output of the command is the name of the index, it does not mean it is ready to use. This could require some seconds, depending on the hosting machine.
To monitor the status of this process:

var index = await collection.SearchIndexes.ListAsync(indexName);
var first = await ind.FirstOrDefaultAsync();
var s = TryGetValue<string>(first, "status");
Console.WriteLine(s);

This code gets the specific index from the available indexes and prints its status.
The SearchIndexes operation returns a BsonDocument. To retrieve the status property, the code uses the TryGetValue method.

private static T? TryGetValue<T>(BsonDocument document, string name)
{
    if (!document.TryGetValue(name, out var value))
    {
        return default;
    }

    var result = BsonTypeMapper.MapToDotNetValue(value);
    return (T)result;
}

With this information, it is possible to code a utility method that checks if the index is READY or in any other status.

Vector Search Index

The steps to create a vector search index and check for its status are the same as the ones used for the full-text search index.

var indexScript = await File.ReadAllTextAsync(indexFilePath);
var name = collection.CollectionNamespace.CollectionName;
var command = new[]
{
    "mongosh",
    "--username", _username,
    "--password", _password,
    "--quiet",
    "--eval",
    $"db.{name}.createSearchIndex('{indexName}','vectorSearch',{indexScript})"
};
var result = await Container.ExecAsync(command);

Console.WriteLine(result.Stdout);
Console.WriteLine(result.Stderr);

The only difference with the full-text search index is that the createSearchIndex uses the vectorSearch parameter to pass the specific index type.

Dispose

Keep in mind of disposing of the resources used in the code.

mongoClient.Dispose();
await container.StopAsync();
await container.DisposeAsync();

Stop the specific container in use and call the Dispose method to let Tescontainers (the ryuk container) free all the resources.

After a couple of seconds from the finish of the execution of the code, the used containers should disappear from the Docker Desktop UI.

Considerations

Testcontainers is a useful tool for running integration testing on the local machine and with CI.
It can be a handy alternative to using docker for managing the app’s specific dependencies.

Its huge ecosystem and support for many programming languages, make it easy to adopt.
In addition, it preserves the ability to run commands inside the container, allowing specific customizations not built into the library.

One issue that probably will be addressed in the future by the Testcontainers team is the lack of support for alternative OCI container management tools, like podman.
There are some workarounds to make it work, almost li having docker.
However, it is not the same experience. For podman, the ryuk container might be unable to dispose of itself after disposing of the other containers. Check this page for help with non docker installations:

Code repository for this article:
https://github.com/menalb/testcontainers-samples/tree/main/atlas-console