Full Text Search in Minutes with MongoDB Atlas

Ali Mohamad
Major League Hacking
8 min readJun 4, 2020

--

Foreword

Every once in a while, I like to learn a new technical skill , like a new Java library, or a new programming language like Golang. As helpful as reading docs or open source tutorials can be, I’ve often found the best way for me to learn new stuff is from books. They’re curated, establish what fundamentals or prerequisite knowledge you should have before delving in (oftentimes explaining it to you as well!), and then consistently build on the established fundamentals. Of course the next question is: which book do I read?

There’s a lot of data about what books are well-received and what they cover, but I can’t possibly comb through all of that by myself. Luckily as a hacker, I can build something that does it for me! Enter this project, which can recommend books for you based on keywords with MongoDB Full-Text Search and MongoShell!

What is MongoDB Atlas Full-Text Search?

Let’s start from the top.

MongoDB is the premiere NoSQL database architecture and engine. What this basically means is that MongoDB is a tool for you to organize and store your data in a super convenient document format, which is a lot more intuitive than the standard relational, or table-based, database architecture that SQL uses. Here’s an example of a collection of MongoDB documents, which we’ll say represent some Major League Hacking Coaches:

{
“_id”: 0,
“name”: “Ali”,
“home_hackathon”: “HackRU”,
“graduated”: false,
“favorite_movie” : “the social network”
}
{
“_id”: 1,
“name”: “Anuhya”,
“home_hackathon”: “HackUMBC”,
“graduated”: false
}
{
“_id”: 2,
“name”: “Peter”,
“home_hackathon”: “VTHacks”,
“graduated”: false
}
{
“_id”: 3,
“name”: “Kat”,
“home_hackathon”: “HackPSU”,
“graduated”: true
}

As you can see, these are pretty basic objects that share the same properties, but have different values for each document (some documents can even have different properties, like how Ali has a favorite movie!) — this makes them super flexible for any use — if you wanna make a database for any kind of app, you just need to come up with an object (or collection of variables) that keeps track of everything you wanna know about your users!

MongoDB Atlas is a MongoDB Database as a Service client, where your database is hosted in the cloud, and you can access that data with just an API call.

MongoDB Atlas Full-Text Search is a service on top of Atlas. It allows you to find which documents in your collection contain a specific keyword or set of keywords based on a search!

Table of Contents

This tutorial has two main parts, where we’ll go through everything, and by the end, you’ll have a functional API service that you can call to search through our premade book dataset.

  1. Setting Up MongoDB Atlas
  • Creating an Account
  • Creating your Cluster
  • Creating a Collection
  • Importing your Dataset

2. Building Your API with MongoShell

  • Installing MongoShell
  • Connecting to your Altas Cluster
  • Setting Up a Search Index
  • Running your Query

Setting Up MongoDB Atlas

Creating an Account

This part is super simple. If you go to atlas.mongodb.com, and click the green “Try Free” in the top, right-hand corner, it’ll take you to a registration page. You can set up your Atlas account there. After you hit the “Get started free” button to register your account, you’ll be able to set up your cluster.

Creating Your Cluster

Here, you’re going to be setting up a cluster for your database — like we said before, this just means that your database will be hosted in the cloud (basically just a server that isn’t yours) and you’ll be able to access it. You should see this page:

Under Starter Clusters, click the “Create a cluster button”. That’ll take you to this page:

For this project I chose Azure as my provider, and Virginia as my region since I’m writing this guide from the east coast, but any provider and location is fine — pick your personal preference. After hitting the “Create Cluster” button on the bottom right, you’ll see your Atlas dashboard!

Creating Your Collection

Your dashboard will look something like this:

When you see this, you’ll want to wait for your cluster to finish being created, then hit the Collections button (on the left-hand side of your screen, underneath where it says “Cluster0” ) — that’ll let you create a new collection, where you can load in our dataset on books. After you click it, you’ll be sent to this page, where you want to push the “Add My Own Data” button:

This allows you to name your database and collection — we’ll call the database “book-recommendations” and the collection “books”, then hit the create button.

Importing Your Data

After creating your collection, you should notice “book-recommendations” and “books” in the sidebar of the page we just completed — click on “books”. After doing that, it’ll open up an overview of our collection… which should return nothing, because it’s still empty. Let’s fix that by clicking the “Insert Document” button in the right-hand corner of our overview.

That’ll open up an “Insert to Collection” view. From here, the rest is super simple:

  1. Next to the word “View”, click the button such that you get a textbox instead of a more polished view where you can build your document’s structure than fill in the field for each property (Note: this is a super useful feature generally, but less so for us since we’re going to import our data all at once — remember it for other projects!)
  2. Delete everything in the textbox.
  3. Copy / paste the contents of this file into the textbox: books.json
  4. Click the “Insert” button.

And that’s it! You’ll see that our books collection is now populated, and our Atlas cluster is prime and ready for us to use it for some full-text searchin’!

Setting Up MongoDB Atlas

Installing MongoShell

To run your queries, you need to install MongoDB on your machine! Click here for some different guides that walk you through the install process based on your OS.

Connecting to your Atlas Cluster

We want to be able to connect to our Atlas cluster — this means that whenever we run some database commands, the data the commands run on is the data we set up and imported into Atlas. This gives us our endpoint for the dataset we created.

If you go to the front page of your Atlas dashboard, you should see the cluster we created. Click the Connect button all the way to the left:

After that, click the “Add Your Current IP Address” button, and come up with a username and password for your access to the database, and click the “Create MongoDB User” button (whenever we connect to our database, this username and password makes sure that whoever is connecting has permission to do so. You’ll see this later when we actually connect.)

Next, you want to choose a connection method — pick “Connect with the mongo shell”.

From there, just click “I have the mongo shell installed”, copy the command given to you, and put it somewhere safe for now (we’re going to run it in the terminal later!)

Creating a Search Index

The next step in the process is creating a Search Index — this will just make it easier for MongoDB to search through our collection super quickly, making lookup super fast. If you go to the Collections tab, and then our books-data collection’s Search (Beta) tab, we want to click the Create Search Index button:

From here, it’ll show you a Default index — that’s good enough for what we need! Just click the Create Index button and then we’re all good.

Running Your Query

Ok! From here we just need to open up our MongoShell and give it a structured query. In your terminal, run the command that we got from setting up our connection screen:

mongo “mongodb+srv://mlh-full-text-search-9u5aw.azure.mongodb.net/test” — username <your username>

After you run this command and input your password, congrats! You’re in. You can now run queries on your collection. Let’s do a really simple one real quick:

use mlh-fts
db.getCollection(“book-data”).find()

This returns some of the documents we added in Atlas to our book-data collection, but in our own terminal!

Now let’s run some searches. Here is all the code you need to run a search query on this collection:

db.getCollection(“book-data”).aggregate([
{
$searchBeta: {
“search”: {
“path”: “longDescription”,
“query”: <whatever you want!>
}
}
}
])

Let’s break this down:

  • The aggregate() (see docs here) function is MongoDB’s way of returning as much relevant data from a collection as possible from very specific data points that you give it. It’s different from find(), which is a lot more strict in its search, and can accept less fields.
  • $searchBeta (see docs here) is how we leverage full-text search. The search object we pass through $searchBeta contains the parameters we’re searching for while aggregate() iterates over our documents.
  • The path property tells aggregate() where to look, and the query tells it what to look for.

Some examples:

db.getCollection(“book-data”).aggregate([{$searchBeta: {search:{path: “longDescription”, query: “mobile app development”}}}]).pretty()db.getCollection(“book-data”).aggregate([{$searchBeta: {search:{path: “longDescription”, query: “web development”}}}]).pretty()db.getCollection(“book-data”).aggregate([{$searchBeta: {search:{path: “longDescription”, query: “data science”}}}]).pretty()

That’s pretty much it! If you pass literally any programming-related phrase to query as a string, you’ll get a body of relevant books.

Post Script

You’re done! Congrats — you should be immensely proud of yourself. This project is a straightforward of an application of Full-Text Search, and here’s some other really cool stuff you can use it for:

  • Sentiment analysis on huge bodies of content (like Tweets, YouTube comments, or Reddit threads )
  • Searching through books for key phrases at a ridiculous speed
  • Basically anything that would require you to find something specific in a wall of text lightning-fast!

Good luck and happy hacking!

--

--

an advocate for intersectionality in tech activism, and an enthusiast of tech intersecting with the arts.