Querying Data using MongoDB

Blog | June 2, 2022 | By Kartik Nagarajan

Mastering MongoDB: A Guide to Querying Data

Querying Data in MongoDB: Techniques and Best Practices

Unleashing the Power of MongoDB: Effective Data Querying Strategies

What is MongoDB?

MongoDB is a document database designed for ease of application development and scaling. It is one of the widely used Nosql database.

What is the diff between SQL and No Sql database?

SQL is the programming language used to interface with relational databases. (Relational databases model data as records in rows and tables with logical links between them).

NoSQL is a class of DBMs that are non-relational and generally do not use SQL.

Depending on a business use case , it would be advised to use either of them.

Getting Started with MongoDB Querying: Basic CRUD Operations

Advanced Querying Techniques: Unlocking MongoDB’s Full Potential

Query Optimization: Tips for Improving MongoDB Query Performance

Real-World Examples: Applying MongoDB Querying Techniques in Practice

Introduction to MongoDB Query Language: Syntax and Basic Commands

Aggregation Pipeline: Harnessing MongoDB’s Aggregation Framework for Complex Queries

Indexing Strategies: Optimizing Query Performance with MongoDB Indexes

Case Studies: Exploring Use Cases and Solutions for MongoDB Querying Challenges

1. Steps to create and connect to MongoDB

a). Install Mongo shell: The mongo shell is an interactive JavaScript interface to MongoDB. You can use the mongo shell to query and update data as well as perform administrative operations.

Link to Mongo DB documentation: https://www.mongodb.com/docs/mongodb-shell/

Preview

The MongoDB Shell, mongosh, is a fully functional JavaScript and Node.js 14.x REPL environment for interacting with MongoDB deployments. You can use the MongoDB Shell to test queries and operations directly with your database.

b). Mongosh Installationhttps://www.mongodb.com/docs/mongodb-shell/install/#std-label-mdb-shell-install

Preview

c). Install msi from below link: The downloaded mongosh shell exe file needs to be installed now. Please use think link for your reference – https://www.mongodb.com/try/download/shell?jmp=docs

Preview

d). After successful download of msi file, you would need to extract it and add the monosh binary to your path environment by doing the following steps:

  • Open the Control Panel.
  • In the System and Security category, click System.
  • Click Advanced system settings. The System Properties modal displays.
  • Click Environment Variables.
  • In the System variables section, select Path and click Edit. The Edit environment variable modal displays.
  • Click New and add the filepath to your mongosh binary.
  • Click OK to confirm your changes. On each other modal, click OK to confirm your changes.

2. Create a database in Cloud using Mongo DB Atlas on onprem depending on your requirement:

Note: Add your current IP Address to the Access List.

  • Click on browse collections
  • Important Keywords to take note of
    • Cluster: A sharded cluster in MongoDB is a collection of datasets distributed across many shards (servers) in order to achieve horizontal scalability and better performance in read and write operations. Sharding is very useful for collections that have a lot of data and high query rates.
    • Collection : In layman terms collections are equivalent to Tables. Databases are sets of collections.
    • Documents: Collections store records, which are referred to as documents normally called as rows.

3). Import and Upload sample data to the database created

  • We would be using Mongo import to perform the import operation

To run mongoimport, install mongo db tools using below link

https://www.mongodb.com/try/download/database-tools

  • Use the below command to perform the import
mongoimport --uri mongodb+srv://yd2281:<PASSWORD>@cluster0.lx1fw.mongodb.net/<DATABASE> --collection <COLLECTION> --type <FILETYPE> --file <FILENAME>

In the below example I have tried to import another dataset names historical events for your reference

  • We are importing a restaurant json data : following is the preview of our json file
{
  "address": {
    "building": "1839",
    "coord": [
      -73.9482609,
      40.6408271
    ],
    "street": "Nostrand Avenue",
    "zipcode": "11226"
  },
  "borough": "Brooklyn",
  "cuisine": "Ice Cream, Gelato, Yogurt, Ices",
  "grades": [
    {
      "date": {
        "$date": 1405296000000
      },
      "grade": "A",
      "score": 12
    },
    {
      "date": {
        "$date": 1373414400000
      },
      "grade": "A",
      "score": 8
    },
    {
      "date": {
        "$date": 1341964800000
      },
      "grade": "A",
      "score": 5
    },
    {
      "date": {
        "$date": 1329955200000
      },
      "grade": "A",
      "score": 8
    }
  ],
  "name": "Taste The Tropics Ice Cream",
  "restaurant_id": "40356731"
}

4). Connect with mongo db from mongo shell using below command after successful import of data. To get url to connect to the database follow the below steps:

a)

b). Once you click on the command line tools , mongo db would provide url to connect to cluster

c).

mongosh "mongodb+srv://cluster0.lx1fw.mongodb.net/myFirstDatabase" --apiVersion 1 --username xxxx

d). Once connected, view all collections created in the connected Database using show collections command

5). Querying to the created database to fetch different sent of insights using MongoDB query

i). To select documents use command: db.meteorites.find()

ii).To view one document use command: db.meteorites.findOne()

6). Following are the query to solve different questions to database

Q1. Count the number of documents(records) in the collection(table):

Answer: print(db.restaurants.countDocuments())

Q2. Display all the documents in the collection:

Answer: print(db.restaurants.find())

Q3. Display: restaurant_id, name, borough and cuisine for all the documents:

Answer: print(db.restaurants.find({},{“restaurant_id”:1,”name”:1,”borough”:1,”cuisine”:1}))

Q4. Display: restaurant_id, name, borough and cuisine, but exclude field _id, for all the  documents in the collection:

Answer: print(db.restaurants.find({},{“restaurant_id”:1,”name”:1,”borough”:1,”cuisine”:1,”_id”:0}))

Q5. Display: restaurant_id, name, borough and zip code, exclude the field _id for all the documents in the collection:

Answer: print(db.restaurants.find({},{“restaurant_id”:1,”name”:1,”borough”:1,”address.zipcode”:1,”_id”:0}))

Q6. Display all the restaurants in the Bronx:

Answer: print(db.restaurants.find({“borough”:”Bronx”},{}))

Q7. Display the first 5 restaurants in the Bronx:

Answer: print(db.restaurants.find({“borough”:”Bronx”},{}).limit(5))

Q8. Display the second 5 restaurants (skipping the first 5) in the Bronx:

Answer: print(db.restaurants.find({“borough”:”Bronx”},{}).skip(5).limit(5))

Q9. Find the restaurants with a score more than 85:

Answer: print(db.restaurants.find({grades : { $elemMatch:{“score”:{$gt : 85}}}}))

Q10. Find the restaurants that achieved a score, more than 80 but less than 100:

Answer: print(db.restaurants.find({grades : { $elemMatch:{“score”:{$gt : 80 , $lt :100}}}}))

Q11. Find the restaurants which locate in latitude value less than -95.754168:

Answer: print(db.restaurants.find({“address.coord” : {$lt : -95.754168}}))

Q12. Find the restaurants that do not prepare any cuisine of ‘American’ and their grade score more than 70 and latitude less than -65.754168:

/*print(db.restaurants.find(
          {$and:
                [
                  {"cuisine" : {$ne :"American "}},
                  {"grades.score" : {$gt : 70}},
                  {"address.coord" : {$lt : -65.754168}}
                ]
          }
         ))
*/

Q13. Find the restaurants which do not prepare any cuisine of ‘American’ and achieved a score more than 70 and located in the longitude less than – 65.754168. (without using $and operator):

/*print(db.restaurants.find(
             {
                   "cuisine" : {$ne : "American "},
                   "grades.score" :{$gt: 70},
                   "address.coord" : {$lt : -65.754168}
             }
          ))
*/

Q14. Find the restaurants which do not prepare any cuisine of ‘American ‘ and achieved a grade point ‘A’ and not in the borough of Brooklyn, sorted by cuisine in descending order

print(db.restaurants.find( {
                 "cuisine" : {$ne : "American "},
                  "grades.grade" :"A",
                  "borough": {$ne : "Brooklyn"}
            } 
            ).sort({"cuisine":-1}))

Q15. Find the restaurant Id, name, borough and cuisine for those restaurants which contain ‘Wil’ as first three letters for its name:

print(db.restaurants.find({name:{$regex:"^Wil"}},{
       "restaurant_id" : 1,
       "name":1,
       "borough":1,
       "cuisine" :1}
))

Q16. Find the restaurant Id, name, borough and cuisine for those restaurants which contain ‘ces’ as last three letters for its name.

print(db.restaurants.find({name:{$regex:"ces$"}},{
        "restaurant_id" : 1,
        "name":1,
        "borough":1,
        "cuisine" :1}
))

Q17. Find the restaurant Id, name, borough and cuisine for those restaurants which contain ‘Reg’ as three letters somewhere in its name:

print(db.restaurants.find({name:{$regex:"Reg"}},{
       "restaurant_id" : 1,
       "name":1,
       "borough":1,
       "cuisine" :1}
))

Q18. Find the restaurants which belong to the borough Bronx and prepared either American or Chinese dish

print(db.restaurants.find({ 
      "borough": "Bronx", 
      $or : [{ "cuisine" : "American " }, { "cuisine" : "Chinese" }] }
))

Q19. Find the restaurant Id, name, borough and cuisine for those restaurants which belong to the boroughs of Staten Island or Queens or Bronx or Brooklyn.

print(db.restaurants.find(
      {"borough" :{$in :["Staten Island","Queens","Bronx","Brooklyn"]}},
      {
         "restaurant_id" : 1,
         "name":1,
         "borough":1,
         "cuisine" :1
      }
))

Q20. Find the restaurant Id, name, borough and cuisine for those restaurants which are not belonging to the borough Staten Island or Queens or Bronx or Brooklyn

print(db.restaurants.find(
      {"borough" :{$nin :["Staten Island","Queens","Bronx","Brooklyn"]}},
      {
         "restaurant_id" : 1,
         "name":1,
         "borough":1,
         "cuisine" :1
      }
))

Q21. Find the restaurant Id, name, borough and cuisine for those restaurants which achieved a score below 10.

print(db.restaurants.find(
    {"grades.score" : { $not: {$gt : 10}}
},
    {
       "restaurant_id" : 1,
       "name":1,
       "borough":1,
       "cuisine" :1
    }
))

Q22. Find the restaurant Id, name, borough and cuisine for those restaurants which prepared dish except ‘American’ and ‘Chinese’ or restaurant’s name begins with letter ‘Wil’.

print(db.restaurants.find(
{$or: [
  {name: {$regex:"^Wil"}}, 
  {"$and": [
     	{"cuisine" : {$ne :"American "}}, 
      	{"cuisine" : {$ne :"Chinese"}}
   ]}
]},
{"restaurant_id" : 1,"name":1,"borough":1,"cuisine" :1}
))

Q23. Find the restaurant Id, name, and grades for those restaurants which achieved a grade of “A” and scored 11 on an ISODate “2014-08-11T00:00:00Z” among many of survey dates.

print(db.restaurants.find(
    {"grades" : {$elemMatch: {"date": ISODate("2014-08-11T00:00:00Z"),
    "grade":"A", "score":11}}}, 
    {_id:0, restaurant_id:1, name:1, grades:1}
))

Q24. Find the restaurant Id, name and grades for those restaurants where the 2nd element of grades array contains a grade of “A” and score 9 on an ISODate “2014-08-11T00:00:00Z”.

print(db.restaurants.find(
    {$and: [{"grades.1.grade":"A"},
    {"grades.1.score": 9},
    {"grades.1.date": ISODate("2014-08-11T00:00:00Z")}]},
    {_id:0, restaurant_id:1, name:1, grades:1}
))

Q25. Find the restaurant Id, name, address and geographical location for those restaurants where 2nd element of coord array contains a value which is more than 42 and up to 52.

print(db.restaurants.find(
    {$and : [{"address.coord.1": {$gt : 42}},
    {"address.coord.1": {$lte : 52}}]},
    {_id:0, restaurant_id:1, name:1, address:1}
))

Q26. Count the number of events per year.Note, you will need to create a new date field from the string provided.Assume numbers are years if the entry is not a valid date.

print(db.historicalevents.aggregate([
     {"$group" : {_id:"$date", count:{$sum:1}}}
   ]
))

Q27. Count the number of events per year.Note, you will need to create a new date field from the string provided.Assume numbers are years if the entry is not a valid date.

print(db.meteorites.find({
    geolocation:{ 
    $near:{
       $geometry: { type: "Point",  coordinates: [ 6.08333,50.775 ] },
       $minDistance: 0,
       $maxDistance: 8046
    }
   }
}))
About the Author
Certified Tableau and Alteryx engineer who leverages data to find meaningful insights. Extensive experience in data modelling , data transformations , ETL and reporting analytics with an inkling to use cloud services to leverage end customer needs.
Kartik NagarajanSenior BI Analyst | USEReady