Learn Python Series (#20) - PyMongo Part 3
utopian-io·@scipio·
0.000 HBDLearn Python Series (#20) - PyMongo Part 3
# Learn Python Series (#20) - PyMongo Part 3  #### What Will I Learn? - You will learn about how to use the Logical Operators `$and`, `$or`, `$nor` and `$not`; - about some other operators including `$text` and `$search`; - how to address nested fields; - how to create and use an index for textual search; - how to use the `$size` operator; - and how to use the flexible `$where` operator, which will be of special interest to JavaScript-loving aspiring Pythonistas! #### Requirements - A working modern computer running macOS, Windows or Ubuntu; - An installed Python 3(.6) distribution, such as (for example) the Anaconda Distribution; - The ambition to learn Python programming; - A running MongoDB installation on your computer system, as explained in the previous two episodes . #### Difficulty Intermediate #### Curriculum (of the `Learn Python Series`): - [Learn Python Series - Intro](https://utopian.io/utopian-io/@scipio/learn-python-series-intro) - [Learn Python Series (#2) - Handling Strings Part 1](https://utopian.io/utopian-io/@scipio/learn-python-series-2-handling-strings-part-1) - [Learn Python Series (#3) - Handling Strings Part 2](https://utopian.io/utopian-io/@scipio/learn-python-series-3-handling-strings-part-2) - [Learn Python Series (#4) - Round-Up #1](https://utopian.io/utopian-io/@scipio/learn-python-series-4-round-up-1) - [Learn Python Series (#5) - Handling Lists Part 1](https://utopian.io/utopian-io/@scipio/learn-python-series-5-handling-lists-part-1) - [Learn Python Series (#6) - Handling Lists Part 2](https://utopian.io/utopian-io/@scipio/learn-python-series-6-handling-lists-part-2) - [Learn Python Series (#7) - Handling Dictionaries](https://utopian.io/utopian-io/@scipio/learn-python-series-7-handling-dictionaries) - [Learn Python Series (#8) - Handling Tuples](https://utopian.io/utopian-io/@scipio/learn-python-series-8-handling-tuples) - [Learn Python Series (#9) - Using Import](https://utopian.io/utopian-io/@scipio/learn-python-series-9-using-import) - [Learn Python Series (#10) - Matplotlib Part 1](https://utopian.io/utopian-io/@scipio/learn-python-series-10-matplotlib-part-1) - [Learn Python Series (#11) - NumPy Part 1](https://utopian.io/utopian-io/@scipio/learn-python-series-11-numpy-part-1) - [Learn Python Series (#12) - Handling Files](https://utopian.io/utopian-io/@scipio/learn-python-series-12-handling-files) - [Learn Python Series (#13) - Mini Project - Developing a Web Crawler Part 1](https://utopian.io/utopian-io/@scipio/learn-python-series-13-mini-project-developing-a-web-crawler-part-1) - [Learn Python Series (#14) - Mini Project - Developing a Web Crawler Part 2](https://utopian.io/utopian-io/@scipio/learn-python-series-14-mini-project-developing-a-web-crawler-part-2) - [Learn Python Series (#15) - Handling JSON](https://utopian.io/utopian-io/@scipio/learn-python-series-15-handling-json) - [Learn Python Series (#16) - Mini Project - Developing a Web Crawler Part 3](https://utopian.io/utopian-io/@scipio/learn-python-series-16-mini-project-developing-a-web-crawler-part-3) - [Learn Python Series (#17) - Roundup #2 - Combining and analyzing any-to-any multi-currency historical data](https://utopian.io/utopian-io/@scipio/learn-python-series-17-roundup-2-combining-and-analyzing-any-to-any-multi-currency-historical-data) - [Learn Python Series (#18) - PyMongo Part 1](https://utopian.io/utopian-io/@scipio/learn-python-series-18-pymongo-part-1) - [Learn Python Series (#19) - PyMongo Part 2](https://utopian.io/utopian-io/@scipio/learn-python-series-19-pymongo-part-2) # Learn Python Series (#20) - PyMongo Part 3 In the previous `Learn Python Series` episodes about `PyMongo Part 1` and `PyMongo Part 2`, we learned what MongoDB & PyMongo are about, how to do "CRUD" (Create, Read, Update, Delete) operations - and how to use Query Operations for more advanced querying. In this episode, we'll expand our knowledge regarding PyMongo with a number of techniques: we are going to discuss so-called **Logical Operators**. Let's get started! # Adding some (dummy) data to the `test_mongo` dataset Up until now, we've kept the dataset of the `test_mongo` test database pretty simple. We didn't need more or more complex data to explain what was discussed in the PyMongo Parts 1 & 2 tutorial episodes anyway. But since we're going to discuss creating more complex data queries via using so-called **Logical Operators** regarding PyMongo, let's first add some more data to the `test_mongo` dataset. ```python import pymongo from bson.objectid import ObjectId from pprint import pprint client = pymongo.MongoClient('mongodb://localhost:27017') db = client.test_mongo coll = db.accounts scipio_posts = [ { "title": "Learn Python Series (#19) - PyMongo Part 2", "url": "https://steemit.com/utopian-io/@scipio/learn-python-series-19-pymongo-part-2" }, { "title": "Learn Python Series (#18) - PyMongo Part 1", "url": "https://steemit.com/utopian-io/@scipio/learn-python-series-18-pymongo-part-1" }, { "title": "Learn Python Series (#17) - Roundup #2 - Combining and analyzing any-to-any multi-currency historical data", "url": "https://steemit.com/utopian-io/@scipio/learn-python-series-17-roundup-2-combining-and-analyzing-any-to-any-multi-currency-historical-data" } ] result = coll.update_one({"account": "scipio"}, {"$set": {"posts": scipio_posts}}) ``` ```python jedigeiss_posts = [ { "title": "Das Tagebuch der (Steem)Leiden", "url": "https://steemit.com/deutsch/@jedigeiss/das-tagebuch-der-steem-leiden" }, { "title": "D-A-CH Support -- Discord / Steem Bridge -- added Autovoter", "url": "https://steemit.com/utopian-io/@jedigeiss/d-a-ch-support-discord-steem-bridge-added-autovoter" } ] result = coll.update_one({"account": "jedigeiss"}, {"$set": {"posts": jedigeiss_posts}}) ``` Okay, now as a result, 2 of the 4 documents in the `accounts` collection have a `"posts"` field, of which one holds 3 and the other 2 post items, each containing 2 fields (`"title"` and `"url"`); that will do for now.  # Using Logical Operators for more complex data queries PyMongo gives us support for logical query operators `$and`, `$or`, `$nor` and `$not`, with which we're able to construct more complex data queries. Let's go over them one by one to see how they work and what kind of queries we can come up with. ### Using the `$and` operator The `$and` operator can be used to perform logical `AND` operations (where all expressions involved need to be `True`) on a list (array) of 2 or more expressions. As a result, only the documents that satisfy all expressions will be selected. ```python result = list(coll.find( {"$and": [ {"posts": {"$exists": True}}, {"account_id": {"$exists": True}} ]} )) pprint(result) ``` [{'_id': ObjectId('5ae46285dd58330cd666056f'), 'account': 'scipio', 'account_id': 422033, 'active': True, 'posts': [{'title': 'Learn Python Series (#19) - PyMongo Part 2', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-19-pymongo-part-2'}, {'title': 'Learn Python Series (#18) - PyMongo Part 1', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-18-pymongo-part-1'}, {'title': 'Learn Python Series (#17) - Roundup #2 - Combining and ' 'analyzing any-to-any multi-currency historical data', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-17-roundup-2-combining-and-analyzing-any-to-any-multi-currency-historical-data'}], 'slogan': "Does it matter who's right, or who's left?"}] **Nota bene:** As we can see, only 1 account (`scipio`) is returned, being the only account that has both a `"posts"` field as well as an `"account_id"`. The account `jedigeiss` has a `"posts"` field as well but not an `"account_id"`, where the account `fabiyamada` has an `"account_id"` field but not a `"posts"` field. ### Using the `$or` operator The `$or` operator can be used to perform logical `OR` operations (where only one of all expressions involved needsto be `True`) on a list (array) of 2 or more expressions. As a result, the documents that satisfy one or more of all expressions involved will be selected. ```python result = list(coll.find( {"$or": [ {"posts": {"$exists": True}}, {"account_id": {"$exists": True}} ]} )) pprint(result) ``` [{'_id': ObjectId('5ae46285dd58330cd666056f'), 'account': 'scipio', 'account_id': 422033, 'active': True, 'posts': [{'title': 'Learn Python Series (#19) - PyMongo Part 2', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-19-pymongo-part-2'}, {'title': 'Learn Python Series (#18) - PyMongo Part 1', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-18-pymongo-part-1'}, {'title': 'Learn Python Series (#17) - Roundup #2 - Combining and ' 'analyzing any-to-any multi-currency historical data', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-17-roundup-2-combining-and-analyzing-any-to-any-multi-currency-historical-data'}], 'slogan': "Does it matter who's right, or who's left?"}, {'_id': ObjectId('5ae46693dd58330cd6660571'), 'account': 'fabiyamada', 'account_id': 261379, 'active': True}, {'_id': ObjectId('5ae4bc73dd58332709589486'), 'account': 'jedigeiss', 'active': True, 'posts': [{'title': 'Das Tagebuch der (Steem)Leiden', 'url': 'https://steemit.com/deutsch/@jedigeiss/das-tagebuch-der-steem-leiden'}, {'title': 'D-A-CH Support -- Discord / Steem Bridge -- added ' 'Autovoter', 'url': 'https://steemit.com/utopian-io/@jedigeiss/d-a-ch-support-discord-steem-bridge-added-autovoter'}], 'slogan': 'IT Nerd, Risk Specialist, Musician, Cryptocoin Enthusiast, ' 'Banker, Gamer'}] **Nota bene:** The query we now ran using `$or` deliberately is the same, to show you the difference, as the example `$and` query we began with, because right now in total 3 documents were selected: - the account (`scipio`) has both a `"posts"` field as well as an `"account_id"` field, which satisfies the `$or` clause (at least 1 expression evaluates to True, and it doesn't matter both expressions do); - the account `jedigeiss` has a `"posts"` field; - the account `fabiyamada` has an `"account_id"` field. ### Using the `$nor` operator The `$nor` operator performs logical `NOR` operations on a list (array) of 2 or more expressions. As a result, the documents that satisfy NONE of all expressions, ergo when ALL expressions FAIL, will be selected. ```python result = list(coll.find( {"$nor": [ {"posts": {"$exists": True}}, {"account_id": {"$exists": True}} ]} )) pprint(result) ``` [{'_id': ObjectId('5ae46693dd58330cd6660570'), 'account': 'stoodkev', 'active': True}] ### Using the `$not` operator The `$not` operator performs logical `NOT` operations on 1 expression and selects the documents that do **not match** that expression. `$not` affects other operators. ```python result = list(coll.find( {"account_id":{ "$not": {"$exists": True} }} )) pprint(result) ``` [{'_id': ObjectId('5ae46693dd58330cd6660570'), 'account': 'stoodkev', 'active': True}, {'_id': ObjectId('5ae4bc73dd58332709589486'), 'account': 'jedigeiss', 'active': True, 'posts': [{'title': 'Das Tagebuch der (Steem)Leiden', 'url': 'https://steemit.com/deutsch/@jedigeiss/das-tagebuch-der-steem-leiden'}, {'title': 'D-A-CH Support -- Discord / Steem Bridge -- added ' 'Autovoter', 'url': 'https://steemit.com/utopian-io/@jedigeiss/d-a-ch-support-discord-steem-bridge-added-autovoter'}], 'slogan': 'IT Nerd, Risk Specialist, Musician, Cryptocoin Enthusiast, ' 'Banker, Gamer'}] # Other operators ### The `$size` operator If you use the `$size` operator on a list (array) field, it will match documents having that exact number of elements: ```python result = list(coll.find( {"posts": {"$size": 2}} )) pprint(result) ``` [{'_id': ObjectId('5ae4bc73dd58332709589486'), 'account': 'jedigeiss', 'active': True, 'posts': [{'title': 'Das Tagebuch der (Steem)Leiden', 'url': 'https://steemit.com/deutsch/@jedigeiss/das-tagebuch-der-steem-leiden'}, {'title': 'D-A-CH Support -- Discord / Steem Bridge -- added ' 'Autovoter', 'url': 'https://steemit.com/utopian-io/@jedigeiss/d-a-ch-support-discord-steem-bridge-added-autovoter'}], 'slogan': 'IT Nerd, Risk Specialist, Musician, Cryptocoin Enthusiast, ' 'Banker, Gamer'}] **Nota bene:** the `$size` operator expects an integer number as its argument, you cannot use an expression such as: ``` result = list(coll.find( {"posts": {"$size": {"$gte": 2}}} )) ``` ### The `$where` operator For those of you that have experience in JavaScript programming, or for those aspiring Pythonistas feeling adventurous: special attention for the `$where` operator, since it allows for passing an argument string containing - you probably guessed it by now - a JavaScript expression (or even a JavaScript function). However, please note that although the `$where` operator does provide a lot of flexibility, it is quite slow because it requires the MongoDB database to evaluate the JavaScript expression separately for every document in the selected collection. A nice example, in which the `$where` expression shines, is by performing a "greater-than-or-equal-to"-type query for the number of elements in the `"posts"` list (array) field, which doesn't work with the same flexibility as we've just seen with the `$size` example: ```python result = list(coll.find( {"posts": {"$exists": True}, "$where":"this.posts.length >= 2"} )) pprint(result) ``` [{'_id': ObjectId('5ae46285dd58330cd666056f'), 'account': 'scipio', 'account_id': 422033, 'active': True, 'posts': [{'title': 'Learn Python Series (#19) - PyMongo Part 2', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-19-pymongo-part-2'}, {'title': 'Learn Python Series (#18) - PyMongo Part 1', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-18-pymongo-part-1'}, {'title': 'Learn Python Series (#17) - Roundup #2 - Combining and ' 'analyzing any-to-any multi-currency historical data', 'url': 'https://steemit.com/utopian-io/@scipio/learn-python-series-17-roundup-2-combining-and-analyzing-any-to-any-multi-currency-historical-data'}], 'slogan': "Does it matter who's right, or who's left?"}, {'_id': ObjectId('5ae4bc73dd58332709589486'), 'account': 'jedigeiss', 'active': True, 'posts': [{'title': 'Das Tagebuch der (Steem)Leiden', 'url': 'https://steemit.com/deutsch/@jedigeiss/das-tagebuch-der-steem-leiden'}, {'title': 'D-A-CH Support -- Discord / Steem Bridge -- added ' 'Autovoter', 'url': 'https://steemit.com/utopian-io/@jedigeiss/d-a-ch-support-discord-steem-bridge-added-autovoter'}], 'slogan': 'IT Nerd, Risk Specialist, Musician, Cryptocoin Enthusiast, ' 'Banker, Gamer'}] and as you can see, in this case both the accounts of `scipio` as well as `jedigeiss` were selected correctly. ### Perform a textual search using the `$text` & `$search` operators The `$text` operator is pretty cool, in that it allows you to search for words contained within the textual content of specific fields, provided those field are indexed with a **text index**. To do so, first **place an index** (using `create_index()`) where you pass in the field key(s) as a list (array) and TEXT as the **index direction**. Then combine the `$text` & `$search` operators and look for a specific word. Like so: ```python # Since the "title" field is nested within the elements contained, # inside the "posts" field, use the dot-notation `.` to index # the field "posts.title" coll.create_index([("posts.title", pymongo.TEXT)]) result = list(coll.find( {"$text": {"$search": "Discord"}} )) pprint(result) ``` [{'_id': ObjectId('5ae4bc73dd58332709589486'), 'account': 'jedigeiss', 'active': True, 'posts': [{'title': 'Das Tagebuch der (Steem)Leiden', 'url': 'https://steemit.com/deutsch/@jedigeiss/das-tagebuch-der-steem-leiden'}, {'title': 'D-A-CH Support -- Discord / Steem Bridge -- added ' 'Autovoter', 'url': 'https://steemit.com/utopian-io/@jedigeiss/d-a-ch-support-discord-steem-bridge-added-autovoter'}], 'slogan': 'IT Nerd, Risk Specialist, Musician, Cryptocoin Enthusiast, ' 'Banker, Gamer'}] # What did we learn, hopefully? In this episode, we again gained more knowledge on how to query MongoDB via the PyMongo package. We discussed the Logical Query Operators `$and`, `$or`, `$nor` and `$not`, plus we went over the operators `$size`, `$where`, `$text` and `$search`. And because I deliberately kept the queries and underlying dataset simple enough to comprehend (hopefully?) yet rich enough to demonstrate the various querying techniques, I hope it became clear that MongoDB / PyMongo provides us with some pretty powerful mechanisms to "precisely pinpoint" the data we're looking for based on a number of criteria! ### Thank you for your time! <br /><hr/><em>Posted on <a href="https://utopian.io/utopian-io/@scipio/learn-python-series-20-pymongo-part-3">Utopian.io - Rewarding Open Source Contributors</a></em><hr/>
👍 scipio, steemline, analyzer, helo, tuanis, itsmikechu, fabiyamada, parejan, jasonbu, millionfist, espoem, allyouneedtoknow, rdvn, valued-customer, shakibul, clayjohn, reazuliqbal, iqbaladan, panotwo, rishi-sayz, wave.beads, cryptocorgi, zcool, stoodkev, olyup, rezawijaya, loshcat, greenorange, ibez, amosbastian, privacybydesign, techslut, choogirl, onderakcaalan, howo, madarauchiha, yurijkkuig, olegkorrole, malyshenko, ruth-elise, diogogomes, aderemi01, osita4coin, utopian-io, drifter1,