Important Information for KCDC MongoDB Workshop

TL;DR

The single most important thing you need to know is that if you show up with a machine that is running Windows XP or a machine that doesn't have a JDK on it, you're gonna have a bad day.

Prerequisites for your computer

OS: Windows 7 or 8, preferably 64bit, Linux or recent OS X
Have a Java SDK 1.7 installed and configured with IDE on machine

Why? Because we will utilize features that were introduced in MongoDB version 2.2.

Windows XP and older versions of Linux/OS X ARE NOT SUPPORTED by MongoDB 2.2+

Prerequisites for you

Review and be familiar with JavaScript Object Notation at http://json.org
Be able to run and use the command prompt / terminal of your OS.
Be able to create code, add a jar/library, compile and run code in your favorite IDE.

Primary Goal

It is my primary goal that you leave the workshop with a functioning MongoDB environment and knowledge of the fundamentals with the skills to do routine development work.

Course Outline


  • Introduction and Installation of MongoDB
  • Schema (Relational and Document Oriented)
  • Creating, Reading, Updating and Deleting documents (CRUD)
  • Advanced CRUD - sub documents, arrays, sorting, limiting and other operators
  • backups
  • Performance/Indexes
  • Aggregation Framework
  • GridFS
  • Replication
  • Sharding Overview
  • Open Lab - Time Permitting

MongoDB Schema

One of the major aspects of MongoDB is that it is a document store.  You can put anything you want into a document-- it is schema-less.  However, in many cases the documents stored in a collection do consist of the same fields.  So, in answer to the question, "how do I determine the schema of a collection?"  There are a couple options.
  1. Manually/visually inspect the contents of the collection
  2. Use a utility to examine a single document
  3. Use some sort of utility to examine all of the documents in a collection

Manually inspecting the collection

As you can imagine is simply, "using" the database, and doing a db.collectionName.findOne()
For thoroughness, you'd probably want to examine more than one document.  This is where a db.collectionName.find().pretty() will come in handy.

Use a utility to examine a single document

I created a small python utility given a database name and collection name will give you the keys for a document in the collection.  This requires you have Python 2.7 and pymongo installed on your computer.  I put this in my ~/bin directory and chmod +x it.

Use a utility to examine all of the documents in a collection

Skratch. has a cool extension to the MongoShell which examines all of the documents in a collection and tells you how many documents are using the field.  Fields can vary in type by document.  So, this tool even breaks down the occurrence of the field by type!  It is on github at: https://github.com/skratchdot/mongodb-schema/

MongoDB examples, Replica Set and GridFS

MongoDB Replica Set with Python example

I published a short lab on working with MongoDB replica sets in Python on GitHub https://github.com/k0emt/mongodb_repset_experiment

One thing to note is that I included all of the replica set nodes in the connection information.  That is because if "regular" primary node was down at the time of initial connection the code would fail.

The nodes in the replica set will figure out who should be primary and that will happen auto-magically behind the scenes.  However, your client code still needs to deal with needing to reconnect.

Adjust the counters if you want to have the client up and running longer while you experiment.

MongoDB GridFS with Java example

Example code for demonstrating GridFS and the metadata field with Java was also published on GitHub  https://github.com/k0emt/gridfs_example_java

By utilizing the metadata field you can keep your document metadata with it.