Moving A Cluster to MongoDB Enterprise with SSL

Porsche_356_CarreraWHY SSL

MongoDB’s SSL support allows MongoDB clients to talk to the database using encrypted connections for security. Now if you are trying to run  from a regular distribution of MongoDB, it probably will not work, because the fre version of MongoDB does not contain support for SSL. To use SSL, you must either build MongoDB yourself or buy MongoDB Enterprise.

What this blog post is about, how to move a cluster running MongoDB to MongoDB Enterprise with SSL, and a little background into what is going on with the MongoDB servers in the process.

More of an outline for getting started with SSL and assume that you have already installed a build of MongoDB that includes SSL support and that your client driver supports SSL.

There are two parts relevant to moving your cluster to SSL. The server side as the servers communicate with each other and the client side that send queries to the servers. In MongoDB 2.6, there is a new net.ssl.mode parameter that can ease the transition.

MIXED MODE

The net.ssl.mode parameter is new in version 2.6. There are four modes that ssl can operate using. The major difference is how the servers communicate between servers. One of the reason you may consider this is because of client drivers.


disabled No SSL encrypted connections
allowSSL Between servers do not use SSL. Otherwise accept both SSL and non-SSL.
preferSSL Between servers use SSL. Otherwise accept both SSL and non-SSL.
requireSSL Only SSL encrypted connections

 

MongoDB Servers

Servers can operate in three modes. The first is SSL encryption mode where everything is encrypted. The second is that clients have a cert from a certificate authority , which rules out self signed certificates. Finally, the server validates with a valid certificate or NO certificate. The last mode only fails if the client passes and invalid certificate.

To upgrade a cluster, you go through the three SSL modes. First you start the server nodes with all the nodes using allowSSL. Then using this command update the entire cluster to preferSSL

db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "preferSSL" } )

And finally , the last move to requireSSL, which blocks any non SSL nodes.

db.getSiblingDB('admin').runCommand( { setParameter: 1, sslMode: "requireSSL" } )

After this, update /etc/mongodb.conf to requireSSL so the settings will stay the same after a reboot

MongoDB SSL Clients

Now that the servers are running in SSL, lets look the MongoDB client.  All the mongo tools, (mongo, mongodump, mongoexport,mongofiles,mongoimport,mongooplog, mongorestore, mongostat, mongotop ) need to have SSL configured, in the same way as the shell. Since you will be upgrading your cluster, you need your shell configured first.

As an example with mongo ( that is just as valid with the other mongo utilities ), you would pass -ssl along with a .pem file.

mongo --ssl --sslPEMKeyFile /etc/ssl/client.pem

If the server only cared about encryption, then passing -ssl would be fine

mongo --ssl

Not all client drivers support SSL connections. This is a pain, and another reason why you should use the official driver. I was using a C# driver that did not support SSL and when the requirement came along to use SSL , then a lot of re-factorings had to happen switching to the official driver which did support SSL.

A Note on FIPS

The United States government defines many (several hundred) “Federal Information Processing Standard” (FIPS) documents. One of the FIPS regulations, FIPS 140, governs the use of encryption and cryptographic services. FIPS Mode”, which is really “FIPS 140 Mode”. Both MongoDB Enterprise and MongoDB compiled with –ssl can operate in FIPS 140 Mode.


Building Fast, Scalable Multi Tenant Apps with MongoDB

landlord_and_tenantMULT TENANT

We are all building apps in the cloud. Accounting apps, to-do apps, word processing apps everything. With clients trusting you with their data, the question becomes how to keep their data safe and separate. Keeping your clients data separate is the difference between living in a group commune and being a tenant in an apartment. This is called multi-tenancy.

Multi-tenancy refers to a principle in software architecture where a single instance of the runs on a server, serving multiple customers (tenants). This is a really important part of important feature of cloud computing. This is important because in multi tenant environment customers do not share or see each other’s data.

There are three main ways to build multi tenant databases in MongoDB. The first is by putting all tenants in a single database. The second is putting each tenant in their own individual database, and finally, each tenant in its own collection.

PUTTING ALL TENANTS IN A SINGLE DATABASE

This is the most common form of multi-tenancy and where most web apps start. Putting all your tenants together is a lot simpler and we do not even think about calling it multi-tenancy. Putting all tenants in a single database requires putting the multi tenancy logic into the application level. Enforcing security at the application level can be something as simple as place a  enforcing user or customer  filters on all data queries, eg.  prefixing every database query with a user id.

For a “freemium” business, this will be a better model, since each MongoDB database occupies at least 32MB.  Creating hundreds of databases for hundreds of non paying customers can waste a lot of resources.

GIVING EACH TENANT A SEPARATE COLLECTION

This is probably the worst way for MongoDB for a couple of reasons. I won’t go into detail, but this is the method you really want to avoid.

First, Collections in the same database share the same database Lock. MongoDB concurrency has been steadily improving, but it is still there. Second, the default MongoDB nssize setting limits the number of collections in a database to 24,000. You can go up to 3 million by changing the nssize setting in the configuration.

GIVING EACH TENANT A SEPARATE DATABASE

The may be the best way depending on your app. Giving each tenant their own databases gives you flexibility in managing and optimising your MongoDB setup. By having a separate database per customer, things like great for moving , managing and deleting client databases become trivial. Since each database is separate, you can create different indexes for different clients depending on their needs.

The downside to this is that each client takes space. If you clients are paying customers, this is not a problem. If you have a free service, then each client will use 32 MB of disk space which is quite a lot if you have a lot of inactive clients.

Even with multi tenancy, it can be hard to pick a shard key. The hashed shard key in MongoDB can provide performance even at scale ( depending on your application )

WRAP UP

For most things, performance is application specific. Especially with MongoDB, and the advice here needs to be seen in the light of your application. As always, your mileage may vary. I you are starting an app, for development, you can certainly use one large database, while writing you app logic to support One database per tenant. You may not use this initially with your app, but by   putting multi database logic in your app code from the start will save you a lot of heartache if you have any sort of success. And one last thing. Shard Early Picking a shard key is something that is hard to change later

 


New Mongo DB Class : MongoDB Advanced Deployment and Operations

ImageNEW COURSE

Mongo DB is offering an advanced course for MongoDB DBA’s. This is a follow on to their 102 course for MongoDB DBA’s. The course has not started as I write this, but from the description, it seems to cover more real life use cases. if you have been a MongoDB DBA for a while, I am sure it will be boring, but for those of you who have not run MongoDB in production, it is probably a good course given the cost. Also, these courses are a good way to keep up with the changes to MongoDB.

HOW I USE THE COURSES

I have used the courses and I will continue to use the MongoDB classes. I started using MongoDB during the 1.6/1.8 days, when MongoDB was not nearly as polished. To keep up with all the changes, I use the courses as my continuing edication. I take a class and head straight to the quizzes and homework. Usually I just breeze through them without spending too much time. When I run into a quiz where I cannot answer, it is usually a new feature that has been added to MongoDB. Then I go through the videos, and “keep my tools sharp”.

Fork Me on Github

I JOINED THE CROWD

I finally have a public Github repo for my stuff. I do not know what I am going to put there exactly, but it will be MongoDB goodness. I have a lot of little tools I have written for various things. Some Javascript, some Python, some C# and some Bash. Even some straight C.

MONGODB ADMIN

Probably the best repo to start with it the MongoDB admin repo. That is probably where I will start adding things.

 

NoSQL Popularity – Comparing MongoDb, Riak, Redis and CouchDb

NoSQL POPULARITY

Starting the new year, I wanted to take a look at the most popular NoSQL databases as rated by searches on Google. The more people search for a database, the more popular it should be, given as unscientific as this kind of poll is.

The graph below is a popularity contest, it shows how many times users searched for the term. It doe not represent which database is better. I would never do a chart on which database was better, I would not want to start a religious war, and spend my day answering comments.

GOOGLE SEARCH POPULARITY

I used Google Zeitgeist to examine four popular NoSQL databases. MongoDb, Riak, Redis and CouchDb. ( Click on the image to see a larger image )

Jan2014-mongodb-riak-redis-couchdb

So what can we learn from this chart.

MongoDB – Strong and growing in popularity. Yeah ! Since the name of the blog is The Mongo DBA, you probably know I am biased.

Riak – I thought Riak was more popular, and I was surprised at the result. Que sera sera ( Which is french for ‘Whatever’ )

Redis – Has pretty much flat lined. I think Redis had it’s day, and we have reached ‘peak Redis’

CouchDb – CouchDb is declining, which is almost as surprising as the results from Riak.

RESULTS

If you are looking at a popular NoSQL database, look at Redis and MongoDb. The have more ‘mindshare’ and more visibility in the open source community.

This is my humble opinion, so take it with a grain of salt. Riak and CouchDB are written in Erlang and that may have something to do with the popularity of the database. It is a lot easier to gain open souce traction if hackers can us ‘C’ which was used to write Redis and MongoDb.

I did try to search for Cassandra, but Cassandra is a popular word that lots of non database results appear over reporting ‘Cassandra‘. If you look for Apache Cassandra, you miss a lot of posts, under reporting ‘Apache Cassandra

Mongodb.js vs Mongoose, Why we chose Mongodb.js

NODE.JS AND THE API

For the contract I am currently working on, we are building an API using node.js, express.js and mongodb. For node.js to access data in a Mongodb database,  you need to have an library. There are several ways including the native mongodb driver, Mongoose, Mongolian Dead Beef, MongoSkin and I am sure there are tons I did not mention.

USING MONGOOSE

Mongoose brings schema’s and models to Node.js / Mongodb. Yes, it brings a schema to a NoSQL database. I think it makes no sense to do that. If you wanted models and schemas, go ahead and use MySQL.  Taking a database and putting schemas on top actually defeats the purpose.

One of the complaints I hear is that using Mongodb.js , developers have a problem with the joins and the schema. This is a problem with the developer, not with Mongodb or NoSQL databases in general.

DIFFERENCE BETWEEN THE TWO

Mongoose is higher level interface to Mongodb and actually uses mongodb.js,  the MongoDB driver. The question is not really which one is better or worse, the question for us is

Do the benefits of an ODM in Mongoose outweigh the drawbacks?

If you’re looking for an object modeling (ODM) tool so that you do not have to learn a lot about the way Mongodb works, then Mongoose is probably for you. If you want a fast driver and really get the most out of Mongodb, then use the native driver. We know our way around Mongodb so Mongoose would have slowed use down and our app.

BOTTOM LINE

If you are having a problem using schemas ( or complaining about joins )  in any NoSQL database, it is time to stop and spend a little bit more time understanding NoSQL databases and why they work the way they do.

Some of the things you get from Mongoose, validation, etc are good uses, I just think there are a lot better ways to do it in javascript than wrapping a NoSQL database in schemas.