Real-time multi-tenant migration with Cloudant NoSQL database

Datetime:2016-08-23 00:52:20          Topic: DataBase           Share

Executing data migrations on a multi-tenant, document-based, enterprise-level cloud database system is as difficult to do as it is to say. NoSQL document-based storage has proven itself to be the best database solution for agile development, as it is trivial to start storing data. However, many struggle with managing their data after it is created.

What happens when changes are needed to some or all documents, such as adding a new field to every document? Perhaps the documents in one collection must be refactored into a different format. Relational SQL-based database systems can perform queries to alter a table or database, or run update a number of rows at a time. Cloudant and other NoSQL databases are schema-less.  Thus, a migration script is essential in performing bulk operations. This article provides tips on writing migration scripts and performing real-time tenant data migration without downtime.

Before Writing A Migration Script

There are several factors one must consider when writing a migration script, such as:

  • Will it need to run on every database, or just a subset of them?
  • Which documents require migration? If not all of them, is there an index to filter the documents?
  • On average, how large is each document? Will out-of-memory errors come into play?
  • Is there a need for new design documents or map-reduce views?
  • Will this script be used again in the future?
  • Who will be running this script? A human? A cron job (machine)?

Before doing anything, however, it is strongly advised that backups be made. There is no concept of ‘transactions’ in Cloudant. Any changes made are permanent. That being said, let’s discuss multi-tenancy.

Multi-tenancy means that the collections for every tenant are stored and managed in the same Cloudant account. Each tenant could have a number of collections with different types of documents. Determining which collections contain the documents requiring migration is the first step in writing a migration script. Suppose the list of databases resembles:

tenant_{tenantCode}_config

tenant_{tenantCode}_data

tenant_{anotherTenantCode}_config

tenant_{anotherTenantCode}_data

If the documents requiring changes only reside in the `_config` collection, then list of collections needs to be filtered before fetching any documents. Additionally, be sure updates are made to the correct documents within a collection, as fetching documents could return design documents or other metadata that do not require migration.

When fetching documents, take care to observe the application’s memory limit. Scripting languages such as NodeJS and PHP enforce strict limits on how much memory the script may use. Fetching all documents in a collection in one request can easily result in an ‘out of memory’ error if the collections contain several thousand documents. Limiting the number of documents per request can be done by supplying the `limit` parameter to Cloudant.

The unintended consequence of this approach is that, should the script fail because of a network error, a Cloudant timeout, or even just an unhandled exception, we would have to start the script over from the beginning. Additionally, Cloudant’s `skip` parameter (to offset the result set) performs very slowly for large values. To prevent this, use the `startkey` or `startkey_docid` parameters to specify where to begin the next batch. The last key (document id) used can be saved in a temporary file to be used in case of an error, or just printed to the console and passed in as an argument.

Running A Migration Script

Before running anything, again, please backup your data.

If the script will be run multiple times, it’s a good idea to wrap it in an easy to use command-line interface (CLI). Allowing options to be passed in via command-line also allows the script to be set up to run on as a cron job, or triggered as a result of some action such as a Jenkins build or a git hook.

It is helpful to provide meaningful output about what is going on during the script. If the database has a large number of collections and documents, the script could be running for a great length of time, and running migration scripts (especially on production databases) can be very stressful. Relaying information about what is happening can help ease nerves when waiting for a script to complete for several hours. Something as simple as:

Gathering documents for collection tenant_{tenantCode}_config

Found 245 documents to modify

Updated 244 documents. 1 error.

Gathering documents for collection tenant_{anotherTenantCode}_config

can make a world of difference. This style of logging can also help in debugging issues later on. It is helpful to post console output that describes each step of the script to verify it is working as expected.  In a service with multiple environments (e.g. development, test, pre-production, production), the output is useful for tracking and documenting the migration process across each environment. While developing a script, a great way to debug is to initially comment out the code that saves changes to Cloudant, and simply confirm the script runs correctly.

Post-Migration

Care must be taken to remember that massive operations on Cloudant can cause issues with performance, and even bring down Cloudant nodes. Bulk updates can cause massive reindexing of views, which results in a large drain on performance. Allowing time for indexing to complete is paramount to success in a real-time live environment. When working with large amounts of documents, indexing can sometimes take as long as 8 hours (or more)!

When creating new design documents or indexes, it’s important to allow time for those indexes to generate as well. A good plan when introducing new indexes is to create them days ahead of deploying any code that will use them. This gives Cloudant ample time to create the new indexes before attempting to use them. Updating existing indexes is typically a bad idea for large sets of data, as reindexing will then take place while code is still running against the index (which is still being built). It’s generally wise to create a brand new index ahead of time, and then switch to it after indexing is complete.

For the person running the script, providing a summary of results is crucial to understanding what happened during a script run. Explaining the number of documents or collections affected, what actions were taken on them, and reporting any errors will help to smooth the migration process, especially if errors are occurring.

Finally, after the script is finished running, you must determine your next steps, and answer a few questions. Could new documents have been created during the script run that therefore were not migrated? Can the script be run again without affecting documents that were already modified? Writing the script in such a way that it can be run repeatedly, without duplicating operations, is key to migrating data in a production system. The best migration scripts can figure out whether a document has already been transformed or modified, and will skip it in subsequent runs.

Putting It All Together

The following is a basic template for the steps needed to perform real-time migration:

  1. Create new databases, design documents, indexes, views that are needed for the migration.
  2. Allow time for indexing operations to complete and allow cluster return to steady-state.
  3. Run the migration script to migrate existing data.
  4. Deploy new code to use the new databases, design documents, indexes, and views.
  5. Run migration script after deployment is complete to catch “in-flight” data.
  6. Analyze migration output for any errors and check that all migration is completed.  If necessary, re-run the migration script again.

Documenting and using the same template or steps when performing migration across the various environments (development, test, etc…) can also help catch environmental issues or problems with the scripts before migration of the production environments.

A Few Last Words

Overall, migration scripts are a powerful method for performing updates across collections in a Cloudant database. While NoSQL document-based storage may not have the built-in functionality to perform bulk operations like SQL, by using migration scripts, you can go beyond traditional ALTER functions, and make multiple, extensive updates with relative ease.

Author’s Note

The article is a collection from our experiences in developing and supporting an enterprise, multi-tenant SaaS service using Cloudant running on IBM Bluemix. The team members comprised of: Tim Burns, James Jong, Matt Margolis, and Jorge Padilla.

Learn more about NoSQL databases





About List