Moving to the cloud is all the rage. According to an IDC Survey Spotlight, Experience in Migrating Databases to the Cloud, 63% of enterprises are actively migrating their databases to the cloud, and another 29% are considering doing so within the next three years.
This article discusses some of the risks customers may unwittingly encounter when moving their database to a database as a service (DBaaS) in the cloud, especially when the DBaaS leverages open source database software such as Apache Cassandra, MariaDB, MySQL, Postgres, or Redis. At EDB, we classify these risks into five categories: support, service, technology stagnation, cost, and lock-in. Moving to the cloud without sufficient diligence and risk mitigation can lead to significant cost overruns and project delays, and more importantly, may mean that enterprises do not get the expected business benefits from cloud migration.
Because EDB focuses on the Postgres database, I will draw the specifics from our experiences with Postgres services, but the conclusions are equally valid for other open source database services.
Support risk. Customers running software for production applications need support, whether they run in the cloud or on premises. Support for enterprise-level software must cover two aspects: expert advice on how to use the product correctly, especially in challenging circumstances, and quickly addressing bugs and defects that impact production or the move to production.
For commercial software, a minimal level of support is bundled with the license. Open source databases don’t come with a license. This opens the door for a cloud database provider to create and operate a database service without investing sufficiently in the open source community to address bugs and provide support.
Customers can evaluate a cloud database provider’s ability to support their cloud migration by checking the open source software release notes and identifying team members who actively participate in the project. For example, for Postgres, the release notes are freely available, and they name every individual who has contributed new features or bug fixes. Other open source communities follow similar practices.
Open source cloud database providers that are not actively involved in the development and bug fixing process cannot provide both aspects of support—advice and rapid response to problems—which presents a significant risk to cloud migration.
Service Risk. Databases are complex software products. Many users need expert advice and hands-on assistance to configure databases correctly to achieve optimal performance and high availability, especially when moving from familiar on-premises deployments to the cloud. Cloud database providers that do not offer consultative and expert professional services to facilitate this move introduce risk into the process. Such providers ask the customer to assume the responsibilities of a general contractor and to coordinate between the DBaaS provider and potential professional services providers. Instead of a single entity they can consult to help them achieve a seamless deployment with the required performance and availability levels, they get caught in the middle, having to coordinate and mitigate issues between vendors.
Customers can reduce this risk by making sure they clearly understand who is responsible for the overall success of their deployment, and that this entity is indeed in a position to execute the entire project successfully.
Technology stagnation risk. The shared responsibility model is a key component of a DBaaS. While the user handles schema definition and query tuning, the cloud database provider applies minor version updates and major version upgrades. Not all providers are committed to upgrading in a timely manner—and some can lag significantly. At the time of this writing, one of the major Postgres DBaaS providers lags the open source community by almost three years in their deployment of Postgres versions. While DBaaS providers can selectively backport security fixes, a delayed application of new releases can put customers in a situation where they miss out on new database capabilities, sometimes for years. Customers need to inspect a provider’s historical track record of applying upgrades to assess this exposure.
A similar risk is introduced when a proprietary cloud database provider tries to create their own fork or version of well-known open source software. Sometimes this is done to optimize the software for the cloud environment or address license restrictions. Forked versions can deviate significantly from the better-known parent or fall behind the open source version. Well-known examples of such forks or proprietary versions are Aurora Postgres (a Postgres derivative), Amazon DocumentDB (with MongoDB compatibility), and Amazon OpenSearch Service (originally derived from Elasticsearch).
Users need to be careful when adopting cloud-specific versions or forks of open source software. Capabilities can deviate over time, and the cloud database provider may or may not adopt the new capabilities of the open source version.
Cost risk. Leading cloud database services have not experienced meaningful direct price increases. However, there is a growing understanding that the nature of cloud services can drive significant cost risk, especially in the case of self-service and rapid elasticity combined with an intransparent cost model. In on-premises environments, database administrators (DBAs) and developers must optimize code to achieve performance with the available hardware. In the cloud, it can be much more expedient to ask the cloud provider to increase provisioned input/output operations per second (IOPS), compute, or memory to optimize performance. As each increase instance drives up cost, such a short-term fix is likely to have long-lasting negative cost impacts.
Users mitigate the cost risk in two ways: (1) close supervision of the increases of IOPS, CPU, and memory to make sure they are balanced against the cost of application optimization; (2) scrutiny of the cost models of DBaaS providers to identify and avoid vendors with complex and unpredictable cost models.
Lock-in risk. Cloud database services can create a “Hotel California” effect, where data cannot easily leave the cloud again, in several ways. While data egress cost is often mentioned, general data gravity and the integration with other cloud-specific tools for data management and analysis are more impactful. Data gravity is a complex concept that, at a high level, purports that once a business data set is available on a cloud platform, more applications likely will be deployed using the data on that platform, which in turn makes it less likely that the data can be moved elsewhere without significant business impact.
Cloud-specific tools are also a meaningful driver for lock-in. All cloud platforms provide convenient and proprietary data management and analysis tools. While they help derive business value quickly, they also create lock-in.
Users can mitigate the cloud lock-in effect by carefully avoiding the use of proprietary cloud tools and by making sure they only use DBaaS solutions that support efficient data replication to other clouds.
Planning for risk. Moving databases to the cloud is undoubtedly a target for many organizations, but doing so is not risk-free. Businesses need to fully investigate and understand potential weaknesses of cloud database providers in the areas of support, services, technology stagnation, cost, and lock-in. While these risks are not a reason to shy away from the cloud, it’s important to address them up front, and to understand and mitigate them as part of a carefully considered cloud migration strategy.
This content was produced by EDB. It was not written by MIT Technology Review’s editorial staff.