Seven challenges of Oracle to PostgreSQL migration and how to address themDan Lindsay
Driven by their move to the cloud, many organisations are choosing to migrate from on-premises Oracle databases to open-source alternatives with the potential to lower Total Cost of Ownership (TCO), such as PostgreSQL. In many cases, this is a complex undertaking with notable delivery risk.
The work involved in performing the migration can be thought of in two parts:
- Modifying the application in such a way that it can run against PostgreSQL (Application Refactor)
- Creating a reliable and auditable migration process to transfer the data (Trials)
In this blog, we will outline the key challenges faced when moving to PostgreSQL and how best to address them. Specifically, we will be exploring migration to AWS hosted services – Amazon PostgreSQL Aurora or Amazon RDS for PostgreSQL. The key takeaways are summarised below:
- Timelines: Take a DevOps approach, introduce automation, and use a trial-based approach
- Costs: Calculate a Total Cost of Ownership using the AWS pricing calculator, include IOPS and future data egress
- Performance: Use Performance Insights throughout trials, and tune target schema indexes and application queries
- Data Governance: Use production-like data as soon as you are able
- Networking: Improve latency to AWS with Direct Connect, and use latency to plan migration phasing
- Reconciliation: Perform reconciliation throughout trials, automate execution, and reporting
- Skills Gap: Upskill teams early and build a repeatable and observable migration process
The work that is necessary to upgrade an application to support the PostgreSQL dialect can be significant. It may take a number of months and even extend over a year to perform a large refactor. Significant refactoring is generally necessary if the application has a large database feature footprint or is heavily integrated at the database layer.
Commercial Off-The-Shelf (COTS) applications may have support for PostgreSQL built in, which reduces the work involved. However, be mindful that it may still be necessary to address new bugs or performance issues flushed out by the change of database platform. If PostgreSQL is not supported by the application vendor and this is not on the roadmap, it may be necessary to use RDS Oracle or Oracle on EC2.
A migration trial is a single execution of a 'mock' migration, where data is transferred from the legacy database to the new. It is common for just a handful of trials to be planned at the start of a project, but each migration trial is an opportunity for learning and lowers the risk associated with a go-live event. For this reason, try to plan for multiple trials in each environment and invest in methods that reduce the overhead of each trial up front. There is usually a chain of activities making up each trial such as creating the target schema, data transfer, and reconciliation. If there is a significant amount of data to migrate, it can be a great advantage to let the migration run un-attended. By upholding DevOps principles and automating as much of the trial as possible, the cost of each trial is reduced.
A significant proportion of RDS cost is not only the instance class (CPUs and memory), but how many IOPS (Input/Output Per Second) will be consumed. Price total resource consumption using the AWS pricing calculator, including any significant data egress traffic such as nightly exports. Remember that multi-AZ deployments in AWS will double the instance cost.
Since on-premises instances must be provisioned for peak load, they tend to be over provisioned. Establish utilisation by obtaining AWR reports for each environment which cover those periods of peak load. These metrics appear in the 'Load Profile' section of the Report Summary.
AWS do not charge for data ingress, so the cost of a trial is generally low unless moving substantial amounts of data from another cloud provider. If AWS Data Migration Service (DMS) is used for data transfer (also known as heterogeneous data migration), it will be necessary to provision a replication instance which is billed similarly to EC2. Generally, this is a small additional outlay in comparison to RDS costs, but again be aware that the multi-AZ instances recommended for production will cost double.
Like Oracle, PostgreSQL is built for performance, but the underlaying architecture is substantially different. As a result, queries that perform well on Oracle may initially perform poorly on PostgreSQL. With Oracle, it is possible to provide hints to the query planner. These are sometimes used to force the query planner to use certain indexes or run with a desired degree of parallelism, but the PostgreSQL community have resisted adopting any analogous concept. If query hints are widely in use, expect a bumpy ride as it will be necessary to remediate each query, either by re-writing it, breaking it down into smaller parts, or by adding additional indexes. The latter should be done with care on any tables where commit latency is important.
The impact of poor query execution is generally resource consumption orders of magnitude higher than optimal execution, so a thorough period of Non-Functional Testing (NFT) is required to flush these out. After NFT completes and performance issues are remediated, it may be possible to downscale cloud instances or reduce provisioned IOPS, which will result in savings. Lower IOPS consumption also uses less energy, improving the sustainability of the target state solution.
AWS Performance Insights provides granular visibility of database load, similar to Oracle Enterprise Manager (OEM). It's free with a seven-day retention and enabled by default. We recommend becoming familiar with this tool, as Performance Insights allows you to split load by SQL Statement, User, Host and Wait, all in near real time. Review performance insights after running representative load to spot blocking, bottlenecks, and poorly performing queries. It can even be used for troubleshooting failovers as the detailed SQL reporting provides a picture of load prior to failover.
If using AWS Data Migration Service (DMS), be aware that primary keys are required on all tables in PostgreSQL where Change Data Capture (CDC) will be applied. These are also required for acceptable performance of data validation tasks when data volumes are large. This requirement can be addressed by using a custom schema for migration, which in turn allows for other performance optimisations such as fewer indexes for improved commit latency.
4) Data governance
All organisations are subject to a continued tightening of their data governance requirements. To be confident in the results of a performance test, a representative variety of queries must have been executed against production-like data in a production-like environment. This should include critical queries that execute infrequently to support end of month processing, for example. Consider which data will be used such that security approvals can be submitted ahead of time.
If use of production data is restricted, it will be necessary to explore alternative approaches such as masked or synthetic data. If the application to be migrated has its own data masking routines, these can be used to create datasets that may be suitable for consumption in other environments.
Moving to the cloud is an opportunity to enhance an organisation’s security posture. Many data centres have few restrictions on 'east to west' traffic and cloud migration is an opportunity to significantly improve on this through the adoption of zero trust networking rules. However, establishing the correct rules to apply can be a time-consuming process. Use a tool such as AWS Application Discovery Service to gain a complete picture of the networking landscape to surface all integrations currently communicating with the database.
Latencies within data centres and AWS regions typically fall under 1ms, whilst connections between on-premises and AWS can be orders of magnitude higher. The knock-on impact of this can be that application response times are degraded to a similar degree. This kind of effect can often be magnified by chatty integrations. To avoid latency or bottlenecks becoming a problem, consider the suitability of Direct Connect. If this is already in place, perform an early POC to establish the latency and use this figure to inform the phasing of migration. For this reason, consider migrating tightly coupled applications in the same wave.
As part of a major go-live event, a team of stakeholders will generally meet to provide a go/no go decision. A key input into this decision will be the success of the data transfer, determined by the results of a reconciliation. Offline migrations are easier to reconcile than online migrations where there is continuous movement of data. For online migrations, consider conducting the reconciliation in two parts – once after the initial data load and then once more at the point of go-live.
Although DMS provides a capability to perform row-level validation, there may be reluctance among stakeholders to allow DMS to assert its own success. There is a fair argument to bring in another tool for this and the best choice is likely to come down to the criticality of the data.
Whichever approach is taken, ensure that the reconciliation process becomes part of the trials. Not only will this surface any data integrity issues early in the project, but it will also ensure that the process of performing the reconciliation is smooth and does not create delays during go live.
7) Skills gap
A successful migration to PostgreSQL requires deep technical skills and knowledge of the new tools that will be introduced as part of the migration. As stated previously, Oracle and PostgreSQL have a very different core architecture. The first application to be migrated will always incur an additional overhead as teams adjust and upskill. If a separate team will support the application in production, consider how they can be brought in early for knowledge transfer.
It goes without saying that together, the AWS and PostgreSQL documentation provide a wealth of useful knowledge to address most questions or problems. If it becomes necessary to escalate issues to AWS support, having a repeatable and observable migration process should make providing the necessary information straightforward.
As detailed above, the complexity of migrating to PostgreSQL depends on a number of factors unique to your organisation and its applications. By selecting the right tools and approach, you can lay the foundations for success.
Transforming your organisation to succeed with Cloud - Part 1
Transforming your organisation to succeed with Cloud - Part 2
Transforming your organisation to succeed with Cloud - Part 3
Transforming your organisation to succeed with Cloud - Part 4
Podcast: Maximising your organisation's cloud transformation journey
Cloud computing: Getting more out of serverless
Podcast: Cloud optimisation - Is it simply about costs?