It’s time that businesses took a good, hard look at the way they manage their cloned database environments. For years the words “we need a fresh copy of production in QA” were the universal sign of a horrible day. The amount of storage, server resources, and time required for provisioning a new database environment were often times unacceptable or at the least inconvenient.
Thanks to Virtual Machines, the server resources have become far easier to provision. Within minutes a completely available virtual server could be ready to go, and often disk resources could be provisioned along with the virtual machine. But still a clone of the database has to be made. The clone operation will require a significant amount of storage for the new database, the intermediary backup taken to create the clone (in some cases), and archive logs for bringing the cloned database up to the proper point in time. But even worse, the clone will take time.
Developers, project managers, even business analysts discuss the requirements for new deployments. Meetings take place with DBA teams and other operational resources; decisions are made about how to proceed with QA and DEV resources. And no matter what brand of fancy snapshotting or cloning technology your storage solution might offer, no matter what kind of virtual machine environment you have, time has to be taken and resources expended to make the clone happen.
This situation happens often at many organizations, sometimes as much as once per week. Disk space and time are wasted, making it more difficult to resolve critical issues or bring new development to market.
Enter Database Virtualization
Database virtualization completely changes the way database clones are created and revolutionizes the way a business perceives new environment provisioning. It does this not by virtualizing servers in the way popular options like VMWare ESX and Oracle VM do, but by virtualizing the very data that you are accessing.
An Oracle environment is made up of two parts: the Instance and the Database. The instance is formed of the running processes that make up Oracle’s software stack and segments of data in RAM. The database is nothing more than files on disk. Datafiles, redo logs, and control files are just blocks of data written to disk which serve no real purpose without an Oracle instance to access them.
In a traditional environment, cloning a database requires a new Oracle instance and a complete copy of the source database. This must occur for every clone your application needs. In a typical application stack, you will have databases for production, development, and quality assurance (QA). In some environments you might add user acceptance testing (UAT), regression testing, and reporting as well. Each of these environments will need its own instance and copy of the database.
Cloning with a Virtualization Appliance
The virtualized database, on the other hand, requires only one actual copy of the source database. The copy must only be taken one time to the virtualization server via RMAN APIs:
After the database has been cloned a single time into the virtualization appliance, it will be kept up to date with either redo/archive logs or level 1 incremental backups. Redo can be pulled from the source database near real-time to keep the virtualization appliance as close as possible to the source. If a more staggered clone is required, archive logs or level 1 backups can provide incremental updates on whatever schedule you need. We will get back to the change data later.
With database virtualization, that single copy of the source database can be used to provide data for multiple cloned environments:
The disk savings in this configuration alone are tremendous because a new copy of the production database is not required for every single system you create. In the case of a 10TB database with Development, QA, and UAT clones you would save roughly 50% of the total disk space (originally 40 TB with production 10TB + QA 10TB + Dev 10TB + UAT 10 TB down to just 20TB with production 10TB + one clone 10TB). But beyond this, another benefit of database virtualization shrinks the requirement even more with ZFS or DxFS compression.
In the case of the 10TB database with three clones, the overall size would be somewhere around 13.3TB (production 10TB + one clone 3.3TB). Each instance you provision can attach to the compressed virtualized data via NFS and perform normal read/write operations like a full database copy without the extra overhead.
Even more impressive benefits
While the disk savings alone are tremendous, the real benefits come in the form of time. Since the source data is already copied and all that is required is an instance to mount the virtual database files, you can create a full sized read/write clone in about 5 minutes.
Your target can be any Unix or Linux system, physical or virtual, that is linked to the virtualization appliance and has Oracle installed. With a few mouse clicks on the administration UI a clone can be completely provisioned from start to finish for near-instant access.
As noted in the previous section the virtualization appliance also takes change data in the form of redo/archive logs or level 1 backups from the production environment. Unlike a standby database where this data is consumed in order to have a single clone kept up to date, the change data in the data virtualization environment is kept for a user-specified retention period so clones can be provisioned from any point in time. Each clone can come from a different time, so you can have multiple clones of the same source database from different time windows with no additional overhead.
Lastly, there is a strong performance benefit when using a virtualization appliance thanks to shared block caching between cloned environments. This performance benefit is described in detail by a joint performance study between Delphix and IBM called A High Performance Architecture for Virtual Databases.
Putting Virtual Databases To Use
Because the virtual database technology so drastically changes the way we think about cloning and provisioning new database environments, a myriad of use cases can also be envisioned that can take advantage of the saved storage and time. Additionally, these use cases fit perfectly into the model for accelerated development and management of a database environment.
This is the area where the biggest impact of data virtualization can be felt. Because the development environment is the first and most critical stage of application release, the benefits are widely felt and cascade into future stages of development.
With virtual databases, each developer on a team can have a complete read/write copy of the database from any point in time (software revision) they require. The team as a whole can also share in a single fully deployed clone of production. By allowing developers to work with the entire database on a personal and collaborative level, the development process becomes faster and more accurate. Performance issues arising from inefficient queries can be found and identified much faster and results from the database will be far more realistic.
Virtual databases can be provisioned to any point in time, making them a perfect method of source control during development and testing. Different instances can be created on the same or different machines (physical or virtual) to keep track of the various branches of development. In the event that development moves forward and requires fresh data from production, that data can be refreshed or newly provisioned in a matter of minutes without additional storage requirements.
By implementing a centralized virtualization appliance for development, every developer can have a copy of the production (or QA) database at any stage that they wish. New deployments can be created with a few mouse clicks. Any system on the network running Linux or UNIX is a potential target – even the developer’s PC.
QA, UAT, Regression, and Bug Fix
During project development, QA can often times get pushed back or disregarded completely. This practice is dangerous, but usually a result of inadequate resources to perform proper tests. Modern development methodologies also insert new complexity into the mix. Instead of a traditional DEV/QA/PROD configuration, there are now requirements for User Acceptance Testing (UAT), regression testing, stress testing, and bug fix environments in many situations. Time resources that were already lacking become nearly impossible to manage depending on the business and development requirements.
Being able to instantly clone either production or development databases to a new QA instance with all data intact (from any point in time) changes the game entirely. One or more QA environments can be provisioned at any stage of development or after development is complete. Timing of the creation is no longer an issue since database virtualization allows multiple copies of the source database from different points in time.
More importantly, the effects of QA or stress testing no longer mean the destruction of the QA database environment. With traditional cloning, the QA process was burdensome because of the actual purpose of QA: to break it. Destructive tests can be run on a virtual database without consequence because a new one can be quickly and easily provisioned; alternatively, a new one can be pre-built just as easily.
Running QA tests in parallel will also speed up efficiency during the release cycle. In a traditional environment it is often difficult to perform multiple testing requirements at the same time; for instance, a critical bug requiring a fix could completely disrupt or reset regression test efforts. Using a virtual database allows multiple systems to be provisioned as necessary at any stage of the QA process and on demand in the event of an issue. The inclusion of a new environment does not diminish the abilities of another.
Reporting and Recent Data Retrieval
Another area that is often overlooked because of its size and complexity is reporting. Many organizations end up opting to perform reporting directly against the production database. This practice is ultimately untuned and highly dangerous to production environments.
If, however, a reporting system can be instantly built and deployed at any time, the issue becomes nonexistent. Reporting environments can be spun up at a moment’s notice, subject to any reporting tool compatible with Oracle, and then decommissioned as soon as the reports are completed. If a report was missed or further information is needed the reporting environment can be instantly spun up again, either to the same point in time as the previous virtual database or with fresh data from the source system. It is like TiVo for databases, but with more options and more parallel capabilities.
It is almost unthinkable to imagine a reporting environment that is fully read/writeable and capable of being refreshed with current data within minutes. This type of provisioning capability makes end of month, quarter, and year financial reporting a simple task. It also allows new initiatives such as data discovery and data mining, projects that before would require massive overhauls to storage and system configurations.
There are also times that it is required to retrieve deleted or historical data from a recent period, either to satisfy a customer service request or to fix a fat finger mistake by support or development staff. Instead of using database flashback capabilities or retrieving an entire export or RMAN backup of the database then rolling forward to find the right point in time, a new clone can be instantly provisioned, queried, and decommissioned. Since the virtual database is a fully capable Oracle instance, you can even create database links back to the source to populate the missing data back. These links can also be used for comparison reporting.
Backup and Recovery
Beyond the logical recovery capabilities described in the last section where the virtual database appliance can act as a time machine for your data, physical backup capabilities are also enhanced.
The virtual database environment acts as a backup already with its initial RMAN backup and incremental change collection. The difference is that with the compression offered in ZFS or DxFS (in the case of Delphix) is that you can store 50 days of backups in the same space as a single physical backup.
Restore testing becomes incredibly simple with a database virtualization appliance as your backup and restore hub. By backing up multiple source databases into the appliance, you can manage and provision backups from any source system from any point in time to any target environment. That sort of freedom of choice and speed of operation makes for lightning fast crisis resolution. In fact, if an issue is proactively noticed it is possible to have the clone set up before the production system even fails (no matter how urgent the upcoming failure) which means it can be used for near-instant recovery with extremely minimal downtime.
The ability to restore from any point in time within the retention window of the database virtualization appliance also means you can use it to configure staggered clones for pinpointing the point in time of logical corruption issues. Alternatively, it can also be useful for provisioning an alternative to delayed DataGuard. Since the database environment can be refreshed quickly and easily, having a system that is constantly a day behind the production environment is trivial and can be done without overhead.
Patching, Performance Testing, and Forensics
Of course, these abilities make it almost irresistible to consider the forensics opportunities available when there are issues on any database environment. In a traditional deployment environment there are many times that issues have to be labeled as minor and lumped together for future analysis simply because the resources to make a clone for every small issue were not available. But if a clone of any database can be created within minutes without additional overhead the entire reason for not resolving issues is turned upside down. Server and time constraints are eliminated as a bottleneck, allowing work to take place whenever it is required.
To take forensics to the next level, imagine a situation where a production database is encountering issues shortly after the release of a new software patch (let’s call it Release 2, naturally following Release 1):
- Create a clone of the production database as-is to instance REL2
- Create a clone of the production database as it was prior to the release on instance REL1
- Create a clone of the production database for testing fixes for the performance issue called REL2-FIX
- Perform comparison tests on all three environment simultaneously to observe differences
- Each virtual database will have its own X$, V$, and DBA_ views with individual metrics for easier comparison
- Speed differences can be seen in realtime with parallel execution across the three instances
- System level SQL tracing can be enabled on each system as tests are performed with no overlap in results
- AWR snapshots can be transferred between environments for comparisons with the AWR Diff tool ($ORACLE_HOME/rdbms/admin/awrddrpt.sql)
- Application code can be pointed to each environment in turn to allow better testing of code changes without having to roll forward and back or involve production in any way
- Once the problem is resolved, the virtual databases can be decommissioned and recreated instantly if other problems arise.
The point in time recovery capabilities of virtual databases also make it possible to do easy before/after comparisons in performance without sacrificing the before environment to build the after. This is frequently a problem both for application level changes (queries, table changes, etc.) and for Oracle level changes.
For instance, if the production database is currently running Oracle 188.8.131.52 and a move to Oracle 184.108.40.206 is desired, it is possible to create a virtual database clone and apply the patch on the clone without any affect on the source database environment. Since the clone operation itself only takes a few minutes, the major bottleneck in patch testing is removed. If problems are encountered on the new 220.127.116.11 virtual database then performance can be compared between it and the source, or a new virtual database from the source can be spun up and tested against. Either operation can be done immediately without increasing your storage footprint.
Every benefit and situation detailed here can be achieved with only a single virtualization appliance. The implications of virtual databases are far reaching yet quite honestly in their infancy. At present, this technology presents amazing new capabilities in a variety of use cases. Just as virtual machines changed the way we consider servers, virtual databases will change the way we provision and deploy database environments in nearly every situation.
Have any interesting ideas for how Database Virtualization could revolutionize your DBA or business tasks? Send us a message at [email protected]