Delphix & the EU General Data Protection Regulation (GDPR)


GDPR Observations

Last week I was reading a CMS Wire article by Brian Wallace, titled Who’s Ready for the GDPR? [Infographic], and found a few of the data points cited, eye catching.

On May 25, 2018 the GDPR goes into effect and according to the embedded infographic…

Note: direct quotes from the infographic are cited in italic.
  1. The GDPR requires that all EU Citizen data [i.e. Sensitive & Personal] be protected as stipulated in the final text of the regulation, even if the data lives outside of the EU.
    • Sensitive data: Name, location, identification numbers, IP address, cookies, RFID info
    • Sensitive personal data: Health data, genetic data, biometric data, racial or ethnic data, political opinions, sexual orientation
  2. 92% of U.S. businesses list GDPR as a top data protection priority
    • 77% of U.S. businesses have started preparing for GDPR, but only 6% are GDPR-ready
      • The low readiness percentage is consistent with my experience working alongside data owners at major U.S. corporations
  3. In addition to protecting EU Citizen’s data, there are other services a custodian of their data must provide. Some of these include:
    • EU citizens have the right to access their data as well as information about how it is being used 

    • EU citizens can take their data to a different agency upon request 

    • EU citizens have the right to data erasure 

    • Certain companies and governmental organizations must appoint a Data Protection Officer
    • Companies must implement reasonable data protection measures 

    • Companies must assess for threats
  4. Noncompliance with the GDPR will be costly. Top tier fines are set at €20 million or 4 percent of global annual turnover, whichever is greater

What are the challenges specific to a Data Protection Officer?

The challenges are the same challenges faced by CIOs and CDOs in major corporations today;  to secure sensitive and personal data while delivering copies to developers, testers, and analysts in effort to compete at the speed required in the Digital Age. Corporate metrics used to measure success also remain the same – increase revenue, reduce costs, and stay compliant. Sounds reasonable until you evaluate the application services your organization provides and realize your data is heavy and the anchor by which all other tasks in your process workflow wait.

What I consistently hear from clients include, but are not limited to:

  1. Slow, complex, masking process workflows which require teams of programmers to maintain the code
    • Integration testing (i.e. maintaining referential integrity across multiple, disparate, databases) adds substantial complexity
    • No concept of a masked master data set where copies can be quickly created
  2.  More than one masking toolset and/or process workflow; requires multiple skillsets and teams
  3. Masked data does not have realistic values substituted into the fields
  4. Too few copies of data sets to developers, testers, and analysts means sharing
    • a corruption introduced by one individual stops everyone from working
  5. Too many copies of the data sets requires a significant amount of storage and time to refresh and manage multiple copies
    • physically impossible to accommodate due to limited capital resources
  6. Teams that subset data to deliver copies faster and reduce storage are simply pushing the problem downstream
    • Developers cannot test end-to-end processes
      • Too often issues are only exposed in production
    • Testers are limited to a small set of test cases
      • Too often defects are found later in QA
    • QA, as told to me by every CIO I speak to, bears the brunt of performing the lions share of testing, meaning
      • issues and defects which should be found in Dev and Test are found in QA, typically due to providing stale and/or subset data to Dev and Test
      • the goal for these CIOs is to shift left their testing process workflows so QA can focus on a finite set of product quality testing

The Delphix Masking Engine

Addressing #1-3 (above)

Contrary to most of the masking solutions in the industry today, which are complex, require programmers, and are difficult to manage when changes occur to the data sets, Delphix Masking provides a GUI-based software solution. There are (3) powerful and easy to use components which simplify the core capabilities of an enterprise class masking tool.

  1. Profile – scan the selected data sets, identify sensitive data, and return a report of elements found along with recommended masking algorithms.
  2. Secure (Mask) – Apply the assigned masking algorithms to their respective elements while maintaining referential integrity; no programming required. Elements will be masked with fictitious, but realistic data substitutions. Once the algorithms are assigned the masking will be consistent and repeatable.
  3. Audit – To ease the demands of maintaining compliancy, Delphix provides a report that identifies which sensitive data elements have been protected thus simplying delivery to auditors. Audit will also alert admins if new data fields are added which introduce new vulnerabilities.

Screen Shot 2017-06-20 at 4.24.02 PM

The Delphix Data Virtualization Engine

Addressing #4-6 (above)

Data virtualization is the complimentary capability to masking. Protecting data at rest and in use with masking accommodates regulatory requirements but does nothing to enable your business to ‘go faster’. Why? Because data is still heavy and slow. Delphix Data Virtualization addresses the demands of your business by making data lightweight. What if…

  • you can have a full size, secure (masked), read-writable copy of any size database in minutes?
  • have as many copies of that database as you want/need without additional storage costs?
  •  provide developers, testers, and analysts with self-service access to their database (or files) and include the ability to:
    • reset, rewind, or refresh their database w/o opening a ticket
    • bookmark copies of their database for future reference or share with other teams
    • version control data like teams do for source code

Well, those are not ‘what if’ scenarios but real capabilities found in the Delphix Data Virtualization Engine. The (3) areas that define how Delphix manages the data virtualization workflow are:

  1. Collect – Delphix attaches to data sources (Databases and Applications) using native protocols to the platform.
  2. Control – By maintaining a unique set of common blocks in Delphix, users experience a 90% savings in non-prod storage. Leveraging the TimeFlow retention log users can provision copies from any point in time; masked master copies can be created from which all other copies can be created in minutes with certainty where and how the data was protected and distributed.
  3. Consume – Developers, testers, and analysts can refresh, rewind,  restore, bookmark, and share their database(s) and application(s) from any point in time in a matter of minutes versus the hour, days, and weeks required today.

Delphix Virtualization & Masking Engine(s)


GDPR will commence on May 25, 2018 and with it bringing hefty penalties for non-compliance. The level of effort and the impact to every organization in massive in scope and a distraction from day-to-day development and maintenance of your business services. Delphix provides an enterprise class solution to accommodate the protection of sensitive and personal data through an easy to use, but very powerful, masking solution. Combining masking with data virtualization enables businesses to continue to work securely on business services while adapting new workflow processes to address GDPR.

For more information to learn if Delphix is the right solution for you, please Contact Us.


Data First Strategy

 If data are the jewels of the company, then companies are handcuffed to their treasure.

The industry is long overdue for a disruptive correction to the biggest problem facing companies today – being data constrained.  Specifically, the long ‘wait’ times for delivering databases and datasets to your application developers, DBAs, testers, and analysts that are current, consistent, and secure. Data management in the pre-production space can no longer be treated by CIOs as just one of the components in their IT strategy if they expect to remain relevant competing on business agility, customer affinity, and operational excellence. Data must be considered the first priority around which your enterprise IT strategy is built. Today’s brittle IT infrastructures are incapable of handling the current data demands of the business and this has a tremendous impact on cost. How much cost? – Howard Rubin cites in his paper titled Technology Economics: The “Cost of Data”  –   … did you know that overall 92% of the cost of business — the financial services business — is ‘data’? According to Rubin, The next breakthroughs in the cost structure of the banking and financial services technology economic will likely come about through a focus on the efficiencies of data.  So, it’s not surprising that according to a recent news report by Gartner, By 2015, 25% of Large Global Organizations Will Have Appointed Chief Data Officers (CDO). But how will CDOs address the data constraint head on? – The correction required by organizations will be a Data First Strategy.

Root of the problem?

The industry has provided solutions to virtualize and automate everything in the data center. Well, almost everything, except virtualizing the data (data = databases and files.) Are you engaged in some of the latest trends like Agile Development, Cloud, or DevOps? Responsible for assuring data governance and compliancy? If so, are you able to deliver datasets, regardless of size, securely and in a matter of minutes to these teams? Can you provide unlimited copies for comparison or regression testing? Offer inherent continuous data protection so datasets can be reset or rewound to a previous point in time? The answer is typically no, and thus the reason that a Data First Strategy has been impossible, until now.

What’s needed?

The postal service has matured over time from the pony express to locomotives, automobiles, and airplanes to expedite package delivery. In much the same way postal depots were established to optimize delivery routes, so is the need to plan where Corporate data is needed to expedite not only the delivery of data, but also the services dependent on this data. A Data First Strategy is a paradigm shift in the way an organization’s services are created, built, and managed to deliver the right data, to the right teams, at the right time. This strategy relies heavily on the ability to deliver full datasets as fast as other virtualization technologies can deliver their service, typically in minutes. There are two tenets that define a Data First Strategy:

  1. Prioritizing your data first in your architectural design (business, hardware, and software)
    1. Focus on consumers, data center location, services, SLAs, security
  2. Prioritizing your data first in value to the company
    1. Focus on monetization, management, governance, compliance, collaboration, and acquisition of Corporate data

The subtly here is, if you could draw a box in your architecture design to depict immediate delivery and access to your data, then you could remove the ‘wait’ times that impede time to market, provide immediate access to audit data, and instant access to virtually unlimited copies to everyone.

Achieving a Data First Strategy

Possible with Delphix Agile Data Management

Delphix Agile Data Management unlocks your data by virtualizing your Corporate datasets and expediting their delivery to the teams who need them. The datasets are stored in a highly optimized, single set of common blocks, which can be used to securely create, refresh, and rewind copies of any size in a matter of minutes. To the end user the virtualized copies look and respond as full size datasets and can be managed by these teams through an easy-to-use self-service GUI, further eliminating ‘wait’ times. Some of the immediate capabilities that can be realized include, but are not limited to:

  • Immediate access to point-in-time datasets; delivered in minutes versus day/weeks, consider:
    • audit data access
    • unlimited ‘what if’ testing scenarios
    • root cause analysis of production data issues without impact to production resources
    • unlimited copies for regression testing
    • test data management
    • training facilities
  • Automated data masking to ensure immediate protection of customer data; onshore and offshore
  • Eliminate storage vendor lock-in to manage your pre-production environment
  • Enable seamless data movement across heterogenous storage arrays
    • a mandatory requirement for Cloud and data center migrations
  • Prepare for Cloud enablement by providing a hardware agnostic medium for transferring data efficiently and securely to and from a cloud service
    • includes public, private, and hybrid cloud environments.
  • Leverage a virtualized data copy for Immediate recovery for Disaster Recovery
  • Continuous Data Protection (CDP) for Source and Virtual Data Sets
    • tracks all changes as they occur

Customers have realized the following benefits:  reduced time to market for customer facing services by 60%, provide a new or refreshed data copy 99% faster, increase the number of pre-production data copies while reducing the storage required by 97%.

Closing observation

Data is the constraint in every organization. CIOs and CDOs will need to adopt and adapt a new approach to data delivery within their organization if they expect to achieve the full benefits from initiatives like Agile Development, DevOps, and Cloud. A Data First Strategy is the next step, but can only be achieved by data virtualization. Without the ability to deliver the right data, to the right teams, at the right time organizations will continue to struggle with compliancy, application and service quality, and escalating project costs.

Why I left Oracle for Delphix?

I recently bumped into a colleague of mine from Sun Microsystems in New York City. Although it’s been 18 months since I left Oracle, post-Sun acquisition in 2010, he was shocked. We had met when I started at Sun in 1997 and, since the Financial melt down in 2008, we assumed roles and trained as Enterprise Architects (EA); think TOGAF, ITIL, OEA, Business Value, Operations Management Capabilities Model, etc. versus the Java definition of EA. We were focused on critical business problems at the executive level. My management was very supportive in providing all necessary resource to ensure success, I was frequently asked to facilitate industry talks (Engineered Systems, Cloud, Data Center consolidation, and Virtualization, to name a few), and the compensation plan is excellent. I wasn’t looking for a new career.

So he asks me, Why the [explicative] would you walk away from a secure job to chase a startup?

So I asked him, What’s the biggest, unsolved problem in IT today? – Being data constrained, right? And isn’t data the ‘jewels’ of every company? It’s great we have all these industry driven buzzword-initiatives like Virtualization, DevOps, Agile Development, Cloud (Public, Private, Hybrid, *aaS), Big Data, and even Engineered Systems. But at the end of the day, no matter how fast you can stand-up infrastructure, you still can’t provide the developers, analysts, DBAs, and testers with a copy, or multiple copies, of a gigabyte/terabyte database in minutes for pre-production environments; in most cases it takes days, weeks, etc. to provision or refresh. And, if you can deliver a fast copy, it’s typically a snapshot which begins to go stale the moment the end-user receives it.

Delphix’ Agile Data Management solves this data constraint for databases and application data by providing an optimized copy of the Source. The Delphix Engine efficiently manages all changes from the Source, and those applied to the full read-writable virtual copies, to provide continuous data protection (CDP) from which copies can be created, refreshed, and reset to any point in time in minutes.

So Delphix solves the biggest problem in the industry, which I found not only exciting but a significant change agent to assist customers realize the potential benefits of Cloud, Big Data, DevOps, etc. by eliminating the ‘wait time’ for data delivery and management. But what ultimately convinced me to leave a position of 16 years was the executive management team and rock star list of engineers who are at the core of this company. A cool product is one aspect, having a team that can execute makes all the difference. I was offered an opportunity to be part of this team and have no regrets.