For teams responsible for data masking, demands continue to grow. It seems each month the size of tables that need masking gets significantly larger. Now, it’s not uncommon for teams to receive requests to mask tables with hundreds of millions and even billions of rows. However, while table sizes keep growing, time frames don’t. No matter how large the table, teams still typically need to turn around requests within eight to 12 hours.
How do teams scale their masking to accommodate these expanding demands and tight turnaround times? The good news is that Broadcom Test Data Manager is helping customers meet these demands every day. In this post, I’ll offer an introduction to a new feature in Broadcom Test Data Manager called Scalable Masking and outline how you can use the feature most effectively.
Starting with release 4.9, Test Data Manager offered Scalable Masking capabilities. Customers that upgrade to the latest release, 4.10, will be able to leverage these features and more. Users can initiate masking jobs through the TDM portal and centrally run masking jobs across a range of tables and data models.
In these new releases, Test Data Manager was offered as a Docker container view. This allows teams to run multiple masking engines as a Docker container. As a result of this container-based approach, when teams have large tables, they can split jobs across multiple masking engines, which provides significant performance and scalability benefits.
While a lot of variables will affect performance, Scalable Masking can mask between four and 15 million cells per minute. (See the table below for examples of differing data source sizes and configurations.)
When requests are made in the portal, they are submitted as RESTful requests to the message bus. Based on the number of engines available and the processing status of the engines, the message bus sends those requests to the appropriate engine.
Masking engines connect to the target table, the table is sent to the engine, and the engine conducts masking. The message bus reports on progress back to the portal. This reporting enables administrators to track status and it provides a documented record that can be retained for auditing purposes.
It is important to note that the Docker masking engine communicates directly with the database instance, and they both reside on the same subnet, which can provide significant benefits in performance and throughput.
To illustrate how masking jobs can be split, here’s a hypothetical an example:
To get the most out of the power of Scalable Masking, following are some key strategies:
Following are suggested settings for Scalable Masking:
For today’s development teams, the ability to scale data masking continues to get more critical. By employing Scalable Masking and properly configuring their environments, teams will be able to dramatically scale their masking capacity. To learn more, be sure to read Masking Performance Optimization in CA TDM Portal.