Demands for Increasing Data Masking Scale
For teams responsible for data masking, demands continue to grow. It seems each month the size of tables that need masking gets significantly larger. Now, it’s not uncommon for teams to receive requests to mask tables with hundreds of millions and even billions of rows. However, while table sizes keep growing, time frames don’t. No matter how large the table, teams still typically need to turn around requests within eight to 12 hours.
How do teams scale their masking to accommodate these expanding demands and tight turnaround times? The good news is that Broadcom Test Data Manager is helping customers meet these demands every day. In this post, I’ll offer an introduction to a new feature in Broadcom Test Data Manager called Scalable Masking and outline how you can use the feature most effectively.
Working with Scalable Masking
Starting with release 4.9, Test Data Manager offered Scalable Masking capabilities. Customers that upgrade to the latest release, 4.10, will be able to leverage these features and more. Users can initiate masking jobs through the TDM portal and centrally run masking jobs across a range of tables and data models.
Container Approach Yields Significant Performance Advantages
In these new releases, Test Data Manager was offered as a Docker container view. This allows teams to run multiple masking engines as a Docker container. As a result of this container-based approach, when teams have large tables, they can split jobs across multiple masking engines, which provides significant performance and scalability benefits.
While a lot of variables will affect performance, Scalable Masking can mask between four and 15 million cells per minute. (See the table below for examples of differing data source sizes and configurations.)
When requests are made in the portal, they are submitted as RESTful requests to the message bus. Based on the number of engines available and the processing status of the engines, the message bus sends those requests to the appropriate engine.
Masking engines connect to the target table, the table is sent to the engine, and the engine conducts masking. The message bus reports on progress back to the portal. This reporting enables administrators to track status and it provides a documented record that can be retained for auditing purposes.
It is important to note that the Docker masking engine communicates directly with the database instance, and they both reside on the same subnet, which can provide significant benefits in performance and throughput.
Example: How Job Splitting Works
To illustrate how masking jobs can be split, here’s a hypothetical an example:
- Environment. An organization has the TDM portal, docker container, and message bus running and they have four containers set up as part of their implementation. By default, each container will have four engines.
- Scope. The team wants to run a masking job with 10 tables, with each table having an average of 5M rows. They have five columns in each table that have to be masked.
- Split. The TDM portal submits the request to the message bus, which splits the job.
- Each table gets its own Scalable Masking engine, so all 10 tables can be masked in parallel.
- Two containers, which each have four masking engines, will execute eight of the jobs.
- A third container will handle the last two jobs, while the other two engines, and the fourth container, remain idle.
Tips and Best Practices
To get the most out of the power of Scalable Masking, following are some key strategies:
- Calculate and plan based on table sizes and masking scope. Upfront, it is important to establish a count of records to be masked, which is calculated by multiplying the number of columns being masked and the number rows.
- Allocate adequate space and resources. Masking jobs may fail due to server issues or lack of required memory, so it is important to allocate the resources required. Teams need to allocate enough memory in order to create the tablespace necessary and they need to have enough processors available to complete the job in a timely manner. As a rule, the more processors, the shorter the masking window will be.
- Validate database instance configuration and performance. Work closely with the DBA to make sure recommended configurations are applied and to ensure masking is working correctly.
- Manage heap size. This setting determines how much RAM is allocated to each instance. If the heap size is too small, teams may see their masking job fail. In general, in both Oracle and SQL Server, about 3 GB of heap size is sufficient to run most jobs properly.
Recommended Settings
Following are suggested settings for Scalable Masking:
- BATCHSIZE=37500
- BLANKSASNULLS=Y
- COMMIT=37500
- EMPTYASNULL=Y
- FETCHSIZE=75000
- GETTABLEROWCOUNTS=N
- ORDERBY=N
- PARALLEL=<Based on the number of CPU cores available>
- LARGETABLESPLITENABLED=Y
- LARGETABLESPLITSIZE=<Your calculation based on the largest table row count>
Conclusion
For today’s development teams, the ability to scale data masking continues to get more critical. By employing Scalable Masking and properly configuring their environments, teams will be able to dramatically scale their masking capacity. To learn more, be sure to read Masking Performance Optimization in CA TDM Portal.

Abhijit Mugali
Abhijit Mugali has extensive experience in both technical product ownership and strategic product management. He interacts with clients across geographies for requirement gathering, beta participation, and product launch. He also has expertise interacting with the global sales and pre-sales teams to effectively...
Other Resources You might be interested In
Observability and Monitoring Governance (Part 1 of 4)
Find out how strong monitoring governance can help IT teams cut through the noise, see what truly matters, and act with precision.
Observability and Monitoring Governance (Part 2 of 4)
Read this post and discover some of the top downstream benefits of establishing strong monitoring governance. Gain best practices on how and where to start.
DX UIM Hub Interconnectivity and the Benefits of Static Hubs
Find out how using static hubs is a powerful way to enhance observability. Discover when and how to use static hubs, and the benefits they can provide.
Broadcom Recognized as a Leader: Engineering the Future of Service Orchestration
Read this post and see why Broadcom was named a Leader in the 2025 Gartner® Magic Quadrant™ for Service Orchestration and Automation Platforms.
Customer Spotlight: Global Bank MUFG Saves Millions of Dollars
MUFG’s Bruce Frank discusses how the global bank invokes Broadcom's Automated Analytics & Intelligence (AAI) to manage SLAs and ensure regulatory compliance, saving millions of dollars annually.
The "Lighthouse" of Strategy: Guiding Your Organization Through Decision Chaos
Strategic clarity is key. See how strategic portfolio management (SPM) helps align resources and decisions for better business outcomes and ROI.
4 Ways AppNeta Enhances Cost-Focused Cloud Planning
See how AppNeta delivers insights that enable cloud architects to correlate wasted spending with performance degradation and proactively relocate resources.
Automic Automation Cloud Integrations: Azure Functions Agent Integration
Broadcom's Azure Functions Automation Agent lets you easily execute Azure Functions, monitor and manage them with your existing enterprise workload automation, as well as other cloud-native...
The Public Internet Is Not Your WAN
Moving beyond MPLS was a strategic necessity. To succeed in modern environments, you need to stop guessing about internet performance and start measuring it.