Key Takeaways
|
|
More than ever, it’s vital for organizations to operate with maximum agility, and to get better at leveraging data. Many organizations are turning to Azure Data Factory (ADF) because it offers significant advantages in both areas, however, challenges with basic scheduling require an enterprise automation solution like AutoSys Workload Automation or Automic Automation for workload orchestration.
ADF is a fully managed, serverless data integration service. Featuring more than 90 built-in connectors, the service helps organizations simplify and scale their data integration initiatives. The solution helps users run their extract, transform, and load (ETL) processes, without having to do any coding.
As IT organizations continue to expand their use of ADF and other cloud services, the volume of automated workflows continues to grow. In the process, however, they encounter a number of challenges.
ADF features a basic, time-based scheduler that operators can use to automatically run jobs at specified times. The problem is that this scheduler can’t intelligently accommodate dependencies—and ADF workflows typically have multiple upstream and downstream dependencies. Dozens of sources may feed data into ADF, and several downstream applications may rely on ADF outputs for ongoing operation.
To coordinate these various processes and associated dependencies, administrators relying on the ADF scheduler are stuck with having to rely on forced time delays, that is, scheduling subsequent tasks to start at a time after which prior tasks have been completed.
Relying on these hard-wired schedules means that if one task takes longer than the forced time delay established, a subsequent task will kick off, typically with old, inaccurate, or incomplete data. This means the ultimate output of the job sequence will be suboptimal or even unusable.
These issues get magnified in many large-scale environments, where dozens of data sources may be used. If one source doesn’t come across in time, workflows may encounter cascading failures.
When running ADF, users are highly reliant upon a range of dispersed, distributed networks. At any given moment, automation jobs can fail, simply due to a call to an API returning an error message. When this type of downtime occurs, scheduled jobs will fail, creating issues for subsequent downstream jobs.
To prevent these issues, operators can opt to add buffers. For example, imagine that the longest phase-one activities of a given workflow can take up to 10 hours to complete. A user could then add a buffer of two hours, and schedule all phase-two workloads to start a total of 12 hours after phase-one tasks were kicked off.
While this approach can help minimize failures, this means ADF instances will need to be kept idling for two hours, or more if jobs complete ahead of schedule. This can be very costly. In many environments, these idling resources may account for fees of thousands of dollars, and these costs are accrued frequently.
If a failure is discovered while a workstream is underway, the administrator will have to disable the schedule, potentially in multiple products, and manually troubleshoot and address any issues that have arisen.
To avoid some of the challenges outlined above, some automation groups have sought to develop shell scripts for creating automated workflows. However, these approaches require significant up-front investment, are very difficult to support and run over time, and are not scalable. Further, inefficiency and costs continue to mount as the scale of the environment grows.
As long as automation has been around, the potential for costly, brittle islands of automation has also been around. While businesses continue to expand their use of ADF, automation managers don’t want to add a siloed automation tool that they have to maintain and support, in addition to the other platforms they have already invested time and money in, and have already established expertise in.
That’s why the use of enterprise automation continues to be so essential. Enterprise automation provides central management of automation workloads across a range of environments and platforms. These solutions enable organizations to adapt to the evolving requirements of cloud-driven workloads, including those running in ADF.
Automation by Broadcom offers robust scheduling that enables users to manage dependencies across pipelines, integrations, applications, and processes. These solutions deliver end-to-end visibility across cloud vendors and on-premises deployments.
Automation by Broadcom offers a wide range of cloud integrations, which are featured in our Automation Marketplace. With the solutions’ broad platform and service coverage, customers can efficiently manage complex, multi-phase automation deployments within ADF—as well as complex pipelines that span platforms and services from a range of platforms and vendors, including cloud vendors and on-premises systems.
With Automation by Broadcom for ADF, developers and data scientists can fully leverage the power of ADF in harnessing enterprise data. At the same time, automation teams can continue to employ Automation by Broadcom solutions as their central, unified platform for managing and orchestrating automation workloads across their application landscape.
Automation by Broadcom Solutions offer a rich set of capabilities that are invaluable for IT operations organizations. Users can model any process dependencies, they can establish centralized operational control, and they can gain 360-degree visibility of all services running in production.
By implementing the ADF integration with Automation by Broadcom solutions (e.g. Automic, AutoSys, and dSeries), organizations can realize a number of benefits, particularly as their usage of ADF and other cloud solutions continues to expand. Here are a few of the potential upsides:
For today’s enterprises, extracting maximum value from data is an increasingly critical imperative. To achieve this objective, it is vital to establish seamless automated data pipelines and to have the ability to harness the power of cloud-based data integration services like ADF. With Automation by Broadcom, IT organizations can leverage a unified platform for managing all automation workloads running in ADF and all their other cloud-based services and on-premises platforms.
Visit the Automation Marketplace for details on integration features, links to extended TechDocs, and the option to download the software.