Learning Path

Automic Automation: Delivering Data Pipelines to Production Using Multi-Cloud Orchestration

Learn how to leverage Automic Automation’s orchestration capabilities to integrate on-premises applications and multiple Cloud Integrations.

Start Your Path

13 chapters

to review

2 hours

estimated completion

Using Automic Automation to Orchestrate Multi-Cloud Processes

This conceptual, cross-functional learning path breaks down the basic components of a tangible, real-life scenario, bringing together heterogeneous functions (integration packages, Jobs, Connection objects, authentication requirements) deployed in a single Workflow, to meet the needs of a Cloud-based operation.

This learning path consists of a white paper and videos that provide a deep-dive into several Cloud integrations (Kubernetes, Azure, Google, AWS) and demonstrates the ease with which Automic Automation federates and streamlines disparate capabilities, using easy-to-use orchestration mechanisms.

Data Orchestration: How to create agile and efficient data pipelines

Read this white paper for an overview of how Automic Automation’s capabilities provide the tools you need to orchestrate processes in multiple platforms.

White Paper July 16, 2024

Data Orchestration: How to Create Agile and Efficient Data Pipelines

Learn how to create agile and efficient data pipelines by orchestrating disparate sources with Automation by Broadcom. Achieve control, reliability, and repeatability for better data analysis.

Read White Paper

Multi-platform Orchestration: Use Case Overview

A Workflow orchestrates the following use case:

A job generates data from a local application (for example, Oracle or SAP).
This data should undergo certain transformation processes. To make the local data available to the cloud solutions that will transform it, it is first uploaded to cloud-based store tools, Google Cloud Storage and Azure Blob.
The data is transformed through Google BigQuery and Azure DataBricks.
The data is sent to a third-party application in a Kubernetes cluster for inventory calculations.

Installing, Configuring and Starting Cloud Integration Agents

All Automic Automation Agent Integrations operate the same way. Learn how to download, install, configure and start the Automic Cloud Integration Agents.

Uploading Local Data to the Cloud: Azure Blob Integration

Azure Blob Storage is Microsoft Azure’s Cloud storage solution and the standard storage cloud solution for many Automic Automation users. Customers of all sizes and industries use Azure Blob Storage to store (upload and download) unstructured data such as text and binary data, to move and monitor it, to delete it and so forth.

In our use case, the data generated by a local application needs to be uploaded to the Cloud for Azure Databricks. Before this can happen, the data must be made available in the cloud. The Workflow contains a Blob Upload Job that does this.

Transforming the Data: Azure DataBricks Integration

Azure DataBricks is Microsoft Azure’s Cloud data management solution. It brings data lakes (large data volumes) to lakehouses (systems built to exploit these volumes). Databricks is popular in certain circles: Data science, data engineering and Machine Learning.

Databricks has a proprietary job processing tool. Automic triggers these jobs, or can submit them independently via JSON, bypassing Databricks.

The data needs to undergo transformation before being transferred for inventory analysis.

Understanding Authentication Methods: Azure

One of the major challenges that Automic Automation teams encounter when designing cloud automation processes is how to authenticate to the target cloud systems This video explains the various methods applicable to each integration, giving special attention to the Azure Service Principal, which applies to all Azure integrations.

Uploading Local Data to the Cloud: Google Cloud Storage / AWS S3 Integrations

Google Cloud Storage and AWS S3 are industry-leading, object storage services that are the standard for many Automic Automation users. Customers of all sizes and industries use them to store and protect any amount of data for a range of use cases, such as big data analytics, backup and restore, archive, enterprise applications, and so on.

In our use case, the data needs to be uploaded to the Cloud for Airflow and BigQuery.

Triggering Auxiliary Processes: Google Cloud Composer/Airflow Integrations

Airflow is an open-source, Apache Cloud-based workflow automation solution, which was incorporated into the Google Cloud Platform. Both execute Python-based jobs made of sequential tasks. For Automic Automation, this means triggering DAG jobs, capturing the status of the individual tasks and of the overarching DAG Job, and reporting.

In our use case, the data has been uploaded and an auxiliary process is now triggered in the Google Cloud Composer. Airflow and GCC use the same Automic Agent.

Transforming the Data: Google BigQuery Integration

Google BigQuery is a data warehousing service for business intelligence, particularly suitable for AI. BigQuery is a major player in Big Data analytics. It includes native job capabilities: Loading, exporting, querying or copying data. Jobs are coded with a mixture of languages, like python and SQL. They are then listed as individual objects in the console, as Transfer Configs.

In our use case, the data needs to undergo a process of transformation before being transferred for inventory analysis.

Understanding Authentication Methods: Google

One of the major challenges that Automic Automation teams encounter when designing cloud automation processes is how to authenticate to the target cloud systems. This video explains the various methods applicable to each integration, giving special attention to the Service Account Key, which is central to authenticating with Google Cloud applications.

Understanding Authentication Methods: AWS S3

Triggering Processes in Clusters: Kubernetes Integration

Kubernetes is an open-source container system for scaling software applications. It assembles virtual or bare metal systems into a cluster which can run containerized workloads. Individual software components (processes, databases, web servers...) are isolated in individual virtual pods, which can be up or downscaled to optimize capacity management.

In our use case, this Job builds a gateway into the Kubernetes environment so as to trigger processes in the clusters.

Orchestrating Cloud Processes and Monitoring the Output

All the pieces are ready now, so we can execute the Workflow.

In our use case, we execute the Workflow and explain its progress. We look at and explain the reports and Agent logs.