We talked about why AutoSys is essential for companies moving to the cloud in a recent blog, but when making the move there are different integration methods companies can follow. Each, naturally, has its own pros and cons. For sake of this discussion, I’ll be focusing on AWS — the technology people ask me about most frequently — but the concepts apply to all cloud vendors.
Here are four ways to integrate AWS processing into AutoSys.
Using the AWS Command Line Interface
The AWS Command Line Interface (CLI) is the most straightforward integration; it provides a simple entry point to all AWS services. The general syntax is ‘aws serviceName action [parms and sub-verbs]. For example, I want to upload a local file to an S3 bucket ‘aws S3 cp importantFile.csv S3://mybucket/importantfile.csv”.
It is rapid, easy to do, and works well for synchronous commands; that is to say, the job acts and returns that action’s result.
You need a different approach for asynchronous commands, like Step Functions or Batch. Asynchronous commands only tell you if the CLI request was successful, not the result of what was requested. To do that, you need to retrieve the request’s ID and keep polling until the action comes to some form of termination status, then determine the success or failure within AutoSys of that termination status.
- Pro: Quick and easy to run commands
- Con: While easy to use for invocation of a request, it may not return the final result
- Con: Requires installation of the AWS CLI
- Con: Credentials needed for calls outside of AWS
Using the AWS Software Development Kit
While scripting can be a powerful tool, sometimes the level of effort required to parse the output is much higher than if you had used a language designed for this purpose.
An AWS Software Development Kit (SDK) exists for most popular programming languages such as Python, Java, Go, Ruby, Javascript, and Node.js. I have used the Python and Java ones myself and found them relatively easy to build something useful very quickly. That is due mainly to the many code examples from AWS and other online sources, not my programming acumen. The response from most requests is in JSON format, and using the language native JSON parser is usually the best way to dissect the response using either JSONPath or regex.
The typical flow is to invoke the service and obtain the request’s ID. Then go into a loop and check the status for a termination status that gives you the success or failure for the request, and set the exit code of the script or program to signify that status.
Another benefit of using the SDK is that if the programming language has a signal handler, you can “kill” the job from AutoSys, catch the signal, and stop the process in AWS. Of course, if you stop it from within the AWS console, that changes the status returned in the script to exit with a failure exit code.
The SDK integration allows you to retrieve any logs and write them to stdout/stderr of your AutoSys job.
- Pro: Many examples on AWS and other sources
- Pro: SDK supports many popular languages
- Pro: Can be customized to your exact requirements
- Pro: If the agent is on EC2, then an IAM role simplifies security and credential
- Con: Requires installation of SDK
- Con: Does require programming
- Con: May require maintenance or debugging
- Con: Requires credentials when called outside of AWS
Here is a link to some Java examples provided by AWS:
https://github.com/aws/aws-sdk-java
Python coding examples from AWS:
https://docs.aws.amazon.com/code-samples/latest/catalog/code-catalog-python.html
I’ve included an example of a Python script that uses a signal handler. This was created from a couple of different examples from AWS that showed one specific function call. I modified it into a cohesive script with the additional ability to kill the process in AWS.
#
import sys
import time
import boto3
import signal
def switch_state(argument):
switcher = {
1: "SUCCEEDED",
2: "RUNNING",
3: "FAILED",
4: "TIMED_OUT",
5: "ABORTED"
}
# Define a Signal Handler to be able to stop a running execution
def handler(a, b): # define the handler
print("Good bye")
response = sfn.stop_execution(
executionArn=executeArn,
)
print(response)
sys.exit(1)
signal.signal(signal.SIGINT, handler) # assign the handler to the signal SIGINT
# The Amazon Resource Name (ARN) of the state machine to execute.
STATE_MACHINE_ARN = sys.argv[1]
#The name of the execution
EXECUTION_NAME = sys.argv[2]
#The string that contains the JSON input data for the execution, would be a file or input in production
INPUT = '{"IsHelloWorldExample": true}'
sfn = boto3.client('stepfunctions')
response = sfn.start_execution(
stateMachineArn=STATE_MACHINE_ARN,
name=EXECUTION_NAME,
input=INPUT
)
#display the arn that identifies the execution
print(response.get('executionArn'))
executeArn = response.get('executionArn')
print(response.get('startDate'))
while 1>0:
response = sfn.describe_execution(
executionArn = executeArn
)
statusStep=response.get('status')
print(response.get('status'))
if statusStep == "RUNNING":
time.sleep(2)
continue
if statusStep == "SUCCEEDED":
rc = 0
break
else:
rc= 1
break
time.sleep(2)
sys.exit(rc)
Using Web Services
Here we can use straight HTTP requests. Nothing needs to be installed locally, which reduces the footprint. The downside is that you will need to code the necessary authorization signature so that AWS accept the requests.
Note: both the CLI and SDK use these same web service calls but do all the heavy lifting regarding authorization signatures.
Once you have an authorization signature, the process is the same as using the SDK. Using the web service orchestration capability in Autosys R212.0, a second Web Service call requests to recover the ID, using the ID you poll (making another web request) until you receive a termination state and set the exit code accordingly.
Like with the SDK, if the programming language has a signal handler, you can “kill” the job from AutoSys, catch the signal, and stop the process in AWS. Of course, if you stop it from within the AWS console, that changes the status returned in the script to exit with a failure exit code. The Web Service integration will produce spool files that will contain output.
The recommendation from AWS and many of the searches I read while researching the HTTP process is to use the CLI or the SDK, as the signing process is error-prone and the returned errors are not as clear on what the problem may be.
- Pro: No AWS code required
- Pro: Can use any language that supports REST
- Pro: Some examples from AWS and other resources
- Con: Need to generate a signature for each call - Potentially error-prone
- Con: Harder to debug
- Con: May require additional maintenance and debugging
- Con: Always requires credentials of some sort to be passed
Invoke AutoSys Web Service from AWS
So far we have covered calling AWS from AutoSys. In some cases, you may want to reach out and touch AutoSys from AWS. Several years ago, we introduced web services that cover sendevent and basic autorep type calls. The ability to run autorep and jil as a web service was added in R12.0.
This addition meant that you could invoke a web service call from within an AWS process to set a global variable, start a job, or retrieve the status of a job. For example, you could create a Lambda function that tells AutoSys to set a global variable that means a particular process has finished in AWS, and AutoSys can start the next part of the process or use that variable for an on-premise or other AWS-related job.
It's All About Security
All of these methods require credentials, and you must not use hard-coded values in any scripts or job definitions; they are essentially the keys to your AWS accounts.
When using the CLI or the SDK there are several ways to authenticate the user:
- The most basic is that the keys are kept in a credentials file under the user’s home directory. A single file may contain several users and their associated keys. Using a switch on the command or SDK call, you can signify which user to use.
- Use environment variables set in a job profile or passed in the job definitions. Giving you more flexibility and could incorporate more sophisticated ways to retrieve and set the environment variables.
- Use the credentials file to point to an external script to return the keys. This is similar to using environment variables in that you don’t have to have hardcoded keys in a file.
You can simplify credentials management for AWS significantly by installing an AutoSys agent on an EC2 image and use a command job to run the CLI or SDK. You can then assign AWS Identity Access Management (IAM) roles to the image, and the command no longer needs the keys on the EC2 image. The role manages security.
Best Practices
Use variables
Use global variables to obscure confidential information, like an ID, or promote greater reuse without changing job definitions.
Some coding will most likely be required for all but the simplistic tasks. Use of variables will make the scripts or programs flexible and reusable for multiple applications.
Use job templates
Create job templates in Quick Edit or App Edit in WCC for the different use cases. This helps make it self-documenting and only requires updates of specific fields, like an instance name or region, if multiple ones could be used.
For requests that require a unique run name, like Step Functions use $AUTO_JOB_NAME$AUTORUN as it associates the run in AWS with the specific job in AutoSys and makes it easy to connect the two from either.
Conclusion
Integrating AWS with AutoSys is straightforward. There are multiple ways to accomplish the task, and in the end it depends mostly on your use cases and personal preference.
The CLI mechanism is quick and easy to implement if you are only running synchronous tasks. If your processing also includes asynchronous tasks, I would suggest using the SDK and, if possible, using an AutoSys agent installed on an EC2 image. Using an EC2 image gives your security team the ability to maintain the permissions, reducing the need to maintain keys and keeping your compliance team happy.