Broadcom Software Academy Blog

Introducing a New, Zero-Touch Way to Manage Your DX NetOps Upgrades

Written by Saurabh Sharma | Sep 3, 2024 8:10:00 PM
Key Takeaways
  • Employ DX NetOps and its zero-touch administration (ZTA) capabilities to streamline upgrades.
  • Plan, test, and upgrade your deployment versions in one session.
  • Discover the key elements and steps required to simplify and speed solution upgrades.

For every customer who has an existing DX NetOps solution deployed, an upgrade can be a daunting task. Even for seasoned administrators, the process of logging into each box, running the pre-checks, and then executing the installers can be tedious.

With the solution’s support for zero-touch administration (ZTA), the effort becomes easier. Now, you can plan, test, and then finally upgrade your deployment versions in one session.

DX NetOps now features zero-touch administration

DX NetOps offers robust, highly scalable operations monitoring. Our commitment to security is unwavering. Every release strengthens not only performance and usability but also our defenses against emerging threats.

However, it can be challenging for customers to keep up with releases because upgrading an existing deployment is a complex, multi-step process. With ZTA (EA), we are taking steps towards making that process much smoother.

In this blog, I will introduce you to the netops-deployment-tools (ZTA) which is a big step towards simplifying and effectively planning, managing, and executing a DX NetOps upgrade.

There are several components within the ecosystem of DX NetOps:

  • Spectrum Server
  • Performance Monitoring, which itself has several pieces, such as Data Repository, Data Aggregator, Data collectors, and so on
  • VNA
  • Kafka and others

In order to stay closer to the latest version, every user has to plan an upgrade, which has to be executed in a defined sequence in order to ensure there are no post-upgrade surprises.

At its core, ZTA  is just an Ansible playbook that can do two things for you:

  • Test your existing deployment for upgrade worthiness; we call it “preflight.”
  • Upgrade your deployment to the desired version.

What’s in the package?

ZTA (EA) is presently available for download in the form of a tar archive. A typical package would be in the following form:

netops-deployment-tools-dist-23.3.12-RELEASE.tar.gz 

The digits in the file name (in the above example, 23.3.12) identify the version of DX NetOps this utility supports. Once unpacked, it will have the following:

-rwxrwxrwx rtcbuild/software 2088 2024-07-25 14:49 netops-deployment-tools-dist-23.3.12-RELEASE/run-container.sh
-rwxrwxrwx rtcbuild/software 2134 2024-07-25 14:49 netops-deployment-tools-dist-23.3.12-RELEASE/config/inventory.remote
-rwxrwxrwx rtcbuild/software 3482 2024-07-25 14:49 netops-deployment-tools-dist-23.3.12-RELEASE/config/variables.yml
-rwxrwxrwx rtcbuild/software  822 2024-07-25 14:49 netops-deployment-tools-dist-23.3.12-RELEASE/config/ansible.cfg
-rw------- rtcbuild/software 202385039 2024-07-25 16:42 netops-deployment-tools-dist-23.3.12-RELEASE/netops-deployment-tools-image-23.3.12-RELEASE.tar.gz
-rw-r--r-- rtcbuild/software    108044 2024-07-25 16:41 netops-deployment-tools-dist-23.3.12-RELEASE/netops-deployment-tools-package-23.3.12-RELEASE.tar.gz
drwxrwxrwx rtcbuild/software         0 2024-07-25 16:41 netops-deployment-tools-dist-23.3.12-RELEASE/installers/
drwxrwxrwx rtcbuild/software         0 2024-07-25 16:41 netops-deployment-tools-dist-23.3.12-RELEASE/config/

Following is a description of each file:

File

Description

run-container.sh

Script file that launches a Docker container with all the required binaries. The playbooks have been validated on a specific version of Python and Ansible. To avoid any system conflicts, the easiest way to launch the playbook is via this container.

installers/

This is a placeholder directory that should contain all the tar installer files for DA, Portal, Spectrum, and other components.

Users can download these files from the Broadcom support website.

config/

Contains files that are used for configuring the execution.

config/inventory.remote

Ansible automates tasks on managed nodes or “hosts”, using a list or group of lists known as inventory.

This file contains all the designated machines and IP information for the DX NetOps components.

config/variables.yml

This file defines all the variables that control the execution of the playbooks in the ZTA.

config/ansible.cfg

This Ansible configuration file defines variables that control the execution environment required.

netops-deployment-tools-image-23.3.12-RELEASE.tar.gz
netops-deployment-tools-package-23.3.12-RELEASE.tar.gz

These files contain the container image and playbooks respectively.

Why the container?

Python and Ansible are closely related, as Ansible is primarily written in Python and relies on Python for much of its functionality.

Why does the version matter?

  1. Ansible is built on Python, so it requires Python to run. The version of Python installed on your system can affect which versions of Ansible you can use.
  2. Different Ansible versions are compatible with specific Python versions. For example, Ansible 2.5 and later require Python 2.7 or Python 3.5+. More recent Ansible versions have dropped support for Python 2.x entirely.
  3. Many Ansible modules are written in Python. The Python version can affect how these modules function and which features are available.
  4. Newer Python versions often include performance improvements, which can benefit Ansible's execution speed.
  5. Using up-to-date Python versions ensures you have the latest security patches, which is crucial for maintaining a secure automation environment.
  6. Newer Ansible versions may leverage features from more recent Python versions, potentially offering improved functionality or performance.
  7. The Python version installed on your control node and managed nodes can impact Ansible's ability to communicate and execute tasks across your infrastructure.

You still have the option to choose to work without a container, but this is not recommended. If you choose to take this approach, you will have to ensure the Ansible and Python version match.

ansible [core 2.16.6]
python version = 3.12.3
jinja version = 3.1.4
community.general = 9.2.0

Running a container the first time

When you run the container the first time, you might see something as below:

> ./run-container.sh

The console output might be different depending on the version you are using.

PWD=
Loading netops-deployment-tools-image into docker
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
Getting image source signatures
Copying blob 94e5f06ff8e3 skipped: already exists  
Copying blob f65114428c0c done   | 
Copying blob fe7bb379c799 done   | 
Copying blob 5f70bf18a086 skipped: already exists  
Copying blob 5f70bf18a086 skipped: already exists  
Copying blob 8beb21be7346 done   | 
Copying blob 717a77b6ad64 done   | 
Copying blob 21e4d0616721 done   | 
Copying blob ee3965cbb1fc done   | 
Copying blob 9c41a3a845cf done   | 
Copying blob 010c4e24bb42 done   | 
Copying blob 5b293ac664ec done   | 
Copying blob 53b88bead7fc done   | 
Copying blob 058764feb1cf done   | 
Copying config 919b97a17b done   | 
Writing manifest to image destination
Loaded image: com.broadcom.netops/netops-deployment-tools:latest
Ansible playbook folder on docker container=netops-deployment-tools-package-23.3.12-RELEASE
Starting container netops-deployment-tools
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
b8ec018ff3db:/#

Once the prompt b8ec018ff3db:/# appears, your container is up and ready for next steps.

Must have items

  1. All the installers files should be downloaded before-hand in the installers/ folder.  The container mounts the local installers/ folder before it shows the ready prompt and this is where the playbook would look at these files before it can execute actions/tasks.
  2. Add elements to the inventory.remote so that you know all the participating machines.

Inventory information: inventory.remote

It contains the machine/host details for the respective component.

The noticeable elements are the groups for example [portal], [portaldb], etc. These groups help identify the components installed on the respective machine.

Group

Description

[portal]

The host where the portal is deployed.

[dr]

The host where dr is installed. If it is a cluster, you do not need to specify all hosts. We will determine the other node information from the one you specify.

[portaldb]

If the MySQL database is on a separate host, add the details under this group.

[daproxy]

If FT DA setup is deployed, specify the host information for proxy.

Each host specified in any of the groups can have unique or common credentials (username and password or ssh_key) and there are a few ways of configuring these details.

Scenario 1

When each host has the same user and same private key, it can be configured in two ways:

In the ansible.cfg 

[defaults]
remote_user=root
private_key_file=/path/to/file

Or you can configure in the inventory.remote

[all:vars]
ansible_user=root
ansible_ssh_private_key_file=/path/to/file

Scenario 2

If each host has a different user a key, you can specify these values against each host entry in the respective group:

[portal]
10.253.6.64 ansible_user="root" ansible_ssh_private_key_file=/path/to/file

Or you can introduce [group:vars] if you do not want to repeat for each host:

[dr:vars]
ansible_user="root" 
ansible_ssh_private_key_file=/path/to/file

Scenario 3

When each host has the same user and different password:

The user can be configured in the ansible.cfg

[defaults]
remote_user=root

And the password can be configured against each host in a group or you can use.[group:vars]

[portal]
10.253.6.64 ansible_ssh_pass="mypass"

Or as follows:

[portal:vars]
ansible_ssh_pass="mypass"

Scenario 4

When each host has the same user and same password:

The user and password can be configured in the ansible.cfg

[defaults]
remote_user=root
connection_password_file="/path/to/file"

Or if password is to be specified, you can only set the user in the ansible.cfg and specify password using ansible_ssh_pass in the inventory file, like in Scenario 3.

[defaults]
remote_user=root

Example

This file captures all the inventory information. If required, you can selectively choose what you want to upgrade too.

For example, a slight modification would be to only specify Spectrum-related information and it will skip other components. Ideally you should run for all components, but in preflight mode you can choose components.

[orchestrator]
localhost ansible_connection=local

[portal]
10.253.6.64 ansible_user="root" ansible_ssh_pass="valid-ssh-password"

[portaldb]
# if we have a dedicated MySQL machine add that under this group

# Atleast one DR host out of the DR cluster should be defined. It'll be used as the Primary host to upgrade
[dr]
10.253.7.76

# Remove this daproxy entry for non fault-tolerant DA, this should appear only for Fault tolerant DA (FT DA)
[daproxy]
10.253.7.80

[da]
10.253.6.181
10.253.7.78

[kafka]

10.253.6.88 
10.253.6.79

[spectrum_primary]
10.253.7.77 ss_install="yes" oc_install="yes" spectrum_owner="spectrum"

[spectrum_secondary]
10.253.7.77 ss_install="yes" oc_install="yes"

[vna]
10.253.6.246

[camm]
# remote-vna-host  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[sdc]
# remote-sdc-or-trapx-host trapx_enable="" sdc_trapx_installation_path="" sdc_trapx_install_owner="" ansible_user="valid-ssh-user" ansible_ssh_pass=***

[spectrum:children]
spectrum_primary
spectrum_secondary

ZTA modes

There are two modes of operation that have been scripted:

  • Preflight: ./preflight-all-remote.sh
  • Upgrade: ./upgrade-all-remote.sh

Preflight. This is a test mode that runs through the complete inventory in a defined sequence, identifies potential failures, and, if failures occur, generates  appropriate messages for system inadequacies. Until these failures are corrected, you should not run the other script. Post preflight success, the chances of successful upgrades are much higher.

Upgrade.  It is recommended to run Preflight before you run the actual upgrade.

Example

For our test run, I configured ansible.cfg as under.  

Key points:

  1. Look at connection_password_file, which holds the password information for remote_user - sruser.
  2. The section privilge_escalation has been configured for the sudo privileges or the for the host user the playbook should execute as.

ansible.cfg

[defaults]
inventory = ./inventory.remote
roles_path = roles
stdout_callback = community.general.diy
show_custom_stats = true
host_key_checking = false
remote_user=sruser
connection_password_file=./connection.yml
become_password_file=./connection.yml
remote_tmp=~/.ansible/tmp
allow_world_readable_tmpfiles=True

# Local log file if you want to create
# log_path=mylog.log
# for printing debug info to stdout
# stdout_callback = minimal

[callback_diy]
playbook_on_start_msg="Welcome to NetOps Ansible: "
playbook_on_start_msg_color=yellow
playbook_on_play_start_msg="PLAY: starting play "

[privilege_escalation]
#  If you want to define a become user that will be used for installation
become=True
become_user=root
# become_password = "interOP@123"
# become_allow_same_user=False
# become_ask_pass=False
# become_method=sudo

The inventory.remote contains only the Spectrum host.

inventory.remote

[orchestrator]
localhost ansible_connection=local

[portal]
# remote-portal-host ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[portaldb]
# if we have a dedicated MySQL machine add that under this group

# Atleast one DR host out of the DR cluster should be defined. It'll be used as the Primary host to upgrade
[dr]
# remote-dr-host-1 ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

# Remove this daproxy entry for non fault-tolerant DA, this should appear only for Fault tolerant DA (FT DA)
[daproxy]
# remote-daproxy-host ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[da]
# remote-da-host-1 ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***
# For FT DA there will be two DA hosts
# remote-da-host-2 ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***


[kafka]
# kafka-broker1.netops.broadcom.net ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[spectrum_primary]
10.253.7.77 ss_install="yes" oc_install="yes" spectrum_owner="spectrum"

[spectrum_secondary]
# remote-spectrum-host ss_install="" oc_install="" spectrum_owner="" exclude_parts="" remote_username="" remote_password="" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[vna]
# remote-vna-host ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[camm]
# remote-vna-host ansible_ssh_user="valid-user" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[sdc]
# remote-sdc-or-trapx-host trapx_enable="" sdc_trapx_installation_path="" sdc_trapx_install_owner="" ansible_connection=ssh  ansible_user="valid-ssh-user" ansible_ssh_pass=***

[spectrum:children]
spectrum_primary
spectrum_secondary

Our variables.yml

# Defines all the variables required for each role
---
product_name: Netops
# ansible_user: root
# Version information
minimum_version: 22.2.11

# Turn this ON when admin wants to force same version upgrade
# force_upgrade: true

# Directory information
local_installer_dir: /installers
notMolecule: true # this should always be true for Client environment

# When configured, this variable will be used to set the IATEMPDIR environment variable. It is used for temporary directory configuration during upgrades.
remote_installer_dir: /tmp

# Portal variables
# portal_upgrade_file_name: Portal-SilentInstaller.properties
# my_sql_password:

# this user is required for Portal MySql and default value is "mysql"
remote_portal_db_install_user: mysql

########### Variables required for DAProxy role #########################################################
# da_upgrade_file_name: "DAProxy-SilentInstaller.properties"
daproxy_port: 8500
da_port: 8581

########### Variables required for DA role #########################################################
# da_upgrade_file_name: "DA-SilentInstaller.properties"
# da_rs_password: "Specify the password"
# da_rs_username: admin

########### Variables required for DC role #########################################################
da_secured: false
validate_certificates: true
# da_port: 8581

###########  Spectrum role #########################################################################
# main_loc_serv is the MANDATORY variable to be updated by user. Set the fully qualified host name of the
# Spectrum main location server to main_loc_serv and uncomment it. Do not enter an IP address in place of a hostname.
#
main_loc_serv: "molecule-ss.netops.broadcom.net"
#
# Below variables are used for non-root installation.
# Set non_root_installation to 'yes' and provide group name of the non-root spectrum install user.
# If spectrum_group is not same for all the users, update it in spectrum_primary , spectrum_secondary host entries
non_root_installation: "no"
spectrum_group: "spectrum"
# Properties that are common are defined below, these values will be used when host specific parameters not found.
xtn_install: "yes"
locale: "en_US"
ignore_disk_space: "no"
remove_vnmdb_lock: "yes"
# hostargs and password file path relative to the service directory.
primary_hostargs_file: "../primary_host.txt"
primary_password_file: "../primary_pass.txt"
secondary_hostargs_file: "../secondary_host.txt"
secondary_password_file: "../secondary_pass.txt"

# option to copy installer in remote/ Target node and run installation / run installation from controller node.
remote_to_localcopy_upgrade: "yes"

###########  Variables required for VNA role #########################################################
mysql_superuser_password: admin
vna_ui_password: admin

###########  Variables required for CAMM role #########################################################
user_install_dir: "/opt/CA/CAMM"

###########  SDC role ##############################################################################
# Below variables are used for non-root SDC installation.
# Set sdc_trapx_non_root_installation to 'yes' and provide group name of the non-root sdc install user.
# If sdc_install_owner_group is not same for all the users, update it in sdc host entries.
sdc_trapx_non_root_installation: "no"
sdc_install_owner_group: 

Running preflight ./preflight-all-remote.sh gives output as under (sample output):

**************************************************************
Starting Netops Kafka Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-kafka.yml"
"PLAY: starting play NetOps Kafka Deployment"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops Portal Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-portal.yml"
"PLAY: starting play Portal playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops Data Aggregator Proxy Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-daproxy.yml"
"PLAY: starting play DAProxy playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops Data Repository Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-dr.yml"
"PLAY: starting play DX Netops DR playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops Data Aggregator Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-da.yml"
"PLAY: starting play DA playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops Data Collector Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-dc.yml"
"PLAY: starting play DC playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops Spectrum Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-spectrum.yml"
"PLAY: starting play Spectrum playbook"

TASK [Gathering Facts] *********************************************************
ok: [10.253.7.77]

TASK [Check if /tmp is a valid directory and configure IATEMPDIR] **************

...........
...........


PLAY RECAP *********************************************************************
10.253.7.77                : ok=61   changed=6    unreachable=0    failed=0    skipped=72   rescued=0    ignored=0   


**************************************************************
Starting Netops VNA Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-vna.yml"
"PLAY: starting play VNA playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
Starting Netops CAMM Pre-Flight
**************************************************************
"Welcome to NetOps Ansible: services/netops-camm.yml"
"PLAY: starting play CAMM playbook"
skipping: no hosts matched

PLAY RECAP *********************************************************************


**************************************************************
NetOps Pre-Flight Results
**************************************************************

Result: Netops Kafka
-----------------------------------------------------------------------------
Success

Result: Netops Portal
-----------------------------------------------------------------------------
Success

Result: Netops Data Aggregator Proxy
-----------------------------------------------------------------------------
Success

Result: Netops Data Repository
-----------------------------------------------------------------------------
Success

Result: Netops Data Aggregator
-----------------------------------------------------------------------------
Success

Result: Netops Data Collector
-----------------------------------------------------------------------------
Success

Result: Netops Spectrum
-----------------------------------------------------------------------------
Success

Result: Netops VNA
-----------------------------------------------------------------------------
Success

Result: Netops CAMM
-----------------------------------------------------------------------------
Success

Since we only had Spectrum configured, it runs only for the host defined in group spectrum_primary and the rest are skipped.