Solace Node Health Test

This test auto-discovers the nodes in target Solace Cluster and reports the current status of each node and thereby helps administrators to identify any node that is down. This test also reports the state of ConfigSync on each node, which can help you to understand if the automatic synchronization between the primary and backup nodes are enabled or not. If the ConfigSync status is disabled or down it can lead to configurations getting out of sync between the nodes and eventually leading to replication issues and message loss. In addition this test also monitors the redundancy health and status of Message spool on each node. Through the process, administrators are promptly alerted on any activity failover, current redundancy mode and internal state of the Guaranteed Messaging Redundancy facility of each nodes in the cluster.

Target of the test : A Solace Cluster

Agent deploying the test : An external agent

Outputs of the test : One set of results for each node in the target cluster that is to be monitored.

Configurable parameters for the test
Parameter Description

Test Period

How often should the test be executed.

Host

The IP address of the target host for which this test is to be configured.

Port

Refers to the port at which the Solace Cluster listens to.

UserName, Password and Confirm Password

The eG agent uses SEMP API to collect metrics from all the nodes in the Solace Cluster. In order to enable the eG agent to access SEMP API and collect metrics, a user with read only privilege has to be created on all the nodes in the cluster that requires monitoring. If such a user does not pre-exist, you have to manually create a user with aforesaid privileges, for that, refer to: Creating a New User for Monitoring Solace PubSub+ Event Broker.

Specify the credentials of such a user against the User Name and Password parameters. Confirm the Password by retyping it in the Confirm Password text box.

Total Cluster Nodes

Provide a comma-separated list of both the primary and backup nodes in the cluster that requires monitoring on this text box. You should specify the nodes in the following format: HOSTNAME1:PORT1,HOSTNAME2:PORT2,... . For example, 172.16.8.233#8080,172.16.8.235#8080,....

Primary Nodes

The eG agent needs to connect to the SEMP API on the primary node and run API commands to collect metrics. For this purpose, the eG agent has to be configured with the details of the primary node on this text box. You should specify the node details in the following format: HOSTNAME:PORT. For example, 172.16.8.233#8080.

SSL

By default, this flag is set to No indicating that the Solace Cluster is not SSL-enabled by default. Set this flag to Yes if the Solace Cluster is SSL-enabled.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Node status

Indicates the current status of this node.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values

Numeric values
Down 0
Up 1

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of this node. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 1.

If the status is Up then this node is operational and available to provide service.

If the status is Down then this node is not operational and is unavailable to provide service.

Redundancy health

Indicates the current redundancy state of this node.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Shutdown 0
Degraded 1
Primary-Active 2
Primary-Standby 3
Backup-Active 4
Backup-Standby 5

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current redundancy state of this node. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 5.

Use the detailed diagnosis of this measure to find out the details of the node like Active Standby role, Activity status, Redundancy configuration, Redundancy, ADB Link up, and ADB hello up.

Messagespool status

Indicates the current state of the Message spool on this node.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Unknown 0
Primary-AD-Active 1
Primary-AD-Standby 2
Backup-AD-Active 3
Backup-AD-Standby 4

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current status of Message spool on this node. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 to 4.

Use the detailed diagnosis of this measure to find out the Messagespool configuration and Messagespool operational status.

ConfigSync status

Indicates the current state of ConfigSync on this node.

 

The values reported by this measure and its numeric equivalents are mentioned in the table below:

Measure values Numeric values
Down 0
Disabled 1
Up 2

Note:

By default, this measure reports the Measure Values listed in the table above to indicate the current state ConfigSync status on this node. The graph of this measure however, is represented using the numeric equivalents only i.e., 0 or 2.

The Config-Sync feature can be used for Solace PubSub+ software event brokers to automatically synchronize:

  • configuration properties between event brokers in high-availability (HA) pairs.

  • replication‑enabled Message VPNs on one replication site with the corresponding Message VPNs on its mate replication site.

Use the detailed diagnosis of this measure to find out the Admin and operational status.

The detailed diagnosis of Redundancy health reveal further details like Active Standby role, Activity status, Redundancy configuration, Redundancy, ADB Link up, and ADB hello up.

Figure 1 : Detailed diagnosis of Redundancy health measure

The detailed diagnosis of Messagespool status reveal further details like Messagespool configuration and Messagespool operational status.

Figure 2 : Detailed diagnosis of Messagespool status measure

The detailed diagnosis of ConfigSync status reveal further details like Admin and operational status.

Figure 3 : Detailed diagnosis of ConfigSync status