Trouble Tickets by Org Test

Alarms are an integral part of the monitoring process. Missed or lost alarms would lead to catastrophic outcomes in the monitored environment thus questioning the efficiency of the monitoring tool. Therefore, it is highly important to ensure the timely delivery of alarms to corresponding maintenance personnel. The eG Enterprise system can be seamlessly integrated with an existing TT system in the target environment. eG integrates with most of the Trouble ticket/Collaboration systems using API. When alarms are generated, the eG manager makes API calls to the TT system to create/update/close trouble tickets and send to appropriate executive. However, when eG manager tries to send many alerts to the external TT systems simultaneously, it can cause bottleneck conditions. When eG faces latency issues, that can inturn cause delay or halt in tricket creation. This will result in missed or lost alarms. Hence, it is very crucial to monitor the integration of eG manager with TT system.

This test monitors every organization configured on the eG manager and reports the number of API calls made to the external TT system and calls succeeded. In addition, this test also keep track of the failed API calls, number of tickets created/updated/closed, number of retried API calls and dead tickets. Using these measures, administrators can ascertain any slowdowns or latency issues faced by the manager and proactively remediate the issues before it affects the user experience.

Target of the test : The eG Manager

Agent deploying the test : An internal/remote agent

Outputs of the test : One set of results for each organization configured on the eG manager.

Configurable parameters for the test
Parameter Description

Test period

How often should the test be executed.

Host

The host for which the test is to be configured.

Port

The port number at which the specified host listens.

JMX Remote Port

Here, specify the port at which the JMX listens for requests from remote hosts. In the <EG_MANAGER_INSTALL_DIR>\manager directory (on Windows; on Unix, this will be the /opt/egurkha/manager directory) of the eG manager, you will find a management.properties file. Set the port defined against the com.sun.management.jmxremote.port parameter of the file as the JMX Remote Port.

JMX User and JMX Password

By default, JMX requires no authentication or security. Therefore, the JMX User and JMX Password , parameters are set to none by default.

JNDIName

The JNDIName is a lookup name for connecting to the JMX connector. By default, this is jmxrmi. If you have registered the JMX connector in the RMI registry using a different lookup name, then you can change this default value to reflect the same.

Provider

This test uses a JMX Provider to access the MBean attributes of the eG manager and collect metrics. Specify the package name of this JMX Provider here. By default, this is set to com.sun.jmx.remote.protocol.

JMX Bind Address

By default, this flag is set to Local Host,this implies that the host IP address for JMX to bind is by default the Local IP address, i.e. 127.0.0.1. If the flag is set to Other IP, then JMX binds to the IP address specified against Host parameter.

Timeout

Specify the duration (in seconds) for which this test should wait for a response from the eG manager. If there is no response from the eG manager beyond the configured duration, the test will timeout. By default, this is set to 240 seconds.

DD Frequency

Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 6:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.

Detailed Diagnosis

To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option.

The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:

  • The eG manager license should allow the detailed diagnosis capability
  • Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.
Measurements made by the test
Measurement Description Measurement Unit Interpretation

TT Manager status

Indicates the current status of the target manager.

 

The values that this measure can report and their corresponding numeric values have been listed in the table below:

Measure Value Numeric Value
Not Running 0
Running 1

Note:

By default, the test reports the Measure Values listed in the table above to indicate the current running status of the manager. In the graph of this measure however, the same is represented using the numeric equivalents only.

Total API calls

Indicates the total number of API calls made from eG manager for this organization to the TT system during the last measurement period.

Number

 

Percent of API calls succeeded

Indicates the percentage of API calls succeeded for this organization during the last measurement period.

Percent

 

Average response time

Indicates the average time taken to receive response from TT system.

Seconds

If the average response time is higher than usual, then it indicates latency issues. This can inturn cause delays in creating/updating/closing tickets in the TT system that leads to missed alarms.

Average time taken to create ticket

Indicates the average time taken by the TT system to create the ticket for this organization.

Seconds

This measure is a clear indication of latency issues in the eG manager that can cause delay in creating the tickets in the TT system.

Average time taken to update ticket

Indicates the average time taken by the TT system to update the ticket for this organization.

Seconds

This measure is a clear indication of latency issues in the eG manager that can cause delay in updating the tickets in the TT system.

Average time taken to close ticket

Indicates the average time taken by the TT system to close the ticket for this organization.

Seconds

This measure is a clear indication of latency issues in the eG manager that can cause delay in closing the tickets in the TT system.

API calls succeeded

Indicates the number of API calls succeeded for this organization during the last measurement period.

Number

 

API calls failed

Indicates the number of API calls failed for this organization during the last measurement period.

Number

 

Tickets created

Indicates the number of tickets created for this organization in the last measurement period.

Number

 

Tickets updated

Indicates the number of tickets updated for this organization in the last measurement period.

Number

 

Tickets closed

Indicates the number of tickets closed for this organization in the last measurement period.

Number

 

High priority tickets created

Indicates the number of high priority tickets created for this organization during the last measurement period.

Number

 

Medium priority tickets created

Indicates the number of medium priority tickets created for this organization during the last measurement period.

Number

 

Low priority tickets created

Indicates the number of low priority tickets created for this organization during the last measurement period.

Number

 

High priority tickets updated

Indicates the number of high priority tickets updated for this organization during the last measurement period.

Number

 

Medium priority tickets updated

Indicates the number of medium priority tickets updated for this organization during the last measurement period.

Number

 

Low priority tickets updated

Indicates the number of low priority tickets updated for this organization during the last measurement period.

Number

 

Retried API calls

Indicates the number of retried API calls for this organization during the last measurement period.

Number

 

Retried API calls succeeded

Indicates the number of retried API calls that succeeded for this organization during the last measurement period.

Number

 

Retried API calls failed

Indicates the number retried API calls that failed for this organization during the last measurement period.

Number

 

Dead tickets

Indicates the number of dead tickets for this organization during the last measurement period.

Number

Dead tickets are retried API calls that failed even after maximum retry attempts.

Use the detailed diagnosis of this measure to know more information on dead tickets.

Total alerts added

Indicates the total number of alerts added to eG manager from Alarm history for this organization during the last measurement period.

Number

 

Total alerts processed

Indicates the total number of alerts processed for this organization during the last measurement period.

Number

 

Total alerts to TT system

Indicates the total number of alerts sent to TT system for this organization during the last measurement period.

Number