Trouble Tickets by Org Test
Alarms are an integral part of the monitoring process. Missed or lost alarms would lead to catastrophic outcomes in the monitored environment thus questioning the efficiency of the monitoring tool. Therefore, it is highly important to ensure the timely delivery of alarms to corresponding maintenance personnel. The eG Enterprise system can be seamlessly integrated with an existing TT system in the target environment. eG integrates with most of the Trouble ticket/Collaboration systems using API. When alarms are generated, the eG manager makes API calls to the TT system to create/update/close trouble tickets and send to appropriate executive. However, when eG manager tries to send many alerts to the external TT systems simultaneously, it can cause bottleneck conditions. When eG faces latency issues, that can inturn cause delay or halt in tricket creation. This will result in missed or lost alarms. Hence, it is very crucial to monitor the integration of eG manager with TT system.
This test monitors every organization configured on the eG manager and reports the number of API calls made to the external TT system and calls succeeded. In addition, this test also keep track of the failed API calls, number of tickets created/updated/closed, number of retried API calls and dead tickets. Using these measures, administrators can ascertain any slowdowns or latency issues faced by the manager and proactively remediate the issues before it affects the user experience.
Target of the test : The eG Manager
Agent deploying the test : An internal/remote agent
Outputs of the test : One set of results for each organization configured on the eG manager.
Parameter | Description |
---|---|
Test period |
How often should the test be executed. |
Host |
The host for which the test is to be configured. |
Port |
The port number at which the specified host listens. |
JMX Remote Port |
Here, specify the port at which the JMX listens for requests from remote hosts. In the <EG_MANAGER_INSTALL_DIR>\manager directory (on Windows; on Unix, this will be the /opt/egurkha/manager directory) of the eG manager, you will find a management.properties file. Set the port defined against the com.sun.management.jmxremote.port parameter of the file as the JMX Remote Port. |
JMX User and JMX Password |
By default, JMX requires no authentication or security. Therefore, the JMX User and JMX Password , parameters are set to none by default. |
JNDIName |
The JNDIName is a lookup name for connecting to the JMX connector. By default, this is jmxrmi. If you have registered the JMX connector in the RMI registry using a different lookup name, then you can change this default value to reflect the same. |
Provider |
This test uses a JMX Provider to access the MBean attributes of the eG manager and collect metrics. Specify the package name of this JMX Provider here. By default, this is set to com.sun.jmx.remote.protocol. |
JMX Bind Address |
By default, this flag is set to Local Host,this implies that the host IP address for JMX to bind is by default the Local IP address, i.e. 127.0.0.1. If the flag is set to Other IP, then JMX binds to the IP address specified against Host parameter. |
Timeout |
Specify the duration (in seconds) for which this test should wait for a response from the eG manager. If there is no response from the eG manager beyond the configured duration, the test will timeout. By default, this is set to 240 seconds. |
DD Frequency |
Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 6:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency. |
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement | Description | Measurement Unit | Interpretation | ||||||
---|---|---|---|---|---|---|---|---|---|
TT Manager status |
Indicates the current status of the target manager. |
|
The values that this measure can report and their corresponding numeric values have been listed in the table below:
Note: By default, the test reports the Measure Values listed in the table above to indicate the current running status of the manager. In the graph of this measure however, the same is represented using the numeric equivalents only. |
||||||
Total API calls |
Indicates the total number of API calls made from eG manager for this organization to the TT system during the last measurement period. |
Number |
|
||||||
Percent of API calls succeeded |
Indicates the percentage of API calls succeeded for this organization during the last measurement period. |
Percent |
|
||||||
Average response time |
Indicates the average time taken to receive response from TT system. |
Seconds |
If the average response time is higher than usual, then it indicates latency issues. This can inturn cause delays in creating/updating/closing tickets in the TT system that leads to missed alarms. |
||||||
Average time taken to create ticket |
Indicates the average time taken by the TT system to create the ticket for this organization. |
Seconds |
This measure is a clear indication of latency issues in the eG manager that can cause delay in creating the tickets in the TT system. |
||||||
Average time taken to update ticket |
Indicates the average time taken by the TT system to update the ticket for this organization. |
Seconds |
This measure is a clear indication of latency issues in the eG manager that can cause delay in updating the tickets in the TT system. |
||||||
Average time taken to close ticket |
Indicates the average time taken by the TT system to close the ticket for this organization. |
Seconds |
This measure is a clear indication of latency issues in the eG manager that can cause delay in closing the tickets in the TT system. |
||||||
API calls succeeded |
Indicates the number of API calls succeeded for this organization during the last measurement period. |
Number |
|
||||||
API calls failed |
Indicates the number of API calls failed for this organization during the last measurement period. |
Number |
|
||||||
Tickets created |
Indicates the number of tickets created for this organization in the last measurement period. |
Number |
|
||||||
Tickets updated |
Indicates the number of tickets updated for this organization in the last measurement period. |
Number |
|
||||||
Tickets closed |
Indicates the number of tickets closed for this organization in the last measurement period. |
Number |
|
||||||
High priority tickets created |
Indicates the number of high priority tickets created for this organization during the last measurement period. |
Number |
|
||||||
Medium priority tickets created |
Indicates the number of medium priority tickets created for this organization during the last measurement period. |
Number |
|
||||||
Low priority tickets created |
Indicates the number of low priority tickets created for this organization during the last measurement period. |
Number |
|
||||||
High priority tickets updated |
Indicates the number of high priority tickets updated for this organization during the last measurement period. |
Number |
|
||||||
Medium priority tickets updated |
Indicates the number of medium priority tickets updated for this organization during the last measurement period. |
Number |
|
||||||
Low priority tickets updated |
Indicates the number of low priority tickets updated for this organization during the last measurement period. |
Number |
|
||||||
Retried API calls |
Indicates the number of retried API calls for this organization during the last measurement period. |
Number |
|
||||||
Retried API calls succeeded |
Indicates the number of retried API calls that succeeded for this organization during the last measurement period. |
Number |
|
||||||
Retried API calls failed |
Indicates the number retried API calls that failed for this organization during the last measurement period. |
Number |
|
||||||
Dead tickets |
Indicates the number of dead tickets for this organization during the last measurement period. |
Number |
Dead tickets are retried API calls that failed even after maximum retry attempts. Use the detailed diagnosis of this measure to know more information on dead tickets. |
||||||
Total alerts added |
Indicates the total number of alerts added to eG manager from Alarm history for this organization during the last measurement period. |
Number |
|
||||||
Total alerts processed |
Indicates the total number of alerts processed for this organization during the last measurement period. |
Number |
|
||||||
Total alerts to TT system |
Indicates the total number of alerts sent to TT system for this organization during the last measurement period. |
Number |
|