NetApp LUNs Test

This test auto-discovers the LUNs configured on the NetApp Unified Storage system, monitors the availability, state, and the processing ability of each LUN, and reports the following:

Which LUNs are currently offline?
Is any LUN experiencing a contention for storage space?
Is I/O load uniformly balanced across all LUNs, or is any LUN overloaded? Is it causing the LUN to receive an increased number of Queue Full responses?
Are the LUNs able to process the I/O requests quickly? Is any LUN experiencing processing bottlenecks?

Target of the test : A NetApp Unified Storage

Agent deploying the test : An external/remote agent

Outputs of the test : One set of results for each LUN configured on the NetApp storage system being monitored.

Configurable parameters for the test
Parameters	Description
Test Period	How often should the test be executed.
Host	The host for which the test is to be configured.
Port	Specify the port at which the specified host listens in the Port text box. By default, this is NULL.
User	Here, specify the name of the user who possesses the following privileges: login-http-admin,api-aggr-check-spare-low,api-aggr-list-info,api-aggr-mediascrub-list-info,api-aggr-scrub-list-info,api-cifs-status,api-clone-list-status,api-disk-list-info,api-fcp-adapter-list-info,api-fcp-adapter-stats-list-info,api-fcp-service-status,api-file-get-file-info,api-file-read-file,api-iscsi-connection-list-info,api-iscsi-initiator-list-info,api-iscsi-service-status,api-iscsi-session-list-info,api-iscsi-stats-list-info,api-lun-config-check-alua-conflicts-info,api-lun-config-check-cfmode-info,api-lun-config-check-info,api-lun-config-check-single-image-info,api-lun-list-info,api-nfs-status,api-perf-object-get-instances-iter,api-perf-object-instance-list-info,api-quota-report-iter,api-snapshot-list-info,api-vfiler-list-info,api-volume-list-info-iter*. If such a user does not pre-exist, then, you can create a special user for this purpose using the steps detailed in Creating a New User with the Privileges Required for Monitoring the NetApp Unified Storage.
Password	Specify the password that corresponds to the above-mentioned User.
Confirm Password	Confirm the Password by retyping it here.
Authentication Mechanism	In order to collect metrics from the NetApp Unified Storage system, the eG agent connects to the ONTAP management APIs over HTTP or HTTPS. By default, this connection is authenticated using the LOGIN_PASSWORD authentication mechanism. This is why, LOGIN_PASSWORD is displayed as the default authentication mechanism.
Use SSL	Set the Use SSL flag to Yes, if SSL (Secured Socket Layer) is to be used to connect to the NetApp Unified Storage System, and No if it is not.
API Port	By default, in most environments, NetApp Unified Storage system listens on port 80 (if not SSL-enabled) or on port 443 (if SSL-enabled) only. This implies that while monitoring the NetApp Unified Storage system, the eG agent, by default, connects to port 80 or 443, depending upon the SSL-enabled status of the NetApp Unified Storage system - i.e., if the NetApp Unified Storage system is not SSL-enabled (i.e., if the Use SSL flag above is set to No), then the eG agent connects to the NetApp Unified Storage system using port 80 by default, and if the NetApp Unified Storage system is SSL-enabled (i.e., if the Use SSL flag is set to Yes), then the agent-NetApp Unified Storage system communication occurs via port 443 by default. Accordingly, the API Port parameter is set to default by default. In some environments however, the default ports 80 or 443 might not apply. In such a case, against the API Port parameter, you can specify the exact port at which the NetApp Unified Storage system in your environment listens, so that the eG agent communicates with that port for collecting metrics from the NetApp Unified Storage system.
vFilerName	A vFiler is a virtual storage system you create using MultiStore, which enables you to partition the storage and network resources of a single storage system so that it appears as multiple storage systems on the network. If the NetApp Unified Storage system is partitioned to accommodate a set of vFilers, specify the name of the vFiler that you wish to monitor in the vFilerName text box. In some environments, the NetApp Unified Storage system may not be partitioned at all. In such a case, the NetApp Unified Storage system is monitored as a single vFiler and hence the default value of none is displayed in this text box.
Timeout	Specify the duration (in seconds) beyond which the test will timeout if no response is received from the device. The default is 120 seconds.
DD Frequency	Refers to the frequency with which detailed diagnosis measures are to be generated for this test. The default is 1:1. This indicates that, by default, detailed measures will be generated every time this test runs, and also every time the test detects a problem. You can modify this frequency, if you so desire. Also, if you intend to disable the detailed diagnosis capability for this test, you can do so by specifying none against DD frequency.
Detailed Diagnosis	To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled: The eG manager license should allow the detailed diagnosis capability Both the normal and abnormal frequencies configured for the detailed diagnosis measures should not be 0.

Measurements made by the test

Measurement

Description

Measurement Unit

Interpretation

Is LUN online?

Indicates whether/not this LUN is online.

This measure is applicable only for the individual LUNs. This measure reports a value Yes if this LUN is currently available online and a value No if this LUN is not available online.

The numeric equivalents corresponding to the above-mentioned values are listed in the table below:

Measure Value	Numeric Value
No	0
Yes	1

Note:

This measure reports the Measure Values listed in the table above to indicate the current state of a LUN. However, in the graph of this measure, the same is indicated using only the Numeric Values listed in the above table.

Size

Indicates the size of this LUN in the active file system.

Size used

Indicates the currently used size of this LUN.

A low value is desired for this measure. A high value indicates that the LUN is running out of space.

Read operations

Indicates the rate at which the read operations were performed on this LUN.

Ops/Sec

A high value is desired for this measure. A consistent decrease in this value could indicate a processing bottleneck.

Write operations

Indicates the rate at which the write operations were performed to this LUN.

Ops/Sec

A high value is desired for this measure. A consistent decrease in this value could indicate a processing bottleneck.

Total operations

Indicates the rate at which the operations (incuding the read and write) were performed on this LUN.

Ops/Sec

A high value is desired for this measure. A consistent decrease in this value could indicate a processing bottleneck.

Average latency

Indicates the average time taken for executing an operation in this LUN.

Milliseconds

A high value indicates that the LUN is taking too long to process the I/O requests to it.

Compare the value of this measure across LUNs to isolate the slow LUNs.

Queue full responses

Indicates the rate at which the queue full responses were received on this LUN.

Responses/Sec

This measure is a good indicator for detecting sudden/co=ordinated bursts of I/O from the initiators.

A Queue full condition signals that the target/storage port is unable to process more I/O requests and thus the initiator will need to throttle I/O to the storage port. Some operating systems like AIX may not handle repeated Queue full responses gracefully i.e., will not throttle the I/O requests appropriately leading to I/O errors. These conditions can also be alleviated by reducing the LUN queue depth setting appropriately.

Read data

Indicates the rate at which data is read from this LUN.

Bytes/Sec

A high value is desired for this measure.

Write data

Indicates the rate at which data is written to this LUN.

Bytes/Sec

A high value is desired for this measure.

Queue depth

Indicates the queue depth of this LUN.

Number

Queue Depth is the number of outstanding I/O requests a LUN will issue or hold before the LUN can trigger a Queue Full response i.e., the number of I/O operations that can run in parallel on the LUN. This is useful when compared to the number of Queue Full responses triggered by the LUN. Queue depth is usually set too high and hence could contribute significantly to latency if improperly set.

Average read latency

Indicates the average time taken to execute a read request in this LUN.

Milliseconds

A low value is desired for this measure. A high value indicates that the requests take too long to execute which directly affects the performance of the LUNs.

Average write latency

Indicates the average time taken to execute a write request in this LUN.

Milliseconds