Azure VM Details Test
This test auto-discovers the virtual machines used by the target Microsoft Azure subscription, and for each VM, it reveals in-depth metrics such as status, memory utilization, CPU utilization, disk I/O measures, etc. In the process, the test points administrators to resource-hungry VMs.
Target of the Test: A Microsoft Azure Subscription
Agent deploying the test: A remote agent
Output of the test: One set of results for each VM in every resource group of the target Azure subscription
Parameters | Description |
---|---|
Test Period |
How often should the test be executed. |
Host |
The host for which the test is to be configured. |
Subscription ID |
Specify the GUID which uniquely identifies the Microsoft Azure Subscription to be monitored. To know the ID that maps to the target subscription, do the following:
|
Tenant ID |
Specify the Directory ID of the Azure AD tenant to which the target subscription belongs. To know how to determine the Directory ID, refer to Configuring the eG Agent to Monitor a Microsoft Azure Subscription Using Azure ARM REST API. |
Client ID, Client Password, and Confirm Password |
To connect to the target subscription, the eG agent requires an Access token in the form of an Application ID and the client secret value. For this purpose, you should register a new application with the Azure AD tenant. To know how to create such an application and determine its Application ID and client secret, refer to Configuring the eG Agent to Monitor a Microsoft Azure Subscription Using Azure ARM REST API. Specify the Application ID of the created Application in the Client ID text box and the client secret value in the Client Password text box. Confirm the Client Password by retyping it in the Confirm Password text box. |
Proxy Host and Proxy Port |
In some environments, all communication with the Azure cloud be routed through a proxy server. In such environments, you should make sure that the eG agent connects to the cloud via the proxy server and collects metrics. To enable metrics collection via a proxy, specify the IP address of the proxy server and the port at which the server listens against the Proxy Host and Proxy Port parameters. By default, these parameters are set to none, indicating that the eG agent is not configured to communicate via a proxy, by default. |
Proxy Username, Proxy Password and Confirm Password |
If the proxy server requires authentication, then, specify a valid proxy user name and password in the Proxy Username and Proxy Password parameters, respectively. Then, confirm the password by retyping it in the Confirm Password text box. |
Diagnostic Measures |
By default, this flag is set to Off. This means that, by default, this test reports only host-level metrics - eg., CPU usage, disk usage, and network usage - for each VM. For deeper insights into VM performance, you may want to collect guest-level metrics and other diagnostic data using the Azure Diagnostics extension. Azure Diagnostics extension is an agent in Azure Monitor that collects monitoring data from the guest operating system of Azure compute resources including virtual machines. To configure this test to use this extension and pull guest-level metrics from VMs, do the following:
|
Detailed Diagnosis |
To make diagnosis more efficient and accurate, the eG Enterprise embeds an optional detailed diagnostic capability. With this capability, the eG agents can be configured to run detailed, more elaborate tests as and when specific problems are detected. To enable the detailed diagnosis capability of this test for a particular server, choose the On option. To disable the capability, click on the Off option. The option to selectively enable/disable the detailed diagnosis capability will be available only if the following conditions are fulfilled:
|
Measurement | Description | Measurement Unit | Interpretation | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Status |
Indicates the current state of this virtual machine. |
|
The values reported by this measure and its numeric equivalents are mentioned in the table below:
Note: By default, this measure reports the Measure Values listed in the table above to indicate the current state of this virtual machine. In the graph of this measure however, the same is represented using the numeric equivalents only. Use the detailed diagnosis of this measure to know the IP, location, type, OS, and size of the VM. |
||||||||||||||||||
Provisioning status |
Indicates provisioning status of this virtual machine. |
Number |
The values reported by this measure and its numeric equivalents are mentioned in the table below:
Note: By default, this measure reports the Measure Values listed in the table above to indicate the current state of this virtual machine. In the graph of this measure however, the same is represented using the numeric equivalents only. |
||||||||||||||||||
Total cores |
Indicates the total number of cores in this virtual machine. |
Number |
|
||||||||||||||||||
Configured memory |
Indicates the amount of memory that is configured for this VM. |
GB |
|
||||||||||||||||||
Maximum disk size |
Indicates the maximum size of the disk allocated to this VM. |
GB |
|
||||||||||||||||||
Temporary disk size |
Indicates the size of the 'temporary disk' allocated to this VM. |
GB |
|
||||||||||||||||||
Maximum data disks |
Indicates the maximum number of the 'Data disks' attached to this VM. |
Number |
|
||||||||||||||||||
Maximum IOPS |
Indicates the maximum number of I/O operations that are allowed for this VM. |
Number |
|
||||||||||||||||||
CPU utilization |
Indicates the percentage of CPU utilized by this VM. |
Percent |
A value close to 100% is a cause for concern, as it implies a severe contention for CPU resources on the VM. You may want to look deeper into the VM to figure out if any application is hogging its CPU resources. |
||||||||||||||||||
Incoming network traffic |
Indicates the amount of data received by this VM through all network interfaces. |
MB |
In the event that there is a network congestion, compare the values of these measures across VMs to know which VM is probably causing it. |
||||||||||||||||||
Outgoing network traffic |
Indicates the amount of data sent out through all the network interfaces by this VM. |
MB |
|||||||||||||||||||
Data reads from disk |
Indicates the amount of data read from the disk of this VM during the last measurement period. |
MB |
|
||||||||||||||||||
Data writes to disk |
Indicates the amount of data written to the disk of this VM during the last measurement period. |
MB |
|
||||||||||||||||||
Disk read operations |
Indicates the rate at which data was read from the disk of this VM during the last measurement period. |
Operations/sec |
|
||||||||||||||||||
Disk write operations |
Indicates the rate at which data was written from the disk of this VM during the last measurement period. |
Operations/sec |
|
||||||||||||||||||
Total IOPS |
Indicates the total number of I/O Operations per second on this VM. |
Number |
|
||||||||||||||||||
Interrupt time |
Indicates the percentage of time that the processor of this VM spent receiving and servicing hardware interrupts during the last measurement period. |
Percentage |
This measure appears only if the value of the Is VM diagnostics settings enabled? measure is Yes. The value of this measure is an indirect indicator of the activity of devices that generate interrupts, such as the system clock, the mouse, disk drivers, data communication lines, network interface cards, and other peripheral devices. These devices normally interrupt the processor when they have completed a task or require attention. Normal thread execution is suspended during interrupts. Most system clocks interrupt the processor every 10 milliseconds, creating a background of interrupt activity. |
||||||||||||||||||
Processor time |
Indicates the percentage of time that the processor of this VM is executing application or operating system processes other than Idle threads. |
Percentage |
The value of this measure is a primary indicator of processor activity. It is calculated by measuring the time that the processor spends executing the thread of the Idle process in each sample interval, and subtracting that value from 100%. Each processor has an Idle thread which consumes cycles when no other threads are ready to run. This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
User time |
Indicates the percentage of non-idle processor time that is spent in user mode by the processor of this VM. |
Percentage |
User mode is a restricted processing mode designed for applications, environment subsystems, and integral subsystems. The alternative, privileged mode, is designed for operating system components and allows direct access to hardware and all memory. The operating system switches application threads to privileged mode to obtain operating system services. This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Privileged time |
Indicates the percentage of non-idle processor time spent in privileged mode by the processor of this VM. |
Percentage |
Privileged mode is a processing mode designed for operating system components and hardware-manipulating drivers. It allows direct access to hardware and all memory. The alternative, user mode, is a restricted processing mode designed for applications, environment subsystems, and integral subsystems. The operating system switches application threads to privileged mode to obtain operating system services. % Privileged Time includes time spent servicing interrupts and DPCs. A high rate of privileged time might be attributable to a large number of interrupts generated by a failing device. |
||||||||||||||||||
Processor frequency |
Indicates the frequency at which the processor of this VM operates. |
Number |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Parking status |
Indicates the number of CPU cores that were parked for the processor of this VM. |
Number |
|
||||||||||||||||||
Total processes page faults rate |
Indicates the rate at which page faults by the threads executing in this process of this VM are occurring. |
Faults/sec |
A page fault occurs when a thread refers to a virtual memory page that is not in its working set in main memory. This does not cause the page to be fetched from disk if it is on the standby list and hence already in main memory, or if it is in use by another process with whom the page is shared. |
||||||||||||||||||
Total processes handle usage |
Indicates the number of handles that are currently utilized by the processes on this VM. |
Number |
A high value of this measure could indicate a memory leak on the VM. This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Total processes non-shared data |
Indicates the amount of data bytes that the processes of the processor associated with this VM has allocated that cannot be shared with other processes. |
MB |
|
||||||||||||||||||
Total processes memory |
Indicates the current number of bytes in the working set of the processes of the processor of this VM. |
Number |
The working set is the set of memory pages touched recently by the threads in the process. If free memory in the VM is above a certain threshold, pages are left in the working set of a process even if they are not in use. When free memory falls below a certain threshold, pages are trimmed from working sets. If they are needed, they are then soft-faulted back into the working set before they leave main memory. This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Total processes private memory |
Indicates the number of bytes in the working set that are not shared and cannot be shared by other processes of the processor of this VM. |
Number |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Processes |
Indicates the number of system processes in this VM at the time of data collection. |
Number |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Threads |
Indicates the number of system threads in this VM at the time of data collection. |
Number |
|
||||||||||||||||||
Context switches |
Indicates the rate at which context switches occurred on this VM. |
Switches/sec |
A context switch occurs when the kernel switches the processor from one thread to another. A context switch might also occur when a thread with a higher priority than the running thread becomes ready or when a running thread must wait for some reason (such as an I/O operation). The Thread\Context Switches/sec counter value increases when the thread gets or loses the time of the processor. This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Free memory |
Indicates the amount of physical memory, in bytes, that is immediately available for allocation to a process or for use by this VM. |
MB |
A low value for this measure implies excessive memory usage by a VM. This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Committed memory in use |
Indicates the amount of physical memory that is in use for which space has been reserved in the paging file so that it can be written to disk allocated to this VM. |
MB |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Cache faults |
Indicates the rate at which faults occur when a page sought in the file system cache is not found and must be retrieved from elsewhere in memory (a soft fault) or from disk (a hard fault) of this VM. |
Faults/sec |
Ideally, the value of this measure should be 0 or very low. |
||||||||||||||||||
Page reads from disk |
Indicates the rate at which the disk of this VM was read to resolve hard page faults. |
Pages/sec |
Hard page faults occur when a process references a page in virtual memory that is not in its working set or elsewhere in physical memory, and must be retrieved from disk. This measure is a primary indicator of the kinds of faults that cause system-wide delays. It includes read operations to satisfy faults in the file system cache (usually requested by applications) and in noncached mapped memory files. Compare the value of Page Reads/sec to the value of Pages Input/sec to find an average of how many pages were read during each read operation. |
||||||||||||||||||
Pages read and written to disk |
Indicates the rate at which pages are read from or written to disk of this VM to resolve hard page faults. |
Pages/sec |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Memory paged pool size |
Indicates the size of the paged pool which is an area of system memory (physical memory) for objects that can be written to disk of this VM when they are not being used. |
MB |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Non-paged pool kernel memory size |
Indicates the size of the nonpaged pool which is an area of system memory (physical memory) for objects that cannot be written to disk of this VM, but must remain in physical memory as long as they are allocated. |
MB |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Committed memory |
Indicates the amount of committed virtual memory allocated to this VM. |
MB |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Page Faults rate |
Indicates the average number of pages faulted per second. |
Faults/sec |
This measure will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Transition faults |
Indicates the rate at which page faults are resolved by recovering pages that were being used by another process sharing the page, or were on the modified page list or the standby list, or were being written to disk of this VM at the time of the page fault. |
Faults/sec |
|
||||||||||||||||||
Disk read bytes |
Indicates the amount of data read from the disk of this VM. |
MB |
These measures will be reported only if the following conditions are fulfilled:
|
||||||||||||||||||
Disk write bytes |
Indicates the amount of data written to the disk of this VM. |
MB |
|||||||||||||||||||
Connection failures |
Indicates the number of times the TCP connection to this VM failed. |
Number |
This measure is calculated based on the number of times TCP connections have made a direct transition to the CLOSED state from the SYN-SENT state or the SYN-RCVD state along with the number of times TCP connections have made a direct transition to the LISTEN state from the SYN-RCVD state. |
||||||||||||||||||
Segments sent |
Indicates the rate at which TCP Segments were sent from this VM. |
Segments/sec |
The value of this measure includes those segemtns sent from current connections, but excludes those containing only retransmitted bytes. |
||||||||||||||||||
Segments retransmitted |
Indicates the rate at which segments containing one or more previously transmitted bytes were retransmitted by this VM. |
Segments/sec |
|||||||||||||||||||
Connections reset |
Indicates the number of times that TCP connections from this VM have made a direct transition to the CLOSED state from either the ESTABLISHED or CLOSE-WAIT state. |
Number |
|||||||||||||||||||
Segments received |
Indicates the rate at which segments were received by this VM, including those received in error. |
Segments/sec |
The value of this measure includes segments received on currently established connections. |
||||||||||||||||||
Connections established |
Indicates the number of TCP connections established on this VM. |
Number |
|
||||||||||||||||||
Processor idle time |
Indicates the percentage of time for which the CPU of this VM has been idle. |
Percent |
If the CPU utilization measure of a VM reports a value close to 100%, then you may want to compare the value of the Interrupt time, Processor time, User time, and these two measures to understand where CPU was spent - in servicing interrupts? in executing application/OS processes? in user mode? idle? or waiting for input?
|
||||||||||||||||||
Processor wait time |
Indicates the percentage of time the processor of this VM was waiting for I/O. |
Percent |
|||||||||||||||||||
Page writes on disk |
Indicates the rate at which this VM writes pages to disk. |
Pages/Sec |
|
||||||||||||||||||
Packet sent error |
Indicates the number of error packets sent by this VM. |
Number |
Ideally, the value of these measures should be 0. |
||||||||||||||||||
Packet received error |
Indicates the number of error packets received by this VM. |
Number |
|||||||||||||||||||
CPU credits consumed |
Indicates the number of CPU credits consumed by this VM. |
Number |
Some VMs may not need to the full performance of the CPU continuously, like web servers, proof of concepts, small databases and development build environments. These workloads typically have burstable performance requirements. To support such requirements, Azure provides you with the ability to purchase a VM size with baseline performance that can build up credits when it is using less than its baseline. Every time a VM uses up a portion of the accumulated CPU credits, it means that that VM is using CPU above its baseline. Ideally therefore, the value of the CPU credits consumed measure should be low, and the CPU credits remaining measure should be high. |
||||||||||||||||||
CPU credits remaining |
Indicates the number of CPU credits still unused by this VM. |
Number |
|||||||||||||||||||
VM cached bandwidth consumed |
Indicates the percentage calculated by the total disk throughput completed over the max cached throughput of this VM. |
Percent |
Virtual machines that are enabled for both premium storage and premium storage caching have two different storage bandwidth limits.
If the value of the VM uncached bandwidth consumed is at 100%, it means that the VM has fully utilized its default storage limit. No data can be stored in the VM's disk from this point forward. This can cause the VM to suffer serious and prolonged performance degradations. If the value of the VM cached bandwidth consumed is at 100%, it means that the VM has fully utilized the storage set aside for host caching. You may want to consider allocating more space for caching to improve throughput. |
||||||||||||||||||
VM uncached bandwidth consumed |
Indicates the percentage calculated by the total disk throughput completed over the max uncached throughput of this VM. |
Percent |
|||||||||||||||||||
VM cached IOPS consumed |
Indicates the percentage calculated by the total IOPS completed over the max cached IOPS limit of this VM. |
Percent |
Azure virtual machines have input/output operations per second (IOPS) and throughput performance limits based on the virtual machine type and size. If the value of the VM uncached IOPS consumed measure is 100%, it means the VM performance has been capped. This can happen when the VM is requesting for more IOPS or throughput than what is allotted for the virtual machines or attached disks. When capped, the VM experiences suboptimal performance. This can lead to negative consequences like increased latency. To avoid this, you may want to increase the IOPS limit of the VM. Reads served by the cache are not included in the disk IOPS and Throughput, hence not subject to disk limits. Cache has its separate IOPS and Throughput limit per VM. If the VM cached IOPS consumed measure reports the value 100% for a VM, then it means that the VM has exhausted the IOPS limit configured for the cache. This can adversely impact throughput and I/O processing by the VM's cache. To avoid this, you may want to increase the IOPS limit of the cache. |
||||||||||||||||||
VM uncached IOPS consumed |
Indicates the percentage calculated by the total IOPS completed over the max uncached IOPS limit of this VM. |
Percent |
|||||||||||||||||||
Uptime |
Indicates the uptime of this VM. |
Secs |
|
||||||||||||||||||
Used memory |
Indicates the amount of memory used by this VM. |
MB |
|
||||||||||||||||||
Memory utilization |
Indicates the percentage of memory used by this VM. |
Percent |
A value close to 100% indicates excessive memory usage by the VM. A consistent rise in this value could hint at a potential memory shortage on the VM. You may want to allocate more memory to VM to avoid this. |
||||||||||||||||||
Is VM diagnostics settings enabled? |
Indicates whether/not guest-level monitoring has been enabled for this VM. |
|
The values reported by this measure and their corresponding numeric values are listed in the table below:
Note: By default, this measure reports the Measure Values listed in the table above to indicate whether/not guest-level monitoring is enabled for the VM. In the graph of this measure however, the same is represented using the numeric equivalents only. |
||||||||||||||||||
Azure VM agent status |
Indicates whether/not the Azure Diagnostics Agent installed on this VM is in the Ready state. |
|
The values reported by this measure and their corresponding numeric values are listed in the table below:
Note: By default, this measure reports the Measure Values listed in the table above to indicate whether/not the Diagnostics Agent is ready. In the graph of this measure however, the same is represented using the numeric equivalents only. The detailed diagnosis reported by this measure reveals the version of the Azure VM agent when the value of this measure is Ready. If the value of this measure is Not Ready, then Unknown will be displayed in the Azure VM agent version column. |
Use the detailed diagnosis of the Status measure to know the IP, location, type, OS, and size of the VM.
Figure 3 : The detailed diagnosis of the Status measure of the Azure VM Details test