What is Autodiscovery for IT monitoring?
Autodiscovery in the context of IT monitoring refers to the process of automatically identifying and mapping resources, devices, applications, and services within an IT infrastructure. It involves dynamically detecting and adding new components to the monitoring system without manual intervention, which allows for efficient and up-to-date monitoring coverage.
Why is Autodiscovery necessary?
If autodiscovery is not enabled or possible, the onus is on the IT operations team to manually add any devices, systems, applications, etc. for monitoring. This can be a cumbersome, time-consuming process and is also prone to manual errors (e.g., what if an IT admin forgets to add a new server for monitoring).
This is why autodiscovery is essential for modern IT infrastructures:
- Autodiscovery helps ensure that all relevant systems and resources are included in the monitoring solution, reducing manual configuration efforts and ensuring comprehensive visibility into the IT environment.
- Autodiscovery is essential in modern IT organizations adopting modern technologies such as microservices, container orchestration (e.g., Kubernetes) and cloud services where deployment and auto-scale is automated. In such infrastructures, because of the dynamicity involved, it may not even be possible to rely on a manual approach for adding components for monitoring.
- Autodiscovery features of modern monitoring and observability tools ensure that they are able to handle systems involving rapidly changing and dynamic infrastructure.
How is Autodiscovery performed by IT monitoring tools?
Autodiscovery in IT monitoring tools is performed through various techniques and mechanisms. The specific approach may vary depending on the tool and the environment being monitored, some techniques are more effective than others and choosing the optimal method is often challenging. The most widely used methods used include:
- Network Scanning: IT monitoring tools can perform network scanning to discover devices and resources within a network. They may use protocols like ICMP, SNMP, or specialized discovery protocols to identify and collect information about devices, such as IP addresses, hostnames, and network services.
- Agent-Based Discovery: Some monitoring tools deploy lightweight agents on target systems, which actively report back to the monitoring tool. These agents can provide detailed information about the system, its services, and applications, allowing for accurate autodiscovery.
- Service Discovery: In modern architectures, where applications are distributed across multiple instances or containers, service discovery mechanisms can be used. IT monitoring tools can integrate with service discovery platforms like Consul, etcd, or Kubernetes to automatically detect and monitor services as they are deployed or scaled.
- Log Analysis: Monitoring tools that incorporate log analysis capabilities can perform autodiscovery by analyzing log files. They can identify new log sources, extract relevant metadata, and use pattern matching or machine learning algorithms to classify resources based on log data.
- Cloud Infrastructure APIs: Cloud monitoring tools leverage cloud service provider APIs to autodiscover resources within cloud environments. They can query the cloud provider's API to obtain information about virtual machines, storage volumes, load balancers, databases, and other cloud resources.
- Proprietary Vendor APIs: In a similar way to Cloud Infrastructure APIs, many vendors such as Citrix, Cisco, HP, Microsoft and ISVs provide APIs to discover resources. The quality of a monitoring tool is often determined by how well and how much effort has been invested into these integrations.
- Configuration File Parsing: Autodiscovery can also be performed by parsing configuration files (or similar configurations – e.g., Microsoft Windows Registry) used by applications or infrastructure components. Monitoring tools can scan configuration files to extract information about resources and services, allowing for automatic detection and monitoring.
- Active Probing: Some monitoring tools use active probing techniques to discover resources. They send requests or probes to specific IP ranges or network segments and TCP ports, analyzing responses to identify active devices and services.
- Integration with Infrastructure Orchestration: Monitoring tools can integrate with infrastructure orchestration tools like Terraform, Ansible, Nerdio or Puppet. This integration enables automatic registration and monitoring of resources as they are provisioned or configured through the orchestration tool.
Is one Autodiscovery methodology better than another?
There is no one methodology that fits all when it comes to autodiscovery. The right methodology depends on the target infrastructure. For example if ICMP is disabled in the target infrastructure, ICMP-based scanning may not be useful.
On the other hand, if the monitoring tool is focused on applications and systems, SNMP may not be the ideal way for autodiscovery. While SNMP is popular for network devices, it is not widely supported for applications and systems.
As organizations tighten firewall rules, techniques such as network scanning, active probing, etc. are mainly being used within private networks/DMZs.
What role can Universal Agents and Operators play in Autodiscovery for IT monitoring?
Some monitoring platforms such as eG Enterprise have embraced Universal Agent technology amongst over autodiscovery methods to make monitoring easy and effective. An eG agent is deployed on a server monitors the server hardware, operating system and all of the applications running on it. The agent monitor function is universal in the sense that the same license can be used to deploy an agent irrespective of the operating system to be monitored or the applications to be monitored. This allows deployment flexibility for IT managers. The universal agent can be deployed on the gold images or remotely and rapidly pushed to target systems using any software deployment tool.
Universal Operators perform a similar role to Universal Agents in Kubernetes and cloud orchestration environments and are often standardized, the best-known example being Red Hat OpenShift’s Universal Operator.
What are some of the ways in which Autodiscovery can be controlled or configured?
Often a challenge with autodiscovery is that it introduces excessive and unnecessary load on the infrastructure it is discovering. This is especially true when active probing is used. Therefore, IT admins need different ways to control how autodiscovery is performed. Another reason is also the licensing needs. To stay within the licenses they have access to, an IT team may only want to discover applications/servers of a specific type.
- IT admins should be able to pick and choose which type of IT components must be autodiscovered. For example, a database admin may be interested only in autodiscovery of database instances, while a network admin is only interested in network device discovery.
- The specific network/IP range to be used for discovery should be configurable. This ensures that the discovery process does not overload the entire IT infrastructure but focuses on just the infrastructure of interest to an admin.
- Autodiscovery is a process that is repeated periodically to ensure that new components are automatically added for monitoring. The period for re-discovery must be configurable.
- The mechanism used for discovery may also be configurable. E.g., if ICMP is blocked in the network, autodiscovery that is based on ICMP alone will not be effective.
How can one measure how well Autodiscovery is working?
There are a couple of criteria:
- The accuracy of autodiscovery: Does the autodiscovery mechanism correctly identify the device, application, server, service, etc.? If autodiscovery is not accurate, it will create extra work for the IT operations team. For example, in some environments, applications may be running on non-standard TCP ports – e.g., a web server may run on port 7079 instead of port 443. If autodiscovery just on TCP port scans, it may miss the web server or detect it incorrectly.
- The timelineness of autodiscovery: This measured by how quickly autodiscovery happens, so the time between when a system, device, application, etc. is introduced in a network and the time when it is discovered by the monitoring tool is minimized.
Is Autodiscovery only performed at the component-level?
Continuous discovery also has to performed when monitoring any component. The network interfaces on a router, the tablespaces on a database server, the disk drives on a server, etc. are some of the attributes that must be autodiscovered during normal monitoring of the different components. To ensure the accuracy and relevance of monitoring, such autodiscovery has to happen in real-time.