Auto-unmanaging or Deleting Kubernetes Nodes

Once the Kubernetes nodes in the cluster are managed as components by the eG manager after being discovered by the eG agents, eG Enterprise employs agent-based/agentless techniques to monitor each managed component, and periodically pulls performance metrics from them. Sometimes, eG agents may stop pulling metrics from a component. The failure may persist if the target node is no longer part of the monitored infrastructure or has been shut down for regular maintenance or upgrades. Administrators would be right to delete or at least temporarily unmanage such components to save monitoring resources. This can however prove to be a challenge in large dynamic infrastructures, where components are added/removed frequently. In such environments, administrators can find it cumbersome to identify components that are no longer reporting metrics and to manually delete/unmanage them from the eG Enterprise system. To minimize the administrative effort involved in this exercise, you can now configure the eG manager to automatically unmanage/delete components that are not reporting metrics beyond a configured duration.

To achieve the above, do the following:

  1. Login to the eG admin interface.
  2. Follow the Infrastructure -> Discovery menu sequence.
  3. Select the Auto Delete option under the Kubernetes Infrastructure sub-node of the Auto Scaling Environment sub-node of the Agent Discovery node in the discovery tree. The right panel will change to display a Kubernetes - AUTO DELETE page (see Figure 1).

    Figure 1 : The KUBERNETES - AUTO DELETE window

  4. To enable the eG manager to automatically unmanage/delete the discovered nodes that are not reporting, turn on the Enable automatic actions on Kubernetes nodes flag to Yes. Doing so will bring up the options to unmanage or delete the nodes as shown in fig 2.

    Figure 2 : Configuring auto- unmanage/delete settings for the Kubernetes nodes

  5. Next, indicate what action you want to automate on Kubernetes nodes that are not reporting. If you want to delete such components automatically, then select the Delete option from the What action would you like to take? list. If you want to unmanage the components, select the Unmanage option from the What action would you like to take? list.
  6. Then, specify when the chosen action should be triggered. In other words, in the Take action for components that are reporting for text box, mention how long a component should not be reporting metrics for it to be auto-deleted/unmanaged by the eG manager. By default, this is set to 240 minutes.
  7. If the Delete option is chosen from the What action would you like to take? list, then by default, eG Enterprise will automatically delete only those components that are in the unmanaged state and have not been reporting metrics for the configured duration. If you want the eG manager to auto-delete even the managed components along with the unmanaged components, then select the Managed and Unmanaged state option from the Take action for the component types in list.
  8. Finally, click the Update button.