Skip to content
CA Unified Infrastructure Management Probes
Documentation powered by DocOps

cdm IM Configuration

Last update November 21, 2018

This article describes the configuration concepts and procedures to set up the CPU, Disk, Memory Performance Monitoring (cdm) probe. You can configure the probe to monitor the CPU, disk, and memory performance of the system on which the probe is deployed. The probe automatically identifies components and allows you to specify QoS parameters and thresholds for alarms.

Important! Use cdm version 5.61 or later with cluster version 3.33 or later to view the cluster disks on the cdm Infrastructure Manager (IM).

This article is for probe versions 5.6 and later.

The following diagram outlines the process to configure the cdm probe.

Configuring cdm IM

Contents


Verify Prerequisites

Review the following prerequisites in cdm (CPU, Disk, Memory Performance Monitoring) Release Notes before you configure the probe:

  • Verify that required hardware, software, and related information is available.
  • (Version 5.90) Enable FIPS encryption on the system where the probe is deployed. For more information, see Enable FIPS Encryption section.

(Optional) Configure General Properties

You can change the default configuration of your probe if the default settings do not meet your needs. The probe is active and immediately attempts to publish data after installation with the default configuration. You can configure the logging properties and global QoS and alarm source and target parameters for the probe:

Follow these steps:

  1. Navigate to the Setup > General tab.
  2. Update the following field information:
    • Log level: specifies the level of details that are written to the log file. You can select the following log levels:

      • 0 - Logs only severe information (default)
      • 1 - Logs error information
      • 2 - Logs warning information
      • 3 - Logs general information
      • 4 - Logs debugging information
      • 5 - Logs tracing/low-level debugging information

      Note: Log as little as possible during normal operation to minimize disk consumption, and increase the amount of detail when debugging.

    • Log size: specifies the maximum size of the log file, in kilobytes.  When this size is reached, new log file entries are added and the older entries are deleted.
      Default: 100 KB

    • Send alarm on each sample: allows you to generate an alarm on each sample, if selected. If not selected, the probe waits for the number of samples (specified in the Samples field of the Control properties tab) before sending the alarm. For example, if the Interval value is 1 minute and the Samples value is 2 under the Control Properties tab, and if this option is:
      • Unchecked: the probe generates the first alarm in 2 minutes and respective alarm at 1 minute interval.
      • Checked: the probe generates the first alarm in 1 minute and each respective alarm at 1 minute interval.
        Default: Selected

        Note: The sample collected at the start of the probe is considered as the first sample. The sample count is cleared on de-activation of the probe. For more information, see Configure Control Properties.

    • Send short name for QoS source: allows you to send the host name. If not selected, sends the FQDN (Fully Qualified Domain Name).

      Important! If the Set QoS source to robot name option is selected in the controller, robot name is used as target.

    • Allow QoS source as target: allows you to use the QoS source as the target name and QoS messages can use the host name as their target, by default.

    • (Linux, Solaris, AIX and HP-UX platforms only) Calculate Load Average per Processor: allows you to calculate the load average per processor. For all Unix-based systems, the system load measures the computational work performed. This means that if your system has a load of 4, then four running processes are either using or waiting for the CPU. Load average refers to the average of the computer load over several periods of time. 
      Default: Selected
  3. Click Apply.

 (Optional) Configure Control Properties

Configure the following, as needed:

You can configure the global monitoring interval and sample count for the probe to calculate the values used to determine the alarm conditions. You can also configure the following parameters:

  • Global monitoring interval and sample count for the probe to calculate the values and determine the alarm conditions
  • Filesystem parameters in disks
  • Timeout (Non-Windows platforms) in disks 
  • QoS targets for CPU and memory

Follow these steps:

  1. Navigate to the Setup > Control Properties tab.
  2. Update the following field information to configure the global interval and sampling details for disk, CPU, and memory monitoring:

    Note: Reduce this interval to generate alarms and QoS frequently. A shorter interval can also increase the system load.

    • Interval: specifies the interval (in minutes) after which the monitoring information is obtained.

    • Samples: specifies the number of samples the probe stores. This number is used to calculate values to compare with thresholds. 

      Note: Set the sample value to 0 to use the default sample value for QoS messages for disks.

    • QoS Interval (Multiple of 'Interval'): specifies the monitoring interval between each sample collection for QoS information. For example, If the interval is set to 5 minutes and number of samples is set to 5, the average CPU utilization for the last 25 minutes is displayed in the QoS message.

      Note: If “the value is greater than 1, the probe only calculates the average data for CPU monitoring. For disk, memory, and paging QoS messages, the probe generates the QoS messages with the current value. The probe only uses the QoS Interval to reduce the frequency of QoS messages.

    Important! On Solaris Global Zone platforms, the probe executes the command echo ::memstat | mdb -k to retrieve the buffer and cache memory values. Ensure that the specified alarm and QoS intervals in the probe are greater than mdb command response time. Otherwise, perform one of the following options in the Raw Configuration interface:

    • Set the value of mem_buffer_timeout key to a desirable time lesser than the profile execution interval time. The default value for this key is 25 sec.
    • Set the mem_buffer_used key to Yes so that the probe does not execute this command. For more information about this key, see cdm Advanced Configuration.
  3. Update the following field information in the Disk properties section to configure the filesystem parameters:
    • Ignore Filesystems: defines the file system to be excluded from monitoring. For example, specify the regular expression C:\\ to exclude the C drive of the system from monitoring. A red symbol is displayed next to the disk drive which is excluded from monitoring in the Disk usage section of the Status tab.

      Important! On UNIX platforms, use the regular expression (/\) to exclude the root directory (/) from monitoring.

    • Filesystem Type Filter: specifies the type of the filesystem to be monitored using regular expressions. The Filesystem Type Filter does not disable monitoring for file systems already discovered and configured. This filter only prevents the probe from discovering a file system.
      • If we specify RegEx as *, then all filesystems are enabled for monitoring. Or you can also specify ext* to allow monitoring of filesystems with "ext". For example ext4 or ext5.

      • If we specify a negative RegEx, then specified filesystems are excluded from monitoring. For example, if we do not want to monitor ext4 filesystem, use /^(?!ext4)/

      • If this field is blank, no filesystem is enabled for monitoring. (default)

    • (Non-Windows platforms only) Timeout: specifies the maximum time for the probe to collect monitoring information. For example, timeout for disk fail or crash scenarios in stale filesystems allow you to prevent the probe from going into a pending state. You can specify the default timeout of 5 seconds to retrieve the disk statistics.

      Note: CA recommends a value of 10 seconds when the monitored system has high CPU load.

  4. Select Set QoS Target as 'Total' in the CPU properties section to specify the source of the QoS for Total (Individual as well as Average) from the hostname to Total. The following SQL scripts demonstrate how to update old data to confirm with when the QoS Target as Total is changed:
    • QOS_CPU_USAGE
      Execute the following SQL query in the UIM database to view the rows to be updated:

      SELECT * FROM dbo.s_qos_data

       WHERE probe LIKE 'cdm'

       AND qos LIKE 'qos_cpu_usage'

       AND target NOT IN('user','system','wait','idle')

      Execute the following SQL query in the UIM database to update the table for the new target. Target is the new QoS target to be set and Source is the QoS source for which target is changed. You can configure both the values, as applicable.

      Declare @Target varchar(100) Declare @Source varchar(100)
      SELECT @Target = 'Total'

      SELECT @Source = 'tsuse10-32'

      UPDATE dbo.s_qos_data

      SET target=@Target

       WHERE source LIKE @Source

       AND probe LIKE 'cdm'

       AND qos LIKE 'qos_cpu_usage'

       AND target NOT IN('user','system','wait','idle')

    • QOS_CPU_MULTI_USAGE
      Execute the following SQL query in the UIM database to view the rows to be updated:

      SELECT * FROM dbo.s_qos_data

       WHERE probe LIKE 'cdm'

       AND qos LIKE 'qos_cpu_multi_usage'

       AND (target NOT LIKE 'User%'

       AND target NOT LIKE 'System%'

       AND target NOT LIKE 'Wait%'

       AND target NOT LIKE 'Idle%')

      Execute the following SQL query in the UIM database to update the table for the new target. Target is the new QoS target to be set and Source is the QoS source for which target is changed. You can configure both the values, as applicable.

      Declare @Target varchar(100) Declare @Source varchar(100)

      SELECT @Target = 'Total'

      SELECT @Source = 'tsuse10-32'

      UPDATE dbo.s_qos_data

      SET target=@Target+RIGHT(target,2)

       WHERE source LIKE @Source

       AND probe LIKE 'cdm'

       AND qos IN ('qos_cpu_multi_usage')

       AND (target NOT LIKE 'User%'

       AND target NOT LIKE 'System%'

       AND target NOT LIKE 'Wait%'

       AND target NOT LIKE 'Idle%')

  5. Select Set QoS target as 'Memory' in the Memory & paging properties to specify the QoS target for memory and paging as Memory. The following SQL scripts demonstrate how to update old data in the database when the QoS target as Memory is changed:
    • Execute the following SQL query in the UIM database to view the rows to be updated:

      SELECT * FROM dbo.s_qos_data
       WHERE probe LIKE 'cdm'

       AND (qos LIKE'QOS_MEMORY_PERC_USAGE'

       OR qos LIKE 'QOS_MEMORY_PAGING_PGPS'

       OR qos LIKE 'QOS_MEMORY_PAGING'

       OR qos LIKE 'QOS_MEMORY_PHYSICAL'

       OR qos  LIKE 'QOS_MEMORY_PHYSICAL_PERC'

       OR qos LIKE 'QOS_MEMORY_SWAP'

       OR qos LIKE 'QOS_MEMORY_SWAP_PERC'

       OR qos LIKE 'QOS_PHYSICAL_MEMORY_TOTAL')

    • Execute the following SQL query in the UIM database to update the table for the new target. Target is the new QoS target to be set. You can configure the value.

      Declare @Target varchar(100)
      SELECT @Target = 'Memory'
      UPDATE dbo.s_qos_data
      SET target=@Target
       WHERE probe LIKE 'cdm'
       AND (qos LIKE'QOS_MEMORY_PERC_USAGE'

       OR qos LIKE 'QOS_MEMORY_PAGING_PGPS'

       OR qos LIKE 'QOS_MEMORY_PAGING'

       OR qos LIKE 'QOS_MEMORY_PHYSICAL'

       OR qos  LIKE 'QOS_MEMORY_PHYSICAL_PERC'

       OR qos LIKE 'QOS_MEMORY_SWAP'

       OR qos LIKE 'QOS_MEMORY_SWAP_PERC'

       OR qos LIKE 'QOS_PHYSICAL_MEMORY_TOTAL')

  6. Click Apply.

Configure Alarm and QoS Properties

You can configure the following alarm and QoS properties of the probe:

Configure Cluster Alarm and QoS Source

The probe automatically detects and displays the list of virtual groups belonging to the cluster.  You can configure the alarm and QoS source for each virtual group as one of the following parameters:

  • <cluster ip>: The IP address of the cluster.
  • <cluster name>: The name of the cluster.
  • <cluster name>.<group name>: A combination of the cluster name and the group name

Note: The Cluster tab is available only when the monitored system is part of a cluster and the cluster probe is deployed on the robot.

Follow these steps:

  1. Navigate to the Setup > Cluster tab.
    The probe displays all the cluster groups that include the monitored host.
  2. Double-click the required cluster from the list to display the Group sources dialog.
    The probe displays the group name in the Virtual group field.
  3. Update the following information to configure the alarm and QoS source:
    • Alarm source: specifies the source to be used for alarms.
      Default: <cluster ip>
    • QoS source: specifies the source to be used for QoS messages.
      Default: <cluster ip> 
  4. Click OK to close the dialog.
  5. Click Apply.

Configure Alarm Thresholds for CPU, Memory, and Paging

You can configure the alarm thresholds for the following parameters:

  • CPU usage
  • Memory usage: total, swap, and physical
  • Paging activity

Follow these steps:

  1. Navigate to the Status tab.
  2. Select the checkbox next to the applicable field to enable the respective threshold.

    Note: CA recommends you to specify a higher severity alarm (Default severity: Error) for High threshold and a lower severity alarm (Default severity: Warning) for Low threshold.

  3. Update the following information in the CPU usage section.
    • High: specifies the maximum CPU usage when the probe generates a higher severity alarm.
    • Low: specifies the maximum CPU usage when the probe generates a lower severity alarm.
  4. In the Memory usage section, select one of the following memory categories to specify the threshold:
    • M: allows you to specify the threshold values for total memory usage.
    • S: allows you to specify the threshold values for swap memory usage.
    • P: allows you to specify the threshold values for physical memory usage.
  5. Update the following information. The values in these fields depend on the selected memory category in Step 4.
    • High: specifies the maximum memory usage when the probe generates a higher severity alarm.
    • Low: specifies the maximum memory usage when the probe generates a lower severity alarm.
  6. (Optional) Repeat Step 4 and Step 5 for required memory categories.
  7. Update the following information in the Paging activity section:
    • High: specifies the maximum number of the paging data operations in a second when the probe generates a higher severity alarm.
    • Low: specifies the maximum number of the paging data operations in a second when the probe generates a lower severity alarm.
  8. Click Apply.

Configure Alarm Thresholds for Multi-CPU

You can specify the alarm thresholds for multi-CPU systems. A multi-CPU (multi-core processor) is a single computing component with two or more independent processors called cores. Each core individually reads and executes program instructions. A multi-core processor implements multiprocessing in a single physical package. 

Follow these steps:

  1. Navigate to the Multi CPU tab.

    Note: This tab is available only when the probe is monitoring a multi-CPU computer.

  2. Select the checkbox next to the applicable field to enable the respective threshold.
  3. Update the following information in the Alarm on section:
    • Maximum: specifies the maximum usage of any CPU in a multi-CPU system. The probe generates an alarm if the CPU usage is greater than the specified value.
    • Difference: specifies the maximum difference in usage between any two CPUs in a multi-CPU system. The probe generates an alarm if the difference is greater than the specified value.
  4. Click Apply.

Configure Advanced Monitoring Properties

You can configure the probe to generate QoS messages for the following components:

  • CPU: single and multi-CPU systems
  • CPU load average
  • Memory: total, swap, and physical
  • Physical Memory: total physical memory (RAM) available for a system.

You can also configure the following monitoring properties:

  • Alarms on processor queue length and system reboot
  • CPU usage parameters
  • Paging measurement unit

Follow these steps:

  1. Navigate to the Advanced tab. 
  2. Select the applicable QoS parameters from the following fields:
    • (Windows only) Processor Queue Length: enables you to generate QoS messages for the number of queued processes divided by the number of processors waiting for CPU time for the system.

      Note: This field is available as System Load (Processor Queue Length) on AIX, SGI, Linux, and Solaris platforms.

    • Computer uptime (hourly): enables you to generate hourly QoS messages for the computer uptime (in seconds).
    • (AIX, SGI, Linux, and Solaris) Load Average 1 min: enables you to generate QoS messages for the average CPU usage during the last one minute.
      Default: Not Selected
    • (AIX, SGI, Linux, and Solaris) Load Average 5 min: enables you to generate QoS messages for the average CPU usage during the last five minutes.
      Default: Not Selected
    • (AIX, SGI, Linux, and Solaris) Load Average 15 min: enables you to generate QoS messages for the average CPU usage during the last fifteen minutes.
      Default: Not Selected 

      Note: The probe does not use sampling for the three load average QoS messages. The values are calculated for each processor if Calculate Load Average per Processor is selected in the Setup > General tab.

    • Memory Usage: enables you to generate QoS messages for the amount of total available memory (physical and virtual memory) used in Mbytes.
    • Memory in %: enables you to generate QoS messages for the amount of total available memory (physical and virtual memory) used in %.
    • Memory Paging in Kb/s: enables you to generate QoS messages for the amount of paging virtual memory in Kbytes/second.
    • Memory Paging in Pg/s: enables you to generate QoS messages for the amount of paging virtual memory in pages per second.

      Note: If you use probe version 3.70 or earlier, the QoS settings in the GUI are different from that in version 3.72. However, if you have already created QoS entries in the database for kilobytes per second (KB/s) and pages per second (Pg/s) or both using probe version 3.70 or earlier, these entries are retained and updated with QoS data from the probe version 3.72 and higher.

    • Physical Memory Usage: enables you to generate QoS messages for the amount of total available physical memory used in Kbytes.
    • Physical Memory in %: enables you to generate QoS messages for the amount of total available physical memory used in %.
    • Swap Memory Usage: enables you to generate QoS messages for the space on the disk used for the swap file in Kbytes.
    • Swap Memory in %: enables you to generate QoS messages for the space on the disk used for the swap file in %.
    • Physical Memory: enables you to generate QoS messages for the total physical memory (RAM only) available for the system.
    • (From version 6.10 on AIX only, with Active Memory Expansion (AME) enabled) Memory Expansion Factor: enables you to generate QoS messages for the target memory expansion factor that you have configured in your system to increase the total physical memory of a logical partition (LPAR).
    • (From version 6.10 on AIX only, with AME mode enabled) Memory Xphysc: enables you to generate QoS messages for the number of physical processors used for AME.
    • (From version 6.10 on AIX only) Smt: enables you to generate QoS messages for the number of simultaneous multithreads in the partition.
    • (From version 6.10 on AIX only, with AME mode enabled) Memory Dxm: enables you to generate QoS and alarm messages for the memory deficit in megabytes. At times, an LPAR cannot be configured with the provided memory expansion factor as it is too large, and the workload in the LPAR does not compress well. Thus, a memory deficit is created.
    • (From version 6.10 on AIX only) Total Memory: enables you to generate QoS messages for the total memory size of an LPAR in megabytes.
  3. Select the applicable QoS parameters in the Total CPU tab from the following fields:
    • CPU Usage (Total): enables you to generate QoS messages for the total CPU usage. The probe does not include tasks such as input/output as the CPU usage remains 0%.

      Note: The CPU Usage (Total) QoS message includes user and system CPU usage. The probe also includes the CPU wait information in the message if you select CPU Wait is included in CPU Usage (Total) in the CPU Usage options section.

    • CPU User: enables you to generate QoS messages for the CPU usage on user tasks.
    • CPU System: enables you to generate QoS messages for the CPU usage on system tasks.
    • CPU Wait: enables you to generate QoS messages for the time that the CPU waits to access external devices or memory.
    • CPU Idle: enables you to generate QoS messages for the time the CPU runs idle without processing anything.
  4. (Multi-CPU Systems only) Select the applicable QoS parameters in the Individual CPU tab from the following fields:
    • Individual CPU Usage (Total): enables you to generate QoS messages for the total CPU usage. The probe does not include tasks such as input/output as the CPU usage remains 0%.

      Note: The Individual CPU Usage (Total) QoS message includes user and system CPU usage. The probe also includes CPU wait information in the message if you select CPU Wait is included in CPU Usage (Total) in the CPU Usage options section.

    • Individual CPU User: enables you to generate QoS messages for the CPU usage on user tasks.
    • Individual CPU System: enables you to generate QoS messages for the CPU usage on system tasks.
    • Individual CPU Wait: enables you to generate QoS messages for the time that the CPU waits to access external devices or memory.
    • Individual CPU Idle: enables you to generate QoS messages for the time the CPU runs idle without processing anything.
  5. Select Alarm on Processor Queue Length to enable alarms on the maximum length of the processor queue.

    Note: This field is available as Alarm on System Load on AIX, SGI, Linux, and Solaris platforms.

  6. Update the following information to configure alarms on the length of processor queue:
    • Max. Queue Length: specifies the maximum length of the processor queue before the probe generates an alarm.

      Note: On multi-CPU systems, the queued processes are shared on the number of processors. For example, if running on a system with four processors and using the default Max Queue Length value (4), alarm messages are generated if the number of queued processes exceeds 16.

    • Message id: specifies the alarm message to use when the length of the processor queue is greater than the specified value.
  7. Select Detected reboot in the Alarm on section to generate an alarm when the probe detects a system reboot.
  8. Update the following information in the CPU Usage options section:
    • CPU Wait is included in CPU Usage (Total): enables you to add the CPU wait time to the total CPU usage of all and individual CPUs.
    • (AIX only) CPU stats. against entitled capacity: calculates the CPU usage based on the entitled capacity.
      The formula to calculate CPU usage on AIX system is:

      Lparstat –i command
      Total Capacity =( maxVirtualCPU/ maxCapacity)*100;
      CPU User = CPU user  *EntCap)/TotCapacity;
      cpuStats->fSystem = (double)((cpuStats->fSystem * cpuStats->fEntCap)/TotCapacity);
      cpuStats->fWait = (double)((cpuStats->fWait * cpuStats->fEntCap)/TotCapacity);
      cpuStats->fIdle = (double)((cpuStats->fIdle * cpuStats->fEntCap)/TotCapacity);

    • Top CPU consuming processes in alarm: specifies the number of top CPU consuming processes that are included in CPU usage alarms.
      Default: 5
      Consider the following points while using this option:
      • This alarm is generated when the defined total CPU usage is breached. The new alarms generate the process information in the following format:
        [processname[pid]-cpu%]; [processname[pid]-cpu%]
      • The actual CPU value in the alarm may not always match the total percentage of all the top CPU consuming processes shown in the alarm message. It may vary as Total CPU Usage is calculated on the basis of samples. The probe retrieves the raw data at a given time and displays in the alarm.
      • For non-Windows platform, the probe uses ps command to retrieve the top CPU consuming processes.
      • Depending on your environment, the values achieved (%) can be over 100%. For example, the monitoring environment includes the 56-core system using hyper-threading. In this case, the number of virtual cores, because of hyper-threading (more than 1 thread per core), becomes twice that value (2 threads per core). If the user has up to 112 virtual cores (112 = 56*2), the maximum possible value (%) from Top CPU consuming processes can be 11200% (112 virtual cores * 100). This explains why a user may see CPU consuming values over 100% or 1000% or 5000%.
  9. Update the following information in the Memory Usage options section:
    • Top Memory consuming processes in alarm: specifies the number of top memory consuming processes that are included in memory usage alarms.
      Default: 5
  10. Select from the following options in the Paging measured in section to configure the unit for paging activity messages:
    • Kilobytes per second: enables the probe to use paging activity as data that is sent to or read from virtual memory in Kbytes/second.
    • Pages per second: enables the probe to use paging activity as the number of paging operations on the virtual memory in pages/second.

    Default: Kilobytes per second

    Note: When changing the paging selection, the header of the Paging graph on the Status tab immediately changes to show the selected unit, but the values in the graph do not change until the next sample is measured.

  11. Click Apply.

Configure Alarms and QoS for Disk Usage

You can configure the probe to monitor local, clustered, and shared disks over the network. When monitoring cluster or shared disks (such as NFS mounts) over low-performance or over-utilized lines, the response time can be slow.

Note: The NFS mounts monitored in the cdm probe point to servers. These servers also appear during discovery in USM.

You can generate QoS messages and alarms for the following disk performance categories:

  • Disk usage on local, clustered, and shared disks (in percentage or megabytes)
  • Change in disk usage on local and clustered disks (in megabytes or percentage)

Note: The probe uses the mount entries as in /proc/mounts file in Linux to display the file system type of devices that are remounted to a different location.

Follow these steps:

  1. Navigate to the Status tab.
  2. (Optional) Right-click a shared or network disk and select Enable space monitoring to monitor availability state and usage of shared and network disks using any robot where the probe is deployed.

    Note: The Enable space monitoring option appears only for the shared drive/folder (using the New Share... option) that is monitored by the cdm probe.

  3. (Optional) Right-click in the Disk usage section and select Modify Default Disk Parameters to open the Fixed Disk Properties dialog and configure the default properties for disk monitoring. Select Active in the Present tab to apply the default parameters to all disks. The probe also generates alarms if a configured disk is missing.
  4. Right-click a disk in the Disk usage section and select Edit to open the Drive name dialog.

  5. Update the following information to configure alarms on disk usage in the Disk usage and thresholds tab. The tab also displays the total, used, and free space on the disk.

    Note: CA recommends you to specify a higher severity alarm (Default severity: Error) for High threshold and a lower severity alarm (Default severity: Warning) for Low threshold.


    • Monitor disk using: select from the following options to specify the measuring criteria for disk usage:
      • MB: enables the probe to generate disk usage alarms in megabytes.
      • % : enables the probe to generate disk usage alarms in percentage.
    • High: specifies the minimum free disk space and the applicable higher severity alarm message. The probe generates the alarm if the free disk space is lower than the specified value.
    • Low: specifies the minimum free disk space and the applicable lower severity alarm message. The probe generates the alarm if the free disk space is lower than the specified value.

     You can also view the average free space of the last 4 collected samples.

  6. Select the applicable QoS messages that the probe generates for disk usage in the Quality of Service message section:
    • on Disk usage in Mb: enables the probe to generate QoS messages on disk usage (in megabytes).
    • on Disk usage in %: enables the probe to generate QoS messages on disk usage (in percentage).
    • on Disk Free in Mb: enables the probe to generate QoS messages on free disk space (in megabytes).
    • on Disk Free in %: enables the probe to generate QoS messages on free disk space (in percentage).
  7. Navigate to the Disk usage change and thresholds tab to configure QoS and alarms for change in disk usage.
  8. Select one of the following options to configure disk usage change calculation:
    • Change summarized over all samples: enables the probe to use the difference between the latest sample and the first sample. The number of samples that the probe stores in memory for threshold comparison is set in Samples on the Setup > Control Properties tab.

      Note: There can be discrepancies between the values in QoS and values in alarms when the Change summarized over all samples option is selected. This is because the QoS are generated on every interval and alarms are generated based on the selection of the option Change summarized over all samples.

    • Change between each sample: The change in disk usage will be calculated after each sample is collected.
  9. Update the following information to configure alarms for change in disk usage:

    Note: CA recommends you to specify a higher severity alarm (Default severity: Error) for High threshold and a lower severity alarm (Default severity: Warning) for Low threshold.


    • Type of change: allows you to specify whether alarms are generated on increase, decrease, or both increase and decrease in disk usage.
    • Monitor disk usage change using: select from the following options to specify the measuring criteria for disk usage change:
      • MB: enables the probe to generate disk usage change alarms in megabytes.
      • % : enables the probe to generate disk usage change alarms in percentage.
    • High: specifies the maximum change in disk usage and the applicable higher severity alarm message. The probe generates the alarm if the change in disk usage is greater than the specified value.
    • Low: specifies the minimum change in disk usage and the applicable lower severity alarm message. The probe generates the alarm if the change in disk usage is greater than the specified value. 
  10. Select the applicable QoS messages that the probe generates for disk usage change in the Quality of Service message section:
    • on Disk Usage change in Mb: enables the probe to generate QoS messages on change in disk usage (in megabytes).
    • on Disk Usage change in %: enables the probe to generate QoS messages on change in disk usage (in percentage).
  11. Click OK to close the dialog.
  12. Click Apply.

(UNIX only) Configure Inode Disk Monitoring

You can view the number of total, used, and free inodes on the filesystem. You can configure thresholds for alarms and QoS.

Follow these steps:

  1. Navigate to the Status tab.
  2. (Optional) Right-click a shared disk and select Enable space monitoring to configure disk usage on shared disks.
  3. Right-click a disk in the Disk usage section and select Edit to open the Drive name dialog.

    Note: You can also configure the default properties for inode monitoring. Right-click in the Disk usage section and select Modify Default Disk Parameters to open the Fixed Disk Properties dialog. Select Active in the Present tab to apply the default parameters to all disks. The probe also generates alarms if a configured disk is missing.

  4. Navigate to the Inode usage and thresholds tab.
  5. Update the following information to configure alarms for inode usage:

    Note: CA recommends you to specify a higher severity alarm (Default severity: Error) for High threshold and a lower severity alarm (Default severity: Warning) for Low threshold.


    • Monitor inodes using: select from the following options to specify the measuring criteria for inode usage:
      • inodes: enables the probe to generate inode usage alarms in number of inodes.
      • % : enables the probe to generate inode usage alarms in percentage.
    • High: specifies the minimum free inode space and the applicable higher severity alarm message. The probe generates the alarm if the free disk space is lower than the specified value.
    • Low: specifies the minimum free inode space and the applicable lower severity alarm message. The probe generates the alarm if the free disk space is lower than the specified value.
  6. Select the applicable QoS messages that the probe generates for inode usage:
    • on Inode usage in inode count: enables the probe to generate QoS messages on inode usage (in number of inodes).
    • on Inode usage in %: enables the probe to generate QoS messages on inode usage (in percentage).
  7. Click OK to close the dialog.
  8. Click Apply.

Monitor Network Disk Availability

You can monitor the availability of NFS disks and generate alarm and QoS messages. 

Follow these steps:

  1. Navigate to the Status tab.
  2. Right-click a shared or network disk and select Enable space monitoring.
  3. Right-click the shared or network disk in the Disk usage section and select Edit to open the Drive name dialog.

  4. Update the following information to configure the QoS on disk availability.
    • Network Drive: specifies the network drive location.
    • Mountpoint: the server where the mount point is pointing.
    • Error message: specifies the error message that is generated when the network disk is unavailable.
    • Disk Available Quality of Service message: allows you to generate QoS message when the network disk is not available.

      Note: When the space monitoring option is not enabled (Enable space monitoring option present on right click of network disk in the Disk Usage section) and you enable the option Disk Available Quality of Service message, the probe may generate QoS for the disks that are available and mounted but in Stale condition.

  5. Click OK to close the dialog and enable the probe to generate QoS messages for the network disk availability.

Monitor Shared Disk Availability

You can monitor the availability of shared (network) disks and generate alarm and QoS messages.

Follow these steps:

  1. Navigate to the Status tab.
  2. Right-click in the Disk usage section and select New Share.
  3. Specify the path of the shared disk.
  4. Click OK to open the Disk name dialog.
  5. Update the following fields in the Share Properties section:
    • Share: displays the path of the shared disk.
    • User: specifies the username of the account that can access the shared disk.
    • Password: specifies the password for the specified User.
    • Message: specifies the alarm that the probe generates when connection to the shared disk fails.
  6. Select Folder Availability Quality of Service Message to enable the probe to generate QoS messages for the shared folder availability.
  7. Click OK to close the dialog.
  8. Click Apply.

Tip: How to mount CIFS share for monitoring

The folder to be mounted on Linux should have the "sharing"  option. Use the  following  command to mount  a windows folder on Linux:

mount -t cifs -o username=administrator,password=<password>//<machine_name_to_mount>/data/windows

Once the drive is added, the type appears as “Network”.

(Optional) Configure Active Messages

You can associate alarm messages with the applicable thresholds. You can configure alarms for CPU, disk, memory, computer, and other parameters.

Follow these steps:

  1. Navigate to the Setup > Message definitions > CPU tab.
  2. Drag-and-drop the applicable messages from the Message pool to configure the following sections:
    • CPU usage: allows you to configure the following alarms for total CPU usage. CA recommends you to specify a higher severity alarm (Default severity: Error) for High threshold and a lower severity alarm (Default severity: Warning) for Low threshold.
      • High threshold: specifies the (error) alarm message to generate when the total CPU usage is greater than the specified high threshold.
        Default: CpuErrorProcesses
      • Low threshold: specifies the (warning) alarm message to generate when the total CPU usage is greater than the specified low threshold.
        Default: CpuWarningProcesses
    • Individual CPU usage: allows you to configure the following alarms for CPU usage of individual CPUs. The configuration is only applicable for multi-CPU systems.
      • CPU maximum: specifies the alarm message to generate when the individual CPU usage is greater than the specified threshold.
        Default: CpuMultiMaxError 
      • CPU difference: specifies the alarm message to generate when the difference in usage of two CPUs is greater than the specified threshold.
        Default: CpuMultiDiffError
  3. Navigate to the Setup > Message definitions > Disk tab. You can modify the alarm messages for disks through the Status tab.
  4. Navigate to the Setup > Message definitions > Memory tab.
  5. Drag-and-drop the applicable messages from the Message pool to configure the following sections. The probe generates the alarms for total, swap, or physical memory. The alarms are based on the view selected in the Status tab.

    Note: CA recommends you to specify a higher severity alarm (Default severity: Error) for High threshold and a lower severity alarm (Default severity: Warning) for Low threshold.


    • Pagefile usage: allows you to configure the following alarms for pagefile usage.
      • High threshold: specifies the (error) alarm message to generate when the pagefile usage is greater than the specified high threshold.
        Default: PagefileError
      • Low threshold: specifies the (warning) alarm message to generate when the pagefile usage is greater than the specified low threshold.
        Default: PagefileWarning
    • Paging activity: allows you to configure the following alarms for paging data transfers per second.
      • High threshold: specifies the (error) alarm message to generate when the paging data transfers per second is greater than the specified high threshold.
        Default: PagingError
      • Low threshold: specifies the (warning) alarm message to generate when the paging data transfers per second is greater than the specified low threshold.
        Default: PagingWarning
  6. Navigate to the Setup > Message definitions > Computer tab.
  7. Drag-and-drop the applicable message from the Message pool to the Message field to configure the Computer boot alarm section. The alarm generates when a monitored system reboots.
    Default: BootAlarm
  8. Navigate to the Setup > Message definitions > Other tab.
  9. Drag-and-drop the applicable message from the Message pool to the Message field. The alarm generates when the probe is unable to retrieve monitoring information.
    Default: InternalAlarm
  10. Click Apply.

(Optional) Create Custom Profiles

The probe provides flexibility to add custom profiles and to these profiles you can add thresholds based on your requirements. So you can create a profile and add threshold based on percentage usage. You can also add another custom profile and add threshold based on space usage.You can configure custom profiles in the probe for CPU, Disk, and Memory resources that are not currently available in the monitored system. For example, you can create a custom profile for a cluster disk that is only temporarily available in the system.

Follow these steps:

  1. Navigate to the Custom tab.
  2. Right-click in the Profiles section and select from the following options:
    • New CPU Profile: allows you to create a custom CPU profile.
    • New Disk Profile: allows you to create a custom disk profile.
    • New Memory Profile: allows you to create a custom memory profile.
  3. Specify a Name and Description for the profile.
  4. (CPU Profile) Specify whether the probe generates alarms for average CPU or individual CPU on multi-CPU systems from the Alarm on drop-down list. Configure the alarm and QoS parameters, as required.
  5. (Disk Profile) Select the type of the disk from the New Disk Profile dialog and click OKConfigure the alarm and QoS parameters, as required.
  6. Update the following information to configure the disk properties:
    • Regular Expression for Mount point: allows you to specify a regular expression to filter through mount points.
    • Mount point: allows you to select a mount point for the disk.
    • Device: specifies the name of the device the disk is connected to.
  7. (Memory Profile) Configure the alarm and QoS parameters, as required.
    • Top Memory consuming processes in alarm: specifies the number of top memory consuming processes that are included in memory usage alarms.
      Default: 5
  8. Click OK to close the dialog and create the profile.
  9. Select the checkbox next to the profile name in the Profiles section to activate the profile.
  10. Click Apply.

(Optional) View Current Values in Probe Interface

You can view the current monitored values in the probe interface for the following components:

  • CPU usage
  • Memory usage (total, swap, and physical)
  • Paging activity
  • Disk usage
  • Multi-CPU usage

Follow these steps:

  1. Navigate to the Status tab. You can view the CPU usage, memory usage, and paging activity graphs. The Disk usage section displays the total, used, and free space on the disk.
  2. Right-click the required graph and select from the following graph-view options:
    • As 2D line
    • As 2D bar
    • As 2D area
    • As 3D line
    • As 3D bar
    • As 3D area
  3. (Memory usage) You can also select from the following options to filter through the type of memory data:
    • M: allows you to view the values for total memory usage.
    • S: allows you to view the values for swap memory usage.
    • P: allows you to view the values for physical memory usage.
  4. Navigate to the Multi CPU tab. You can view the CPU usage graph of each individual CPU in the system.
  5. Select the required individual CPU from Select processors to view to view the graph of only the selected CPU.
  6. Click the Update button to refresh the graphs to the current values. Clicking the Update button displays the current values only in the probe interface. QoS and alarms use monitored values at configured intervals.

(Optional) Configure Alarm Messages

You can create alarm messages, as required. You can then select the message at the applicable field in the probe.

Follow these steps:

  1. Navigate to the Setup > Message definitions tab.
  2. Right-click in the Message pool and select New. You can also click Edit to modify an existing message.
  3. Update the following information to configure the message properties:
    • Message text: specifies the text for the alarm message. You can use a $ sign to select variables from a list.
    • Severity: specifies the severity of the alarm.
      Default: Information 
    • Message token: specifies the category of the alarm message to be identified in USM.
    • Message subsystem: defines the subsystem ID of the message.

      Note: CA recommends you to not assign the same subsystem ID to different messages.

  4. Click OK to save the message.
  5. Click Apply.

You can use following variables with a message to create an alarm:

  • $boot_time

  • $check_description

  • $check_name

  • $cpu_multi_diff_err_name

  • $cpu_multi_max_err_name

  • $description
  • $device
  • $directory
  • $disk
  • $drive
  • $error
  • $faverage
  • $fcpu_idle
  • $fcpu_system
  • $fcpu_usage
  • $fcpu_used
  • $fcpu_user
  • $fidle
  • $fidle_average
  • $file
  • $filesys
  • $filesystemtype
  • $fiowait
  • $fiowait_average
  • $fmax_average
  • $fproc_qlength
  • $fsystem_average
  • $fusage_average
  • $fuser_average
  • $hostname
  • $icount
  • $icpu_id
  • $id
  • $isample
  • $isamples
  • $laverage
  • $limit
  • $llast_val
  • $max_min_diff
  • $max_processor_index
  • $min_processor_index
  • $netstatus
  • $robotname
  • $situation
  • $size
  • $size_gb
  • $size_mb
  • $space
  • $type
  • $unit
  • $value
  • $value_last
  • $value_limit
  • $value_number
  • $iostat_name
  • $processes

More Information:

Was this helpful?

Please log in to post comments.

  1. Baltasar Infante
    2017-05-18 10:17

    Hi, I've had a case where the customer complained about the doc being messy when it comes to the monitoring of NFS drives. I've run some tests and I agree with him:

    1) Basically, until he enabled "Space monitoring", he was not succeeding while monitoring QOS_DISK_AVAILABILITY and he was not being alerted when the NFS drive was down.

    2) The doc mentions the following:

    Monitor Network Disk Availability You can monitor the availability of NFS disks and generate alarm and QoS messages.

    Follow these steps:

    1. Navigate to the Status tab.
    2. Right-click a shared or network disk and select Disable space monitoring.
    3. Right-click the shared or network disk in the Disk usage section and select Edit to open the Drive name dialog.
    4. Update the following information to configure the QoS on disk availability. Network Drive: specifies the network drive location. Mountpoint: the server where the mount point is pointing. Error message: specifies the error message that is generated when the network disk is unavailable. Disk Available Quality of Service message: allows you to generate QoS message when the network disk is not available.

    Note: When the space monitoring option is not enabled (Enable space monitoring option present on right click of network disk in the Disk Usage section) and you enable the option Disk Available Quality of Service message, the probe may generate QoS for the disks that are available and mounted but in Stale condition. 5. Click OK to close the dialog and enable the probe to generate QoS messages for the network disk availability. --------------------- Point 2->Implies you should disable space monitoring rather than enable it. End of point 4->I guess there is a typo and instead of "Stale" should be "Stable" or "Static", if it means the QOS_DISK_AVAILABILITY value will be static (or won't change, like what happened to the customer because he did not have the Space monitoring enabled.

    To sum up: to monitor NFS disk availability, you need to enable Space monitoring, if not, the drive does not reflect the values of QOS_DISK_AVAILABILITY when the drive's status changes.

    Thanks.

    Best regards,

    Balta.

    1. Raka Saha
      2017-05-19 03:18

      Thank you Baltasar Infante, we have updated the document to suggest that you need to select Enable space monitoring option to monitor the network disk availability. Please let us know if you have any other queries.

       

      -Documentation Team. 

  2. Daniel Blanco
    2017-07-14 04:13

    Are you sure this is correct?

    In this section:

    Filesystem Type Filter:

    "If this field is blank, no filesystem is enabled for monitoring. (default)"

    That doesn't make sense. By default this is blank so out of the box the probe won't monitor any file systems? All my file systems on my cdm instances are monitored and this field is empty.

    1. Raka Saha
      2017-07-20 05:26

      Hello Daniel Blanco

      This is the expected behavior of the Filesystem Type Filter field. By default when cdm is deployed this field is blank and no filesystem is monitored. If any filter is applied, then cdm will monitor only those filesystem as defined in the filter.

      Since this is not what you are experiencing, we would like to find out the cause. Can you please let us know which interface you are using. Is it VB/AC/ or MCS ? Also please do let us know the cdm version.

      -Documentation Team.

  3. Baltasar Infante
    2017-07-25 04:27

    Hello, from case 00775010, we would need to add some more info on the "Top CPU consuming processes in alarm" section. Basically, we need to show that depending on customer's environment, the values achieved (%) can be over 100%. In fact, it can be over 1000%. For example:

    Imagine a customer who is having a 56-core system, using hyper-threading. In this case, the number of virtual cores, because of hyper-threading (more than 1 thread per core), becomes twice that value (2 threads per core).

    In the above scenario, customer could have up to 112 virtual cores (112 = 56*2).

    So, the maximum possible value (%) from Top CPU consuming processes could be 11200% (112 virtual cores * 100)

    This explains why a customer might see CPU consuming values over 100% or 1000% or 5000%.

    So, could we please have this added to the "Top CPU consuming processes in alarm" section?

    Thanks a lot.

    Best regards,

    Balta.

    1. Raka Saha
      2017-08-01 05:13

      Thanks Baltasar Infante , we have updated the documentation to include this information. 

      Please let us know if you need more information.

      -Documentation Team.

  4. Kumar
    2017-10-05 10:39

    From CDM we can get only disk usage as QOS can it be possible to give the Disk Total & Disk Free as well on alert and QOS both.

    1. Medikonda, Sandeep Samuel
      2017-10-06 05:22

      Kumar Thanks for reaching out, we'll get this confirmed from the team.

      -Documentation Team

    1. Medikonda, Sandeep Samuel
      2017-10-16 01:01

      Hi Kumar, you can use both the Disk total and Disk usage metrics together. You need to note that the QOS_DISK_TOTAL_SIZE shows just the QoS as it is a total disk size and does not change, and hence there is no Alarm.

      If you are asking about disk free space? There is no metric but the disk free space is the difference between QOS_DISK_TOTAL_SIZE (Total size of the disk) - QOS_DISK_USAGE (Aggregated disk usage).

      -Documentation Team

      1. Kumar
        2017-10-24 11:11

        Thanks for the update, i can't see any option to enable QOS_DISK_TOTAL_SIZE so please let me know the option.

        Thanks, Naveen

  5. Richard Little
    2018-11-01 09:46

    Filesystem Type Filter: specifies the type of the filesystem to be monitored using regular expressions. If we specify RegEx as , then all filesystems are enabled for monitoring. Or you can also specify ext to allow monitoring of filesystems with "ext". For example ext4 or ext5.

    can you please add a note that this filter does not disable monitoring for file systems already discovered and configured. this filter only prevents the probe from discovering a file system

    1. Medikonda, Sandeep Samuel
      2018-11-08 02:23

      Thanks Richard Little, the updated description should be available from our next refresh.

      -Documentation Team