iAM:Servers and associated software

At the heart of a DTCS solution during operation is a suite of software and tools that have been carefully integrated with a keen focus on the things that matter when keeping a critical system operational. In DTCS, that means running monitoring agents on all nodes in the VMS Cluster and feeding any output from those agents to a pair of monitoring stations used by the operations team. Each station, called an "OMS" (Operations Management Station) independently tracks any output (events) received. The OMS systems also collect information by "listening" to the console devices of the VMS systems that are part of the protected environment.

Events can be displayed on the built-in monitor or forwarded to existing in-house systems, to email and, with the appropriage external hardware or software sent to pagers as SMS messages.

The tools described on this page are the core components of this architecture, namely iAM:Servers and associated products.

iAM:Servers

This is the main ‘rule engine’ at the heart of the monitoring solution.

On OpenVMS it can be configured to monitor or sample a wide-range of items, which it calls statistics. In the case of DTCS, the focus of the statistics gathered are those relating to the health of the hardware and the status of the cluster. The statistics also include information on key application processes (e.g. whether processes are missing, looping or in unexpected lock states).

On the Windows OMS Systems, statistics are limited to server critical items - such as hardware failure and memory consumption.

So in summary, iAM:Servers responsibilities include monitoring of the state of all server systems using both IP reachability status and the DTCS ‘heartbeat’ component.

TotalView Monitor

TotalView is the "viewer" component that is supplied as part of the iAMs Suite. In the case of the DTCS environnent it is specifically configured to present events in a tiered manner, allowing very rapid detection of events that may be critical to the environmnent or to the application execution.

TotalView acts as the main operator console view for the OMS.

EDA

EDA is the Itheon component responsible for reliable transportation of events between monitored and monitoring system. E.g. events sent from OpenVMS Cluster nodes to the OMS and events sent between OMS’s. This component is not visible to the operator, and acts in the background to ensure timely and reliable presentation of information to the DTCS event monitor. It can also be used for optional customisations such as paging or forwarding of events to external monitoring systems via SNMP traps.

iAM:Consoles

iAM:Consoles™ is used in the DTCS environment to give operator access to all relevant console lines such as OPA0: for each OpenVMS system and the serial console line of any Storage Controllers. Optionally it can be used to provide GUI access to any device which has a ‘serial data’ consoles facility, such as a UPS or network router etc. The second responsibility for iAM:Consoles™ is to monitor console output and compare against a set of predefined text rules. If any match is found, then an event is generated for display under the DTCS event viewer.