Infrastructure Monitoring
Customer
The Client is a global bank based in South Africa with operations in more than 20 countries, total assets of more than US$ 100 billion and more than 40,000 employees. The major areas of business of the client are in the areas of Personal and Business Banking, Corporate and Investment Banking, and Investment Management and Life Insurance
Business Need
Often, there comes a point in the growth of an organization where the impact of IT services moves from productivity enhancement to critical operations associated with business continuity. In response, the IT management challenge moves from simply troubleshooting problems 'after the event' to proactively monitoring and anticipating problems before they happen.
For Standard Bank this threshold became obvious when the single location of the organization found itself using more than 500 servers and many applications with a steadily growing requirement for more in the future.
Having identified the need for a Standard Bank IT operations management solution, ITC Infotech provided a solution which is easy to use, low cost solution, Fault tolerant and would accommodate multiple platforms
Solution
As part of monitoring solution implementation ITC Infotech identified the following modules which fits into standard bank environment
- OpManager - To monitor the health of servers and network devices
- AppManager - to monitor the MS SQL, MySQl, Oracle and Sybase databases
- Firewall analyzer - to monitor the check point logs
- NetFlow Analyzer - to monitor the network bandwidth usage
- OpStor - To monitor the storage devices
- VQManager - To monitor the Voice quality
Architecture
ITC Infotech has implemented the all the modules on 2 servers which are in a different data centers to provide the disaster recovery feature for failure of any modules. This involves in monitoring the parameters over the network in case of failure of one server. Architecture diagram is as below
Module implementation / Site is as follows
ITC Infotech proposed non DR for VQManager to avoid the port conflict
Parameters being monitored - Module wise
OpManager
| OpManager | |||
|---|---|---|---|
| Windows Servers | Unix Servers | Switches | Routers |
| Availability | Availability | Availability | Availability |
| Health | Health | Health | Health |
| Response Time | Response Time | Response Time | Response Time |
| CPU Utilization | CPU Utilization | CPU Utilization | CPU Utilization |
| Disk Utilization | Disk Utilization | Memory Utilization | Memory Utilization |
| Disk Partition Monitoring | Memory Utilization | Traffic Monitors | Traffic Monitors |
| Memory Utilization | Active Processes | Interface availability | Interface availability |
| Traffic Monitoring | Software Installed | Interface utilization | Interface utilization |
| Network Interface | Traffic Monitoring | Port Configuration | Interface Traffic |
| Active Processes | Asset Details | Dependencies | Interface Errors |
| Software Installed | Slice Monitoring | Interface Configuration | |
| Asset Details | Service Monitoring | Dependencies | |
| Service Monitoring | NIC Traffic Monitoring | IP Routing Table | |
| IP Address Table | |||
AppManager
| MYSQL | Oracle | Sybase |
|---|---|---|
| Out of the Box Solution | Out of the Box Solution | Script Monitoring |
| Health | Health | Availability |
| Version of SQL | Oracle Version | Health |
| ODBC Version | Availability | Response Time |
| Availability | Connection Time | Database Connection Time |
| Memory Usage | User Activity | UserStats Query execution time |
| Buffer Manager Statistics | Database Details | Process Stats Query Execution time |
| Connection Statistics | Database Status Database | Stats Query Execution time |
| Cache Details | Table Space Usage | Table Database Stats |
| Lock Details | Table Space Details | Table Process Stats |
| SQL Statistics | Table Space status | Table User Stats |
| Latch Details | SGA Performance, Details and Status | Table DatabaseStats |
| Access Method Details | Performance of Data Files | Table ProcessStats |
| Database Details | Session Details, wait and Summary | Table UserStats |
| Server Resource Utilisation Snapshot | Rollback Segment | |
| File Systems Space Monitoring | ||
| Process Oracle and Listener Monitoring | ||
| Scripting Monitoring | ||
| Max processes running | ||
| Error Log Reporting | ||
| Table Space Monitoring | ||
Firewall Analyzer
| Monitoring | Reporting |
|---|---|
| Track Bandwidth Usage | TRAFFIC REPORTS |
| Detect Intrusions | PROTOCOL USAGE REPORTS |
| Audit Traffic | WEB USAGE REPORTS |
| Detect Anomalies through network behavioral analysis | MAIL USAGE REPORTS |
| FTP USAGE REPORTS | |
| TELNET USAGE REPORTS | |
| STREAMING AND CHAT SITE REPORTS | |
| EVENT SUMMARY REPORTS | |
| VPN REPORTS | |
| FIREWALL REPORT RULES | |
| INBOUND AND OUTBOUND TRAFFIC | |
| INTERNET REPORTS | |
| INTERNET REPORTS | |
| SECURITY REPORTS | |
| VIRUS REPORTS | |
| ATTACK REPORTS | |
| PROTOCOL TREND REPORTS | |
| TRAFFIC TREND REPORTS | |
| EVENT TREND REPORTS |
NetFlow Analyzer
- Threshold based alerting
- Provide granular details about bandwidth usage for each WAN link
- Real-Time bandwidth monitoring - Bandwidth monitoring reports for each interface which shows current, average, and peak bandwidth usage patterns across each NetFlow-enabled interface.
- Bandwidth usage statistics for each hosts, applications and conversation across specific interface
- Historical bandwidth usage trends which will help customer for future plans
- Application wise bandwidth distribution reporting
- Consolidated bandwidth usage summary
OpStor
- Fabric Switches
- RAID (Storage Arrays)
- HBA
- Host Servers
- Tape Library
VQ Manager
- Monitoring Parameters
- Real-time VoIP monitoring
- Pro-active notification
- Problem Diagnosis
- Comprehensive Reporting
Integrated View (Service grouping dashboard)
Currently Adventnet tools have a module specific dashboard, Which demands multiple windows to view all the reports for all the modules and yet doesn't give service level information. ITC Infotech recommended and implemented the service grouping view (Integrated dashboard) for AppManager, OpManager and OPStor which will give consolidated service level view. With this one can view 3 modules service level drilled down view using AppManager reporting window. This also provides granular report for each device, application and storage devices.
Salient features
- Read only view - No conflicts between module owners
- Complete drilled down view of devices, application and storage devices
- Just open the link without providing the login credentials - Separate credentials are required in out of box dashboard viewing
- Service level information - Which is not available in Out of box dashboard viewing
The following screenshot represent the current architecture of the solution
In the above mentioned window is the AppManager dashboard and TierA-A1 TierA-A2 is the dashboards for OpManager and OpStor
Some of the highlights of this project include:
- Integrated dash board for AppManager, OpManager and OpStor showing the current health of the infrastructure.
- Notifications and alerts in the event of failure and threshold breaks
- 16X5 and 24X5 monitoring
- Reporting of availability, performance and failures over a given period
- Recommended approach for future state environment envisioned in the project objectives
Business
ITC Infotech has established robust communication processes for delivering the services. By utilizing these services, client was able to
- Predict demand, threats and potential failures
- Assess the system health
- Obtain real-time snapshots of the environment
- Get instant notifications on events
Solution provided by ITC Infotech was extensible to encompass future requirements of the client to monitor certain business-critical applications and scaleable to incorporate client's 20 global sites at a later stage in a cost effective and efficient way.

