There are several levels of availability and therefore several levels of monitoring the availability :
is the Server up / down ?
Is the Service up / down ?
Is the Port answering ?
Is the Database Running ?
and more and more technical availabilities....
The IT world started by monitoring each individual issue with scripts / logs etc.... and when one part of the chain goes wrong then you are aware of it.
The thing is that in real life - the full chain is unknown - and each time a new chain element (Single point of failure) is discovered - it is either added to the detailed technical monitoring or not.....
Well the IT world evolved a bit in the last decade.... and is able to monitor applications and user experience:
By doing this they check the actual availability from several locations (LAN / WAN / Internet...) and can check business processes.
Do you perform Application monitoring in your organization ?
If so - give some examples.
If not - what is the reason ? the cost of the tools ?
There are several software solutions for application monitoring available on the market (like Microsoft SMS or in some way even SysAid can monitor processes) but I think that application monitoring is similar to "observing the user". I take the position that every administrator has to respect the users' privacy and do not monitor if he is using Open Office right now or not - so I don't think application monitoring for user applications is sensible and legal.
On the other hand I'm using SysAid for application monitoring (aka process monitoring) to check, whether our mail service is running or not.
Funny detail - If the mailservice fails, SysAid cannot alert me anymore - It would be really clever if SysAid might be able to handle more than one SMTP out connection and decide (based on COST factors or whatever) which gateway to use.
Unfortunatelly SysAids monitoring capabilities are quite limited (and microsoft oriented) so I use Cacti to graph my routers and firewalls and Munin to quick and dirty graph my linux boxes.
Another fine solution is Nagios which I've previously used or OpenNMS that I've tried before I decided that a combination of munin, cacti and SysAid is just right for our company size. The PRO of Nagios and openNMS is, that they can alert you and these systems are highly scaleable.
I've also posted a Feature Request regarding SysAids monitoring capabilities.
Regarding your issues about the dependencies - SysAid already implements a CMDB therefore it might be able to display impacts on failing services. I know that NAGIOS can handle large dependency chains and can display the impacts if one or more services fail.
we monitor the standard Exchange services as well as a few web servers, databases and a smattering of process across all servers. To do this we use Nagios and have hooked just about everything with a data port into it. At present we're running just shy of 1000 checks across the company with e-mail and sms alerting as well as historical graphing for looking at trends.