Why do you need a monitoring service such as Watchmouse? (2005-01-31)
There are a
number of reasons for this, depending on your role in your
organization, and what you want to achieve. Each of these roles leads
to a different approach for using and setting up the
service.
Most likely you are either responsible for keeping a
service such as a website online, or you have contracted somebody
else to do that for you. Additionally, you could be a consultant or
technical architect who wants to get an insight in performance and
uptime characteristics of various solutions and services.
If your
role is to keep things running, you really want to be notified of
problems as soon as possible, before your customers or supervisors
notice. You want appropriate error messages and not too many false
alarms. As you configure Watchmouse you probably want to have a quick
alert by e-mail or SMS/text message when things don't work and have additional
diagnostic information available. In this way, downtime can be kept
to a minimum. It is not only the quality of the systems that counts,
but also the speed with which you can fix problems.
Your role
could also be in overseeing your service providers, whether they are
internal or outsourced. In that case, you don't want to be
interrupted by these messages, unless the situation becomes dramatic.
Instead you would like to look at the weekly report, and see if your
service providers are living up to their promises. On the Internet it
is easy to get 99% uptime, and you should really be doing better than
that. The services that regularly fail to make this grade need
attention, to see if another approach to provisioning them works
better.
If you are considering technical alternatives for the way
you are setting up your e-business, you are most likely interested in
typical failure modes. For example, we know from experience that
most website problems are software problems, followed by sizing
problems. Communications problems are fairly rare, and if they occur
they take the form of peering problems: websites cannot be reached
from specific networks, even if all networks are operational. One
approach using Watchmouse reports is to check various aspects with
different rules. Use one rule to download the homepage, another to
check the DNS and a third to check connectivity to the hosting
centre. In a next column I'll go into the details of this.
Peter van
Eijk is a management consultant specialized in management of
network infrastructures. He can be reached via his
contact page.