The Five Pillars of Monitoring – Pillar One

This post is the first in a series of five that will be released over the next few weeks.  Today’s post is an introduction to the series and the first pillar of monitoring.

When a website is launched there is one particular component of the system that is deeply important.  That system is monitoring.

Monitoring is often swept under the rug, held for last, or poorly done.  This is a huge mistake that, in my opinion, is nearly inexcusable.  It is my opinion that a lack of monitoring of key items in key contexts will lead to poor performance, outages, and bad business decisions.  Monitoring is simply a requirement for any production web application that you actually care about.

You might be wondering what you should monitor.   That is what I call the pillars of monitoring.    It is the what, that is the subject matter for this series.

Note: This article is not about how to monitor these things. There are numerous tools and services and some are better than others.  Some are affordable and some are mind numbingly expensive.  Fitting the right tool to the right pillar and for the site sometimes requires analysis and is a different discussion and outside the scope of this series.

Your web site or application will stand upon these pillars in a way so make them strong.  Today, there are five pillars.

Pillar One: Third Party Outside Monitoring - Your Eye in the Sky (or cloud)

This pillar is all about having a path to agreement if things go wrong and good metrics for approximation of your customer point of view.  As a result this type of monitoring is usually done from outside of your own data center.  Additionally, this is often done from numerous points of presence around the globe.  This type of monitoring can help to provide the best approximation of what your client might be experiencing at their geographic location when using your application.

3rd Party Outside monitoring will produce metrics such as uptime, downtime, latency, page component analysis, DNS performance, and many other related items.  These metrics can be vital to understanding your application performance at a given point in time and as importantly, over time as changes are introduced to the system.

One very important note is that in the event an incident occurs and there are problems for some reason, it is this pillar that will provide the final point of arbitration about the performance or uptime of a given web facing service.  For example, if a the client or clients customers perceive that the website is offline for some reasons. Then, the systems engineer says that it appears just fine.  Well, rather than arguing, the decision is easier since we have an agreement that the site is up if the third party service says it is up and performing well then it is and the tie is broken.  Of course, this does make the assumption that the monitors aren’t bogus and that the monitoring service is actually functioning.  So, setup and choose your providers carefully.  Additionally, this doesn’t always mean everything is perfect, but it can help to focus efforts more quickly in productive directions.

Monitoring is a requirement.  It is not an after-thought.  Stay tuned for the next installment where I will discuss Pillar Two.