Categories
Automation

Uptime and SLAs

This is a bonus post that follows up on some information that is useful if you read Web Operations Dashboards, Monitoring, and Alerting. This article is all about uptime and SLAs. Having helped a number of businesses understand what uptime and SLAs are, and how they work in real life, I have encountered a few […]

Categories
Automation Programming

The Monitor Matrix

This is the last in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this final bite-size chunk, I’m going to talk about the Monitor Matrix. Selecting monitors has a gradual evolution. You start off monitoring the things that everyone starts monitoring. You keep […]

Categories
Automation Programming

The Monitor Selection Principles

This is one more article in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this article, I’m going to talk about Monitor Selection Principles. While it can be tempting to start off by monitoring everything, and alerting every time something slightly odd happens, […]

Categories
Automation Programming

The Alerting Principles

This is the next in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this instalment, I’m going to talk about the Alerting Principles. When it comes to monitoring alerting is an area you will want to get right. There is a natural tension […]

Categories
Automation Programming

The Incident Causation Principles

This is another in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this article, I’m going to talk about incident investigations and the causations principles. When things go wrong, it may be that some internal trigger such as a software release or configuration […]

Categories
Automation Programming

The Three Fs of Event Log Monitoring

This is one of a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this article, I’m going to talk about the Three Fs of event log monitoring. When you first start collecting event logs, it is likely that you will be inundated with a […]

Categories
Automation

Using Katelyn Crawler to Find All Domain References

You can use the Katelyn Crawler to crawl a website looking for references to a particular domain. I have an example below, which will report back not just each instance of the domain, but the actual full URL that was found, but don’t limit your imagination. You could search for any HTTP references that should […]

Categories
Automation

Configure Azure Auto-Healing for your Azure Web Sites

While there a whole host of great ideas you can apply to monitoring and alerting, one of the key reasons you spend time crafting your operations story is to avoid being interrupted during family time. So the AutoHeal feature for Azure Web Sites is your family-friendly helper that will take care of minor issues without […]

Categories
Automation

Avoiding Complex Octopus Deploy Variables

Octopus Deploy has a very smart system of variable management that allows you to scope variables to machines, environments, steps, roles – and to store variables in projects and in shared library sets. It is so flexible, you could make your life very miserably if you don’t make things as manageable as possible. Scope Octopus […]

Categories
Automation

Database Deployments with Octopus Deploy and a SQL Cluster

So you have a SQL Cluster and you want to run a database upgrade using Octopus Deploy… where do you start? There are actually two strategies you can employ to do this, and you can choose the most appropriate one based on how you have things set up. Octopus Deploy is cluster-agnoistic, but you can […]