When you think about Datadog, you are very often thinking about charts. Which visualisations will bring forth insights from your metrics. What actionable information can you present on your wallboards. What you might not be thinking about is the humble subject of Datadog notes. You can add notes to your dashboard using the Edit DashboardContinue reading “Datadog Notes”
Tag Archives: monitoring
Log to Datadog From .NET Using Hound’s LogHound Class
If you are using Datadog, you know you can log to Datadog using the Windows event log, or by calling the DogStatsD interface on your local agent. When you are running in a non-machine context, such as an Azure App Service, it is likely you won’t be running an agent; so how do you logContinue reading “Log to Datadog From .NET Using Hound’s LogHound Class”
Uptime and SLAs
This is a bonus post that follows up on some information that is useful if you read Web Operations Dashboards, Monitoring, and Alerting. This article is all about uptime and SLAs. Having helped a number of businesses understand what uptime and SLAs are, and how they work in real life, I have encountered a fewContinue reading “Uptime and SLAs”
The Monitor Matrix
This is the last in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this final bite-size chunk, I’m going to talk about the Monitor Matrix. Selecting monitors has a gradual evolution. You start off monitoring the things that everyone starts monitoring. You keepContinue reading “The Monitor Matrix”
The Monitor Selection Principles
This is one more article in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this article, I’m going to talk about Monitor Selection Principles. While it can be tempting to start off by monitoring everything, and alerting every time something slightly odd happens,Continue reading “The Monitor Selection Principles”
The Alerting Principles
This is the next in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this instalment, I’m going to talk about the Alerting Principles. When it comes to monitoring alerting is an area you will want to get right. There is a natural tensionContinue reading “The Alerting Principles”
The Incident Causation Principles
This is another in a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this article, I’m going to talk about incident investigations and the causations principles. When things go wrong, it may be that some internal trigger such as a software release or configurationContinue reading “The Incident Causation Principles”
The Three Fs of Event Log Monitoring
This is one of a series of posts to share some techniques that I wrote about in Web Operations Dashboards, Monitoring, and Alerting. In this article, I’m going to talk about the Three Fs of event log monitoring. When you first start collecting event logs, it is likely that you will be inundated with aContinue reading “The Three Fs of Event Log Monitoring”
Configure Azure Auto-Healing for your Azure Web Sites
While there a whole host of great ideas you can apply to monitoring and alerting, one of the key reasons you spend time crafting your operations story is to avoid being interrupted during family time. So the AutoHeal feature for Azure Web Sites is your family-friendly helper that will take care of minor issues withoutContinue reading “Configure Azure Auto-Healing for your Azure Web Sites”
DataDog Interactive Monitor Report
There are two competing fundamental needs for web operations… reducing false alarms, and ensuring you never miss a real incident! Here is a useful DataDog feature that you might not be using and that can help out a great deal in finding that magical balance-point between these two competing needs… interactive monitor reports. You canContinue reading “DataDog Interactive Monitor Report”