Let’s not monkey-patch instrumentation
Modern telemetry libraries allow easily configuring auto instrumentation, to automatically gather observability data about frameworks and libraries.
There are two main approaches to architecting those auto-instrumentation libraries. As middlewares/wrappers, or as monkey-patches. I believe middlewares are much better, here’s why.
Understanding Trace Propagation in OpenTelemetry
OpenTelemetry is making observability much easier, especially by providing the
first widely accepted vendor agnostic telemetry libraries.
The first signal the project implemented is tracing, which is now GA in most
languages.
You most likely don’t need metrics
Ever since we need to operate hardware and software in production, we have needed to know how those behave. For example, when I brew craft beer, I use an iSpindle to monitor the temperature and the gravity of my wort.
In Search of an Understandable Consensus Algorithm
The goal of a consensus algorithm is to allow multiple machines to work as a
coherent group which can survive the failures from some of its members.
Paxos has been the most common consensus algorithm used around, yet it is quite
hard to understand, and hard to implement.
Meaningful Availability
I actually read this paper for the first time a year ago. But I found it so good that I’ve decided to give it another ready, with a written summary.
Functional Options in Ruby
In this article, I would like to suggest the use of a very common pattern in Go, Functional Options, but adapted to the Ruby language.
Monarch - Google’s Planet-Scale In-Memory Time Series Database
This is my summary and review of the paper Monarch: Google’s Planet-Scale In-Memory Time Series Database
Dissecting OpenTelemetry Go Tracing
OpenTelemetry is a quite new tool meant to provide a standard interface to for handling metrics and traces.
It provides libraries in all main languages, and its collector component allows receiving data from any app in any language, and transmitting them to any observability platform.
How I broke git push heroku main
Incidents are inevitable. Any platform, large or small will have them. While resiliency work will definitely be an important factor in reducing the number of incidents, hoping to remove all of them (and therefore reach 100% uptime) is not an achievable goal.
We should, however, learn as much as we can from incidents, so we can avoid repeating them.
Book Review - How to Take Smart Notes
This article is my review of the book How to Take Smart Notes.