In this work, we significantly further the understanding of real-world cache workloads by
collecting production traces from 153 in-memory cache clusters at Twitter, sifting through
over 80 TB of data, and sometimes interpreting the workloads in the context of the business
logic behind them.
[Y]ou should assume your data is corrupt between when a write is issued until after a flush or
force unit access write completes. However, most programs use system calls to write data. This
article looks at the guarantees provided by the Linux file APIs. It seems like this should be
simple: a program calls write() and after it completes, the data is durable. However,
write() only copies data from the application into the kernel’s cache in memory. To force
the data to be durable you need to use some additional mechanism.
Traefik is The Cloud Native Edge Router yet another reverse proxy and load balancer. Omitting
all the Cloud Native buzzwords, what really makes Traefik different from Nginx, HAProxy, and
alike is the automatic and dynamic configurability it provides out of the box. And the most
prominent part of it is probably its ability to do automatic service discovery.
Which is better, Rust or Go—and does that question even make sense? Which language should you
choose for your next project in 2025, and why? How does Rust compare with Go in areas like
performance, simplicity, safety, features, scale, and concurrency?
Twine is our homegrown cluster management system, which has been running in production for the
past decade. A cluster management system allocates workloads to machines and manages the life
cycle of machines, containers, and workloads. Kubernetes is a prominent example of an open
source cluster management system. Twine has helped convert our infrastructure from a
collection of siloed pools of customized machines dedicated to individual workloads to a
large-scale ubiquitous shared infrastructure in which any machine can run any workload.
The purpose of this document is to describe the path data takes from the application down to
the storage, concentrating on places where data is buffered, and to then provide best
practices for ensuring data is committed to stable storage so it is not lost along the way in
the case of an adverse event. The main focus is on the C programming language, though the
system calls mentioned should translate fairly easily to most other languages.
As one of the underlying engines, Uber Money fulfills some of the most important aspects of
people’s engagement in the Uber experience. A system like this should not only be robust, but
should also be highly available with zero-tolerance to downtime, after our success mantra:
“To collect and disburse on-time, accurately and in-compliance”.
While we expand to multiple lines of businesses, and strategize the next best, the engineers
in Uber Money also thrive on building the next generation’s Payments Platform which extends
Uber’s growth. In this blog, we introduce you to this platform and provide insights into our
learnings. This includes migrating hundreds of millions customers between two asynchronous
systems while maintaining data-consistency with a goal of zero impact on our users.
Within AWS, a common pattern is to split the system into services that are responsible for
executing customer requests (the data plane), and services that are responsible for managing
and vending customer configuration (the control plane). In this article, I discuss a number
of different ways the data plane and the control plane interact with each other to avoid
system overload. In many of these architectures the larger data plane fleet calls the smaller
control plane fleet, but I also want to share the success we’ve had at Amazon when we put the
smaller fleet in control.
As I’ve lamented previously, the documentation for xperf (Windows Performance Toolkit) is a
bit light. The names of the columns in the summary tables can be exquisitely subtle, and I
have never found any documentation for them. But, I’ve talked to the xperf authors, and I’ve
used xperf a lot, and I’ve done some experiments, and here I share some more results, this
time for the Disk Usage summary table.
In just 20 years, software engineering has shifted from architecting monoliths with a single
database and centralized state to microservices where everything is distributed across
multiple containers, servers, data centers, and even continents. Distributing things solves
scaling concerns, but introduces a whole new world of problems, many of which were previously
solved by monoliths.
FioSynth is a benchmark tool used to automate the execution of storage workload suites and to
parse results. It contains a base set of block level storage workloads, synthesized from
production I/O traces, that simulate a diverse range of Facebook production services. It is
useful for predicting how a storage device will perform in realistic production environments
and for assisting with performance tuning.
Project Teleport removes the cost of download and decompression by SMB mounting pre-expanded
layers from the Azure Container Registry to Teleport enabled Azure container hosts.
In July 2020 I went on a color-scheme vision quest. This led to some research on various color
spaces and their utility, some investigation into the styling guidelines outlined by the
base16 project, and the color utilities that ship within the GNU Emacs text editor. This
article will be a whirlwind tour of things you can do to individual colors, and at the end how
I put these blocks together.
In this article I will demonstrate that while hardware changed dramatically over the past
decade, software APIs have not, or at least not enough. Riddled with memory copies, memory
allocations, overly optimistic read ahead caching and all sorts of expensive operations,
legacy APIs prevent us from making the most of our modern devices.
[eBPF and io_uring] may look evolutionary, but they are revolutionary in the sense that they
will — we bet — completely change the way applications work with and think about the Linux
Kernel.
I thought it would be helpful to write a guide to dev tools outside of Google for the
ex-Googler, written with an eye toward pragmatism and practicality. No doubt many ex-Googlers
wish they could simply clone the Google internal environment to their new company, but you
can’t boil the ocean. Here is my take on where you should start and a general path I think
ex-Googlers can take to find the tools that will make them - and their new teams - as
productive as possible.
There have been recent attempts to enrich large-scale data stores, such as HBase and BigTable,
with transactional support. Not surprisingly, inspired by traditional database management
systems, serializability is usually compromised for the benefit of efficiency. For example,
Google Percolator, implements lock-based snapshot isolation on top of BigTable. We show in
this paper that this compromise is not necessary in lock-free implementations of transactional
support. We introduce write-snapshot isolation, a novel isolation level that has a
performance comparable with that of snapshot isolation, and yet provides serializability.
This thesis presents the first implementation-independent specifications of existing ANSI
isolation levels and a number of levels that are widely used in commercial systems, e.g.,
Cursor Stability, Snapshot Isolation. It also specifies a variety of guarantees for
predicate-based operations in an implementation-independent manner. Two new levels are defined
that provide useful consistency guarantees to application writers; one is the weakest level
that ensures consistent reads, while the other captures some useful consistency properties
provided by pessimistic implementations.
This post is about gaining intuition for Write Skew, and, by extension, Snapshot Isolation.
Snapshot Isolation is billed as a transaction isolation level that offers a good mix between
performance and correctness, but the precise meaning of “correctness” here is often vague. In
this post I want to break down and capture exactly when the thing called “write skew” can
happen.
Here are some observations on how parsers can be constructed in a way that makes it easier to
recover from parse errors, produce multiple diagnostics in one pass, and provide partial
results for further analysis even in the face of errors, providing a better experience for
user-driven command line tools and interactive environments.
Build systems are awesome, terrifying – and unloved. They are used by every developer around
the world, but are rarely the object of study. In this paper, we offer a systematic, and
executable, framework for developing and comparing build systems, viewing them as related
points in a landscape rather than as isolated phenomena. By teasing apart existing build
systems, we can recombine their components, allowing us to prototype new build systems with
desired properties.
In this paper we introduce a new set of codes for erasure coding called Local Reconstruction
Codes (LRC). LRC reduces the number of erasure coding fragments that need to be read when
reconstructing data fragments that are offline, while still keeping the storage overhead
low.
Software dependencies carry with them serious risks that are too often overlooked. The shift
to easy, fine-grained software reuse has happened so quickly that we do not yet understand the
best practices for choosing and using dependencies effectively, or even for deciding when they
are appropriate and when not. My purpose in writing this article is to raise awareness of the
risks and encourage more investigation of solutions.
On Wednesday, web infrastructure provider Cloudflare announced a new feature called “AI
Labyrinth” that aims to combat unauthorized AI data scraping by serving fake AI-generated
content to bots. The tool will attempt to thwart AI companies that crawl websites without
permission to collect training data for large language models that power AI assistants like
ChatGPT.
All of a sudden, without any ostensible cause, Google Docs was flooded with errors. How it
took me 2 days and a coworker to solve the hardest bug I ever debugged.
Edera Protect is a suite of offerings bridging the gap between modern cloud native computing
and virtualization-based security techniques. To power this platform, we’ve built our own
container runtime designed to operate as a microservice, allowing it to run containers in a
fully programmatic way—similar to how the Kubernetes Container Runtime Interface (CRI) enables
container management through microservices.
I would like to announce a new high-performance PNG codec, which is much faster than other
available codecs written in C, C++, and other programming languages.
Today, on Pi Day (S3’s 19th birthday), I’m sharing a post from Andy Warfield, VP and
Distinguished Engineer of S3. Andy takes us through S3’s evolution from simple object
store to sophisticated data platform, illustrating how customer feedback has shaped every
aspect of the service. It’s a fascinating look at how we maintain simplicity even as
systems scale to handle hundreds of trillions of objects.
Earlier this month, we announced the general availability of custom instructions in
Visual Studio Code. Custom instructions are how you give Copilot specific context about
your team’s workflow, your particular style preferences, libraries the model may not
know about, etc.
In this post we’ll dive into what custom instructions are, how you can use them today to
drastically improve your results with GitHub Copilot, and even a brand new preview
feature called “prompt files” that you can try today.
In a post on X on Wednesday, OpenAI CEO Sam Altman said that OpenAI will add support for
Anthropic’s Model Context Protocol, or MCP, across its products, including the desktop app
for ChatGPT. MCP is an open source standard that helps AI models produce better, more relevant
responses to certain queries.
Microsoft’s six security agents will be available in preview next month, and are designed to
do things like triage and process phishing and data loss alerts, prioritize critical
incidents, and monitor for vulnerabilities.
In a new paper published Thursday titled “Auditing language models for hidden objectives,”
Anthropic researchers described how custom AI models trained to deliberately conceal certain
“motivations” from evaluators could still inadvertently reveal secrets, due to their ability
to adopt different contextual roles they call “personas.” The researchers were initially
astonished by how effectively some of their interpretability methods seemed to uncover these
hidden training objectives, although the methods are still under research.
After significant research and testing on dozens of actual SNES units, the TASBot team now
thinks that a cheap ceramic resonator used in the system’s Audio Processing Unit (APU) is to
blame for much of this inconsistency. While Nintendo’s own documentation says the APU should
run at a consistent rate of 24.576 Mhz (and the associated Digital Signal Processor sample
rate at a flat 32,000 Hz), in practice, that rate can vary just a bit based on heat, system
age, and minor physical variations that develop in different console units over time.
Time for me to write this blog post and prepare everyone for the implementation blitz that
needs to happen to make defer a success for the C programming language.
Solution files have been a part of the .NET and Visual Studio experience for many years now,
and they’ve had the same custom format the whole time. Recently, the Visual Studio solution
team has begun previewing a new, XML-based solution file format called SLNX. Starting in .NET
SDK 9.0.200, the dotnet CLI supports building and interacting with these files in the same
way as it does with existing solution files.
HybridCache is a new .NET 9 library available via the Microsoft.Extensions.Caching.Hybrid
package and is now generally available! HybridCache, named for its ability to leverage both
in-memory and distributed caches like Redis, ensures that data storage and retrieval is
optimized for performance and security, regardless of the scale or complexity of your
application.
Like sorting algorithms, hash table data structures continue to see improvements. In
2017, Sam Benzaquen, Alkis Evlogimenos, Matt Kulukundis, and Roman Perepelitsa at Google
presented a new C++ hash table design, dubbed “Swiss Tables”. In 2018, their
implementation was open sourced in the Abseil C++ library.
Go 1.24 includes a completely new implementation of the built-in map type, based on the Swiss Table design.
We are investigating a critical security incident involving the popular tj-actions/changed-files
GitHub Action. We want to alert you immediately so that you can take prompt action. This post
will be updated as new information becomes available.
A header-only C++ library that offers exceptionless error handling and type-safe enums, bringing
Rust-inspired error propagation with the ? operator and the match operator to C++.