Assorted links for Thursday, July 11:
Links
Assorted links for Wednesday, July 10:
Assorted links for Tuesday, July 9:
- A Behind-the-Scenes Look at How Postman’s Data Team Works
- How We Saved Millions in SSD Costs by Upgrading Our Filesystem
- Forecasting SQL query resource usage with machine learning
- Linux x86 Program Start Up or - How the heck do we get to main()? by Patrick Horgan
- Deploy without credentials with GitHub Actions and OIDC
Assorted links for Monday, July 8:
Assorted links for Friday, July 5:
Assorted links for Thursday, July 4:
Assorted links for Wednesday, July 3:
- Error and Transaction Handling in SQL Server: Part One – Jumpstart Error Handling
- Error and Transaction Handling in SQL Server: Part Two – Commands and Mechanisms
- Error and Transaction Handling in SQL Server: Part Three – Implementation
- SLICK: Adopting SLOs for improved reliability
SLICK can help us locate metric and performance data regarding the reliability of a specific service just by knowing its name. It does this by building an index of onboarded services that link to dashboards with standard visualizations to analyze and assess the service reliability. So, with a single click, it becomes possible to know whether a service currently meets or doesn’t meet user expectations. We can then start asking why.
- Using Admission Controllers to Detect Container Drift at Runtime
Assorted links for Tuesday, July 2:
- How to Measure DevSecOps Success: Key Metrics Explained
Key DevSecOps metrics:
- Number of security vulnerabilities over time
- Compliance with security policies
- “Energy-smart” bricks need less power to make, are better insulation
According to the RMIT researchers, “Brick kilns worldwide consume 375 million tonnes (~340 million metric tons) of coal in combustion annually, which is equivalent to 675 million tonnes of CO2 emission (~612 million metric tons).” This exceeds the combined annual carbon dioxide emissions of 130 million passenger vehicles in the US.
- Researchers upend AI status quo by eliminating matrix multiplication in LLMs
In the new paper, titled “Scalable MatMul-free Language Modeling,” the researchers describe creating a custom 2.7 billion parameter model without using MatMul ([matrix multiplication]) that features similar performance to conventional large language models (LLMs). They also demonstrate running a 1.3 billion parameter model at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that uses about 13 watts of power (not counting the GPU’s power draw). The implication is that a more efficient FPGA “paves the way for the development of more efficient and hardware-friendly architectures,” they write.
- Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding
We implemented a concurrency limiter within PlayAPI that prioritizes user-initiated requests over prefetch requests without physically sharding the two request handlers. This mechanism uses the partitioning functionality of the open source Netflix/concurrency-limits Java library.
- Explaining generative language models to (almost) anyone
Assorted links for Monday, July 1:
- The Danger of Atomic Operations
Most engineers reach for atomic operations in an attempt to produce some lock-free mechanism. Furthermore, programmers enjoy the intellectual puzzle of using atomic operations. Both of these lead to clever implementations which are almost always ill-advised and often incorrect.
- What an SBOM can do for you
- sched_ext: a BPF-extensible scheduler class (Part 1)
sched_ext
allows you to write and run your custom process scheduler optimized for your target workloads and hardware architectures using BPF programs. - sched_ext: scheduler architecture and interfaces (Part 2)
- Leveraging AI for efficient incident response
We’ve streamlined our investigations through a combination of heuristic-based retrieval and large language model (LLM)-based ranking to provide AI-assisted root cause analysis. During backtesting, this system has achieved promising results: 42% accuracy in identifying root causes for investigations at their creation time related to our web monorepo.
Assorted links for Friday, June 28:
- IncludeOS
IncludeOS is a minimal unikernel operating system for C++ services running in the cloud and on real hardware. Starting a program with
#include <os>
will include a tiny operating system into your service during link-time. - Designing Uber
- Designing Tinder
- Fixing Performance Regressions Before they Happen
- How we used C++20 to eliminate an entire class of runtime bugs