Assorted Links

Monday 2025-01-20 Assorted Links
Assorted Links links
Published: 2025-01-20
Monday 2025-01-20 Assorted Links

Assorted links for Monday, January 20:

  1. The Boring Option: Migrating Segment Efforts Storage at Strava
  2. How do you test your tests?
  3. How Facebook keeps its large-scale infrastructure hardware up and running
  4. JUring: Experimental IO_uring For Java With Big Performance Gains

    JUring is a high-performance Java library that provides bindings to Linux’s io_uring asynchronous I/O interface using Java’s Foreign Function & Memory API. Doing Random reads JUring achieves 33% better performance than Java NIO FileChannel operations for local files and 78% better performance for remote files.

  5. How we built the GitHub Skyline CLI extension using GitHub
Thursday 2025-01-16 Assorted Links
Assorted Links links
Published: 2025-01-16
Thursday 2025-01-16 Assorted Links

Assorted links for Thursday, January 16:

  1. Fast commits for ext4

    The Linux 5.10 release included a change that is expected to significantly increase the performance of the ext4 filesystem; it goes by the name “fast commits” and introduces a new, lighter-weight journaling method.

  2. Building Faster AMD64 Memset Routines
  3. How NuGet resolves package dependencies
  4. Maximizing Developer Effectiveness
  5. What’s good about offset pagination; designing parallel cursor-based web APIs
Wednesday 2025-01-15 Assorted Links
Assorted Links links
Published: 2025-01-15
Wednesday 2025-01-15 Assorted Links

Assorted links for Wednesday, January 15:

  1. Cloud PUE: Comparing AWS, Azure and GCP Global Regions

    New data reveals how efficiently the major cloud providers run and cool their data centers – from AWS’s and Azure’s tropical struggles to Google’s industry-leading performance.

  2. The No-Order File System

    In this paper, we introduce the No-Order File System (NoFS), a simple, lightweight file system that employs a novel technique called backpointer based consistency to provide crash consistency without ordering writes as they go to disk.

  3. Whose Code is it Anyway?

    In order to measure the engineering effectiveness of Yelp, we need to measure the effectiveness of its organizations and the teams that make up those organizations. But how do we know what a team is responsible for? We needed a way to assign an owner to something (let’s call this an entity) that we want to measure. Once an entity has an owner, we can collect metrics on that entity and derive the health score (i.e., effectiveness) for that owner. These metrics can then be aggregated by team, organization, or even the entire Engineering division, so that we can identify areas that we can collectively improve. And this is how the Ownership microservice was born.

  4. How we ported Linux to the M1
  5. Nix + Bazel = fully reproducible, incremental builds
Tuesday 2025-01-14 Assorted Links
Assorted Links links
Published: 2025-01-14
Tuesday 2025-01-14 Assorted Links

Assorted links for Tuesday, January 14:

  1. αcτµαlly pδrταblε εxεcµταblε

    One day, while studying old code, I found out that it’s possible to encode Windows Portable Executable files as a UNIX Sixth Edition shell script, due to the fact that the Thompson Shell didn’t use a shebang line. Once I realized it’s possible to create a synthesis of the binary formats being used by Unix, Windows, and MacOS, I couldn’t resist the temptation of making it a reality, since it means that high-performance native code can be almost as pain-free as web apps.

  2. I spent 4 hours learning how Netflix operates Apache Iceberg at scale
  3. How to secure your GitHub Actions workflows with CodeQL

    To help prevent the introduction of vulnerabilities, identify them in existing workflows, and even fix them using GitHub Copilot Autofix, CodeQL support has been added for GitHub Actions.

  4. How the UK was connected to the Internet for the first time
  5. Highlights from Git 2.48

    The open source Git project just released Git 2.48. Here is GitHub’s look at some of the most interesting features and changes introduced since last time.

Monday 2025-01-13 Assorted Links
Assorted Links links
Published: 2025-01-13
Monday 2025-01-13 Assorted Links

Assorted links for Monday, January 13:

  1. Simple defer, ready to use

    With this post I will concentrate on the here and now: how to use C’s future lifesaving defer feature with existing tools and compilers.

  2. Cleanup Attribute in C

    In this blog post I explore __attribute__((cleanup(...))). I discuss what it does, how it does it, why use it, performance considerations, and finish by saying it’s absolutely fantastic.

  3. Debanking (and Debunking?): An entertaining and fairly deep explanation of what debanking is and why it occurs. And I learned a useful new word: sontaku.

    Japanese has a beautiful word, sontaku, for the attitude and actions a diligent subordinate would take without his superior’s explicit instruction, believing them to anticipate his boss’ desires. Sontaku is a core skill in the American professional class.

  4. Seeing like a Bank
  5. The Bond villain compliance strategy
Friday 2025-01-10 Assorted Links
Assorted Links links
Published: 2025-01-10
Friday 2025-01-10 Assorted Links

Assorted links for Friday, January 10:

  1. Colliding with the SHA prefix of Linux’s initial Git commit

    There was a recent discussion about how Linux’s “Fixes” tag, which traditionally uses the 12 character commit SHA prefix, has an ever increasing chance of collisions. There are already 11-character collisions, and Geert wanted to raise the minimum short id to 16 characters.

    Tools like linux-next’s “Fixes tag checker”, the Linux CNA’s commit parser, and my own CVE lifetime analysis scripts do programmatic analysis of the “Fixes” tag and had no support for collisions (even shorter existing collisions).

    So, in an effort to fix these tools, I broke them with commit 1da177e4c3f4 (“docs: git SHA prefixes are for humans”):

  2. FreeBSD Considers Making Use Of Rust Within Its Base System
  3. The Architect’s Guide to Open Table Formats and Object Storage
  4. Alerts Are Fundamentally Messy
  5. Databases in 2024: A Year in Review
Wednesday 2025-01-08 Assorted Links
Assorted Links links
Published: 2025-01-08
Wednesday 2025-01-08 Assorted Links

Assorted links for Wednesday, January 8:

  1. Amex’s FaaS Uses WebAssembly Instead of Containers

    A key reason behind Amex’s adoption of WebAssembly is that WebAssembly demonstrated superior performance metrics compared to containers.

  2. Enhance build security and reach SLSA Level 3 with GitHub Artifact Attestations

    The Supply-chain Levels for Software Artifacts (SLSA) framework … provides a comprehensive, step-by-step methodology for building integrity and provenance guarantees into your software supply chain.

  3. Introducing Configurable Metaflow

    Standing on the shoulders of our extensive cloud infrastructure, Metaflow facilitates easy access to data, compute, and production-grade workflow orchestration, as well as built-in best practices for common concerns such as collaboration, versioning, dependency management, and observability, which teams use to setup ML/AI experiments and systems that work for them. As a result, Metaflow users at Netflix have been able to run millions of experiments over the past few years without wasting time on low-level concerns.

  4. The Feds Push WebAssembly for Cloud Native Security

    According to a National Institute of Standards and Technology (NIST) paper, “A Data Protection Approach for Cloud-Native Applications,” released earlier this year, WebAssembly could and should be integrated across the cloud native service mesh sphere in particular to enhance security.

  5. Self-Designing Software

    Exploring ways to include a software system as an active member of its own design team, able to reason about its own design and to synthesize better variants of its own building blocks as it encounters different deployment conditions.

Tuesday 2025-01-07 Assorted Links
Assorted Links links
Published: 2025-01-07
Tuesday 2025-01-07 Assorted Links

Assorted links for Tuesday, January 7:

  1. What is Inference Parallelism and how it works

    Inference parallelism aims to distribute the computational workload of AI models, particularly deep learning models, across multiple processing units such as GPUs.

  2. Open Source Innovation Comes to Time-Series Data Compression

    NetApp Instaclustr collaborated with the University of Canberra through the OpenSI initiative to develop the Advanced Time Series Compressor (ATSC) — an open source innovation that fundamentally reimagines high-volume time-series data compression.

    ATSC implements a sophisticated lossy compression approach. Rather than storing complete data sets, it generates mathematical functions that closely approximate the original data patterns, storing only the essential parameters of these functions. This approach is paired with granular configurability — users can precisely tune their desired level of accuracy, balancing storage efficiency with data fidelity based on their specific use cases.

  3. What Do You Lose When You Abandon the Cloud?

    High-profile moves from 37signals (the company behind Basecamp and HEY) and GEICO have sparked a renewed interest in cloud repatriation.

    One sometimes overlooked advantage of moving to the cloud is that it allows you to pay for resources when they are needed, for example, as new customers come online. Spending moves from upfront CAPEX (buying new machines in anticipation of success) to OPEX (paying for additional servers on demand).

    Another thing to weigh up is pace of innovation — both from the cloud provider and from the consumer.

    The Zynga example [of moving from the cloud to on-prem, then back to the cloud] highlights several other trade-offs. One to consider is that if you are running your own data centers, you need to be able to hire the right people and retain them.

    There is another set of trade-offs around security. Keeping servers up to date, and guarding against intrusions, is time-consuming work that big cloud providers are very experienced in.

  4. Why All the Major Cloud Platforms Are the Same

    Each provider brought unique strengths and strategic priorities to the table, creating differentiation initially, but eventually converging on a consistent baseline of functionality.

  5. Indexing code at scale with Glean

    How is Glean different?

    • Glean doesn’t decide for you what data you can store.
    • Glean’s query language is very general.