Links

Tuesday 2024-07-02 Assorted Links
Assorted Links links
Published: 2024-07-02
Tuesday 2024-07-02 Assorted Links

Assorted links for Tuesday, July 2:

  1. How to Measure DevSecOps Success: Key Metrics Explained

    Key DevSecOps metrics:

    1. Number of security vulnerabilities over time
    2. Compliance with security policies
  2. “Energy-smart” bricks need less power to make, are better insulation

    According to the RMIT researchers, “Brick kilns worldwide consume 375 million tonnes (~340 million metric tons) of coal in combustion annually, which is equivalent to 675 million tonnes of CO2 emission (~612 million metric tons).” This exceeds the combined annual carbon dioxide emissions of 130 million passenger vehicles in the US.

  3. Researchers upend AI status quo by eliminating matrix multiplication in LLMs

    In the new paper, titled “Scalable MatMul-free Language Modeling,” the researchers describe creating a custom 2.7 billion parameter model without using MatMul ([matrix multiplication]) that features similar performance to conventional large language models (LLMs). They also demonstrate running a 1.3 billion parameter model at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that uses about 13 watts of power (not counting the GPU’s power draw). The implication is that a more efficient FPGA “paves the way for the development of more efficient and hardware-friendly architectures,” they write.

  4. Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding

    We implemented a concurrency limiter within PlayAPI that prioritizes user-initiated requests over prefetch requests without physically sharding the two request handlers. This mechanism uses the partitioning functionality of the open source Netflix/concurrency-limits Java library.

  5. Explaining generative language models to (almost) anyone
Monday 2024-07-01 Assorted Links
Assorted Links links
Published: 2024-07-01
Monday 2024-07-01 Assorted Links

Assorted links for Monday, July 1:

  1. The Danger of Atomic Operations

    Most engineers reach for atomic operations in an attempt to produce some lock-free mechanism. Furthermore, programmers enjoy the intellectual puzzle of using atomic operations. Both of these lead to clever implementations which are almost always ill-advised and often incorrect.

  2. What an SBOM can do for you
  3. sched_ext: a BPF-extensible scheduler class (Part 1)

    sched_ext allows you to write and run your custom process scheduler optimized for your target workloads and hardware architectures using BPF programs.

  4. sched_ext: scheduler architecture and interfaces (Part 2)
  5. Leveraging AI for efficient incident response

    We’ve streamlined our investigations through a combination of heuristic-based retrieval and large language model (LLM)-based ranking to provide AI-assisted root cause analysis. During backtesting, this system has achieved promising results: 42% accuracy in identifying root causes for investigations at their creation time related to our web monorepo.

Wednesday 2024-06-26 Assorted Links
Assorted Links links
Published: 2024-06-26
Wednesday 2024-06-26 Assorted Links

Assorted links for Wednesday, June 26:

  1. Speed Up Your CI/CD Pipeline with Change-Based Testing in a Yarn-Based Monorepo: I note that only building and testing what changed is one of the core value propositions of Bazel, but adopting Bazel often requires large investment in engineering and training.
  2. What makes a good REST API?
  3. How to use DORA metrics to improve software delivery
  4. Don’t Get Lost in the Metrics Maze: A Practical Guide to SLOs, SLIs, Error Budgets, and Toil
  5. Static B-Trees

    In this section, we generalize the techniques we developed for binary search to static B-trees and accelerate them further using SIMD instructions. In particular, we develop two new implicit data structures:

    • The first is based on the memory layout of a B-tree, and, depending on the array size, it is up to 8x faster than std::lower_bound while using the same space as the array and only requiring a permutation of its elements.
    • The second is based on the memory layout of a B+ tree, and it is up to 15x faster than std::lower_bound while using just 6-7% more memory — or 6-7% of the memory if we can keep the original sorted array.
Tuesday 2024-06-25 Assorted Links
Assorted Links links
Published: 2024-06-25
Tuesday 2024-06-25 Assorted Links

Assorted links for Tuesday, June 25:

  1. Radioactive drugs strike cancer with precision

    Pluvicto and Lutathera are both built around small protein sequences, known as peptides. These peptides specifically bind to target receptors on cancer cells—PSMA in the case of prostate cancer and somatostatin receptors in the case of Lutathera—and deliver radiation through the decay of unstable lutetium.

    Administered via infusion into the bloodstream, these drugs circulate throughout the body until they firmly attach to the surfaces of tumor cells they encounter. Anchored at these target sites, the lutetium isotope then releases two types of radiation that aid in cancer treatment. The primary emission consists of beta particles, high-energy electrons capable of penetrating tumors and surrounding cells, tearing into DNA and causing damage that ultimately triggers cell death.

  2. Amazon Exploring MM-Local Memory Allocations To Help With Current/Future Speculation Attacks

    Back in 2019 after various speculation-based CPU vulnerabilities began coming to light, Amazon engineers proposed process-local memory allocations for hiding KVM secrets. They were striving for an alternative mitigation for vulnerabilities like L1TF by essentially providing some memory regions for kernel allocations out of view/access from other kernel code. Amazon engineers this week laid out a new proposal after five years of ongoing Linux kernel improvements for MM-local memory allocations for dealing with current and future speculation-based cross-process attacks.

  3. TypeSpec: An API design language that either competes with, or augments, OpenAPI.
  4. Optimize Kubernetes Pods’ Startup Time Using VolumeSnapshots: If your K8S application uses enormous, static data sources, using VolumeSnapshots may speed up its launch time significantly.
  5. Building a GitOps CI/CD Pipeline with GitHub Actions (SOC 2)
Monday 2024-06-24 Assorted Links
Assorted Links links
Published: 2024-06-24
Monday 2024-06-24 Assorted Links

Assorted links for Monday, June 24:

  1. The time smart quotes prevented the entire Office division from committing code
  2. Video annotator: a framework for efficiently building video classifiers using vision-language models and active learning

    We introduce a novel framework, Video Annotator (VA), which leverages active learning techniques and zero-shot capabilities of large vision-language models to guide users to focus their efforts on progressively harder examples, enhancing the model’s sample efficiency and keeping costs low.

    VA seamlessly integrates model building into the data annotation process, facilitating user validation of the model before deployment, therefore helping with building trust and fostering a sense of ownership. VA also supports a continuous annotation process, allowing users to rapidly deploy models, monitor their quality in production, and swiftly fix any edge cases by annotating a few more examples and deploying a new model version.

  3. PVF: A novel metric for understanding AI systems’ vulnerability against SDCs in model parameters

    Parameter vulnerability factor (PVF) is a novel metric we’ve introduced with the aim to standardize the quantification of AI model vulnerability against parameter corruptions.

  4. Keeping main green in a monorepo
  5. Researchers describe how to tell if ChatGPT is confabulating

    …[T]he researchers focus on what they call semantic entropy. This evaluates all the statistically likely answers evaluated by the LLM and determines how many of them are semantically equivalent. If a large number all have the same meaning, then the LLM is likely uncertain about phrasing but has the right answer. If not, then it is presumably in a situation where it would be prone to confabulation and should be prevented from doing so.

Friday 2024-06-21 Assorted Links
Assorted Links links
Published: 2024-06-21
Friday 2024-06-21 Assorted Links

Assorted links for Friday, June 21:

  1. MLow: Meta’s low bitrate audio codec

    After nearly two years of active development and testing, we are proud to announce Meta Low Bitrate audio codec, aka MLow, which achieves two-times-better quality than Opus (POLQA MOS 1.89 vs 3.9 @ 6kbps WB). Even more importantly, we are able to achieve this great quality while keeping MLow’s computational complexity 10 percent lower than that of Opus.

  2. Unlocking the power of unstructured data with RAG

    To make the most of their unstructured data, development teams are turning to retrieval-augmented generation, or RAG, a method for customizing large language models (LLMs). They can use RAG to keep LLMs up to date with organizational knowledge and the latest information available on the web. They can also use RAG and LLMs to surface and extract insights from unstructured data.

  3. LXC vs. Docker: Which One Should You Use?

    LXC is not typically used for application development but for scenarios requiring full OS functionality or direct hardware integration. Its ability to provide isolated and secure environments with minimal overhead makes it suitable for infrastructure virtualization where traditional VMs might be too resource-intensive.

    Docker’s utility in supporting rapid development cycles and complex architectures makes it a valuable tool for developers aiming to improve efficiency and operational consistency in their projects.

  4. AES-GCM and breaking it on nonce reuse
  5. Next-Level Boilerplate: An Inside Look Into Our .Net Clean Architecture Repo

    Clean architecture is a widely adopted opinionated way to structure your code and to separate the concerns of the application into layers. The main idea is to separate the business logic from the infrastructure and presentation layers.

Thursday 2024-06-20 Assorted Links
Assorted Links links
Published: 2024-06-20
Thursday 2024-06-20 Assorted Links

Assorted links for Thursday, June 20:

  1. How we improved push processing on GitHub

    A push triggers a Kafka event, which is fanned out via independent consumers to many isolated jobs that can process the event without worrying about any other consumers.

  2. Leveraging Rust in High-Performance Web Services

    Rust’s ownership model is a fundamental feature that enhances both speed and safety. Every value in Rust has a unique owner, responsible for its cleanup when it’s no longer needed. This eliminates the need for a garbage collector and ensures efficient memory management. The ownership rules are enforced at compile time, which means there’s no runtime overhead.

  3. systemd 256 Released With run0, systemd-vpick, importctl & Other New Features
  4. Maintaining large-scale AI capacity at Meta

    Outside of special cases, Meta maintains its fleet of clusters using a technique called maintenance trains. This is used for all capacity, including compute and storage capacity. A small number of servers are taken out of production and maintained with all applicable upgrades. Trains provide the guarantee that all capacity minus one maintenance domain is up and running 24/7, thus providing capacity predictability. This is mandatory for all capacity that is used for online and recurring training.

  5. How Meta trains large language models at scale
Wednesday 2024-06-19 Assorted Links
Assorted Links links
Published: 2024-06-19
Wednesday 2024-06-19 Assorted Links

Assorted links for Wednesday, June 19:

  1. Arm64 on GitHub Actions: Powering faster, more efficient build systems

    Developers can now take advantage of Arm-based hardware hosted by GitHub to build and deploy their release assets anywhere Arm architecture is used. Best of all, these runners are priced at 37% less than our x64 Linux and Windows runners.

  2. Develop Kubernetes Operators in Java without Breaking a Sweat
  3. The Energy Footprint of Humans and Large Language Models

    Assuming an 8-hour workday and considering 260 workdays per year brings the annual energy cost of one person’s hour of daily work to around 6 kWh[a].

    Now for the energy cost of running an LLM. We have set a target of 250 words in an hour. LLMs generate tokens, parts of words, so if we use the standard ratio (for English) of 0.75 words per token, our target for one hour of work is around 333 tokens. Measurements with Llama 65B reported around 4 Joules per output token [4]. This leads to 1,332 Joules for 333 tokens, about 0.00037 kWh.

  4. Microsoft is reworking Recall after researchers point out its security problems

    Microsoft’s upcoming Recall feature in Windows 11 has generated a wave of controversy this week following early testing that revealed huge security holes. The initial version of Recall saves screenshots and a large plaintext database tracking everything that users do on their PCs, and in the current version of the feature, it’s trivially easy to steal and view that database and all of those screenshots for any user on a given PC, even if you don’t have administrator access. Recall also does little to nothing to redact sensitive information from its screenshots or that database.

    First and most significantly, the company says that Recall will be opt-in by default, so users will need to decide to turn it on. It may seem like a small change, but many users never touch the defaults on their PCs, and for Recall to be grabbing all of that data by default definitely puts more users at risk of having their data stolen unawares.

    The company also says it’s adding additional protections to Recall to make the data harder to access. You’ll need to enable Windows Hello to use Recall, and you’ll need to authenticate via Windows Hello (whether it’s a face-scanning camera, fingerprint sensor, or PIN) each time you want to open the Recall app to view your data.

  5. Building Generative AI apps with .NET 8