Dear readers, you probably want to be subscribed to these link aggregator channels. Reading them for a few years, heartily recommend.

- https://www.tg-me.com/axisofordinary
- https://www.tg-me.com/j_links
The Functional Machine Calculus

Willem Heijltjes

This paper presents the Functional Machine Calculus (FMC) as a simple model of higher-order computation with "reader/writer" effects: higher-order mutable store, input/output, and probabilistic and non-deterministic computation.

The FMC derives from the lambda-calculus by taking the standard operational perspective of a call-by-name stack machine as primary, and introducing two natural generalizations. One, "locations", introduces multiple stacks, which each may represent an effect and so enable effect operators to be encoded into the abstraction and application constructs of the calculus. The second, "sequencing", is known from kappa-calculus and concatenative programming languages, and introduces the imperative notions of "skip" and "sequence". This enables the encoding of reduction strategies, including call-by-value lambda-calculus and monadic constructs.

https://arxiv.org/abs/2212.08177
Collapsing Towers of Interpreters

Nada Amin, Tiark Rompf

Given a tower of interpreters, i.e., a sequence of multiple interpreters interpreting one another as input programs, we aim to collapse this tower into a compiler that removes all interpretive overhead and runs in a single pass. In the real world, a use case might be Python code executed by an x86 runtime, on a CPU emulated in a JavaScript VM, running on an ARM CPU. Collapsing such a tower can not only exponentially improve runtime performance, but also enable the use of base-language tools for interpreted programs, e.g., for analysis and verification. In this paper, we lay the foundations in an idealized but realistic setting.

https://dl.acm.org/doi/10.1145/3158140
Specializing Data Access in a Distributed File System (Generative Pearl)
P. Das, A. Xhebraj, T. Rompf

https://dl.acm.org/doi/10.1145/3689484.3690736

https://www.youtube.com/watch?v=YeWCPRXWIuA

Modern distributed file systems partition and replicate data amongst a cluster of participating nodes to enable processing of larger-than-memory datasets and to provide fault tolerance. Typically, these systems maintain designated coordinator nodes that track the state of the cluster and route user operations to the nodes holding the requested data.

One way to improve performance in this scenario is to provide the client with the cluster topology and data distribution information so that they can bypass the designated coordinator and directly access the data. Click Me Load More the files from the underlying file system instead of having to go through high level APIs can lead to significantly better performance <...>

In this paper, we propose a system, DDLoader, that embeds the information about the data distribution into the language using the Lightweight Modular Staging metaprogramming framework for Scala. Multi-stage programming techniques allow developers to build applications while delegating the task of generating low-level file loading details to the generated code. This hides the implementation details of the underlying distributed file system while retaining the performance of handwritten examples. Further, metaprogramming erases the distinction between the data loading and data processing phases of the applications typically built over distributed file systems, allowing whole program optimization, which was not possible previously.
FP²: Fully in-Place Functional Programming

Anton Lorenzen, Daan Leijen, Wouter Swierstra

As functional programmers we always face a dilemma: should we write purely functional code, or sacrifice purity for efficiency and resort to in-place updates? This paper identifies precisely when we can have the best of both worlds: a wide class of purely functional programs can be executed safely using in-place updates without requiring allocation, provided their arguments are not shared elsewhere.

We describe a linear _fully in-place_ (FIP) calculus where we prove that we can always execute such functions in a way that requires no (de)allocation and uses constant stack space. Of course, such a calculus is only relevant if we can express interesting algorithms; we provide numerous examples of in-place functions on datastructures such as splay trees or finger trees, together with in-place versions of merge sort and quick sort.

We also show how we can generically derive a map function over _any_ polynomial data type that is fully in-place. Finally, we have implemented the rules of the FIP calculus in the Koka language. Using the Perceus reference counting garbage collection, this implementation dynamically executes FIP functions in-place whenever possible.

https://dl.acm.org/doi/abs/10.1145/3607840
Denotational design with type class morphisms

Conal Elliott

Type classes provide a mechanism for varied implementations of standard interfaces. Many of these interfaces are founded in mathematical tradition and so have regularity not only of types but also of properties (laws) that must hold. Types and properties give strong guidance to the library implementor, while leaving freedom as well. Some of this remaining freedom is in how the implementation works, and some is in what it accomplishes.

To give additional guidance to the what, without impinging on the how, this paper proposes a principle of type class morphisms (TCMs), which further refines the compositional style of denotational semantics. The TCM idea is simply that the instance’s meaning is the meaning’s instance. This principle determines the meaning of each type class instance, and hence defines correctness of implementation. It also serves to transfer laws about a type’s semantic model, such as the class laws, to hold for the type itself. In some cases, it also provides a systematic guide to implementation, and in some cases, valuable design feedback.

The paper is illustrated with several examples of types, meanings, and morphisms.

http://conal.net/papers/type-class-morphisms/
https://www.youtube.com/watch?v=g3qd4zpm1LA

A brief overview of two new features for "OCaml 6.0" that Jane Street is finalizing.

The first one is unboxed types with "levity polymorphism" (more or less) and stack allocation.

Safe stack allocation requires tracking values that don't escape scope (and scope inference) which largely amounts to linearity. This leads to the second feature: race-free parallel (concurrent) programming.

Apparently, it requires tracking more properties than just linearity, but once you've added kinds and modes to your type system, it's hard to stop half-way... 😁
Harnessing the Universal Geometry of Embeddings

We present the first method to translate text embeddings across different spaces without any paired data or encoders.
Our method, vec2vec, reveals that all encoders—regardless of architecture or training data—learn nearly the same representations!
We demonstrate how to translate between these black-box embeddings without any paired data, maintaining high fidelity.
Using vec2vec, we show that vector databases reveal (almost) as much as their inputs.
Given just vectors (e.g., from a compromised vector database), we show that an adversary can extract sensitive information (e.g., PII) about the underlying text.

Strong Platonic Representation Hypothesis (S-PRH)
We thus strengthen Huh et al.'s PRH to say:
The universal latent structure of text representations can be learned and harnessed to translate representations from one space to another without any paired data or encoders.

https://arxiv.org/abs/2505.12540
Unpublished lecture notes in a secret place. All those moments will be lost in time, like tears in rain. Enjoy!

https://www.cs.cmu.edu/~rwh/courses/atpl/pdfs/
Scalable Pattern Matching in Computation Graphs

Luca Mondada
Pablo Andrés-Martínez

Graph rewriting is a popular tool for the optimisation and modification of graph expressions in domains such as compilers, machine learning and quantum computing. The underlying data structures are often port graphs—graphs with labels at edge endpoints. A pre-requisite for graph rewriting is the ability to find graph patterns. We propose a new solution to pattern matching in port graphs. Its novelty lies in the use of a pre-computed data structure that makes the pattern matching runtime complexity independent of the number of patterns. This offers a significant advantage over existing solutions for use cases with large sets of small patterns.

Our approach is particularly well-suited for quantum superoptimisation. We provide an implementation and benchmarks showing that our algorithm offers a 20x speedup over current implementations on a dataset of 10000 real world patterns describing quantum circuits.

https://arxiv.org/abs/2402.13065
Forwarded from Alex Gryzlov
2025/06/30 22:22:15
Back to Top
HTML Embed Code: