freshlist finished draft

This commit is contained in:
Nathan McCarty 2025-03-16 19:42:29 -04:00
parent 5209551bb0
commit 6f7be736e9

View file

@ -6,9 +6,10 @@ module DependentNuggets.FreshLists
import System
import Data.SortedSet
import Data.So
import Data.Nat
```
When programming, we quite frequently encounter the for a data structure that can contain at most one of each given element. Typically, we would use a `Set`, which satisfies this constraint, but does so as a runtime invariant that must be taken on trust, and results in ergonomic concerns when used as a component of API design. We can use "Fresh Lists" as an alternative that provide a type-level guarantee of the uniqueness of each element, while also generalizing the notion of uniqueness and providing compile-time checking of the constraint on literals.
When programming, we quite frequently encounter the need for a data structure that can contain at most one of each given element. Typically, we would use a `Set`, which satisfies this constraint, but does so as a runtime invariant that must be taken on trust, and results in ergonomic concerns when used as a component of API design. We can use "Fresh Lists" as an alternative that provide a type-level guarantee of the uniqueness of each element, while also generalizing the notion of uniqueness and providing compile-time checking of the constraint on literals.
As best as I can tell, the concept of fresh lists originated with Catarina Coquand's _A Formalised Proof of the Soundness and Completeness of a Simply Typed Lambda-Calculus with Explicit Substitutions_ paper, and acquired the name 'FreshList' when they were modified for inclusion in [Agda's standard library](https://agda.github.io/agda-stdlib/master/Data.List.Fresh.html). The implementation in this blog post is inspired both by the Agda implementation, and by the implementation in [collie](https://github.com/ohad/collie), which is a command line arguments handling library for Idris.
@ -234,12 +235,230 @@ x ## (y :: xs) = (x `fresh` y) && (x ## xs)
This operator takes our freshness predicate as an implicit argument, named `fresh`, and recursively checks that an element is fresh relative to every element already in a given `FreshList`, in effect "lifting" our runtime freshness check over two elements into a runtime freshness check over an element and an entire list.
While this provides us our whole-list runtime freshness check, we still need to lift this into a type level proof, which the `#` operator, with type `(#) : {fresh : e -> e -> Bool} -> (x : e) -> (xs : FreshList fresh e) -> Type` does by dressing our `##` in `So` clothing:
While this provides us our whole-list runtime freshness check, we still need to lift this into a type level proof, which the `#` operator, with type `(#) : {fresh : e -> e -> Bool} -> (x : e) -> (xs : FreshList fresh e) -> Type`, does by dressing our `##` in `So` clothing:
```idris
x # xs = So (x ## xs)
```
### Some operations over FreshLists
We can define all of our usual list operations over `FreshList`s, but most of them are going to be just about identical to the `List` equivalent, so we'll just cover some interesting ones.
We can't unconditionally append elements to a `FreshList`, they must be fresh after all, and it can be a bit onerous to have to do the freshness check ourself every time, so we'll define a `maybeAppend` function that runs the freshness check, returning a `Just` if the new element is, in fact, fresh, and `Nothing` otherwise:
```idris
maybeAppend : {fresh : e -> e -> Bool}
-> (x : e) -> (xs : FreshList fresh e) -> Maybe (FreshList fresh e)
maybeAppend x xs =
case choose (x ## xs) of
Left fresh => Just (x :: xs)
Right not_fresh => Nothing
```
## Basic Usage of FreshLists
### Like `Set`s, but with checked literals
To make things a little cleaner, we'll define an alias for `/=` restricted to `Nat`s. Using the `/=` operator directly is possible, but requires a little bit of mess, the type of such a `FreshList` needs to be written like `FreshList ((/=) {ty = Nat}) Nat`, specifying the type that `/=` applies to, so to avoid the clutter:
```idris
Neq : Nat -> Nat -> Bool
Neq = (/=)
```
As previously discussed, a `FreshList` with non-equality as the freshness criteria behaves much like a set, containing at most one of each value. Unlike a set, however, this is ensured at the type level.
Lets define a silly lil function that accepts a `FreshList` of unique numbers to use for some examples:
> [!NOTE]
> Notice how this definition does not impose any constraints upon the element type (ignoring that we have explicitly specified it to be `Nat`). While our definition of `Neq` implicitly uses `Nat`'s `Eq` implementation, this is entirely a result of our own choices as a consumer of `FreshList`.
```idris
takesUniqueNumbers : FreshList Neq Nat -> Bool
takesUniqueNumbers xs = True
```
As expected, this function happily accepts a literal `FreshList` containing only unique numbers:
```idris hide
-- @@test takesUniqueNumbers good
takesUniqueNumbersGood : IO Bool
takesUniqueNumbersGood =
pure $
```
<div class="unit-test">
```idris
takesUniqueNumbers [1, 2, 3, 4, 5]
```
</div>
Notice how, thanks to proof search and literal overloading, we don't have to apply any conversion function to the list-style literal here, the literal directly produces a `FreshList` carrying the associated type-level guarantees.
Additionally, as an upgrade to `SortedSet`, at least for this use-case, attempts to define a literal with duplicate elements are rejected by the compiler:
```idris
failing
oopsNotUnique : Bool
oopsNotUnique = takesUniqueNumbers [1, 2, 2, 3, 4, 5]
```
### Like `Set`s, but with other equivalence relations
We can construct another freshness criteria that compares for non-equality modulo some number, a fairly straight forward example of an equivalence class that _isn't_ equality and would be profoundly weird to express as an implementation of the language's `Eq` interface:
> [!NOTE]
> We have to use `modNatNZ` here instead of just `mod` to allow the compiler to proof search the freshness witnesses for us.
```idris
NeqMod10 : Nat -> Nat -> Bool
NeqMod10 k j =
(modNatNZ k 10 ItIsSucc) /= (modNatNZ j 10 ItIsSucc)
```
We can now use this relationship to define a function that, for example, accepts a list of `Nat`s that are non-equal modulo 10, corresponding to the requirement that every number's last digit be different:
```idris
differentLastDigits : FreshList NeqMod10 Nat -> Bool
differentLastDigits xs = True
```
This function happily accepts a list where every element's last digit is different:
```idris hide
-- @@test differentLastDigits good
diffLastsGood : IO Bool
diffLastsGood =
pure $
```
<div class="unit-test">
```idris
differentLastDigits [23, 12, 41, 56]
```
</div>
And also fails with a compile time error if we try to provide it a list where two numbers have the same last digit:
```idris
failing
oopsSameLastDigit : Bool
oopsSameLastDigit = differentLastDigits [23, 11, 41, 56]
```
### With a non-equivalence relation
Our freshness criteria doesn't have to be an equivalence relationship, if we instead use, say, `<` as our freshness criteria, we end up enforcing not only uniqueness, but that the list be in sorted order. As before, we'll define an alias to clean up the following code:
```idris
Lt : Nat -> Nat -> Bool
Lt = (<)
```
We can then define a silly lil boy accepting lists of numbers sorted in ascending order:
```idris
acceptsAscending : FreshList Lt Nat -> Bool
acceptsAscending xs = True
```
As expected, this happily accepts lists of unique numbers defined in ascending order:
```idris hide
-- @@test acceptsAscending good
acceptsAsceGood : IO Bool
acceptsAsceGood =
pure $
```
<div class="unit-test">
```idris
acceptsAscending [1, 2, 3, 4, 5]
```
</div>
But rejects lists containing repeated elements:
```idris
failing
oopsRepeated : Bool
oopsRepeated = acceptsAscending [1, 1, 3, 4, 5]
```
And also rejects lists with elements that are unique, but defined out of order:
```idris
failing
oopsOutOfOrder : Bool
oopsOutOfOrder = acceptsAscending [2, 1, 3, 4, 5]
```
## More Complicated Use of FreshList
While `FreshList`s are useful in many contexts, in my opinion, they really shine as a tool for building "defensive" APIs.
### Color descriptions
Imagine we are designing an API for building a color pallet, for some imaginary digital art application. Such an API might want to accept a list of pairs of color names and descriptions of those colors:
```idris
record ColorDesc where
constructor MkCD
name : String
desc : String
```
Now, it wouldn't make much sense for each color to have more than one description, and we don't want our programmer to accidentally write out two descriptions for the same color, only for one of them to be silently discarded.
Let's build a freshness criteria to encapsulate this expectation. We don't particularly care if the descriptions are the same, but if two colors are the same, we have a problem, so we'll check for non-equality of the `name` field of our `ColorDesc` record:
```idris
FreshColor : ColorDesc -> ColorDesc -> Bool
FreshColor x y = x.name /= y.name
```
We can now define a function that accepts a color pallet based on this freshness criteria:
```idris
acceptPallet : FreshList FreshColor ColorDesc -> Bool
acceptPallet xs = True
```
If the programmer is mindful, and only defines a description for each color once, this function will accept their pallet:
```idris hide
-- @@test different colors
differentColors : IO Bool
differentColors =
pure $
```
<div class="unit-test">
```idris
acceptPallet
[ MkCD {name = "blue", desc = "the color of the sky"}
, MkCD {name = "red", desc = "the color of very hot things"}
, MkCD {name = "green", desc = "the color of leaves"}
, MkCD {name = "brown", desc = "the color of dirt"}
]
```
</div>
However, if our programmer slips up and defines a description for the same color twice, the compiler rightfully doesn't accept their absent minded behavior, and fails with a compile time error:
```idris
failing
oopsSameColorAgain : Bool
oopsSameColorAgain =
[ MkCD {name = "blue", desc = "the color of the sky"}
, MkCD {name = "red", desc = "the color of very hot things"}
, MkCD {name = "green", desc = "the color of leaves"}
, MkCD {name = "brown", desc = "the color of dirt"}
, MkCD {name = "red", desc = "the color of blood"}
]
```
[^1]: https://en.wikipedia.org/wiki/Equivalence_relation
[^2]: The exact conditions under which this will work are beyond the scope of this article