diff --git a/db/posts/5.json b/db/posts/5.json index c3b3ed4..62306f6 100644 --- a/db/posts/5.json +++ b/db/posts/5.json @@ -4,7 +4,7 @@ "hidden": false, "idris": true, "ipkg": "/home/nathan/Projects/Blog/projects/Idris/Idris.ipkg", - "posted-at": "2025-02-09T06:23:37.499533-05:00", + "posted-at": "2025-02-11T18:24:30.059245-05:00", "slugs": [ ], "source": "/home/nathan/Projects/Blog/projects/Idris/src/LessMacrosMoreTypes/Printf.md", diff --git a/projects/Idris/src/LessMacrosMoreTypes/Printf.md b/projects/Idris/src/LessMacrosMoreTypes/Printf.md index 4e05b99..a6dcd5b 100644 --- a/projects/Idris/src/LessMacrosMoreTypes/Printf.md +++ b/projects/Idris/src/LessMacrosMoreTypes/Printf.md @@ -7,51 +7,37 @@ import Data.List import System ``` -While C can provide convenient string formatting by having hideously memory -unsafe variadics, and dynamic languages, like python, can do the same while -being memory safe by not being type safe, many type safe languages, such as -Rust, are forced to provide such functionality through the use of a macro. -Dependently typed languages, like Idris, can provide a printf like formatting -interface, while maintaining both memory and type safety, without the need for -the macro. We will explore this by implementing a simplified version of `printf` -in Idris from scratch. +While C can provide "convenient" string formatting by having hideously unsafe variadics, and dynamic languages, like python, can do the same, many type safe languages, such as Rust, are forced to provide such functionality through the use of a macro. Dependently typed languages, like Idris, can provide a `printf` like formatting interface, while maintaining both memory and type safety, without the need for macros. We will explore this by implementing a simplified version of `printf` in Idris from scratch. -This article is inspired by an exercise from chapter 6 of -[Type Driven Development with Idris](https://www.manning.com/books/type-driven-development-with-idris), -and is written as a literate Idris file, with the source available -[here](https://git.stranger.systems/thatonelutenist/website/src/branch/trunk/projects/Idris/src/LessMacrosMoreTypes/Printf.md). +This article is inspired by an exercise from chapter 6 of [Type Driven Development with Idris](https://www.manning.com/books/type-driven-development-with-idris), and is written as a literate Idris file, with the source available [here](https://git.stranger.systems/thatonelutenist/website/src/branch/trunk/projects/Idris/src/LessMacrosMoreTypes/Printf.md). ## Gameplan -Our goal is to provide a printf function that can be called, much like it's C -equivalent: +Our goal is to provide a `printf` function that can be called much like it's C equivalent: -> [!NOTE] -> As this is a literate Idris document, and we haven't defined our `printf` -> function yet, we have to use a `failing` block to ask the compiler to check -> that this code parses, and syntax highlight it for us, but not attempt to -> actually compile it. +> [!Note] +> As this is a literate Idris document, and we haven't defined our `printf` function yet, we have to use a `failing` block to ask the compiler to check that this code parses, and syntax highlight it for us, but not attempt to actually compile it. ```idris failing - printf "%s %d %2d" "hello" 1 2 + example : String + example = printf "%s %d %2d" "hello" 1 2 ``` -Idris lacks a dedicated facility for variadics, but we can call functions in -type signatures, Idris allows us to manipulate types as first class values, and -we can use the runtime values of previous arguments to the function we are -defining as arguments to our type-level function. +Idris lacks a dedicated facility for variadics, but we can call functions in type signatures, which gives us an in for implementing them. -To get our variadic `printf` function, we can parse our format string into a -data structure, then pass that data structure into a type-level function that -calculates the rest of the type signature of our `printf` function based on its -contents. +Idris allows us to manipulate types as first class values, and we can use the runtime values of previous arguments to the function we are declaring as arguments to later type-level functions. These type-level functions can, furthermore, recurse into themselves to produce type signatures of varying length and content. + +> [!TIP] +> Idris requires that these type level functions be [_total_](https://idris2.readthedocs.io/en/latest/tutorial/theorems.html#totality-checking) for them to be expanded at compile time, meaning they are known to the compiler to return a value in constant time for all possible inputs. + +This allows us to produce variadic functions, so long as the number of arguments and their types can be determined from one of the arguments of the function before the variadic portion. + +To get our variadic `printf` function, we can parse our format string into a data structure, then pass that data structure into a type-level function that calculates the rest of the type signature of our `printf` function based on contents of the format string. ## Parsing a Format String -First, we need a data structure to describe our format string. We define the -`Format` data type, with constructors for each of the format specifiers we will -be supporting, as well as a constructor to hold literal components. +First, we need a data structure to describe our format string. We define the `Format` data type, with constructors for each of the format specifiers we will be supporting, as well as a constructor to hold literal components. ```idris data Format : Type where @@ -68,15 +54,10 @@ data Format : Type where End : Format ``` -We'll need to be able to parse numbers for our `PaddedNumber` constructor, so -we'll write a little helper function to handle that component of the parsing. -Simply keep pulling off characters, converting them to integers by shifting -their ordinal values. +We'll need to be able to parse numbers for our `PaddedNumber` constructor, so we'll write a little helper function to handle that component of parsing. We'll simply keep pulling off characters, converting them to integers by manipulating their ordinal values, until we hit a non-digit. > [!WARNING] -> On Idris 2's chez scheme backend, these are unicode ordinals, where the digit -> characters are in numerical order, but the values are backend dependent and -> this code is not guaranteed to work properly on other backends. +> On Idris 2's chez scheme backend, these are unicode ordinals, where the digit characters are in numerical order, but the values are backend dependent and this code is not guaranteed to work properly on other backends. ```idris parseNumber : (xs : List Char) -> (acc : Nat) -> (Nat, List Char) @@ -88,10 +69,7 @@ parseNumber (x :: xs) acc = else (acc, x :: xs) ``` -We'll also want another one to scoop up a literal into a `List Char`, continuing -until we hit the end of the string or the next `%`. This is optional, we could -have each literal consist of an individual char, but we will go ahead and group -them together. +We'll also want another one to scoop up a literal into a `List Char`, continuing until we hit the end of the string or the next `%`. This is optional, we could have each literal consist of an individual char, but we will go ahead and group them together. ```idris parseLiteral : (xs : List Char) -> (List Char, List Char) @@ -102,10 +80,9 @@ parseLiteral (x :: xs) = in (x :: literal, rest) ``` -Parse our format string into our `Format` data structure. The specifics of the -parsing here aren't really material to the main point of this article, but we -use a basic pattern matching approach, calling into our helper functions as -appropriate. +Next, parse our format string into an instance of the `Format` data structure. The specifics of the parsing aren't really material to the main point of this article, but we use a basic pattern matching approach, calling into our helper functions as appropriate. + +We return a `Maybe Format` to signal failure in the case of an invalid format string, and trigger an error when the programmer attempts to use an invalid format string. ```idris parseFormat : (xs : List Char) -> Maybe Format @@ -143,7 +120,14 @@ parseFormat (x :: xs) = ## Calculating a Type From a Format String -```idris +Since our parser returns a `Maybe Format`, we will define two variants of our `PrintfType` type-level function. One that handles the inner `Format`, and another that handles a `Maybe Format`, returning a type signature that is impossible to satisfy in the `Nothing` case, and calling our first variant in the `Just` case. + +The structure of this type-level function is rather simple, we just pattern match on our `Format` argument, and prepend the corresponding argument type to the rest of the type signature, computed through a recursive call on the `next` component of the `Format`. When we hit the `End`, we return the `String` type, which is the final return value of our function. + +> [!TIP] +> The `->` that appears in the body of the function is actually the exact same `->` that appears in the type signature, in both cases it is the constructor for function types. Idris uses the same language for type signatures and for function bodies, so anything you can use in one place, you can use in the other. + +```idris PrintfType' : Format -> Type PrintfType' (Number next) = (num : Nat) -> PrintfType' next @@ -154,7 +138,9 @@ PrintfType' (Str next) = PrintfType' (Literal literal next) = PrintfType' next PrintfType' End = String +``` +```idris PrintfType : Maybe Format -> Type PrintfType Nothing = Void -> String PrintfType (Just x) = PrintfType' x @@ -173,6 +159,15 @@ left_pad len pad str = else str ``` +Now, we need to be able to produce our formatting function from a `Maybe Format`. We'll use the same two variants trick as above, but put the variant that operates directly on `Format`s in a `where` block, since we won't need to call it from anywhere else. + +We build the function argument by argument, pattern matching on the `Format`, producing a lambda that accepts the argument type demanded by the current specifier. Then recurse into our inner `printfFmt'` function to handle the rest of the specifiers and arguments, applying the appropriate string conversion to the argument and tacking it to the end of the accumulator string. + +The accumulator string is returned verbatim when we hit `End`. + +> [!NOTE] +> Much like Haskell and other functional programming languages, Idris, semantically, only actually has functions of one argument. Multi-argument functions actually accept one argument, returning another function that handles the rest of the arguments. + ```idris printfFmt : (fmt : Maybe Format) -> (acc : String) -> PrintfType fmt printfFmt Nothing acc = @@ -193,13 +188,17 @@ printfFmt (Just x) acc = printfFmt' x acc ### With a Format String +`printf` is easily defined a simple wrapper, converting the provided string to a `Format` and passing it to `printfFmt`. After accepting the format string argument, we parse a `Maybe Format` from it, providing the parsed value to our `PrintfType` function to calculate the rest of our type signature. + +> [!TIP] +> The `_` syntax that appears in this function introduces an anonymous hole. When the value of a hole is forced by type checking (resolved), the compiler fills in its value for you. Holes defined with `_` will result in an "unsolved holes" compile error if unresolved. + ```idris printf : (fmt : String) -> PrintfType (parseFormat (unpack fmt)) printf fmt = printfFmt _ "" ``` -We can call `printf` as expected, with the number of and types of the arguments -being determined by the provided format string: +We can call `printf` as expected, with the number of and types of the arguments being determined by the provided format string: ```idris hide -- @@test printf hello world @@ -208,28 +207,191 @@ helloWorld = do pure $ ``` +