Librarian of Alexandria

2013-10-17

Latka: Tags

Another new Latka feature: effectively, I want the ability to have abstract data types, so I'd like to hide the constructor/destructor of a type and only expose a particular interface. The way I'm doing this is with tags, which are a particular variation on runtime types that have a bit in common with data type constructors in most ML variants.

We speak of tags wrapping values. We declare a new tag either at the top level with

tag t

or in a local scope with

let tag t in (* ... *)

Once we have a tag, we can wrap a value with

x as t

and unwrap it by pattern-matching. This point is slightly unintuitive, and has been my only sticking point—assuming we have a tag t in scope, then

case x of
  y : t -> y

will extract a value that's been wrapped by type t. This is not how pattern bindings usually work—this corresponds to a Haskell pattern like

case x of Tagged y t' | t == t' -> y

and not to the naïve Haskell translation

case x of Tagged y t -> y

So in this case, unification must also be aware of the values in (lexical) scope, which is a bit of an extension. I think I like this better than the alternatives, though.

A wrapped value cannot be used as though it were unwrapped, so the following would result in a runtime error:

let tag n in
  add.1.(2 as n)
  (* error: non-numeric argument to add *)

And if the tag is not in scope, then we can effectively encapsulate our data by providing various accessor functions to work with the tagged data, e.g.

<mkMyNum,getMyNum,addMyNum> := let tag myNum in
  < \ x : num . x as myNum
    \ _       . failure."Non-numeric argument to mkMyNum"
  , \ x : myNum . x
    \ _         . failure."Non-MyNum argument to getMyNum"
  , \ (x : myNum) (y : myNum) . (add.x.y) as myNum
    \ _                       . failure."Non-myNum arguments to addMyNum"
  >
puts getMyNum.(addMyNum.(mkMyNum.2).(mkMyNum.3))
(* prints 5 *)
puts getMyNum.(addMyNum.3.(mkMyNum.3))
(* Failure: Non-MyNum argument to addMyNum *)

One final quirk is that creating a new tag effectively creates a new token that's not visible to the user, so two tags with the same name are still two distinct tags. This fixes a problem with the original, naïve implementation of local type declarations in SML where you could write

let f = (let datatype t = X of int  in fn (X n) => n) in
let x = (let datatype t = X of bool in X true) in
  f x
end
end

which is of course nonsensical and breaks soundness and everything. In SML, this was solved by not allowing locally defined types to escape, which strikes me as unnecessarily draconian; we just need to be intelligent enough to know that the two t's above are distinct because they were declared in distinct locations. In doing so, SML would admit the sensical analogue to the Latka code above:

(* This doesn't work, but it'd be nice if it did *)
val (mkMyNum,getMyNum,addMyNum) =
  let datatype myNum = MyNum of int in
    ( fn n         => MyNum n
    , fn (MyNum n) => n
    , fn (MyNum n) =>
        fn (MyNum m) => MyNum (n + m)
    )
  end

In contrast, the following Latka code would print "Nope!"

f := let tag t in \ _ : t . "Yup!"
                  \ _     . "Nope!"
x := let tag t in 5 as t
puts f.x

i.e. despite both being wrapped with a tag named t, each declaration of a tag will produce a new, distinct tag. As Latka is disgustingly dynamic1, coming across an expression let tag t in ... will likely increment some hidden local integer that gets used as the tag, but that is of course an implementation detail subject to change. (Also, because of the limited scope of this language's use, it's not like I have to worry about, say, sending data between processes or multicore or anything, which makes these decisions much easier.)


  1. I was describing a Latka feature to a coworker, who suddenly stopped and said, "Wait, if Latka is dynamically typed, what's stopping you from having an expression like 5 | True | (\x.x)?" I told him, "...absolutely nothing." He was disgusted at the prospect.