Librarian of Alexandria

2018-11-22

Ižtreyan Cuisine

Ižtreyan meals are communal, taken at large, circular tables, ornately inlaid with brightly colored tiles or stone or metal fragments, that serve as the centerpieces of ižtreyan homes and feature prominently in the back-rooms of ižtreyan workplaces. An ižtreyan meal will involve a handful of main dishes and several smaller side dishes that are shared by everyone, heaped onto individual plates or bowls.

Certain dishes show up as sides at every meal, ubiquitous small plates called talots: these are considered obligatory to the degree that meal size and formality is characterized by the number of talotsa present, with a two-talotsa meal being considered the bare minimum for any meal, while a ten-talotsa meal is a veritable feast. (An ižtrey would never serve a meal with only one talots: in fact, the idiom "a one-talots meal" is used among ižtreya to connote a thing that is completely and unacceptably lacking or unfinished.)

A characteristic talots, known well even outside ižtreyan cities, is rask, which is a variety of tiny, walnut-sized bread, baked quickly in large quantities and served in large bowls with a dusting of salt, usually made of a combination of wheat and buckwheat flour and—less often—chopped nuts. Others include tsalšak, or wafers of dried cucumber softened and served in a tangy yoghurt-based sauce, reželdo, or chopped salted sardines or herring, and gelbrekhi, or fried vegetables in a buckwheat-honey batter.

Among the small bites are the large dishes. A meal with one or two people will likely have one large dish, but when eating as a family, a community, or a workplace, people will often serve several to even dozens of central dishes. Ižtreyan meals often include meats in flavorful sauces and various baked goods. Centerpiece dishes like this include:

  • žyotsuldo: a roasted savory pie with a buckwheat crust, usually sprinkled with some small pieces cheese shortly before being removed from the oven. A žyotsuldo can be filled with just about anything, but popular choices include beets, marinated beef, or chopped mushrooms. (It's rather uncommon for ižtreyan cities to have stalls for street food or other casual-and-easy-to-acquire foods—it would run contrary to the camraderie of a proper five-talotsa meal, an ižtrey might tell you!—but when such stalls exist, they often sell an easy-to-carry variation on žyotsuldo.)
  • yadash, a roasted, creamed soup of a central flavor (often beets, but sometimes peppers or rhubarb) and a backing, milder flavor (potatoes or yams): this is served with a drizzle of honey and a thick dusting of black pepper, and sometimes a buckwheat flatbread called kyaczut.
  • lyubešku ikhab (or other lyubešk dishes): ikhab is a generic word for red meat, and lyubešk is a style of cooking that involves a slow braise in wine with dried berries and raw grain kernels, usually barley. Over the course of the braise, the grains and berries puff up with the wine and meat juices, and the meat takes on a characteristic pink color. It's possible to lyubešk-cook poultry (lyubešk trabšo) or some vegetables like thicker, meatier mushrooms (lyubešk rabsin), but the lyubešk style is usually associated with red meats.

Tir-Bhahat is a collection of fragments of fantastic world-building. You can read more about it here.

2018-11-02

Quede Names

At first glance, the queder's conventions for naming are remarkably simpler than most of the other peoples. In general, a quede will have two names: in order, a given name chosen by the parents at birth, and a surname taken from the surname of one of their parents. Which parent's surname is taken will vary based on local custom: in some places, a quede will take the surname of a parent of the same gender; in another, the surname of a parent of a different gender; yet in others, the surname of the parent in whose ancestral home they live; yet in others, the oldest of the parents. The queder rarely change their names, and certainly don't bother changing their names for marriage (although it's not unheard of for a quede to move to another village and adopt a new name to accompany their new life!)

Sadly, the story of quede names is nonetheless complicated by the fact that they have, when compared to most folk, remarkably few given names: the most common two dozen names account for the vast majority of queder. It's not uncommon to walk into a place of work and find four laborers there all named Étun. Indeed, shared names are so common that some parents give the same name to multiple children. The river-town of Elascín was home to a locally famous quede, a book-binder named Pégno Telbasci, who had five sons—all five of them also named Pégno Telbasci!

The queder deal with this remarkable state of affairs by compensating with a truly stunning number of ways of building nick-names. In our hypothetical work-place featuring four queder named Étun, all of them likely have their names registered in local ledgers as Étun but nonetheless are known by some specific variation on the name. There are a number of ways of constructing such variations:

  • Every name has its short forms, usually created by dropping the final syllable: thus Étun might be called Étt, Yanna called Yan, a Pégno called Pénn.
  • Among those names which are longer than two syllables, one could drop the final syllable (as in Adrisc for Adrisci or Demel for Demela), but might also drop the middle vowel (as in Asci or Demma) or sometimes even the initial syllable (as in Drisci or Mela.)
  • Adjectives, especially simple adjectives like liga 'little', adora 'big', cilla 'tall', isca 'fat', or reggia 'cheerful', can be combined with the name: a quede named Adrisci may be called Liga-Adrisci or Isca-Adrisci to distinguish her from the other Adriscis. Many of these adjectives no longer carry a strong meaning when used to create nicknames, and certainly none of these are considered negative in any particular way! (That's not to say no quede would refer to another via a pejorative name, but the queder consider it remarkably bad luck to coin a negative nickname that gains any kind of usage!)
  • Various endings can be used to create stock diminutives or pet names, often replacing a final consonant if it exists: -ye, -tta, and -pan are the most common, to the degree than Étuye is sometimes used as a generic name for a given unspecified quede! Still others exist: -en, -an, -gni, -ra, and -qua are all well-attested, and a dozen others might be gathered in any given quede settlement.
  • Some endings, over time, are even added on top of yet other endings: you're as likely to meet an Étuyetta as you are an Étuttaye.
  • Longer compounds are somewhat less common, but by no means unheard of, especially those made with ta 'of'. They may reference an occupation, as in Yanna-ta-Rescar "Yanna-of-Arms" or Yanna-ta-Gieso "Yanna-of-Fish", or they may reference a place of birth or living, as in Yanna-ta-Ciama "Yanna-of-the-Woods_ or Yanna-ta-Dar "Yanna-of-the-Bay". These usually appear with a shortened form of the name, as well: indeed, Pénn-ta-Dar is a figure of local legend among the quede towns of the tallgrass plains in the north.

Consequently, despite a wealth of Étuns and Pegnos and Yannas and Adriscis in the official ledgers, a given quede might not know anyone in their home-town by the same name: one of them may be Liga-Étun, another Étunni, another Étuye, and another Étt-ta-Quami, or one of any of dozens of other variations, and no-one would dream of mistaking one Étun for another.

Tir-Bhahat is a collection of fragments of fantastic world-building. You can read more about it here.

2018-10-29

Reški Cuisine

When you ask someone who has only ever heard second-hand stories of the rešêk about their cities, they might begin by telling you of their reputed skill with stone- and metal-working, of the deep mines of Thabatnûk or the shimmering canals of liquid silver in the workshops of Rustân Phebašerga Kafthesdut the God-Smith. When you ask someone who has spent time with the rešek, however, they will often talk first about the food. The rešêk are amazing farmers, capable of growing hearty vegetables and succulent fruits even in rocky, sandy soil, and they are equally capable of turning those plants into spectacular, mouth-watering dishes.

Rešk food is heavy in vegetables and grains, but also in seafood: the latter may come as a surprise to many who have not visited the subterranean or semi-subterranean reški cities, but many swimming creatures adapt readily to subterranean life and then can be farmed in controlled underground lakes and canals. On the other hand, larger animals like cattle or swine are hard to raise in most of the areas where reški live, and such meats are a rare treat for the rešek. Beef and pork, when they are available, are often preserved through drying or curing and then used in small portions. Such meats rarely serve as a centerpiece dish on their own, and when they do, they are often used in place of goat, a meat which is easier to come by in the rugged environments where the reški often life.

Reški cuisine involves many preservation strategies, including drying, fermenting, and pickling. Many reški meals involve jams and jellies both savory and sweet, pickled pieces of vegetables, fruits, and meats, and pungent fermented mixtures of vegetables and fruits. The most famous reški preserve is called thûrbuk: a mashed and fermented paste made from mushrooms, peppers, and various spices. Thûrbuk is by far the most common condiment in reški meals, and you're likely to see a jar of it on almost every reški table. The reški predilection for fermentation extends to their beverages, which include both high-alcohol distillates like nakhat—a grain alcohol aged in hand-hewn granite vessels, sometimes referred to as 'stone-aged whiskey' by other peoples—as well as everyday drinks like the fizzy fermented fruit juices called suthur, which are also sometimes colloquially called 'small wines' (despite the fact that they are weak enough that they're effectively non-alcoholic!)

Reški cuisine also prominently features a thick, paddle-shaped variety of bread called kegran, which is baked in hot stone ovens and has a crusty exterior with a soft, airy interior. A loaf of kegran is all but guaranteed at almost every meal, regardless of time or setting. Its shape is circular but with a long protrusion called the ašbetik ('pan-handle') which is sometimes used for handling the loaf itself, as it cooks more qiuckly than the rest of the loaf and is usually positioned pointing towards the baker as the loaf bakes. In some places, a rešk will avoid eating this handle-shaped part: as the ašbetik ends up crunchier than the rest of the loaf, it was seen as less desirable, and only those who could not afford a proper meal would stoop to eating it. Despite this cultural association, some reški still prize the ašbetik for its crunchy savoriness, especially when paired with heaping spoonfuls of thûrbuk.

Some other common reški dishes include:

  • Hethun, which is a salty broth made by a week-long boiling of certain kinds of stones with dustings of moss, which together impart a mineral flavor and a mild saltiness to the resulting liquid. A bottle of hethun is often kept on-hand as a refreshing drink during hard labor, but it also serves as a base for many other dishes.
  • Bâkutand, which is a salad of spiced fermented root vegetables, usually potatoes and radishes, served cold with a drizzle of nut oil.
  • Dushâmpek, which is animal skin (usually chicken or salmon) wrapped around sticks of cucumber (or, more rarely, carrot), fried, and served with a generous drizzle of thûrbuk. You will find at least one seller of dushâmpek at almost every marketplace, if not two or three.
  • Uphasdît, which is marinated sliced fish (often trout, but sometimes salmon) and mushrooms, left overnight with chilis and spices and then sautéed quickly in a hot pan, usually eaten on top of torn chunks of a loaf of kegran.

Tir-Bhahat is a collection of fragments of fantastic world-building. You can read more about it here.

2018-10-28

Tir-Bhahat

Something I've considered writing for a very long time is a system-agnostic, largely setting-agnostic tabletop sourcebook that's just descriptions of fantastic cultures and peoples, intended as fragments that could inspire new, interesting worlds instead of prescribing a specific concrete world. I've had scattered notes on this for a while, but I've now decided to start putting pieces of it out into the world in the form of blog posts instead of waiting until it's "complete". These posts will all be small pieces of pure fantasy worldbuilding, not tied to any particular story or game, describing some small part of the cultures of fantastic peoples in a fantastic world. My current working title for this project is Tir-Bhahat, so these posts can be all read under the #tir-bhahat tag.

I also want to give the story of why I was inspired to do this, because this is a very old idea of mine: its origins come from my high school days, and arose from the collision of three things:

  • An awkward, poorly-planned, terribly-run Dungeons and Dragons 3E campaign I put together for my high school friends
  • My budding interest in natural languages and high-school habit of buying cheap grammars and phrasebooks for languages I had no intention of ever speaking fluently
  • My lifelong penchant for worldbuilding and constructing languages

One thing I discovered as I was reading the Dungeons and Dragons player's handbook was that I was really dissatisfied with fantasy naming. The Dungeons and Dragons books had a set of suggested names for each kind of fantastic humanoid—elves, dwarves, halflings, and so forth—and these names felt by and large uninterestingly English-like in phonology. They differed from each other mostly in terms of consonant distribution: that is to say, a Dwarvish name would have more k's and g's, while an Elvish one would have more l's and q's, but you'd rarely find other major differences in terms of the kinds of consonants and vowels that would show up, or overall shape of the words. And beyond that, while you could usually distinguish an Elvish name from a Dwarvish one—possibly because they were all Tolkien pastiche, anyway—you'd have a much harder time distinguishing a Gnomish name from a Halfling name, because they all read so similarly.

I immediately set out to rectify this. I sat down with a set of natural-language inspirations for each fantasy language, and then came up with phonologies and tendences, and then wrote programs so that I could generate names that adhered to those phonologies. My intention was that a player who had played in my games for long enough could, from just the impression and sound of a name, tell what sort of person it belonged to, in the same way that an English-speaker who speaks no other languages can nonetheless often tell a German name from a Hindi name from a Mandarin name.

Many of these phonologies were based loosely on real languages, often chosen somewhat arbitrarily. For example, the language of the gnomes, I decided, had a deeply Slavic flavor to it, resulting in Gnomish names like Ussybneča Kadrey Ažbardzo, while the language of the halflings was loosely based on Romance languages, resulting in Halfling names like Pégno Telbasci. Importantly to me, none of these efforts were intended to be constructed languages like Tolkien's Elvish languages: they didn't feature grammars or vocabularies, but instead were just guidelines about how words should look and sound. I did eventually write up phonologies for every major language mentioned in the Dungeons and Dragons books I had at the time.

Over time, though, the scope of this project got larger as I realized my dissatisfaction about the sounds of fantasy languages extended to other aspects of fantasy cultures such as food. What do fantasy peoples eat? Many of those sourcebooks would tell you that Elves eat fruits and probably a nourishing cracker-like bread, while dwarves eat mushrooms and drink ale, and halflings eat lots of bread. But when you look at real-world cultures, you find cuisines which are much deeper and more varied. You might find a dish of noodles and mushrooms in Italian cuisine as well as in Chinese cuisine, but even then, those dishes will surface with wildly different composition, flavor, and focus. Compared to the richness and variety of foods that appear in the real world, the fantasy standby of "elves eat fruits" feels remarkably simplistic and boring.

Since then, this idea has been on my mind sporadically, so I've accumulated a lot of notes on possibilities for fantastic languages and names and foods and clothing and family structure and cities. However, despite having considered doing so on a number of occasions, I've never taken all these notes—some of the oldest dating back more than a decade—and turned them into a cohesive whole.

Consequently, my plan is now to start polishing them piece-by-piece, with no particular focus or order, and start posting these scattered notes as blog posts instead. The current drafts are no longer deeply tied to Dungeons and Dragons, or any specific fantasy world. They make some cursory efforts to be compatible with existing fantasy cliché—stipulating that dwarves live underground, for example—but also try to avoid the worst and most problematic parts of fantasy cliché, such as the concept of intelligent, sentient beings who are "always chaotic evil". These posts are exercises in world-building for its own sake, vignettes of fantastic cultures that ideally should be simultaneously grounded and yet fantastic.

2016-05-20

Books, Blogs, and Burglars

I recently finished reading A Burglar's Guide to the City by Geoff Manaugh, of BLDGBLOG fame. In general, I liked it: it was a fun and interesting read on a topic that I love, because I am a huge fan of heists and criminal masterminds and whatnot. The thematic center of the book is the idea that the way burglars use architecture can reveal interesting things about architecture and its use, and it explores a lot of related topics.

I think the book suffered in one way, and the way that it suffered is kind of interesting to me: until about halfway through the book, I felt kind of lukewarm about it, and I had some time to think about it before I picked the book up again. My knee-jerk impression was that it was meandering, but on reflection, that didn't make sense, because I like it when books meander. It was meandering in a way that I didn't care for, which is not a common phenomenon.

Manaugh is a talented writer. Maybe one who could benefit from a modicum of restraint—he will occasionally indulge in some florid phrasing in ways that distract from the topic instead of serving it, but that's rare enough, and in general he's good at conveying the detached, speculative, otherworldly sense of place that's pervasive in the kinds of things he writes about—but by and large his writing is talented and effective. I love his writing on BLDGBLOG, where he drifts back and forth between quotation-heavy journalistic prose and adjective-heavy narrative prose, setting up situations and then exploring them in interesting ways.

This prose is still present in the Burglar's Guide, but that was, in its own way, part of the problem. When I came to the book, I came to it as an argument: it had a central thesis, which is explicitly brought up and contrasted with competing theses at times, and a structure, chapter headings with topics and progressions. Except that none of those really exist: the thesis is used more thematically than concretely, the chapter headings and topics are only loosely correlated with their actual contents—with the exception of the chapter on burglary tools, most chapters consist of chunks that could have been rearranged and renamed with little to no effect on the book—and the thesis disappears regularly and reappears whenever a new detail needs to get pinned back to the central thread.

After I had thought about these things, I continued the book but consciously tried to read it ignoring chapter breaks, headings, and ignoring the macro-scale structure, and it became a much better book. I started trying to trying to read it not as a book developing a theme, but as a collection of blog posts under a slightly idiosyncratic and oddly specific topic, a bunch of posts all tagged as Architecture, True Crime. In that frame, it was much more successful.

I think that the Burglar's Guide really should have been presented in this way from the beginning: the chapter structure should have been reworked and tossed, and the sections should have either been headed with their topic or thrown in as chunks in a larger soup of a book, probably with an introduction and conclusion that talk about the larger themes. The macro-scale structure of the book as it exists is largely a fiction, and trying to read the book as adhering to that structure produces a (to me) slightly annoying mismatch when compared to the text itself. A book that didn't make any pretense of having structure would have been, I think, more successful.

That said, the book is otherwise very entertaining, and I would in general recommend it. I wanted to explore this idea as a good but purely textual example of a work being at odds with its medium, even when 'medium' here is defined narrowly: this is a good book that, interestingly, would have shined even more had it been a somewhat different kind of book.

2013-11-18

Extending the Standard Streams

If I write my own shell—which I may very well do at some point—there's a particular process model I'd like to embed in it. To wit: in UNIX right now, each program in a pipeline has a single input stream and two output streams, with files/sockets/&c for other kinds of communication.

A pipeline of functions foo | bar | baz looks kind of like this

keyboard -> stdin\   /stdout -> stdin\   /stdout -> stdin\   /stdout —+
                  foo                 bar                 baz           |
                     \stderr -+          \stderr -+          \stderr -+ |
                              |                   |                   | |
                              v                   v                   v v
                            [..................terminal...................]

Which works pretty well. You can do some pretty nice things with redirecting stderr to here and stdin from there and so forth, and it enables some nice terse shell invocations.

I'd like that basic system to be preserved, but with the ability to easily create other named streams. For example, imagine a hypothetical version of wc which still outputs the relevant data to stdout, but also has three other streams with these names:

                 / newlines
                 | words
stdin -> wc-s -> | bytes
                 | stdout
                 \ stderr

You can always see the normal output of wc on stdout:

gdsh$ wc-s *
       2       3       6 this.txt
      10      20      30 that.c
     100    1000   10000 whatever.py

But you could also extract an individual stream from that invocation using special redirection operators:

gdsh$ wc-s * stdout>/dev/null bytes>&stdout
3
20
1000

We could also have multiple input channels. I imagine an fmt command which can interpolate named streams, e.g.

gdsh$ printf "1\n2 3\n" | wc-s | fmt "bytes: {bytes}\n words: {words}\n nl: {newlines}\n"
bytes: 6
words: 3
newlines: 2

We can then have a handful of other utilities and built-in shell operators for manipulating these other streams:

  # the `select` command takes a stream name and outputs it
gdsh$ wc-s * | select words
3
20
1000
  # here we redirect stdout to the stream X and pass it to fmt
gdsh$ cat this.txt stdout>&X | fmt "this is {X}\n"
this is 1
2 3
  # the same, using file redirection operators
gdsh$ fmt "this is {X}\n" X<this.txt
this is 1
2 3
  # the same, using a shorthand for setting up a stream by taking
  # the stdout from some command
gdsh$ !X='cat this.txt' fmt "this is {X}\n"
this is 1
2 3
  # the same, using a shorthand for setting up a stream by just
  # reading and outputting a file
gdsh$ @X=this.txt fmt "this is {X}\n"
this is 1
2 3
  # using a shorthand for filling in a stream with a string directly
gdsh$ ^Y=recidivism fmt "Y is {Y}\n"
Y is recidivism
  # redirecting each output stream to a different file
gdsh$ wc-s * words>words.txt bytes>bytes.txt newlines>newlines.txt
  # using a SmallTalk-like quoting mechanism to apply different shell
  # commands to different streams
gdsh$ wc -s * | split words=[sort >sorted-word-count.txt] bytes=[uniq >uniq-bytes.txt]

This could also enable new idioms for programs and utilities. For example, verbose output, rather than being controlled by a flag to the program, could be always output to a (possibly unused) stream called verbose, so the verbose output could be seen by redirecting the verbose stream (or by logging the verbose output while only seeing the typical stderr messages):

  # here we only see stderr
gdsh$ myprog
myprog: config file not found
  # here we ignore stderr and see only the verbose output
gdsh$ myprog stderr>/dev/null verbose>&stderr
Setting up context
Looking in user dir... NOT FOUND
Looking in global dir... NOT FOUND
myprog: file not found
Tearing down context
Completed
  # here we see stderr but logg the verbose output
gdsh$ myprof verbose>errmsgs
myporog: config file not found

Or maybe you could have human-readable error messages on stderr and machine-readable error messages on jsonerr:

  # here is a human-readable error message
gdsh$ thatprog
ERROR: no filename given
  # here is a machine-readable error message
gdsh$ thatprog stderr>/dev/null jsonerr>stderr
{"error-type":"fatal","error-code":30,"error-msg":"no filename given"}

Or you could have a program which takes in data on one stream and commands on another:

  # someprog takes in raw data on the stream DATA, and commands
  # on the stream CMDS. Here we take the data from a local file
  # and accept commands from the network:
gdsh$ @DATA=file.dat !CMDS='nc -l 8000' someprog
  # ...and here we have a set of commands we run through locally
  # while taking data from the network:
gdsh$ !DATA='nc -l 8001' @CMDS=cmds.txt someprog

There are other considerations I've glossed over here, but here are a few notes, advantages, and interactions:

  • I glossed over the distinction between input/output streams. In practice, the shell has no trouble disambiguating the two, but a given program may wish to consider the distinction between words.in and words.out; to this end, we could rename the existing streams std.in and std.out and err.out (it being an error to read from err.in in most cases.1)

  • This is obviously not POSIX-compliant, but could be made to work with the existing UNIX process model by e.g. having a standard environment variable for stream-aware programs to look at which maps stream names to file descriptors. That way, programs which don't expect these special streams still use fds 0, 1, and 2 as expected, while programs that do handle these can read STREAM_DESC to find out which 'streams' correspond to which file descriptors. In that case, you can almost use these commands with an existing shell by doing something like

    sh$ echo foo >&3 | STREAM_DESC='foo.in:3' fmt "foo is {foo}\n"
    

    where STREAM_DESC takes the form of streamname:fd pairs separated by spaces.

  • If we do use an existing UNIX system to write this, then we also should integrate libraries for this, and the API for it is unknown. Presumably a C interface would have int get_stream(char* stream_name) that could return -1 on failure. get_stream would look through the STREAM_DESC environment variable to find the relevant stream and return the fd mentioned there, otherwise failing. This does mean that you have to create your streams before you use them, which I think is reasonable.

  • This would interact really interestingly with a semi-graphical shell2 that could visualize the stream relationships between commands as well as a shell with a higher-level understanding of data types.3

So those are some ideas that have been drifting around in my head for a while. No idea if I'll ever implement any of them, or if they'd even be worth implementing, but I might get around to it at some point. We'll see.


  1. I originally figured that err.in would be a useless stream, but after some thought, I can imagine a use for this. Let's say my programming language of choice, Phosphorus, outputs its error messages in XML format. This is great for an IDE, but now I need to debug my program on a remote server which doesn't have my IDE installed. I could have a program ph-wrapper that passes all streams through unchanged except for err.in, which it parses as XML and then processes to a kind of pretty-printed trace representation and passes it to its own err.out. So

    gdsh$ phosphorus src.ph
    Setting up program...
    <PhosphorusException>
      <ExceptionType>NoSuchIndex</ExceptionType>
      <ExceptionCode>44</ExceptionCode>
      <LineNumber>3</LineNumber>
      <StackTrace>
        ...
    gdsh$ phosphorus src.ph | ph-wrapper
    Setting up program...
    NoSuchIndex exception on line 3:
      x = args[3];
      ...
    

    So yes, I can imagine a class of programs which want to pay attention to err.in

  2. Don't cringe. Look—the input device with the most information density is the keyboard, right? That's why you use the command line at all. However, graphical systems have more information density than pure-text systems. You can take a pure-text system and extend it with position and color to give it more information, and then with charts and graphs to give it more information, and so forth. What I'm proposing is not drag-and-drop, although that might be useful to some users; it's a keyboard-driven system that displays information in a more dense, information-rich style. I keep thinking of building this myself but for the massive herds of yaks I'd have to shave first. 

  3. PowerShell is the usual example given here, but I confess I haven't used it. Effectively, rather than streams of raw text, think streams of well-formed data types like JSON or s-expressions or some other kind of more elaborate information. wc might instead of outputting tab-separated numbers output lists of a fixed size, then. 

2013-10-17

Latka: Tags

Another new Latka feature: effectively, I want the ability to have abstract data types, so I'd like to hide the constructor/destructor of a type and only expose a particular interface. The way I'm doing this is with tags, which are a particular variation on runtime types that have a bit in common with data type constructors in most ML variants.

We speak of tags wrapping values. We declare a new tag either at the top level with

tag t

or in a local scope with

let tag t in (* ... *)

Once we have a tag, we can wrap a value with

x as t

and unwrap it by pattern-matching. This point is slightly unintuitive, and has been my only sticking point—assuming we have a tag t in scope, then

case x of
  y : t -> y

will extract a value that's been wrapped by type t. This is not how pattern bindings usually work—this corresponds to a Haskell pattern like

case x of Tagged y t' | t == t' -> y

and not to the naïve Haskell translation

case x of Tagged y t -> y

So in this case, unification must also be aware of the values in (lexical) scope, which is a bit of an extension. I think I like this better than the alternatives, though.

A wrapped value cannot be used as though it were unwrapped, so the following would result in a runtime error:

let tag n in
  add.1.(2 as n)
  (* error: non-numeric argument to add *)

And if the tag is not in scope, then we can effectively encapsulate our data by providing various accessor functions to work with the tagged data, e.g.

<mkMyNum,getMyNum,addMyNum> := let tag myNum in
  < \ x : num . x as myNum
    \ _       . failure."Non-numeric argument to mkMyNum"
  , \ x : myNum . x
    \ _         . failure."Non-MyNum argument to getMyNum"
  , \ (x : myNum) (y : myNum) . (add.x.y) as myNum
    \ _                       . failure."Non-myNum arguments to addMyNum"
  >
puts getMyNum.(addMyNum.(mkMyNum.2).(mkMyNum.3))
(* prints 5 *)
puts getMyNum.(addMyNum.3.(mkMyNum.3))
(* Failure: Non-MyNum argument to addMyNum *)

One final quirk is that creating a new tag effectively creates a new token that's not visible to the user, so two tags with the same name are still two distinct tags. This fixes a problem with the original, naïve implementation of local type declarations in SML where you could write

let f = (let datatype t = X of int  in fn (X n) => n) in
let x = (let datatype t = X of bool in X true) in
  f x
end
end

which is of course nonsensical and breaks soundness and everything. In SML, this was solved by not allowing locally defined types to escape, which strikes me as unnecessarily draconian; we just need to be intelligent enough to know that the two t's above are distinct because they were declared in distinct locations. In doing so, SML would admit the sensical analogue to the Latka code above:

(* This doesn't work, but it'd be nice if it did *)
val (mkMyNum,getMyNum,addMyNum) =
  let datatype myNum = MyNum of int in
    ( fn n         => MyNum n
    , fn (MyNum n) => n
    , fn (MyNum n) =>
        fn (MyNum m) => MyNum (n + m)
    )
  end

In contrast, the following Latka code would print "Nope!"

f := let tag t in \ _ : t . "Yup!"
                  \ _     . "Nope!"
x := let tag t in 5 as t
puts f.x

i.e. despite both being wrapped with a tag named t, each declaration of a tag will produce a new, distinct tag. As Latka is disgustingly dynamic1, coming across an expression let tag t in ... will likely increment some hidden local integer that gets used as the tag, but that is of course an implementation detail subject to change. (Also, because of the limited scope of this language's use, it's not like I have to worry about, say, sending data between processes or multicore or anything, which makes these decisions much easier.)


  1. I was describing a Latka feature to a coworker, who suddenly stopped and said, "Wait, if Latka is dynamically typed, what's stopping you from having an expression like 5 | True | (\x.x)?" I told him, "...absolutely nothing." He was disgusted at the prospect. 

2013-10-01

Matzo Feature Consideration

I'm still working on the Matzo language, and I need to finish the parser (as I can now evaluate more programs than I can parse, and have code written to an as-yet unimplemented module spec.) Here are a few features I'm considering but haven't implemented yet, and how they might interact:

Dynamic Default Variables

This opens a small can of worms, so to be clear, dynamic variables will reside in a different namespace than normal variables. I don't know yet whether said namespace will be denoted with a sigil (say #x) or whether you might have to pass a symbol to a function (like get.X) to access their values. The idea is that certain functions will have sensible defaults that one might want to override, and rather than explicitly matching on arity, one can instead pass them in dynamically using a particular syntax. In the examples below, I'll use an explicit syntax for lookup of a dynamically scoped variable; in particular, get.x.y will look up a dynamically scoped variable x (typically a symbol), and if it hasn't been supplied, will use the default value y.

foo :=
  let combine := get.Combine.cat in
    combine."a"."b"
bar := foo with Combine := (\ x y . x ";" y)

puts foo
(* prints "ab", as it uses the default combiner `cat` *)
puts bar
(* prints "a;b" *)

This will continue down dynamically, so

s1 := (get.Pn."e") " stares intently."
s2 := "It is clear that " (get.Pn."e") 
        " is interested in what is happening."
sent := se.<s1,s2>
puts sent;
puts sent with fixed Pn := "he" | "she" | "e"

will force Pn to be the chosen value in all subexpressions evaluated underneath the with clause. (Notice that nondeterminism still works in dynamically computed variables, so one must declare them as fixed at the binding site if you want them to be computed exactly once.)

Text Combinators

I've used one of these above: right now, I have a few planned.

puts wd.<"a","b","c">
(* prints "abc" *)

puts nm.<"a","b","c">
(* prints "Abc" *)

puts se.<"a","b","c">
(* prints "A b c" *)

puts pa.<"a","b","c">
(* prints "A. B. C." *)

So effectively, they function as ways of combining strings in a more intelligent way. I plan for them to do some analysis of the strings so that they don't, say, produce extraneous spaces or punctuation. (The names are of course short for word, name, sentence, and paragraph, respectively, and may very well be alises for those longer names.)

Rebindable Syntax

The conjunction of the previous two suggests that certain bits of syntax should be rebindable. A prominent example is concatenation, e.g.

puts "foo" "bar" "baz"
(* prints "foobarbaz" *)

puts "foo" "bar" "baz" with Cat := se
(* prints "Foo bar baz.", as normal string concatenation
 * has been overloaded by `se` *)

puts "foo" "bar" "baz" with Cat := fold.(\ x y . x ";" y)
(* prints "foo;bar;baz", as normal string concatenation
 * has been overloaded by a custom function *)

This could lead to some pretty phenomenal weirdness, though:

f := \ x y . add.x.y
puts f.<1,2> where App := \ f x . fold.(\ g y . g.y).(append.f.x)
(* prints 3, as we have overloaded function application to
 * automatically uncurry functions *)

...so maybe there should be a limit on it.

Error Handling

Still not sure on this one. I don't particularly want to bring monads into this, mostly because I want the language to be a DSL for strings and not a general-purpose programming language, but at the same time, it might be nice to have a simple exception-like mechanism. One idea I was playing with was to implement a backtracking system so that errors (both raised by users and by built-in problems) could simply resume at particular points and retry until some retry limit is reached. For example, you could reject certain unlikely combinations:

x ::= a b c d
wd := let result := x x x x
      in if eq.result."aaaa" then raise Retry else result
puts mark Retry in wd

Here, mark exp in wd corresponds roughly to the following imperative pseudocode:

{ retry_count := 0
; while (retry_count < retry_max)
    { try { return wd; }
      catch (exp) { retry_count ++; }
    }
}

It's a much more limited form of exception handling, which may or may not be desirable, but does give you some ability to recover from errors so long as at least some execution of your program will be error-less.

All this is heavily open to change, so we'll see.

2013-09-17

A Petty And Insignificant Complaint About Haskell Records

Haskell records have lots of problems. Here's one that came up for me today.

You are allowed to export record members without exporting the constructor, for example, if you want to ensure some property is true of the constructed values. In the following example, the field isNeg is effectively a function of the field num:

module Foo(mkRec, num, isNeg) where

data Rec = Rec
  { num   :: Int
  , isNeg :: Bool
  }

mkRec :: Int -> Rec
mkRec n = Rec n (n < 0)

Another module can't use the Rec constructor, but can observe the values using the exported accessors

module Bar where

addRecs :: Rec -> Rec -> Rec
addRecs r1 r2 = mkRec (num r1 + num r2)

Unfortunately, there's a hole here, which is that exporing the accessors allows us to use record update syntax, which means that we can now construct arbitrary values:

constructAnyRec :: Int -> Bool -> Rec
constructAnyRec n b = mkRec 0 { num = n, isNeg = b }

There is a way around this, namely, by rewriting the original module with manual accessors for num and isNeg:

module Foo2(mkRec, num, isNeg) where

data Rec = Rec
  { _num   :: Int
  , _isNeg :: Bool
  }

num :: Rec -> Int
num = _num

isNeg :: Rec -> Bool
isNeg = _isNeg

mkRec :: Int -> Rec
mkRec n = Rec n (n < 0)

However, I'd assert that, morally, the correct thing to do would be to disallow record update at all if the constructor is not in-scope. The purpose of hiding the constructor at all is to ensure that a programmer must perform certain computations in order to construct a valid value, e.g. to enforce invariants on constructed data (as I'm doing here), or to avoid the possibility of pattern-matching on data. If you a programmer hides a constructor but exports its accessors, then generally I'd assert it's because of the former reason, so it would be sensible to prevent record update, as you could always write your own updates, if you so desire.

Of course, pointing out this flaw in light of the other problems with the Haskell record system is like complaining about the in-flight movie on a crashing plane, but still.

2013-09-05

The Literary Offenses of George Martin

It's fun to reread Mark Twain's Fenimore Cooper's Literary Offenses and mentally replace "Deerslayer" with "Game of Thrones". Somehow, the essay stays exactly as apt.

There are nineteen rules governing literary art in domain of romantic fiction -- some say twenty-two. In "Game of Thrones," Martin violated eighteen of them. These eighteen require:

  1. That a tale shall accomplish something and arrive somewhere. But the "Game of Thrones" tale accomplishes nothing and arrives in air.

  2. They require that the episodes in a tale shall be necessary parts of the tale, and shall help to develop it. But as the "Game of Thrones" tale is not a tale, and accomplishes nothing and arrives nowhere, the episodes have no rightful place in the work, since there was nothing for them to develop.

  3. They require that the personages in a tale shall be alive, except in the case of corpses, and that always the reader shall be able to tell the corpses from the others. But this detail has often been overlooked in the "Game of Thrones" tale.

  4. They require that the personages in a tale, both dead and alive, shall exhibit a sufficient excuse for being there. But this detail also has been overlooked in the "Game of Thrones" tale.

  5. They require that when the personages of a tale deal in conversation, the talk shall sound like human talk, and be talk such as human beings would be likely to talk in the given circumstances, and have a discoverable meaning, also a discoverable purpose, and a show of relevancy, and remain in the neighborhood of the subject at hand, and be interesting to the reader, and help out the tale, and stop when the people cannot think of anything more to say. But this requirement has been ignored from the beginning of the "Game of Thrones" tale to the end of it.

  6. They require that when the author describes the character of a personage in the tale, the conduct and conversation of that personage shall justify said description. But this law gets little or no attention in the "Game of Thrones" tale, as Daenerys Targaryen's case will amply prove.

  7. They require that when a personage talks like an illustrated, gilt-edged, tree-calf, hand-tooled, seven- dollar Friendship's Offering in the beginning of a paragraph, he shall not talk like a Dothraki stereotype in the end of it. But this rule is flung down and danced upon in the "Game of Thrones" tale.

  8. They require that crass stupidities shall not be played upon the reader as "the craft of the Night's Watch, the delicate art of the forest," by either the author or the people in the tale. But this rule is persistently violated in the "Game of Thrones" tale.

  9. They require that the personages of a tale shall confine themselves to possibilities and let miracles alone; or, if they venture a miracle, the author must so plausibly set it forth as to make it look possible and reasonable. But these rules are not respected in the "Game of Thrones" tale.

  10. They require that the author shall make the reader feel a deep interest in the personages of his tale and in their fate; and that he shall make the reader love the good people in the tale and hate the bad ones. But the reader of the "Game of Thrones" tale dislikes the good people in it, is indifferent to the others, and wishes they would all drowned together.

  11. They require that the characters in a tale shall be so clearly defined that the reader can tell beforehand what each will do in a given emergency. But in the "Game of Thrones" tale, this rule is vacated.

In addition to these large rules, there are some little ones. These require that the author shall:

  1. Say what he is proposing to say, not merely come near it.

  2. Use the right word, not its second cousin.

  3. Eschew surplusage.

  4. Not omit necessary details.

  5. Avoid slovenliness of form.

  6. Use good grammar.

  7. Employ a simple and straightforward style.

Even these seven are coldly and persistently violated in the "Game of Thrones" tale.

Frankly, I don't dislike Game of Thrones, but as a television show, at least, it has more in common with theme-park rides or pornography or the movie Avatar than it does with serious works of fiction. Game of Thrones exists for the experience of Westeros, which is why every conversation involves ten sentences of fake formalities and faux-medieval dialogue for every relevant fact, while each of the thousand paper-thin characters has a single goal and pursues that goal while evincing their single defining trait (Littlefinger is duplicitious and protects Catelyn; Tyrion is clever and pursues power; Joffrey is a jackass and pursues more jackassery.)

Game of Thrones not meant to have any kind of depth or complexity—it's meant to appear that there is depth and complexity just behind the curtain, so that as you sit with your arms and legs inside the ride, you feel like the characters are people instead of animatronic dummies. Space Mountain can't take you to space, and pornography can't make you feel sexual intimacy, but they can emulate the feeling of it. Game of Thrones emulates the feeling of experiencing a complicated and nuanced story without ever giving you the story. It is, in that sense, effective without ever being good.