yellowpigs.net

Lashell Tips & Tricks

This is a quickstart/reference, and is not intended as a substitute for the longer tutorials.

What is Lashell?

From the Officially Unofficial Lashell Programming Language Manual:

Lashell is an update of shell programming to a modern paradigm
supporting recursive lazy streams, strong typing, map-reduce on
massively parallel clusters, exceptions (alerts), the well-typed
lambda calculus, dynamic RESTful web template frameworks, and
ponies.

Well, not ponies. (Unless invoked with the -p option for little
ponies or the -P option for big ponies. It is recommend to always
use --replicated in conjunction with these options, as the default
behavior --sharded can be quite alarming.)

But everything else!

Data type reference:

Unusual syntax reference:

Syntax should be generally familiar to programmers with experience in Bash, Python, and functional programming languages (e.g., Lisp/Scheme, Haskell, Erlang, Scala), but there are few innovations unique to Lashell. More detailed explanations follow in subsequent sections of this doc.

Style suggestions:

Use triple space indentation! Absolutely do not use tabs, as tabs will be interpreted as buffer overflows. (For an explanation of why, see the epic discussion on the -devel mailing list.)

Avoid using line numbers and goto, unless a specific algorithm cannot be expressed without them. (See Dijkstra 1968.)

80-character line length is recommended, although pipelines and regular expressions may be as long as 160 and 240 characters respectively without significantly diminishing readability.

See also the official style guide.

Gotchas:

Here are some things that commonly trip up people new to Lashell.

/ is the division operator. // is the filesystem root.

Comments start with --; debug messages start with # and are automatically printed out unless the program is run with the -q (--quiet) option.

/etc/passwd is a file. /etc/ is a persisted ("real") directory. %etc/passwd is just an in-memory directory entry (mostly equivalent to a hash or dictionary entry).

Use builtins:

Don't forget that grep, sort, and cat are built in! The most recent version of Lashell also includes native implementations of awk and cowsay. You do not need to write your own.

If you need stderr, it's in /dev/stderr just like in bash.

Shard/merge notation:

Shard/merge is one of the hardest ideas in the Lashell language.

A pipeline emits a stream. The sharding operator <n| splits a single stream into n shards, or parallel streams, each of which is piped to a separate instance of the next pipeline.

<| creates as many streams (and next pipelines) as there are cores in the system, plus one.

The merge operator |m> merges any number of sharded streams into m arbitrary shards, each of which is piped to a separate instance of the next pipeline.

|> merges down to a single, non-parallel stream.

Important: Whatever is between the shard and merge operators is run in parallel. You probably don't want to write files in between shard and merge, or those files will be written over and over again if your shards happen to run on the same machine.

If you really need to, you can use @# notation to get the shard number:

  %some/dir/* <| thing | stuff | tee /tmp/save@# |> mergesort > $answer

Streams to scalars and vice versa:

A big confusion between shell and Lashell is the < operator. In shell, this reads from a file on the right. In Lashell, data always flows to the right in pipelines. The < operator takes a scalar on the left and expands it into a stream. Normally this is done by splitting on newlines.

Similarly, the < operator takes a stream on the left and writes it to a scalar. If that scalar is a directory entry, the entry is overwritten with the scalar value. Otherwise this is variable assignment.

The -| tee operator:

  foo  -|  bar > bozo  ||  wibble > womp

Don't loop twice over the same stream: use a tee. Tees execute each of several pipelines in parallel. The || marker goes between the things to be done in parallel. This is different from the tee pipeline, inherited from shell, which simply writes to a directory value:

  foo | bar | tee %some/where | wibble

Functions and subs:

Try to split code into separate functions before it gets too long. Too long is defined as two screens long (landscape mode monitors), one screen long (portrait mode monitors), or 20 screens long (smartphone).

 
  (fun fibo $x) =
     (-lt $x 0) => alert[] FiboNumberTooSmall
     if (-lt $x 2):
        1
     else:
        (+ fibo(- $x 2) fibo(- $x 1))
     fi
  (nuf)

Pattern matching is an option when it is more readable, although many users prefer to use it only for alerts.

  (fun fibo $x) =
     (-lt $x 0) => alert[] FiboNumberTooSmall
     (-lt $x 2) => 1
     (+ fibo(- $x 2) fibo(- $x 1))
  (nuf)

To sum all the values in a directory, and for other similar problems, use a sub (a pipeline stage) and not a function.

  (sub sumdir) =
     $total = 0
     for $val in `read`/*
        (~$val) => alert[] CantReadFromStream
        $total += $val
     rof
     echo $total
  (bus)
This creates sumdir as a pipeline stage that takes a directory name on the input stream and returns a total on the output stream.
  echo "/tmp/numbers" | sumdir

A word about directories:

Don't use persisted directories for temporary variables!

Remember, if you start a directory name with / then it will actually be persisted in the filesystem. For temporary directories (which Lashell uses like Perl uses hashes), always start with % instead of /.

Any directory entry may be stored with double inverse parity by appending the & sign: %mydir/entry&

Either way, using read on a directory entry always gives you a stream. To get a single scalar, use `read`.

If $foo is a scalar naming a directory:

Avoid using ls $foo for $foo/*. The difference is that ls will run a separate thread and materialize the directory entries in memory, whereas $foo/* is lazy. (Note that mutt will be included in the next release, and is expected to significantly improve threading.)

Lambdas:

People new to Lashell are often surprised to learn that Lashell does not support lambda forms. The Lashell team plans to support lambdas in the future. It's complicated to get it right, but I've been told repeatedly that they're working on it.

Enjoy!

PS Read this far and wondering what the heck this is about? A job post erroneously mentioned "Lashell". I thought it would be entertaining to post about the fictious language and see if any recruiters would take the bait. (I also politely notifed them about the mistake.)

(Last updated 2011)