forked from Orchid/orchid
Final commit before submission
This commit is contained in:
317
README.md
317
README.md
@@ -1,314 +1,5 @@
|
||||
Orchid will be a compiled functional language with a powerful macro
|
||||
language and optimizer.
|
||||
All you need to run the project is a nighly rust toolchain. Go to one of the folders within `examples` and run
|
||||
|
||||
# Examples
|
||||
|
||||
Hello World in Orchid
|
||||
```orchid
|
||||
import std::io::(println, out)
|
||||
|
||||
main := println out "Hello World!"
|
||||
```
|
||||
|
||||
Basic command line calculator
|
||||
```orchid
|
||||
import std::io::(readln, printf, in, out)
|
||||
|
||||
main := (
|
||||
readln in >>= int |> \a.
|
||||
readln in >>= \op.
|
||||
readln in >>= int |> \b.
|
||||
printf out "the result is {}\n", [match op (
|
||||
"+" => a + b,
|
||||
"-" => a - b,
|
||||
"*" => a * b,
|
||||
"/" => a / b
|
||||
)]
|
||||
)
|
||||
```
|
||||
|
||||
Grep
|
||||
```orchid
|
||||
import std::io::(readln, println, in, out, getarg)
|
||||
|
||||
main := loop \r. (
|
||||
readln in >>= \line.
|
||||
if (substring (getarg 1) line)
|
||||
then (println out ln >>= r)
|
||||
else r
|
||||
)
|
||||
```
|
||||
|
||||
Filter through an arbitrary collection
|
||||
```orchid
|
||||
filter := @C:Type -> Type. @:Map C. @T. \f:T -> Bool. \coll:C T. (
|
||||
coll >> \el. if (f el) then (Some el) else Nil
|
||||
):(C T)
|
||||
```
|
||||
|
||||
# Explanation
|
||||
|
||||
This explanation is not a tutorial. It follows a constructive order,
|
||||
gradually introducing language features to better demonstrate their
|
||||
purpose. It also assumes that the reader is familiar with functional
|
||||
programming.
|
||||
|
||||
## Lambda calculus recap
|
||||
|
||||
The language is almost entirely based on lambda calculus, so everything
|
||||
is immutable and evaluation is lazy. The following is an anonymous
|
||||
function that takes an integer argument and multiplies it by 2:
|
||||
|
||||
```orchid
|
||||
\x:int. imul 2 x
|
||||
```
|
||||
|
||||
Multiple parameters are represented using currying, so the above is
|
||||
equivalent to
|
||||
|
||||
```orchid
|
||||
imul 2
|
||||
```
|
||||
|
||||
Recursion is accomplished using the Y combinator (called `loop`), which
|
||||
is a function that takes a function as its single parameter and applies
|
||||
it to itself. A naiive implementation of `imul` might look like this.
|
||||
|
||||
```orchid
|
||||
\a:int.\b:int. loop \r. (\i.
|
||||
ifthenelse (ieq i 0)
|
||||
b
|
||||
(iadd b (r (isub i 1))
|
||||
) a
|
||||
```
|
||||
|
||||
`ifthenelse` takes a boolean as its first parameter and selects one of the
|
||||
following two expressions (of identical type) accordingly. `ieq`, `iadd`
|
||||
and `isub` are self explanatory.
|
||||
|
||||
## Auto parameters (generics, polymorphism)
|
||||
|
||||
Although I didin't specify the type of `i` in the above example, it is
|
||||
known at compile time because the recursion is applied to `a` which is an
|
||||
integer. I could have omitted the second argument, then I would have
|
||||
had to specify `i`'s type as an integer, because for plain lambda
|
||||
expressions all types have to be statically known at compile time.
|
||||
|
||||
Polymorphism is achieved using parametric constructs called auto
|
||||
parameters. An auto parameter is a placeholder filled in during
|
||||
compilation, syntactically remarkably similar to lambda expressions:
|
||||
|
||||
```orchid
|
||||
@T. --[ body of expression referencing T ]--
|
||||
```
|
||||
|
||||
Autos have two closely related uses. First, they are used to represent
|
||||
generic type parameters. If an auto is used as the type of an argument
|
||||
or some other subexpression that can be trivially deduced from the calling
|
||||
context, it is filled in.
|
||||
|
||||
The second usage of autos is for constraints, if they have a type that
|
||||
references another auto. Because these parameters are filled in by the
|
||||
compiler, referencing them is equivalent to the statement that a default
|
||||
value assignable to the specified type exists. Default values are declared
|
||||
explicitly and identified by their type, where that type itself may be
|
||||
parametric and may specify its own constraints which are resolved
|
||||
recursively. If the referenced default is itself a useful value or
|
||||
function you can give it a name and use it as such, but you can also omit
|
||||
the name, using the default as a hint to the compiler to be able to call
|
||||
functions that also have defaults of the same types, or possibly other
|
||||
types whose defaults have implmentations based on your defaults.
|
||||
|
||||
For a demonstration, here's a sample implementation of the Option monad.
|
||||
```orchid
|
||||
--[[ The definition of Monad ]]--
|
||||
define Monad $M:(Type -> Type) as (Pair
|
||||
(@T. @U. (T -> M U) -> M T -> M U) -- bind
|
||||
(@T. T -> M T) -- return
|
||||
)
|
||||
|
||||
bind := @M:Type -> Type. @monad:Monad M. fst monad
|
||||
return := @M:Type -> Type. @monad:Monad M. snd monad
|
||||
|
||||
--[[ The definition of Option ]]--
|
||||
define Option $T as @U. U -> (T -> U) -> U
|
||||
--[ Constructors ]--
|
||||
export Some := @T. \data:T. categorise @(Option T) ( \default. \map. map data )
|
||||
export None := @T. categorise @(Option T) ( \default. \map. default )
|
||||
--[ Implement Monad ]--
|
||||
impl Monad Option via (makePair
|
||||
( @T. @U. \f:T -> U. \opt:Option T. opt None \x. Some f ) -- bind
|
||||
Some -- return
|
||||
)
|
||||
--[ Sample function that works on unknown monad to demonstrate HKTs.
|
||||
Turns (Option (M T)) into (M (Option T)), "raising" the unknown monad
|
||||
out of the Option ]--
|
||||
export raise := @M:Type -> Type. @T. @:Monad M. \opt:Option (M T). (
|
||||
opt (return None) (\m. bind m (\x. Some x))
|
||||
):(M (Option T))
|
||||
```
|
||||
|
||||
Typeclasses may be implmented in any module that also defines at least one of
|
||||
the types in the definition, which includes both the type of the
|
||||
expression and the types of its auto parameters. They always have a name,
|
||||
which can be used to override known defaults with which your definiton
|
||||
may overlap. For example, if addition is defined elementwise for all
|
||||
applicative functors, the author of List might want for concatenation to
|
||||
take precedence in the case where all element types match. Notice how
|
||||
Add has three arguments, two are the types of the operands and one is
|
||||
the result:
|
||||
|
||||
```orchid
|
||||
impl @T. Add (List T) (List T) (List T) by concatListAdd over elementwiseAdd via (
|
||||
...
|
||||
)
|
||||
```
|
||||
|
||||
For completeness' sake, the original definition might look like this:
|
||||
|
||||
```orchid
|
||||
impl
|
||||
@C:Type -> Type. @T. @U. @V. -- variables
|
||||
@:(Applicative C). @:(Add T U V). -- conditions
|
||||
Add (C T) (C U) (C V) -- target
|
||||
by elementwiseAdd via (
|
||||
...
|
||||
)
|
||||
```
|
||||
|
||||
With the use of autos, here's what the recursive multiplication
|
||||
implementation looks like:
|
||||
|
||||
```orchid
|
||||
impl @T. @:(Add T T T). Multiply T int T by iterativeMultiply via (
|
||||
\a:int. \b:T. loop \r. (\i.
|
||||
ifthenelse (ieq i 0)
|
||||
b
|
||||
(add b (r (isub i 1)) -- notice how iadd is now add
|
||||
) a
|
||||
)
|
||||
```
|
||||
|
||||
This could then be applied to any type that's closed over addition
|
||||
|
||||
```orchid
|
||||
aroundTheWorldLyrics := (
|
||||
mult 18 (add (mult 4 "Around the World\n") "\n")
|
||||
)
|
||||
```
|
||||
|
||||
For my notes on the declare/impl system, see [notes/type_system]
|
||||
|
||||
## Preprocessor
|
||||
|
||||
The above code samples have one notable difference from the Examples
|
||||
section above; they're ugly and hard to read. The solution to this is a
|
||||
powerful preprocessor which is used internally to define all sorts of
|
||||
syntax sugar from operators to complex syntax patterns and even pattern
|
||||
matching, and can also be used to define custom syntax. The preprocessor
|
||||
reads the source as an S-tree while executing substitution rules which
|
||||
have a real numbered priority.
|
||||
|
||||
In the following example, seq matches a list of arbitrary tokens and its
|
||||
parameter is the order of resolution. The order can be used for example to
|
||||
make sure that `if a then b else if c then d else e` becomes
|
||||
`(ifthenelse a b (ifthenelse c d e))` and not
|
||||
`(ifthenelse a b if) c then d else e`. It's worth highlighting here that
|
||||
preprocessing works on the typeless AST and matchers are constructed
|
||||
using inclusion rather than exclusion, so it would not be possible to
|
||||
selectively allow the above example without enforcing that if-statements
|
||||
are searched back-to-front. If order is still a problem, you can always
|
||||
parenthesize subexpressions at the callsite.
|
||||
|
||||
```orchid
|
||||
(..$pre:2 if ...$cond then ...$true else ...$false) =10=> (
|
||||
..$pre
|
||||
(ifthenelse (...$cond) (...$true) (...$false))
|
||||
)
|
||||
...$a + ...$b =2=> (add (...$a) (...$b))
|
||||
...$a = ...$b =5=> (eq $a $b)
|
||||
...$a - ...$b =2=> (sub (...$a) (...$b))
|
||||
```
|
||||
|
||||
The recursive addition function now looks like this
|
||||
|
||||
```orchid
|
||||
impl @T. @:(Add T T T). Multiply T int T by iterativeMultiply via (
|
||||
\a:int.\b:T. loop \r. (\i.
|
||||
if (i = 0) then b
|
||||
else (b + (r (i - 1)))
|
||||
) a
|
||||
)
|
||||
```
|
||||
|
||||
### Traversal using carriages
|
||||
|
||||
While it may not be immediately apparent, these substitution rules are
|
||||
actually Turing complete. They can be used quite intuitively to traverse
|
||||
the token tree with unique "carriage" symbols that move according to their
|
||||
environment and can carry structured data payloads.
|
||||
|
||||
Here's an example of a carriage being used to turn a square-bracketed
|
||||
list expression into a lambda expression that matches a conslist. Notice
|
||||
how the square brackets pair up, as all three variants of brackets
|
||||
are considered branches in the S-tree rather than individual tokens.
|
||||
|
||||
```orchid
|
||||
-- Initial step, eliminates entry condition (square brackets) and constructs
|
||||
-- carriage and other working symbols
|
||||
[...$data:1] =1000.1=> (cons_start ...$data cons_carriage(none))
|
||||
-- Shortcut with higher priority
|
||||
[] =1000.5=> none
|
||||
-- Step
|
||||
, $item cons_carriage($tail) =1000.1=> cons_carriage((some (cons $item $tail)))
|
||||
-- End, removes carriage and working symbols and leaves valid source code
|
||||
cons_start $item cons_carriage($tail) =1000.1=> some (cons $item $tail)
|
||||
-- Low priority rules should turn leftover symbols into errors.
|
||||
cons_start =0=> cons_err
|
||||
cons_carriage($data) =0=> cons_err
|
||||
cons_err =0=> (macro_error "Malformed conslist expression")
|
||||
-- macro_error will probably have its own rules for composition and
|
||||
-- bubbling such that the output for an erratic expression would be a
|
||||
-- single macro_error to be decoded by developer tooling
|
||||
```
|
||||
(an up-to-date version of this example can be found in the examples
|
||||
folder)
|
||||
|
||||
Another thing to note is that although it may look like cons_carriage is
|
||||
a global string, it's in fact namespaced to whatever file provides the
|
||||
macro. Symbols can be exported either by prefixing the pattern with
|
||||
`export` or separately via the following syntax if no single rule is
|
||||
equipped to dictate the exported token set.
|
||||
|
||||
```orchid
|
||||
export ::(some_name, other_name)
|
||||
```
|
||||
|
||||
# Module system
|
||||
|
||||
Files are the smallest unit of namespacing, automatically grouped into
|
||||
folders and forming a tree the leaves of which are the actual symbols. An
|
||||
exported symbol is a name referenced in an exported substitution rule
|
||||
or assigned to an exported function. Imported symbols are considered
|
||||
identical to the same symbol directly imported from the same module for
|
||||
the purposes of substitution. The module syntax is very similar to
|
||||
Rust's, and since each token gets its own export with most rules
|
||||
comprising several local symbols, the most common import option is
|
||||
probably ::* (import all).
|
||||
|
||||
# Optimization
|
||||
|
||||
This is very far away so I don't want to make promises, but I have some
|
||||
ideas.
|
||||
|
||||
- [ ] early execution of functions on any subset of their arguments where
|
||||
it could provide substantial speedup
|
||||
- [ ] tracking copies of expressions and evaluating them only once
|
||||
- [ ] Many cases of single recursion converted to loops
|
||||
- [ ] tail recursion
|
||||
- [ ] 2 distinct loops where the tail doesn't use the arguments
|
||||
- [ ] reorder operations to favour this scenario
|
||||
- [ ] reactive calculation of values that are deemed to be read more often
|
||||
than written
|
||||
- [ ] automatic profiling based on performance metrics generated by debug
|
||||
builds
|
||||
```sh
|
||||
cargo run -- -p .
|
||||
```
|
||||
Reference in New Issue
Block a user