forked from Orchid/orchid
Bunch of improvements
This commit is contained in:
12
Cargo.lock
generated
12
Cargo.lock
generated
@@ -54,6 +54,17 @@ version = "0.2.2"
|
|||||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
checksum = "7a81dae078cea95a014a339291cec439d2f232ebe854a9d672b796c6afafa9b7"
|
checksum = "7a81dae078cea95a014a339291cec439d2f232ebe854a9d672b796c6afafa9b7"
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "derivative"
|
||||||
|
version = "2.2.0"
|
||||||
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
checksum = "fcc3dd5e9e9c0b295d6e1e4d811fb6f157d5ffd784b8d202fc62eac8035a770b"
|
||||||
|
dependencies = [
|
||||||
|
"proc-macro2",
|
||||||
|
"quote",
|
||||||
|
"syn",
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "getrandom"
|
name = "getrandom"
|
||||||
version = "0.2.6"
|
version = "0.2.6"
|
||||||
@@ -82,6 +93,7 @@ name = "orchid"
|
|||||||
version = "0.1.0"
|
version = "0.1.0"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"chumsky",
|
"chumsky",
|
||||||
|
"derivative",
|
||||||
"thiserror",
|
"thiserror",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|||||||
@@ -7,4 +7,5 @@ edition = "2021"
|
|||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
thiserror = "1.0"
|
thiserror = "1.0"
|
||||||
chumsky = "0.8"
|
chumsky = "0.8"
|
||||||
|
derivative = "2.2"
|
||||||
276
README.md
276
README.md
@@ -1,2 +1,274 @@
|
|||||||
Orchid will be a functional language with a powerful macro language and
|
Orchid will be a compiled functional language with a powerful macro
|
||||||
optimizer. Further explanation and demos coming soon!
|
language and optimizer.
|
||||||
|
|
||||||
|
# Examples
|
||||||
|
|
||||||
|
Hello World in Orchid
|
||||||
|
```orchid
|
||||||
|
import std::io::(println, out)
|
||||||
|
|
||||||
|
main = println out "Hello World!"
|
||||||
|
```
|
||||||
|
|
||||||
|
Basic command line calculator
|
||||||
|
```orchid
|
||||||
|
import std::io::(readln, printf, in, out)
|
||||||
|
|
||||||
|
main = (
|
||||||
|
readln in >>= int |> \a.
|
||||||
|
readln in >>= \op.
|
||||||
|
readln in >>= int |> \b.
|
||||||
|
printf out "the result is {}\n", [match op (
|
||||||
|
"+" => a + b,
|
||||||
|
"-" => a - b,
|
||||||
|
"*" => a * b,
|
||||||
|
"/" => a / b
|
||||||
|
)]
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Grep
|
||||||
|
```orchid
|
||||||
|
import std::io::(readln, println, in, out, getarg)
|
||||||
|
|
||||||
|
main = loop \r. (
|
||||||
|
readln in >>= \line.
|
||||||
|
if (substring (getarg 1) line)
|
||||||
|
then (println out ln >>= r)
|
||||||
|
else r
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
Filter through an arbitrary collection
|
||||||
|
```orchid
|
||||||
|
filter = @C:Type -> Type. @:Map C. @T. @U. \f:T -> U. \coll:C T. (
|
||||||
|
coll >> \el. if (f el) then (Some el) else Nil
|
||||||
|
):(C U)
|
||||||
|
```
|
||||||
|
|
||||||
|
# Explanation
|
||||||
|
|
||||||
|
This explanation is not a tutorial. It follows a constructive order,
|
||||||
|
gradually introducing language features to better demonstrate their
|
||||||
|
purpose. It also assumes that the reader is familiar with functional
|
||||||
|
programming.
|
||||||
|
|
||||||
|
## Lambda calculus recap
|
||||||
|
|
||||||
|
The language is almost entirely based on lambda calculus, so everything
|
||||||
|
is immutable and evaluation is lazy. The following is an anonymous
|
||||||
|
function that takes an integer argument and multiplies it by 2:
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
\x:int. imul 2 x
|
||||||
|
```
|
||||||
|
|
||||||
|
Multiple parameters are represented using currying, so the above is
|
||||||
|
equivalent to
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
imul 2
|
||||||
|
```
|
||||||
|
|
||||||
|
Recursion is accomplished using the Y combinator (called `loop`), which
|
||||||
|
is a function that takes a function as its single parameter and applies
|
||||||
|
it to itself. A naiive implementation of `imul` might look like this.
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
\a:int.\b:int. loop \r. (\i.
|
||||||
|
ifthenelse (ieq i 0)
|
||||||
|
b
|
||||||
|
(iadd b (r (isub i 1))
|
||||||
|
) a
|
||||||
|
```
|
||||||
|
|
||||||
|
`ifthenelse` takes a boolean as its first parameter and selects one of the
|
||||||
|
following two expressions (of identical type) accordingly. `ieq`, `iadd`
|
||||||
|
and `isub` are self explanatory.
|
||||||
|
|
||||||
|
## Auto parameters (generics, polymorphism)
|
||||||
|
|
||||||
|
Although I didin't specify the type of `i` in the above example, it is
|
||||||
|
known at compile time because the recursion is applied to b which is an
|
||||||
|
integer. I could have omitted the second argument but then I would have
|
||||||
|
had to specify `i`'s type as an integer, because for plain lambda
|
||||||
|
expressions all types have to be statically knoqn at compile time. To
|
||||||
|
achieve polymorphism, one parametric tool is available, called auto
|
||||||
|
parameters. An auto parameter is a placeholder filled in during
|
||||||
|
compilation, syntactically remarkably similar to lambda expressions:
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
@T. --[ body of expression referencing T ]--
|
||||||
|
```
|
||||||
|
|
||||||
|
Autos have two closely related uses. First, they are used to represent
|
||||||
|
generic type parameters. If an auto is used as the type of an argument
|
||||||
|
or some other subexpression that can be trivially deduced from the calling
|
||||||
|
context, it is filled in.
|
||||||
|
|
||||||
|
The second usage of autos is for constraints, if they have a type that
|
||||||
|
references another auto. Because these parameters are filled in by the
|
||||||
|
compiler, referencing them is equivalent to the statement that a default
|
||||||
|
value assignable to the specified type exists. Default values are declared
|
||||||
|
explicitly and identified by their type, where that type itself may be
|
||||||
|
parametric and may specify its own constraints which are resolved
|
||||||
|
recursively. If the referenced default is itself a useful value or
|
||||||
|
function you can give it a name and use it as such, but you can also omit
|
||||||
|
the name, using the default as a hint to the compiler to be able to call
|
||||||
|
functions that also have defaults of the same types, or possibly other
|
||||||
|
types whose defaults have implmentations based on your defaults.
|
||||||
|
|
||||||
|
For a demonstration, here's a sample implementation of the Option monad.
|
||||||
|
```orchid
|
||||||
|
--[[ The definition of Monad ]]--
|
||||||
|
Bind = \M:Type -> Type. @T -> @U -> (T -> M U) -> M T -> M U
|
||||||
|
Return = \M:Type -> Type. @T -> T -> M T
|
||||||
|
Monad = \M:Type -> Type. (
|
||||||
|
@:Bind M.
|
||||||
|
@:Return M.
|
||||||
|
0 --[ Note that empty expressions are forbidden so those that exist
|
||||||
|
purely for their constraints should return a nondescript constant
|
||||||
|
that is likely to raise a type error when used by mistake, such as
|
||||||
|
zero ]--
|
||||||
|
)
|
||||||
|
|
||||||
|
--[[ The definition of Option ]]--
|
||||||
|
export Option = \T:Type. @U -> U -> (T -> U) -> U
|
||||||
|
--[ Constructors ]--
|
||||||
|
export Some = @T. \data:T. ( \default. \map. map data ):(Option T)
|
||||||
|
export None = @T. ( \default. \map. default ):(Option T)
|
||||||
|
--[ Implement Monad ]--
|
||||||
|
default returnOption = Some:(Return Option)
|
||||||
|
default bindOption = ( @T:Type. @U:Type.
|
||||||
|
\f:T -> U. \opt:Option T. opt None f
|
||||||
|
):(Bind Option)
|
||||||
|
--[ Sample function that works on unknown monad to demonstrate HKTs.
|
||||||
|
Turns (Option (M T)) into (M (Option T)), "raising" the unknown monad
|
||||||
|
out of the Option ]--
|
||||||
|
export raise = @M:Type -> Type. @T:Type. @:Monad M. \opt:Option (M T). (
|
||||||
|
opt (return None) (\m. bind m (\x. Some x))
|
||||||
|
):(M (Option T))
|
||||||
|
```
|
||||||
|
|
||||||
|
Defaults may be defined in any module that also defines at least one of
|
||||||
|
the types in the definition, which includes both the type of the
|
||||||
|
expression and the types of its auto parameters. They always have a name,
|
||||||
|
which can be used to override known defaults with which your definiton
|
||||||
|
may overlap. For example, if addition is defined elementwise for all
|
||||||
|
applicative functors, the author of List might want for concatenation to
|
||||||
|
take precedence in the case where all element types match. Notice how
|
||||||
|
Add has three arguments, two are the types of the operands and one is
|
||||||
|
the result:
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
default concatListAdd replacing applicativeAdd = @T. (
|
||||||
|
...
|
||||||
|
):(Add (List T) (List T) (List T))
|
||||||
|
```
|
||||||
|
|
||||||
|
For completeness' sake, the original definition might look like this:
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
default elementwiseAdd = @C:Type -> Type. @T. @U. @V. @:(Applicative C). @:(Add T U V). (
|
||||||
|
...
|
||||||
|
):(Add (C T) (C U) (C V))
|
||||||
|
```
|
||||||
|
|
||||||
|
With the use of autos, here's what the recursive multiplication
|
||||||
|
implementation looks like:
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
default iterativeMultiply = @T. @:(Add T T T). (
|
||||||
|
\a:int.\b:T. loop \r. (\i.
|
||||||
|
ifthenelse (ieq i 0)
|
||||||
|
b
|
||||||
|
(add b (r (isub i 1)) -- notice how iadd is now add
|
||||||
|
) a
|
||||||
|
):(Multiply T int T)
|
||||||
|
```
|
||||||
|
|
||||||
|
This could then be applied to any type that's closed over addition
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
aroundTheWorldLyrics = (
|
||||||
|
mult 18 (add (mult 4 "Around the World\n") "\n")
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Preprocessor
|
||||||
|
|
||||||
|
The above code samples have one notable difference from the Examples
|
||||||
|
section above; they're ugly and hard to read. The solution to this is a
|
||||||
|
powerful preprocessor which is used internally to define all sorts of
|
||||||
|
syntax sugar from operators to complex syntax patterns and even pattern
|
||||||
|
matching, and can also be used to define custom syntax. The preprocessor
|
||||||
|
executes substitution rules on the S-tree which have a real numbered
|
||||||
|
priority and an internal order of resolution.
|
||||||
|
|
||||||
|
In the following example, seq matches a list of arbitrary tokens and its
|
||||||
|
parameter is the order of resolution. The order can be used for example to
|
||||||
|
make sure that `if a then b else if c then d else e` becomes
|
||||||
|
`(ifthenelse a b (ifthenelse c d e))` and not
|
||||||
|
`(ifthenelse a b if) c then d else e`. It's worth highlighting here that
|
||||||
|
preprocessing works on the typeless AST and matchers are constructed
|
||||||
|
using inclusion rather than exclusion, so it would not be possible to
|
||||||
|
selectively allow the above example without enforcing that if-statements
|
||||||
|
are searched back-to-front. If order is still a problem, you can always
|
||||||
|
parenthesize problematic expressions.
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
(...$pre:(seq 2) if $1 then $2 else $3 ...$post:(seq 1)) =2=> (
|
||||||
|
...$pre
|
||||||
|
(ifthenelse $1 $2 $3)
|
||||||
|
...$post
|
||||||
|
)
|
||||||
|
$a + $b =10=> (add $a $b)
|
||||||
|
$a == $b =5=> (eq $a $b)
|
||||||
|
$a - $b =10=> (sub $a $b)
|
||||||
|
```
|
||||||
|
|
||||||
|
The recursive addition function now looks like this
|
||||||
|
|
||||||
|
```orchid
|
||||||
|
default iterativeMultiply = @T. @:(Add T T T). (
|
||||||
|
\a:int.\b:T. loop \r. (\i.
|
||||||
|
if (i == 0) then b
|
||||||
|
else (b + (r (i - 1)))
|
||||||
|
) a
|
||||||
|
):(Multiply T int T)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Traversal using carriages
|
||||||
|
|
||||||
|
While it may not be immediately apparent, these substitution rules are
|
||||||
|
actually Turing complete. They can be used quite intuitively to traverse
|
||||||
|
the token tree with unique "carriage" symbols that move according to their
|
||||||
|
environment and can carry structured data payloads.
|
||||||
|
|
||||||
|
TODO: carriage example
|
||||||
|
|
||||||
|
# Module system
|
||||||
|
|
||||||
|
Files are the smallest unit of namespacing, automatically grouped into
|
||||||
|
folders and forming a tree the leaves of which are the actual symbols. An
|
||||||
|
exported symbol is a name referenced in an exported substitution pattern
|
||||||
|
or assigned to an exported function. Imported symbols are considered
|
||||||
|
identical to the same symbol directly imported from the same module for
|
||||||
|
the purposes of substitution.
|
||||||
|
|
||||||
|
# Optimization
|
||||||
|
|
||||||
|
This is very far away so I don't want to make promises, but I have some
|
||||||
|
ideas.
|
||||||
|
|
||||||
|
[ ] early execution of functions on any subset of their arguments where it
|
||||||
|
could provide substantial speedup
|
||||||
|
[ ] tracking copies of expressions and evaluating them only once
|
||||||
|
[ ] Many cases of single recursion converted to loops
|
||||||
|
[ ] tail recursion
|
||||||
|
[ ] 2 distinct loops where the tail doesn't use the arguments
|
||||||
|
[ ] reorder operations to favour this scenario
|
||||||
|
[ ] reactive calculation of values that are deemed to be read more often
|
||||||
|
than written
|
||||||
|
[ ] automatic profiling based on performance metrics generated by debug
|
||||||
|
builds
|
||||||
@@ -1,6 +1,6 @@
|
|||||||
use std::io::{self, Read};
|
use std::io::{self, Read};
|
||||||
|
|
||||||
use chumsky::Parser;
|
use chumsky::{Parser, prelude::*};
|
||||||
|
|
||||||
mod parse;
|
mod parse;
|
||||||
|
|
||||||
@@ -8,6 +8,7 @@ fn main() {
|
|||||||
let mut input = String::new();
|
let mut input = String::new();
|
||||||
let mut stdin = io::stdin();
|
let mut stdin = io::stdin();
|
||||||
stdin.read_to_string(&mut input).unwrap();
|
stdin.read_to_string(&mut input).unwrap();
|
||||||
let output = parse::parser().parse(input);
|
let ops: Vec<String> = vec!["$", "."].iter().map(|&s| s.to_string()).collect();
|
||||||
|
let output = parse::expression_parser(&ops).then_ignore(end()).parse(input);
|
||||||
println!("\nParsed:\n{:?}", output);
|
println!("\nParsed:\n{:?}", output);
|
||||||
}
|
}
|
||||||
|
|||||||
143
src/parse.rs
143
src/parse.rs
@@ -1,143 +0,0 @@
|
|||||||
use std::fmt::Debug;
|
|
||||||
use chumsky::{self, prelude::*, Parser};
|
|
||||||
|
|
||||||
#[derive(Debug)]
|
|
||||||
pub enum Expr {
|
|
||||||
Num(f64),
|
|
||||||
Int(u64),
|
|
||||||
Char(char),
|
|
||||||
Str(String),
|
|
||||||
Name(String),
|
|
||||||
S(Vec<Expr>),
|
|
||||||
Lambda(String, Vec<Expr>)
|
|
||||||
}
|
|
||||||
|
|
||||||
fn uint_parser(base: u32) -> impl Parser<char, u64, Error = Simple<char>> {
|
|
||||||
text::int(base).map(move |s: String| u64::from_str_radix(&s, base).unwrap())
|
|
||||||
}
|
|
||||||
|
|
||||||
fn e_parser() -> impl Parser<char, i32, Error = Simple<char>> {
|
|
||||||
return choice((
|
|
||||||
just('e')
|
|
||||||
.ignore_then(text::int(10))
|
|
||||||
.map(|s: String| s.parse().unwrap()),
|
|
||||||
just("e-")
|
|
||||||
.ignore_then(text::int(10))
|
|
||||||
.map(|s: String| -s.parse::<i32>().unwrap()),
|
|
||||||
empty().map(|()| 0)
|
|
||||||
))
|
|
||||||
}
|
|
||||||
|
|
||||||
fn nat2u(base: u64) -> impl Fn((u64, i32),) -> u64 {
|
|
||||||
return move |(val, exp)| {
|
|
||||||
if exp == 0 {val}
|
|
||||||
else {val * base.checked_pow(exp.try_into().unwrap()).unwrap()}
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
fn nat2f(base: u64) -> impl Fn((f64, i32),) -> f64 {
|
|
||||||
return move |(val, exp)| {
|
|
||||||
if exp == 0 {val}
|
|
||||||
else {val * (base as f64).powf(exp.try_into().unwrap())}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
fn e_uint_parser(base: u32) -> impl Parser<char, u64, Error = Simple<char>> {
|
|
||||||
if base > 14 {panic!("exponential in base that uses the digit 'e' is ambiguous")}
|
|
||||||
uint_parser(base).then(e_parser()).map(nat2u(base.into()))
|
|
||||||
}
|
|
||||||
|
|
||||||
fn int_parser() -> impl Parser<char, u64, Error = Simple<char>> {
|
|
||||||
choice((
|
|
||||||
just("0b").ignore_then(e_uint_parser(2)),
|
|
||||||
just("0x").ignore_then(uint_parser(16)),
|
|
||||||
just('0').ignore_then(e_uint_parser(8)),
|
|
||||||
e_uint_parser(10), // Dec has no prefix
|
|
||||||
))
|
|
||||||
}
|
|
||||||
|
|
||||||
fn dotted_parser(base: u32) -> impl Parser<char, f64, Error = Simple<char>> {
|
|
||||||
uint_parser(base)
|
|
||||||
.then_ignore(just('.'))
|
|
||||||
.then(text::digits(base))
|
|
||||||
.map(move |(wh, frac)| {
|
|
||||||
let frac_num = u64::from_str_radix(&frac, base).unwrap() as f64;
|
|
||||||
let dexp = base.pow(frac.len().try_into().unwrap());
|
|
||||||
wh as f64 + (frac_num / dexp as f64)
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
fn e_float_parser(base: u32) -> impl Parser<char, f64, Error = Simple<char>> {
|
|
||||||
if base > 14 {panic!("exponential in base that uses the digit 'e' is ambiguous")}
|
|
||||||
dotted_parser(base).then(e_parser()).map(nat2f(base.into()))
|
|
||||||
}
|
|
||||||
|
|
||||||
fn float_parser() -> impl Parser<char, f64, Error = Simple<char>> {
|
|
||||||
choice((
|
|
||||||
just("0b").ignore_then(e_float_parser(2)),
|
|
||||||
just("0x").ignore_then(dotted_parser(16)),
|
|
||||||
just('0').ignore_then(e_float_parser(8)),
|
|
||||||
e_float_parser(10),
|
|
||||||
))
|
|
||||||
}
|
|
||||||
|
|
||||||
fn text_parser(delim: char) -> impl Parser<char, char, Error = Simple<char>> {
|
|
||||||
let escape = just('\\').ignore_then(
|
|
||||||
just('\\')
|
|
||||||
.or(just('/'))
|
|
||||||
.or(just('"'))
|
|
||||||
.or(just('b').to('\x08'))
|
|
||||||
.or(just('f').to('\x0C'))
|
|
||||||
.or(just('n').to('\n'))
|
|
||||||
.or(just('r').to('\r'))
|
|
||||||
.or(just('t').to('\t'))
|
|
||||||
.or(just('u').ignore_then(
|
|
||||||
filter(|c: &char| c.is_digit(16))
|
|
||||||
.repeated()
|
|
||||||
.exactly(4)
|
|
||||||
.collect::<String>()
|
|
||||||
.validate(|digits, span, emit| {
|
|
||||||
char::from_u32(u32::from_str_radix(&digits, 16).unwrap())
|
|
||||||
.unwrap_or_else(|| {
|
|
||||||
emit(Simple::custom(span, "invalid unicode character"));
|
|
||||||
'\u{FFFD}' // unicode replacement character
|
|
||||||
})
|
|
||||||
}),
|
|
||||||
)),
|
|
||||||
);
|
|
||||||
filter(move |&c| c != '\\' && c != delim).or(escape)
|
|
||||||
}
|
|
||||||
|
|
||||||
fn char_parser() -> impl Parser<char, char, Error = Simple<char>> {
|
|
||||||
just('\'').ignore_then(text_parser('\'')).then_ignore(just('\''))
|
|
||||||
}
|
|
||||||
|
|
||||||
fn str_parser() -> impl Parser<char, String, Error = Simple<char>> {
|
|
||||||
just('"')
|
|
||||||
.ignore_then(text_parser('"').repeated())
|
|
||||||
.then_ignore(just('"'))
|
|
||||||
.collect()
|
|
||||||
}
|
|
||||||
|
|
||||||
pub fn parser() -> impl Parser<char, Expr, Error = Simple<char>> {
|
|
||||||
return recursive(|expr| {
|
|
||||||
let lambda = just('\\')
|
|
||||||
.ignore_then(text::ident())
|
|
||||||
.then_ignore(just('.'))
|
|
||||||
.then(expr.clone().repeated().at_least(1))
|
|
||||||
.map(|(name, body)| Expr::Lambda(name, body));
|
|
||||||
let sexpr = expr.clone()
|
|
||||||
.repeated()
|
|
||||||
.delimited_by(just('('), just(')'))
|
|
||||||
.map(Expr::S);
|
|
||||||
choice((
|
|
||||||
float_parser().map(Expr::Num),
|
|
||||||
int_parser().map(Expr::Int),
|
|
||||||
char_parser().map(Expr::Char),
|
|
||||||
str_parser().map(Expr::Str),
|
|
||||||
text::ident().map(Expr::Name),
|
|
||||||
sexpr,
|
|
||||||
lambda
|
|
||||||
)).padded()
|
|
||||||
}).then_ignore(end())
|
|
||||||
}
|
|
||||||
72
src/parse/expression.rs
Normal file
72
src/parse/expression.rs
Normal file
@@ -0,0 +1,72 @@
|
|||||||
|
use std::{fmt::Debug};
|
||||||
|
use chumsky::{self, prelude::*, Parser};
|
||||||
|
|
||||||
|
use super::string;
|
||||||
|
use super::number;
|
||||||
|
use super::misc;
|
||||||
|
use super::name;
|
||||||
|
|
||||||
|
#[derive(Debug)]
|
||||||
|
pub enum Expr {
|
||||||
|
Num(f64),
|
||||||
|
Int(u64),
|
||||||
|
Char(char),
|
||||||
|
Str(String),
|
||||||
|
Name(String),
|
||||||
|
S(Vec<Expr>),
|
||||||
|
Lambda(String, Option<Box<Expr>>, Vec<Expr>),
|
||||||
|
Auto(Option<String>, Option<Box<Expr>>, Vec<Expr>),
|
||||||
|
Typed(Box<Expr>, Box<Expr>)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn typed_parser<'a>(
|
||||||
|
expr: Recursive<'a, char, Expr, Simple<char>>,
|
||||||
|
ops: &'a [String]
|
||||||
|
) -> impl Parser<char, Expr, Error = Simple<char>> + 'a {
|
||||||
|
just(':').ignore_then(expr)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn untyped_xpr_parser<'a>(
|
||||||
|
expr: Recursive<'a, char, Expr, Simple<char>>,
|
||||||
|
ops: &'a [String]
|
||||||
|
) -> impl Parser<char, Expr, Error = Simple<char>> + 'a {
|
||||||
|
let lambda = just('\\')
|
||||||
|
.ignore_then(name::name_parser(ops))
|
||||||
|
.then(typed_parser(expr.clone(), ops).or_not())
|
||||||
|
.then_ignore(just('.'))
|
||||||
|
.then(expr.clone().repeated().at_least(1))
|
||||||
|
.map(|((name, t), body)| Expr::Lambda(name, t.map(Box::new), body));
|
||||||
|
let auto = just('@')
|
||||||
|
.ignore_then(name::name_parser(ops).or_not())
|
||||||
|
.then(typed_parser(expr.clone(), ops).or_not())
|
||||||
|
.then_ignore(just('.'))
|
||||||
|
.then(expr.clone().repeated().at_least(1))
|
||||||
|
.map(|((name, t), body)| Expr::Auto(name, t.map(Box::new), body));
|
||||||
|
let sexpr = expr.clone()
|
||||||
|
.repeated()
|
||||||
|
.delimited_by(just('('), just(')'))
|
||||||
|
.map(Expr::S);
|
||||||
|
choice((
|
||||||
|
number::float_parser().map(Expr::Num),
|
||||||
|
number::int_parser().map(Expr::Int),
|
||||||
|
string::char_parser().map(Expr::Char),
|
||||||
|
string::str_parser().map(Expr::Str),
|
||||||
|
name::name_parser(ops).map(Expr::Name),
|
||||||
|
sexpr,
|
||||||
|
lambda,
|
||||||
|
auto
|
||||||
|
)).padded()
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn expression_parser(ops: &[String]) -> impl Parser<char, Expr, Error = Simple<char>> + '_ {
|
||||||
|
return recursive(|expr| {
|
||||||
|
return misc::comment_parser().or_not().ignore_then(
|
||||||
|
untyped_xpr_parser(expr.clone(), &ops)
|
||||||
|
.then(typed_parser(expr, ops).or_not())
|
||||||
|
.map(|(val, t)| match t {
|
||||||
|
Some(typ) => Expr::Typed(Box::new(val), Box::new(typ)),
|
||||||
|
None => val
|
||||||
|
})
|
||||||
|
).then_ignore(misc::comment_parser().or_not())
|
||||||
|
})
|
||||||
|
}
|
||||||
58
src/parse/import.rs
Normal file
58
src/parse/import.rs
Normal file
@@ -0,0 +1,58 @@
|
|||||||
|
use chumsky::{Parser, prelude::*, text::Character};
|
||||||
|
use super::name;
|
||||||
|
|
||||||
|
enum Import {
|
||||||
|
Name(Vec<String>, String),
|
||||||
|
All(Vec<String>)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn prefix(pre: Vec<String>, im: Import) -> Import {
|
||||||
|
match im {
|
||||||
|
Import::Name(ns, name) => Import::Name(
|
||||||
|
pre.into_iter().chain(ns.into_iter()).collect(),
|
||||||
|
name
|
||||||
|
),
|
||||||
|
Import::All(ns) => Import::All(
|
||||||
|
pre.into_iter().chain(ns.into_iter()).collect()
|
||||||
|
)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
type BoxedStrIter = Box<dyn Iterator<Item = String>>;
|
||||||
|
type BoxedStrIterIter = Box<dyn Iterator<Item = BoxedStrIter>>;
|
||||||
|
|
||||||
|
fn init_table(name: String) -> BoxedStrIterIter {
|
||||||
|
Box::new(vec![Box::new(vec![name].into_iter()) as BoxedStrIter].into_iter())
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn import_parser() -> impl Parser<char, Vec<Import>, Error = Simple<char>> {
|
||||||
|
recursive(|expr: Recursive<char, BoxedStrIterIter, Simple<char>>| {
|
||||||
|
name::modname_parser()
|
||||||
|
.padded()
|
||||||
|
.then_ignore(just("::"))
|
||||||
|
.repeated()
|
||||||
|
.then(
|
||||||
|
choice((
|
||||||
|
expr.clone()
|
||||||
|
.separated_by(just(','))
|
||||||
|
.delimited_by(just('('), just(')'))
|
||||||
|
.map(|v| Box::new(v.into_iter().flatten()) as BoxedStrIterIter),
|
||||||
|
just("*").map(|s| init_table(s.to_string())),
|
||||||
|
name::modname_parser().map(init_table)
|
||||||
|
)).padded()
|
||||||
|
).map(|(pre, post)| {
|
||||||
|
Box::new(post.map(move |el| {
|
||||||
|
Box::new(pre.clone().into_iter().chain(el)) as BoxedStrIter
|
||||||
|
})) as BoxedStrIterIter
|
||||||
|
})
|
||||||
|
}).padded().map(|paths| {
|
||||||
|
paths.filter_map(|namespaces| {
|
||||||
|
let mut path: Vec<String> = namespaces.collect();
|
||||||
|
match path.pop()?.as_str() {
|
||||||
|
"*" => Some(Import::All(path)),
|
||||||
|
name => Some(Import::Name(path, name.to_owned()))
|
||||||
|
}
|
||||||
|
}).collect()
|
||||||
|
})
|
||||||
|
}
|
||||||
7
src/parse/misc.rs
Normal file
7
src/parse/misc.rs
Normal file
@@ -0,0 +1,7 @@
|
|||||||
|
pub use chumsky::{self, prelude::*, Parser};
|
||||||
|
|
||||||
|
pub fn comment_parser() -> impl Parser<char, String, Error = Simple<char>> {
|
||||||
|
any().repeated().delimited_by(just("--["), just("]--")).or(
|
||||||
|
any().repeated().delimited_by(just("--"), just("\n"))
|
||||||
|
).map(|vc| vc.iter().collect()).padded()
|
||||||
|
}
|
||||||
9
src/parse/mod.rs
Normal file
9
src/parse/mod.rs
Normal file
@@ -0,0 +1,9 @@
|
|||||||
|
mod expression;
|
||||||
|
mod string;
|
||||||
|
mod number;
|
||||||
|
mod misc;
|
||||||
|
mod import;
|
||||||
|
mod name;
|
||||||
|
mod substitution;
|
||||||
|
|
||||||
|
pub use expression::expression_parser;
|
||||||
28
src/parse/name.rs
Normal file
28
src/parse/name.rs
Normal file
@@ -0,0 +1,28 @@
|
|||||||
|
use chumsky::{self, prelude::*, Parser};
|
||||||
|
|
||||||
|
fn op_parser_recur<'a, 'b>(ops: &'a [String]) -> BoxedParser<'b, char, String, Simple<char>> {
|
||||||
|
if ops.len() == 1 { just(ops[0].clone()).boxed() }
|
||||||
|
else { just(ops[0].clone()).or(op_parser_recur(&ops[1..])).boxed() }
|
||||||
|
}
|
||||||
|
|
||||||
|
fn op_parser(ops: &[String]) -> BoxedParser<char, String, Simple<char>> {
|
||||||
|
let mut sorted_ops = ops.to_vec();
|
||||||
|
sorted_ops.sort_by(|a, b| b.len().cmp(&a.len()));
|
||||||
|
op_parser_recur(&sorted_ops)
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn modname_parser() -> impl Parser<char, String, Error = Simple<char>> {
|
||||||
|
let not_name_char: Vec<char> = vec![':', '\\', '"', '\'', '(', ')', '.'];
|
||||||
|
filter(move |c| !not_name_char.contains(c) && !c.is_whitespace())
|
||||||
|
.repeated().at_least(1)
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn name_parser<'a>(ops: &'a [String]) -> impl Parser<char, String, Error = Simple<char>> + 'a {
|
||||||
|
choice((
|
||||||
|
op_parser(ops), // First try to parse a known operator
|
||||||
|
text::ident(), // Failing that, parse plain text
|
||||||
|
// Finally parse everything until tne next terminal as a new operator
|
||||||
|
modname_parser()
|
||||||
|
)).padded()
|
||||||
|
}
|
||||||
88
src/parse/number.rs
Normal file
88
src/parse/number.rs
Normal file
@@ -0,0 +1,88 @@
|
|||||||
|
use chumsky::{self, prelude::*, Parser};
|
||||||
|
|
||||||
|
fn assert_not_digit(base: u32, c: char) {
|
||||||
|
if base > (10 + (c as u32 - 'a' as u32)) {
|
||||||
|
panic!("The character '{}' is a digit in base ({})", c, base)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn separated_digits_parser(base: u32) -> impl Parser<char, String, Error = Simple<char>> {
|
||||||
|
just('_')
|
||||||
|
.ignore_then(text::digits(base))
|
||||||
|
.repeated()
|
||||||
|
.map(|sv| sv.iter().map(|s| s.chars()).flatten().collect())
|
||||||
|
}
|
||||||
|
|
||||||
|
fn uint_parser(base: u32) -> impl Parser<char, u64, Error = Simple<char>> {
|
||||||
|
text::int(base)
|
||||||
|
.then(separated_digits_parser(base))
|
||||||
|
.map(move |(s1, s2): (String, String)| {
|
||||||
|
u64::from_str_radix(&(s1 + &s2), base).unwrap()
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn pow_parser() -> impl Parser<char, i32, Error = Simple<char>> {
|
||||||
|
return choice((
|
||||||
|
just('p')
|
||||||
|
.ignore_then(text::int(10))
|
||||||
|
.map(|s: String| s.parse().unwrap()),
|
||||||
|
just("p-")
|
||||||
|
.ignore_then(text::int(10))
|
||||||
|
.map(|s: String| -s.parse::<i32>().unwrap()),
|
||||||
|
)).or_else(|_| Ok(0))
|
||||||
|
}
|
||||||
|
|
||||||
|
fn nat2u(base: u64) -> impl Fn((u64, i32),) -> u64 {
|
||||||
|
return move |(val, exp)| {
|
||||||
|
if exp == 0 {val}
|
||||||
|
else {val * base.checked_pow(exp.try_into().unwrap()).unwrap()}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
fn nat2f(base: u64) -> impl Fn((f64, i32),) -> f64 {
|
||||||
|
return move |(val, exp)| {
|
||||||
|
if exp == 0 {val}
|
||||||
|
else {val * (base as f64).powf(exp.try_into().unwrap())}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn pow_uint_parser(base: u32) -> impl Parser<char, u64, Error = Simple<char>> {
|
||||||
|
assert_not_digit(base, 'p');
|
||||||
|
uint_parser(base).then(pow_parser()).map(nat2u(base.into()))
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn int_parser() -> impl Parser<char, u64, Error = Simple<char>> {
|
||||||
|
choice((
|
||||||
|
just("0b").ignore_then(pow_uint_parser(2)),
|
||||||
|
just("0x").ignore_then(pow_uint_parser(16)),
|
||||||
|
just('0').ignore_then(pow_uint_parser(8)),
|
||||||
|
pow_uint_parser(10), // Dec has no prefix
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
fn dotted_parser(base: u32) -> impl Parser<char, f64, Error = Simple<char>> {
|
||||||
|
uint_parser(base)
|
||||||
|
.then_ignore(just('.'))
|
||||||
|
.then(
|
||||||
|
text::digits(base).then(separated_digits_parser(base))
|
||||||
|
).map(move |(wh, (frac1, frac2))| {
|
||||||
|
let frac = frac1 + &frac2;
|
||||||
|
let frac_num = u64::from_str_radix(&frac, base).unwrap() as f64;
|
||||||
|
let dexp = base.pow(frac.len().try_into().unwrap());
|
||||||
|
wh as f64 + (frac_num / dexp as f64)
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn pow_float_parser(base: u32) -> impl Parser<char, f64, Error = Simple<char>> {
|
||||||
|
assert_not_digit(base, 'p');
|
||||||
|
dotted_parser(base).then(pow_parser()).map(nat2f(base.into()))
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn float_parser() -> impl Parser<char, f64, Error = Simple<char>> {
|
||||||
|
choice((
|
||||||
|
just("0b").ignore_then(pow_float_parser(2)),
|
||||||
|
just("0x").ignore_then(pow_float_parser(16)),
|
||||||
|
just('0').ignore_then(pow_float_parser(8)),
|
||||||
|
pow_float_parser(10),
|
||||||
|
))
|
||||||
|
}
|
||||||
42
src/parse/string.rs
Normal file
42
src/parse/string.rs
Normal file
@@ -0,0 +1,42 @@
|
|||||||
|
use chumsky::{self, prelude::*, Parser};
|
||||||
|
|
||||||
|
fn text_parser(delim: char) -> impl Parser<char, char, Error = Simple<char>> {
|
||||||
|
let escape = just('\\').ignore_then(
|
||||||
|
just('\\')
|
||||||
|
.or(just('/'))
|
||||||
|
.or(just('"'))
|
||||||
|
.or(just('b').to('\x08'))
|
||||||
|
.or(just('f').to('\x0C'))
|
||||||
|
.or(just('n').to('\n'))
|
||||||
|
.or(just('r').to('\r'))
|
||||||
|
.or(just('t').to('\t'))
|
||||||
|
.or(just('u').ignore_then(
|
||||||
|
filter(|c: &char| c.is_digit(16))
|
||||||
|
.repeated()
|
||||||
|
.exactly(4)
|
||||||
|
.collect::<String>()
|
||||||
|
.validate(|digits, span, emit| {
|
||||||
|
char::from_u32(u32::from_str_radix(&digits, 16).unwrap())
|
||||||
|
.unwrap_or_else(|| {
|
||||||
|
emit(Simple::custom(span, "invalid unicode character"));
|
||||||
|
'\u{FFFD}' // unicode replacement character
|
||||||
|
})
|
||||||
|
}),
|
||||||
|
)),
|
||||||
|
);
|
||||||
|
filter(move |&c| c != '\\' && c != delim).or(escape)
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn char_parser() -> impl Parser<char, char, Error = Simple<char>> {
|
||||||
|
just('\'').ignore_then(text_parser('\'')).then_ignore(just('\''))
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn str_parser() -> impl Parser<char, String, Error = Simple<char>> {
|
||||||
|
just('"')
|
||||||
|
.ignore_then(
|
||||||
|
text_parser('"').map(Some)
|
||||||
|
.or(just("\\\n").map(|_| None))
|
||||||
|
.repeated()
|
||||||
|
).then_ignore(just('"'))
|
||||||
|
.flatten().collect()
|
||||||
|
}
|
||||||
21
src/parse/substitution.rs
Normal file
21
src/parse/substitution.rs
Normal file
@@ -0,0 +1,21 @@
|
|||||||
|
use chumsky::{self, prelude::*, Parser};
|
||||||
|
|
||||||
|
use super::{expression, number::float_parser};
|
||||||
|
|
||||||
|
pub struct Substitution {
|
||||||
|
source: expression::Expr,
|
||||||
|
priority: f64,
|
||||||
|
target: expression::Expr
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn substitutionParser<'a>(
|
||||||
|
ops: &'a [String]
|
||||||
|
) -> impl Parser<char, Substitution, Error = Simple<char>> + 'a {
|
||||||
|
expression::expression_parser(ops)
|
||||||
|
.then_ignore(just('='))
|
||||||
|
.then(
|
||||||
|
float_parser().then_ignore(just("=>"))
|
||||||
|
.or_not().map(|prio| prio.unwrap_or(0.0))
|
||||||
|
).then(expression::expression_parser(ops))
|
||||||
|
.map(|((source, priority), target)| Substitution { source, priority, target })
|
||||||
|
}
|
||||||
Reference in New Issue
Block a user