forked from Orchid/orchid
136 lines
6.4 KiB
TeX
136 lines
6.4 KiB
TeX
\documentclass{article}
|
|
\usepackage{graphicx}
|
|
\usepackage[margin=2cm]{geometry}
|
|
\usepackage[hidelinks]{hyperref}
|
|
|
|
|
|
\title{Orchid's Type System}
|
|
\author{Lawrence Bethlenfalvy, 6621227}
|
|
\date{12 November 2022}
|
|
|
|
% Why would you toss all this in the template if it just doesn't compile!?
|
|
%\urn{6621227}
|
|
%\degree{Bachelor of Science in Computer Science}
|
|
%\supervisor{Brijesh Dongol}
|
|
|
|
\begin{document}
|
|
\maketitle
|
|
|
|
|
|
\section{Introduction}
|
|
|
|
Originally my final year project was going to be an entire programming language which I started to
|
|
develop around February, however at the start of the year I decided to set a more reasonable goal.
|
|
|
|
Orchid is a functional programming language inspired by $\lambda$-calculus, Haskell and Rust. The
|
|
execution model is exactly $\lambda$-calculus with opaque predicates and functions representing
|
|
foreign data such as numbers, file descriptors and their respeective operations. For the purpose of
|
|
side effects caused by foreign functions, reduction is carried out in normal order just like
|
|
Haskell.
|
|
|
|
There are two metaprogramming systems, one syntax level and one type level, similar to Rust.
|
|
Syntax-level metaprogramming is based on generalized kerning, it is mostly defined and a naiive
|
|
implementation is complete at the time of writing. Type-level metaprogramming resembles Prolog and
|
|
is a major objective of this year's project.
|
|
|
|
The project's home is this repository which, at the time of writing, contains fairly outdated code
|
|
samples: \url{https://github.com/lbfalvy/orchid}
|
|
|
|
\subsection{Aims}
|
|
|
|
My goal for this year is to define a robust, usable type system and write a performant
|
|
implementation.
|
|
|
|
The next phase of development will be a compiler frontend for LLVM. If the type system reaches a
|
|
satisfactory level of completion before the dissertation is complete, I may also write a bit about
|
|
the compilation.
|
|
|
|
If due to some unforeseen circumstances I'm unable to complete enough of the type system to fill
|
|
the dissertation with its details or it ends up too simple, I may also write about the macro system
|
|
which is already in a usable state and only needs some optimizations and minor adjustments
|
|
due to shifts in responsibilities which occured while I was defining the basic properties of the
|
|
type system and experimenting with concrete code examples to get a clearer picture of the
|
|
provisional feature set.
|
|
|
|
\subsection{Objectives}
|
|
|
|
A working type system should have the following parts, which I will implement in roughly this order
|
|
|
|
\begin{itemize}
|
|
\item \textbf{Type inference engine and type checker} This will be an extension of
|
|
the Hindley-Milner algorithm, which simultaneously unifies and completes partial type
|
|
annotations, and recognizes conflicts.
|
|
\item \textbf{Typeclass solver} At the moment this appears to be a relatively simple piece of
|
|
code but I'm not entirely confident that complications won't arise as its responsibilities
|
|
become clearer, so I consider it a separate component
|
|
\item \textbf{Executor} Orchid is a statically typed language so it should eventually be compiled
|
|
with LLVM, but in order to demonstrate the usability of my type system I will have to write
|
|
an experimental interpreter. Since types are already basically expressions of type type,
|
|
parts of the executor will coincide with parts of the type inference engine.
|
|
\end{itemize}
|
|
|
|
\section{Literature Review}
|
|
|
|
The preprocessor can parse arbitrary syntax. Generalized kerning can use "rockets"
|
|
(called carriages in Orchid terminology) to parse token sequences statefully and assume
|
|
the role of an arbitrary parser encoded as a rich Turing machine.\cite{suckerpinch}
|
|
|
|
The type system supports higher-kinded types. I considered translating higher-kinded polymorphism
|
|
into abstract types as demonstrated by Yallop\cite{yallop} which can be translated into
|
|
Prolog and then building the breadth-first executor described by Tubella\cite{tubella}, but
|
|
in the end I decided that since I'm already building my own unification system I might as well
|
|
skip this step. Currently expressions are annotated with common Orchid expressions that evaluate to
|
|
a type. This means that unification is uncomputable in some cases, but the most common cases
|
|
such as halting expressions and recursive types using fixed point combinators can be unified
|
|
fairly easily and this leaves room for extension of the unification engine.
|
|
|
|
\section{Technical Overview}
|
|
|
|
\subsection{Type checker}
|
|
|
|
Type expressions to be unified are collected into a group. For the purpose of unification, types
|
|
are either opaque types with possible parameters which are considered equal if both the type and its
|
|
parameters are equal, or transparent lambda expressions applied to types. Before unification would
|
|
begin, the expressions that refer to equal types are collected in a group. A breadth-first search
|
|
through the network of reduced forms is executed for all expressions in lockstep, and
|
|
syntactic unification is attempted on each pair of reduced forms belonging to different expressions
|
|
in the same group.
|
|
|
|
At a minimum, the following must be valid reduction steps:
|
|
|
|
\begin{itemize}
|
|
\item $\beta$-reduction
|
|
\item fixed point normalization, which simply means identifying that a subexpression has
|
|
reduced to an expression that contains the original. When a fixed point is detected, the
|
|
recursive expression is converted to a form that uses the Y-combinator. This operation
|
|
is ordered before $\beta$-reductions of the expression in the BFS tree but otherwise has
|
|
the same precedence.
|
|
\end{itemize}
|
|
|
|
\subsection{Typeclass solver}
|
|
|
|
This will be relatively simple and strongly resemble Rust's Chalk trait solver, with the exception
|
|
that I would prefer not to enforce the orphan rules on the language level so as not to completely
|
|
stall projects while a third party developer accepts pull requests on what might be legacy code to
|
|
add new impls.
|
|
|
|
\subsection{Executor}
|
|
|
|
A basic version of the executor can technically be produced by initializing the lazy BFS of
|
|
reductions created for the type checker on runtime code, taking the first result, dropping the
|
|
BFS iterator and repeating these two steps ad infinitum, but this will likely be very inefficient
|
|
so time permitting I would like to create a somewhat better version. This stands to show how
|
|
developer effort can be reduced - along with the learning curve of the complex type system - by
|
|
reusing the same language for both. A type system supporting HKTs would have to be uncomputable
|
|
either way.
|
|
|
|
\section{Workplan}
|
|
|
|
TODO
|
|
|
|
\appendix
|
|
|
|
\bibliographystyle{IEEEtran}
|
|
\bibliography{references}
|
|
|
|
\end{document} |