I made a programming language

Today I'd like to introduce you to a programming language I developed called ρs. This language is the result of my master's thesis, and it's based on rewriting semantics. The language is rather academic, but I use it a lot myself in practice (as a calculator replacement). Please take a look and send comments!

In this post I want to make a friendly introduction that is approachable by non computer-science people. But if you are interested in details, here is the link to the thesis paper: rose-lang-thesis.pdf. It contains the complete formal definition of ρs, as well as some proofs of its properties. There is a reference interpreter too: git.vau.place/schrec.git.

Rewriting

As I mentioned before, the language is based on rewriting, or in other words, it is a kind of a rewrite system. A rewrite system is a tool that lets you change expressions using simple rules. These rules are called rewrite rules, and they look like this: "If you see this pattern, replace it with this other pattern". For example "if you see x + 0, you can replace it with x".

To use a rewrite system, you apply its rules to every part of an expression (i.e. to every subexpression). If a rule matches, you replace the matching part with the replacement pattern from the rule. You keep doing this until there are no more matches left.

For instance, let us say you have the expression 3 + (2 + 0). You can apply the rule x + 0 -> x to the subexpression 2 + 0, which matches the left side of the rule. So you replace it with 2, giving you 3 + 2.

Rewrite systems can have multiple rules, and sometimes different rules can match the same part of an expression. In that case, you can choose which rule to apply. For example, if you have the expression (2 + 0) * 0, you can apply the rule x + 0 -> x to the subexpression 2 + 0, giving you 2 * 0. Or you can apply the rule x * 0 -> 0 to the subexpression (2 + 0) * 0, giving you 0. Both results are valid, but they give you different answers.

To see how this is useful in programming, consider the following rewrite rules: A rewrite system made of these rules is nothing but a program for concatenating two lists together.

Fixed rules

In ρs rewrite rules are extended with additional capabilities. By default, a rule applies to every subexpression, but in ρs, some rules can only be applied to a fixed or chosen subexpression.

Consider the input expression 1 + 1, where the first 1 is a subexpression named m. Let's also have a rewrite system with two rules:
  1. 1 -> 1 <> 1
  2. 1 -> 0, but this rule only applies to subexpression m.
After the first application we can get three different results: In each case, we can apply rule 1 further, but rule 2 can only be applied to the second result because its match pattern expects a 1 for subexpression m, but is looking at (1 <> 1) and 0 in the other two results.

By itself, this does not look like a very useful feature to have. That is because it has to be used in conjunction with the second capability of ρs: simultaneous application of multiple rules.

Rules grouping

The idea is that we can group several rules and say that they only apply together. In order to match such a group of rules, every rule in it must match its own subexpression (be it fixed or not) first, and then, if that is successfull, all replacements happen according to these rules' replacement patterns.

For example, if we take our previous rewrite system, but have the two rules grouped together, then after the first application the results will be these: But after this step no further reduction will be possible because rule 2 is not matching its subexpression.

With grouped rules, variables are shared between the rules in the group. For example, take a rewrite system with these two rules:
  1. 1 -> x <> x
  2. x -> 0, but this rule only applies to subexpression m.
And make rule 1 and rule 2 be grouped together. This means that the variable x in rule 1 is the same as the variable x in rule 2. After the first application we get our familiar results: But now all three results can be reduced further. Continuing with the second result we get 0 + ((0 <> 0) <> 1) and 0 + (1 <> (0 <> 0)) as possible answers.

Dynamic rewriting

Up to this point we had a clear separation between "code" (rewrite system) and "data" (input expression). But in order to achieve the full potential of the language, we must allow it to modify itself. So, the next change we make to the standard rewriting is that we encode the rewrite system in the input expression. The particular details of how this is done are arbitrary and not important. Suffices to say that we can somehow identify rewrite rules in an expression that represents the program.

Sharing

Finally, if we allow fixed rules to refer to arbitrary subexpressions, then there will be situations in which two such rules are fixed on the same subexpression. In other words, there will be these rules that refer to the same shared state. To represent such a situation, the regular "expressions" are no good and we must go beyond them and encode our programs as "cyclic expressions". In CS terms, if expression is a tree, then cyclic expression is a graph.

Why?

Existing rewrite systems are ready for use as programming languages, and that is usually with good fun. However, when our problem involves manipulating a state, then this becomes a hassle, especially if the state is shared. This usually requires workarounds, like carrying the state through the whole program. When dealing with state like that, there is a feeling that the language is simply not made for it. But if you look closely at the way that fixed rules work together with rules grouping, then it should become clear that what they do is directly modify the references shared state. My language is my try at adding the state to standard rewrite systems.

In this sense the language is an experiment. And the results of it can be somewhat rigourously analysed. For example, as it turned out, adding the state in the form as I have conceptually described in this post, is enough to make the language being able to express pretty much every feature seen in typical programming languages. Meaning that what is a builtin feature in a typical programming language can be added to ρs by the user, written as a collection of rules. Furthermore, if concurrent rewriting is involved, the state is absolutely required for this result to hold.

return home