Abstract
This tutorial provides an introduction to rule-based programming and
its theoretical foundation: the rewriting calculus. The first class
objects manipulated are rewrite rules which application is controlled
by strategies. The presentation of the compilation techniques of such
languages will be presented. The ELAN language will be used to
illustrate the programming paradigm as well as the compilation
techniques that allow to reach the execution of more than 15 millions
of rewrite per second. Examples from tiny to large will be used to
ease understanding and to become familiar with the rule-based
programming paradigm
Table of Contents
Introduction
Rule-based programming began with artificial intelligence rule-based systems in
the seventies. This paradigm is inherent in Prolog and has been used in
program-manipulation systems like Refine. Indeed, the rewriting
concept appears throughout computer science, from its theoretical
foundations to very practical implementations. Extreme examples
include the mail system in Unix which uses rules to rewrite mail
addresses to canonical forms and the transition rules describing the
behavior of tree automata. Rewriting is used in semantics to describe
the meaning of programming languages, as well as in program
transformations like the re-engineering of Cobol programs. It is used
to compute, implicitly or explicitly, as in Mathematica or OBJ, but
also to perform deduction when using inference rules to describe a
logic, theorem prover or constraint solver. Last, but not least, this
approach is central to systems that raise the notion of rule to an
explicit first class object, like expert systems, programming
languages based on equational logic, algebraic specifications
(e.g. OBJ), functional programming (e.g. ML) and transition systems
(e.g. Murphi).
Rule-based programming is currently experiencing a renewed period of
growth with the emergence of new concepts and systems that allow one
to better understand and better use it. On the theoretical side, after
the in-depth study of rewriting concepts during the eighties, the
nineties saw the emergence of the general concepts of rewriting logic
and of the rewriting calculus. On the practical side, new languages,
like ASM, ASF+SDF, Claire, ELAN and Maude, and also commercial
products, like Ilog Rules, have shown that the concept of rule can be
of major interest as a programming tool. In particular, because it is
now of practical use, new fundamental questions arise, like the
theoretical study of the algorithmic complexity of programs written in
such languages, as well as their optimization. Of course, semantics of
such languages, compilation techniques and methodological studies of
their use should also be explored.
rule-based programming is closely related to both algebraic and functional
programming (when the term rewrite system is confluent) as well as classical
logic programming (when the rewrite system is used for nondeterministic
search). It is therefore deeply related to the main topics of the
ETAPS conferences.
Accordingly, the purpose of this tutorial is to give a presentation of
the main concepts of the field using the ELAN language and system.
Program
The tutorial will be divided into four main parts: introduction, semantics,
compilation and use.
This first section will provide a gentle introduction to the concepts of rewrite rules and of strategies. Numerous small and illustrative examples will be used to provide an in depth understanding of the capabilities of such programming paradigms. We will strongly insist on the distinction between functional evaluation performed by confluent and terminating rewrite systems and non-deterministic transitions performed by strategy controlled rewrite rules. This distinction between computation and deduction is respectively realized in ELAN by un-named (and therefore un-controlled) and named (and thus strategy-controlled) rewrite rules. This introduction will also relate to rule-based systems like ASF+SDF, Cafe-OBJ, Maude, Stratego.
The semantics of such languages can be given in a natural way by the rewriting calculus, a generalization of the lambda calculus allowing first to abstract not only on variables but also on general patterns and second to explicitly handle non-determinism. We will introduce the rewriting calculus and its properties and use it to give a semantics to rewrite based languages and in particular to ELAN.
Because pattern matching is the main atomic step in the evaluation of rewrite based language, its compilation techniques should be made very efficient. Although it is essential to have a good matching algorithm to get an efficient rewriting engine, matching is not the only operation involved in a normalization process. To compute the reduced term, a global consideration of the whole process is crucial: all data structures and sub-algorithms have to be well-designed to cooperate without introducing any bottleneck. In this tutorial we show how to reuse and improve existing techniques, and invent new ones, in order to build a compiler for associative-commutative rewriting in presence of non-determinism strategies. Such techniques have demonstrated their efficiency in the ELAN compiler: it can execute 1 to 2 million (complex conditional) rewrite rules per second on standard Intel or Alpha hardware (from 30,000 to 100,000 pure associative-commutative rewrite steps per second up to 15 millions for simple examples such as Fibonacci numbers). This will enlights how optimized compilation techniques can promote rewrite rule based languages at the level of the best functional or logical language implementations.
To exemplify the concept of rule-based programming, we will present several significant applications implemented using ELAN.
In this direction, we show how to specify, test and prove the Needham-Schroeder authentification protocol. A main interest of this non-trivial and illustrative example is to show how convenient the description of a protocol can be done using the concept of strategy available in ELAN. This will also emphasize the interest of associative and commutative operators to model the non-deterministic communication in networks.
Rule-based programming is particularly well-suited to describe transformation rules over (equivalent) constraints. It has been successfully used to prototype in ELAN rule-based algorithms for solving constraints in equational theories such that Commutativity, Associativity-Commutativity, Booleans, and their combinations.
The rewriting calculus allows us to specify the operational semantics of programming languages, in order to obtain an execution mechanism for programs written in these languages. This will be exemplified with Constraint Logic Programs executed by ELAN. Moreover, we will show how the rewriting engine of ELAN can be used to execute a large class of CASL specifications, where CASL is the Common Algebraic Specification Language developed by the CoFI working group. Currently, the translation from CASL to ELAN is achieved by transforming a CASL abstract syntax tree into an ELAN abstract syntax tree, both represented by Annotated Terms. We will discuss the possibility of performing this translation according to an XML representation and a XSLT tool.
Context
Rewriting is a main theoretical as well as practical concept of computer
science. It is an excellent model of computation as well as programming
paradigm. This tutorial takes place in the context of the events directly
concerned by rewriting: the RTA conferences and the rewriting logic (WRLA) as
well as RULE workshops. The rewriting
page offers detailed informations about these events amongst many
others.
This tutorial, reviewing in depth the conception, implementation and use of
rule-based languages, will cross the main topics of the ETAPS's conferences:
programming concepts and new methodology (concerns FASE and ESOP), semantics
(and therefore FOSSACS), compilation techniques (clearly relevant to CC) and
applications like protocol description (a topics of interest for TACAS).
History of the Tutorial
Under the present proposed form, this will be the first edition of the
tutorial. Several related presentations of the ELAN concepts and system have
been and will be done, in relation to the material we propose to
present here. Let us mention:
Audience
All participants to the ETAPS conferences from students to senior researchers
could benefit of this introduction to the main concepts of rule-based
programming. Some preliminary knowledge on rewrite based techniques
may help but is not mandatory.
Instructors
Materials