IRP programming : an OCaml implementation

From WikiOsp
Jump to: navigation, search

this page is under redaction

Introduction

Intuitive description of a Builder Tree

Let us give an intuitive example of the concepts of builder tree. Suppose we want to built a Triangle ABC from 3 Points A, B and C in the plane, using Point A as the summit and Points B and C as the basis Vector (or bi-point).

Let us define a Point value (noted [P]) as the couple (x, y) of the two floats, coordinates of P. Point A value [A] = (xa, ya), Point B value [B] = (xb, yb) and Point C value [C] = (xc, yc).

Let us define the Vector BC value (noted [BC]) as the couple ([BC] = ([B], [C]) ). Let us define the Triangle ABC value (noted [ABC]) as the couple ([ABC] = ([A], [BC]) ).

To build ABC :

step 1 : we read the coordinates xa, ya which gives the value of Point A.
 a_build : xa -> ya -> [a]  
step 2 : we read xb, yb, xc, yc to build points B and C.
 b_build : xb -> yb -> [b]   
 c_build : xc -> yc -> [c]   
step 3 : from them we build Vector BC
 bc_build : [b] -> [c] -> [bc]   
step 4 : from Point A and Vector BC we build the value of Triangle ABC.
 abc_build : [a] -> [bc] -> [a]   

Builder tree

We can summarize these steps by drawing the Builder tree of Triangle ABC :

             Triangle ABC <--- step 4
               /         \
              /           \     
step 1 --> Point A    Vector BC  <--- step 3
            / \        /      \
           xa ya      /        \  
                   Point B  Point C  <--- step 2
                    / \      / \
                   xb  yb    xc  yc

This tree represents the relation necessary to the building of the value of between a node and its children. Instead of children we shall say builder, because each level gets its values (is built) from the values of its builders : floats xa ya are the builders of Point A, floats xb yb are the builders of Point B and floats xc yc are the builders of Point C. Points B and C are the builders of Vector BC. Vector BC and Point A are the builders of Triangle ABC.

Remark 
the leaves of the Tree are basic data, here floats. But the same kind of tree may have been done with colors instead of coordinates and a specific function to paint a vector from two points and a triangle from a vector and a point.
Remark 
this tree is an expression tree. A value is associated to each subtree.
          This value is obtained via an interpretation i.e. a function from the tree set ->  value domain
Remark 
a calculation is a sequence of steps. Starting from an initial state, going to the final step. At each step a calculation ??? rule is applied. It is a transformation.

Intuitive description of the IRP method

The IRP methods consists in hiding any application of the building functions to access any argument value. Instead we design a series of e_provide functions without any arguments which produce the value [e]. The calculations of the builder tree upper, are now done in reverse order :

We start with step 4 :

   [abc] = abc_provide

the function abc_provide hides the following sequence :

abc_provide =
 (* provision of arguments values *)
   [a] = a_provide
   [bc] = bc_provide
 (* calculation of value with building function *)
   [abc] = abc_build [a] [bc] 
;;

the same kind of hiding is used in

a_provide =
 [xa] = read of xa
 [xb] = read of xb
 [a] = a_build [xa] [xb]
b_provide, c_provide 
and
bc_provide =
 [b] = b_provide
 [c] = c_provide
 [bc] = bc_build [b] [c] 

The program now runs like this

 step 4
 step 1
 step 3
 step 2
Consequence 
the use of provide functions without any argument allow the programmer to be provided with any value of any entity of the program from any part of the program.
Consequence 
there is no accidental coupling between two modules. The only coupling is the harmless built by relation which couples a provide function with those of its builders.
Remark 
this procedure is similar to the Kleene rules (Zohar Manna Mathematical Theory of Computation, Addison Wesley 1974), which consist, to calculate [abc], in launching the calculation of [a] simultaneously with that of [bc], and those of [b] and [c] to obtain [bc].

Assumptions of the IRP method

Building function

We shall define later the concepts of domain, entity and entity value.

Definition : a building function is a function which calculates the value [e] of an instance of an entity e when applied to the values of the builders of e (it is noted e_build).

The IRP method lies on two assumptions on the building functions of an entity value:

Unicity of a building function :
for each entity value [e] there is one and only one building function e_build and to each building function e_build corresponds a unique entity e. In other words, there is a bijection between e and e_build.


Unicity of the builders list :
the arguments of a building function are the values of a well defined list of entities : its builders. By well defined we mean that the function is not partial : all arguments values must be defined when the function is applied.

Some definitions

Domain

Definition 
a domain is a set of entities that we aim at representing by a specific software (a program) using a computer language.

Universe

Definition 
the Universe gathers all domains implemented in a Program.

Entity

Definition 
an entity of a domain is any individual of this domain having some properties to be calculated by a program or being necessary to such a calculation.

Program

Definition 
a program is a software allowing, the description and the management of all information necessary to achieve the calculation of any properties for a given domain using a computer language, at the exclusion of library functions.

Library function

Definition 
a library function is a piece of code independent of any domain that can be used as a library function.
For example 
in what follows we shall use a domain named Figure containing only three entities : a Triangle, a Vector and a Point and an other domain named Elementary containing only four low-level entities : a Coordinate_tuple, a Coordinate, a Unit and a Color.
For example 
algorithms for Fourier transform, matrix diagonalization, determinant calculation, integrals evaluation, tree or list management, sorting and so on are library functions. They are not specific to a domain.

A program is therefore coarse-grained and divided into modules organized according to an architecture

The IRP method is concerned by the definition of the entities of a domain, the design of their relations and the architecture of their modules.

Data structure

Definition 
a data-structure is a collection of any kind of data using at least one cartesian product.
For example 
any tuple, a list are data-structures.

In Caml, any polymorphic type is a data-structure. We shall see later the fundamental distinction that exists between data-structure and descriptive entity.

Two kinds of Entity

There are two kinds of entities used to described a domain, the descriptive entities and the properties entities.

Each entity is described by its own modules.

I Descriptive entity

Definition 
a descriptive entity for a domain is any entity whose value is necessary to calculate (i.e. is a builder of) any other entity value of the domain. It is an abstract description of the individuals of a domain.
For example 
A Point, a Vector, a Triangle of the Figure domain are such entities as shown by the Builder tree described upper.

Descriptive entities have four types : label, tag, container and value, each implemented as a different and homonymous Caml type.

Type Label
Definition 
a label expresses the fact that an entity exists in a Domain.

A label contains nothing. It just is.

Consequence 
the type label of an entity e (noted e_label ) is a pure Caml variant type, excluding any data-structure.
For example 
type point_label = Point of string
type vector_label = Vector of string
type triangle_label = 
 | Triangle_isosceles of triangle_isosceles_label
 | Triangle_scalene of triangle_scalene_label
type triangle_isosceles_label =
 | Triangle_isosceles_equilateral
 | Triangle_isosceles_right
 | Triangle_isosceles_obtuse
 | Triangle_isosceles_acute
type triangle_scalene_label =
 | Triangle_scalene_right
 | Triangle_scalene_obtuse
 | Triangle_scalene_acute
Remark 
instead of using bare Constructors like Triangle_isosceles_equilateral, it would have been possible to use something like :
 | Triangle_isosceles_equilateral of string 

where the basic type string could have been instantiated with the name of the Triangle, which is arbitrary.

Constructor Tree

The type triangle_label hierarchy defines a Tree : the Constructor Tree of the entity Triangle.

                              Triangle_label
                               /        \
             Triangle_isosceles_label  Triangle_scalene_label
              /     |    |      |       |     |    \
      equilateral right acute obtuse   right acute obtuse 


The entities Vector and Point have a Constructor Tree reduced to one Constructor.

It is convenient to define the Constructor Tree of the whole domain Figure by adding the type figure_label in order to have a common figure_label type for any entity of the Figure domain.

type figure_label =
  | Triangle of triangle_label
  | Vector of vector_label
  | Point of point_label
                                               Figure_label
                                        /                           \                   \
                           Triangle_label                        Vector_label       Point_label
                            /        \                               |                  |
          Triangle_isosceles_label  Triangle_scalene_label          vector            point
           /     |    |      |       |     |    \
   equilateral right acute obtuse   right acute obtuse 


Therefore any of the 9 entities labels of the Domain Figure are handled by one common type figure_label.

Type Tag
Definition 
a sibling is a child node of a given node, in a tree.
Definition 
a sibling-index is an integer ranging from 1 to N, defining the relative position of the siblings.

The Builder tree shown upper, can be viewed as the tree of the labels of the entities referred by each node. Label types may not be unique. It is important that any entity of a Program be labelled uniquely. A tree node may be easily indexed uniquely with its tree-index.

Definition 
the tree-index of a node in a tree, is the sequence of its sibling-index followed by of all the sibling-index of its ancestor nodes met along its path to the root of the Tree.
For example 
the indexation of the Builder tree for Triangle ABC gives :
                         [1]  <-----------  ABC (Root)  
            /                        \
           /                          \     
        [1, 1]                      [2, 1]  <---- Vector BC
         / \             /                     \     
[1, 1, 1] [2, 1, 1]   [1, 2, 1]               [2, 2, 1] <--- Point C
                     /          \           /          \
             [1, 1, 2, 1] [2, 2, 2, 1][1, 2, 2, 1] [2, 2, 2, 1]  <--- yc
Definition 
a type tag for an entity e , noted e_tag couples its type label with its tree-index of the Builder-tree of the Domain.
For example 
the tag for Point A is (point_label ("A"), [1, 1])
Type container
Definition 
the type container for an entity e noted e_container is a data-structure collecting the tags of its builders.
For example 
type triangle_container = (point_tag, vector_tag)
type vector_container = (point_tag, point_tag)
type point_container = (float, float)

The relation between containers is defined by the Builder tree.

Type value
Definition 
for a descriptive entity its type value is obtained by substituting recursively the tags of its builders (container) by the elementary values of the corresponding leaves of the Builder tree.
For example 
 vector_value = ( (xa, ya), (xb, yb) ) where xa, ya, xb, yb are floats.

II Property entity

Modules

A module is an element of the program (a Caml file) where an aspect (???) of an entity is defined and its functionalities implemented.

Provider function

Let us consider the calculation of the value a_value of whichever EI, for example, A.
Rule 1 implies that there is only one way to calculate a_value : a call to its unique building function a_build.
While Rule 2 implies that the values of the arguments of a_build (b1_value, b2_value, ..., bn_value) can be calculated by calling the well defined list of n functions b1_build, b2_build, ... ,bn_build, because Rule 1 applies also to B_1, B_2, ... ,B_n (bi_value = bi_build ( ... ) ).

Therefore :

the call to a_build and the provision of the values of its arguments can be encapsulated in an argumentless function ( a_provide ), applying (pseudo-)recursively the same mechanism to the arguments entities.

The argumentless function a_provide will look like :

    a_provide 
      b1_value = b1_provide
      b2_value = b2_provide
      ...
      bn_value = bn_provide
      result = a_build (b1_value, b2_value, ... , bn_value)
    end

and the bi_provide used to calculate the builders entities B_i will look like :

     bi_provide 
      ci1_value = ci1_provide
      ci2_value = ci2_provide
      ...
      cim_value = cim_provide
      result = bi_build (ci1_value, ci2_value, ... , cim_value)
    end
Because a povider function is argumentless, this mechanism garantees that a_value does not depend on the context in which it is called in the program.

There is no more to be worried about how to provide the values of the arguments of a_build : they will be recursively computed until their specific data are reached (see examples below). Therefore, the main difficulty of the standard way[1] of programming is avoided.

It must be emphasized that for using a function bi_provide nothing has to be known about the way a Bi have been computed. The only requirement is that someone has previously implemented the code of bi_provide somewhere in the program.

Rule 1 and Rule 2 ensure that the code will work in any case or fail at link if one of the provide function has not been implemented (Note that in Caml the failure will even occur at compilation).

Provider Summary in Caml

 let provide tag =
   let tag_child_list = formula tag in
   let value_child_list = 
     List.map Child.provide tag_child_list
   in
   build value_child_list

An example for Part I

There are many ways to implement the examples below depending on the language used (see Examples wikifrm:IRP_Programming_Example_OCaml_Triangle) These examples are written in a pseudo-language which does not exit, in the simplest way as possible.

A program to print the surface of a triangle

We want to write a program which, given the 9 floating-point values defining the coordinates of the 3 vertices of a triangle, will calculate its surface and print it.

Design of the program

We proceed from top to bottom, designing each needed EI one after the other.

Design of a Triangle

Design of a Vector

Design of a Point

A possible implementation

The provider for a Point
The provider for a Vector
The provider for a Triangle
Building a Triangle
The main module

The execution of the program

Examples of implementations

The target

The data

The nodes

A path

A sub-tree

A value

Symbols

A symbol names a class of entities having the same nature

The production-tree as a symbolic descriptor

The world of values as a dual of the symbolic world

Formulas

A formula is used to translate the symbol of an EI into the symbols of its childs. It is a data_structure F made off child symbols.

By applying a mapping to a formula F we can obtain the same data structure P filled with the child paths and thus calculate the builders values with the provide function of each child.

Example : a molecule tag formula is a tree of fragment tags, a fragment tag a list of block tags.

In Caml a tag is not a data structure it is a "pure" descriptive type it "contains" nothing, while a formula is a data structure which contains the tags of the child entities.

Conclusion

The basic idea under IRP is very simple get a value wihout any reference to the parameters needed to calculate this value. The consequences on the properties of the resulting code are not and it is possible that not all properties have yet been explored.

Anybody having used IRP is astonished by how easy is programming with it. It woud be even easier if people in charge of writting compilers could implement IRP in their language (i.e. the provide mecanism as part of the language).

An other application could be to write an OS on these principles : the package dependence would then be automatic and the growth of the system would have little effect on its maintainability.

References

  1. Cite error: Invalid <ref> tag; no text was provided for refs named standard_way

Public notes and remarks

Passages to be argumented


Passages to be clarified


Passages to be commented


Passages to be discussed


Passages to be emphasized


Passages to be examplified


Remarks


Specific Questions


General questions on the whole text


  • localization : ~/sources/irp_top/paper/Reduction_of_inter-module_coupling_using_IRP_programming_scheme.mw