Functional Data Model
Table
of Contents
1 Introduction to Functional Data Model....................................................... 2
1.1 Introduction............................................................................................... 2
1.2 Functional
Data Model............................................................................... 2
1.3 Database
Functions................................................................................... 4
1.4 Queries
in the Functional Data Model......................................................... 7
1.5 Data
Manipulation Functions...................................................................... 8
1.6 Definition
of Functions............................................................................. 12
1.7 Recursive
Functions................................................................................. 13
1.8 Conclusion............................................................................................... 19
Generally, any persistent collection of data in computer's memory is called a database. A database describes a particular part of the real world (i.e. some organization, some activity, a market condition or something like this i.e. a database is an informational model of a real world.
An information model consists of two main parts: entities and relationships. Entity is a distinguishable object in the area being modeled, e.g. a person, product, event etc. Relationships are particular logical connections, physical links or some other associations between entities. Each entity is represented in a database by a number of facts (data items) combined into a so-called Data Object which is an addressable unit of a database. I.e. users can store, retrieve or modify data objects as such.
Obviously, an information model must be dynamic. A database changes as soon as the "real world" evolves. In order to maintain such a dynamic information model of the "real world", we need a special software package that provides a convenient way of creating, modifying and accessing the data base. Such software package is normally called a Database Management System (DBMS). In other words, a DBMS is a tool to be applied by the users to build an accurate and useful information model of their organizations.
To build a database the user have to accomplish the following tasks:
1. accurately define a structure of the database (i.e. to define how the information is organized within the database);
2. apply a collection of operators or so-called rules of
inference which are
supported by the DBMS, to retrieve, store, or modify data that are of interest.
Thus, the DBMS supports two different but closely connected languages:
· Data Description Language (DDL) and
· Data Manipulation Language (DML).
The data description language (DDL) is a collection of statements for the description of data structure types. The user must define a database structure in terms of these data structure types. A database structure defined by means of a DDL is called a Data Base Schema or Conceptual Schema.
A data manipulation language (DML) is a collection of operators or rules of inference which can be applied to any valid instance of the data types listed in the data base schema. A database schema contains the description of all types which are of interest to users. A database itself contains instances of the previously defined data types.
A combination of particular Data Manipulation and Data Description Languages is called a Data Model. Generally, the data model specifies rules according to which data are structured and the associated operations that are permitted. It may also be seen as a technique for the formal description of data structures, usage constraints and operations. The facilities available vary from one model to another.
More formally, a data model is a combination of at least three components:
1. A collection of data structure types;
2. A collection of operators or rules of inference which can be applied to any valid instance of the data types listed in (1);
3. A collection of general integrity rules which implicitly or explicitly define the set of consistent database states or changes of states or both.

In conventional database systems, procedures, data structures and actual
content are usually separated. Thus, a conventional database management systems
(DBMS) provides users with a possibility to store, modify or retrieve data that
structured in accordance with a current database schema.
It should be especially noted, that a DBMS retrieves data as they were stored into the database and additional procedures can be applied to such data as an independent level of application programs.
In contrast, the functional data
model provides an unified approach to manipulation both data and procedures.
Main idea of the functional data model is a definition of all components of an
information system in the form of functions. Thus, for example, the functional
data model defines data objects, attributes and relationships as so-called
database functions. Moreover, a Functional Data Manipulation Language is a
number of data manipulation functions which can be applied to database
functions. Finally, users are provided with a special mechanism which is called
Lambda Calculus to define their own functions which can be seamlessly combined
with database and data

manipulation functions mentioned above.
Data Objects are called Database Entities or simply Entities in terminology of the Functional Data Model. All Entities must be declared as special functions that are devoid of parameters.
For example, the database schema dealing with three Entities - Customer, Product and Transaction where each transaction describes a shipment of a product to a customer, can be defined as follows:
CUSTOMER( ) à ENTITY
PRODUCT(
) à ENTITY
TRANSACTION( )
à ENTITY
Note that two or more different entities within a functional database may have duplicate values of all data items (attributes). The Internal Key (IK) is conceptually a data item, whose value is automatically associated with each stored entity in the database. We can think of it as a unique internal identifier used inside a database to distinguish one entity from another. Each entity is assigned an internal key value when it is stored in the database for the first time. An entity retains a value of the internal key (even if the entity is modified) until the entity is finally deleted from the database.
Thus, we can see the function
<entity
name>( ) à ENTITY
as a function which produces a particular value of the internal key for each existing entity.
For instance, the function CUSTOMER( ) à ENTITY might look as follows:
|
Entity |
Internal key |
|
#1, Smith, . . . |
$C1 |
|
#2, Hill, . . . |
$C2 |
|
#8, Johns |
$C3 |
In analogy, all attributes of a particular entity must be also defined in the form of functions. For example, if the entity "CUSTOMER" is described with attributed C# (customer's number), CNAME (customer's name), CITY (city where the customer lives) and PHONE (customer's phone number), then such functions should be declared as follows:
C#(CUSTOMER) à integer
CNAME(CUSTOMER) à string
CITY(CUSTOMER) à string
PHONE(CUSTOMER) à integer
Such function maps a particular value of the internal key into a corresponding value of the attribute.
For instance, the functions:
C#(CUSTOMER)
à integer and
CNAME(CUSTOMER) à string
might look as follows:
|
C#(CUSTOMER) |
integer |
|
C#($C1) |
1 |
|
C#($C2). |
2 |
|
C#($C3) |
8 |
|
CNAME(CUSTOMER) |
string |
|
CNAME($C1) |
Smith |
|
CNAME($C2). |
Hill |
|
CNAME($C3) |
Johns |

Normally, an information model consists of two main parts: entities and relationships. The mechanisms of the functional data model described sofar do not suffice to cover relationships between entities which are, obviously, a very important part of an information model. Consider, for instance, the following relationships in the database being discussed.
The customer Smith ($C1) has bought the product VDU ($P1), this event is represented in the database by the entity TRANSACTION with internal key $T1. The customer Hill ($C2) has also bought the same product (see the entity TRANSACTION with internal key $T2). Such relationships between entities are described as database functions that are applied to entities and reurn entities as a result.
For instance,
CT(TRANSACTION) à CUSTOMER
PT(TRANSACTION) à PRODUCT
Functions of this kind transform an internal key of one entity into corresponding internal key of another one.
Thus, for the previously discussed example, the functions are as follows:
|
CT(TRANSACTION) |
CUSTOMER |
|
CT($T1) |
$C1 |
|
CT($T2). |
$C2 |
|
PT(TRANSACTION) |
PRODUCT |
|
PT($T1) |
$P1 |
|
PT($T2). |
$P1 |
Hence, the functional data model defines a database schema as a set of database functions.
For example:
/* Definitions of entities */
CUSTOMER( ) à ENTITY
PRODUCT( ) à ENTITY
TRANSACTION( ) à ENTITY
/* Definition of
attributes */
C#(CUSTOMER) à INTEGER
CNAME(CUSTOMER) à STRING
CITY(CUSTOMER) à STRING
PHONE(CUSTOMER) à INTEGER
P#(PRODUCT) à INTEGER
PNAME(PRODUCT) à STRING
PRICE(PRODUCT) à INTEGER
DATE(TRANSACTION) à STRING
QNT (TRANSACTION) à INTEGER
TPRICE(TRANSACTION) à INTEGER
/* Definition of relationships */
CT(TRANSACTION) à CUSTOMER
PT(TRANSACTION) à PRODUCT
There exists a useful graphic notation for functional database schemas. This notation is known as a Data Structure Diagram. In accordance with the notation, each entity (i.e., a function which defines an entity) is depicted as a rectangle. The rectangle contains the name of the entity. All other functions are depicted as arrows. The arrow is directed to the resultant entity or data type; the arrow emanates from the symbol of entity which corresponds to parameters of the function.

In the functional data model,
for each function f(s) à t the inverse function Inv_f(t) à s is also available.
For instance, the inverse fuction INV_C#(integer) à CUSTOMER maps values of the attribute C# into internal keys of the entity Customer.
|
INV_C#(integer) |
CUSTOMER |
|
INV_C#(1) |
$C1 |
|
INV_C#(2). |
$C2 |
|
INV_C#(8) |
$C3 |
Analogously, the inverse function INV_CNAME(STRING) à CUSTOMER transforms a particular value of the attribute CNAME into a corresponding internal key of an entity CUSTOMER.
|
INV_CNAME(string) |
CUSTOMER |
|
INV_CNAME(Smith) |
$C1 |
|
INV_CNAME(Hill). |
$C2 |
|
INV_CNAME(Johns) |
$C3 |
There also exists inverse functions which describe relationships between entities. For instance, the inverse function INV_CT(CUSTOMER) à TRANSACTION transforms an internal key of the entity CUSTOMER into a corresponding internal key of the entity TRANSACTION.
|
INV_CNAME(string) |
CUSTOMER |
|
INV_CT($C1) |
$T1 |
|
INV_CT($C2). |
$T2 |
The functional data model supports single-valued and multi-valued functions. All functions discussed sofar are single-valued ones because they produce single value as a result. Multi-valued functions produce results which belong to a so-called bulk data type. The only bulk data type available in the functional data model is a list (i.e., sequential number of elements which can be lists or single values in turn). Hence, multi-valued functions are represented by list-valued functions in the functional data model.
For instance, the inverse function
INV_PT(PRODUCT) àà TRANSACTION
is a list-valued function since it generally, produces a list of values as a result.
INV_PT($P1) àà ($T1, $T2) list of single values of internal keys.
Note that a single-valued function may be applied to a list (i.e., a list of values may be used as a parameter of single-valued function). In this case, the result of the function is also a list of values, because the function is considered to be applied sequentially to each element of the input list.
For instance,
INV_CT(($C1,$C2)) à ($T1,$T2)
list as a parameter list as a result
CNAME(($C1,$C2)) à (Smith, Hill)
The type of a particular database function gives essential information about the semantics of a database.
For instance, if the function INV_C#(integer) à CUSTOMER is defined as a single-valued one, then the attribute C# can be seen as an external key (i.e., duplicate values are not allowed).
In analogy, definition of the function CP(TRANSACTION) à CUSTOMER as a single-valued function ensures that each entity "TRANSACTION" corresponds to exactly only one entity "CUSTOMER", and so on.
Thus, in order to distinguish types of certain functions, single-valued functions are represented by single-headed arrows (à) and multi-valued functions are represented by double-headed arrows (àà). If inverse function is not declared explicitly as a single-valued one, it is considered to be a multi-valued function (default option).
For instance, the functions INV_CNAME, INV_PT, INV_CT etc. are multi-valued ones and, hence, can be defined as follows:
INV_CNAME(STRING) àà CUSTOMER
INV_CT(CUSTOMER) àà TRANSACTION
INV_PT(PRODUCT) àà TRANSACTION
Thus, a functional database system consists of the following
components:
1. Database functions defined by a database administrator (a functional database
schema);
2. Current state of the database functions defined as a collection of pairs
[Parameter] à[Resultant Value]
Queries in the functional data model are also defined using of functions. Such a user-defined function is called a query function or data manipulation functions. In the most simple case, database functions can be just combined in order to build query functions.
For instance, the query
"Get names of customers from Paris" can be defined as the following composite function:
CNAME(INV_CITY("Paris")) à string
Suppose that a database is defined by means of two external database functions:
CITY(CUSTOMER) à string and CNAME(CUSTOMER) à string
Suppose also that the functions looks as follows:
|
CITY(CUSTOMER) |
string |
|
CITY ($C1) |
London |
|
CITY ($C2). |
Paris |
|
CITY ($C3) |
Graz |
|
CITY ($C4) |
Paris |
|
CNAME(CUSTOMER) |
string |
|
CNAME ($C1) |
Smith |
|
CNAME ($C2). |
Hill |
|
CNAME ($C3) |
Johns |
|
CNAME ($C4) |
Maier |
In this particular case, the query is evaluated as follows:
1. INV_CITY("Paris") à ($C2,$C4)
2. CNAME(($C2,$C4)) à ("Hill", "Maier")
Analogously, the query "Get names of customers who bought the product "CPU "" is defined in the form:
CNAME(
CT( INV_PT(INV_PNAME("VDU") ) ) )
à string
Suppose also that the database functions (i.e. the current database state) looks as follows:
|
PNAME(CUSTOMER) |
string |
|
PNAME ($P1) |
CPU |
|
PNAME ($P2). |
VDU |
|
PT(TRANSACTION) |
PRODUCT |
|
PT ($T1) |
$P1 |
|
PT ($T2) |
$P2 |
|
PT ($T3) |
$P1 |
|
PT ($T4) |
$P2 |
Thus, the multi-valued function INV_PT returns the following lists:
INV_PT($P1) à ($T1, $T3)
INV_PT($P2) à ($T2, $T4)
|
CT(TRANSACTION) |
CUSTOMER |
|
CT ($T1) |
$C1 |
|
CT ($T2) |
$C2 |
|
CT ($T3) |
$C3 |
|
PT ($T4) |
$C4 |
In this case, the query is evaluated as follows:
1. INV_PNAME("CPU")à ($P1)
2. INV_PT(($P1)) à ($T1,$T3)
3. CT(($T1,$T3)) à ($C1,$C3)
4. CNAME(($C1,$C3)) à ("Smith", "Johns")
Normally, Database Functions are not sufficient to implement more or less complex queries. Thus, functional DBMS support a number of predefined (i.e. standard) Data Manipulation (DM) Functions which can be applied to the same data types as database functions (i.e., to single values or to lists). Such Data Manipulation Functions are often called a Functional Programming Language. Thus, we can also say that the functional data model extends the number of functions available in a particular functional programming language with database functions which are defined in a current database schema. There is a special mechanism for defining new, purpose-oriented functions as a combination of existing functions. This mechanism is called a Lambda Calculus.
In this paper, we use a notation similar to the programming language LISP [4] as a concrete example of such a functional programming language. Let us introduce some of such functions.
The function SELECT operates on two lists of equal lengths (i.e., which include the same number of elements); The first list consists of values of any type; the second list includes only Boolean values (i.e., "True" or "False"). The function returns the resultant list which consists of values from the first source list positionally corresponding to the "True" values in the second source list.
For example,
L1 = ("Smith", "Hill", "Johns", "Maier")
L2 = (True, False, True, False )
SELECT (L1, L2) à ("Smith", "Johns")
The function MAP has a certain list and another function as parameters. The function returns a list of values which is produced by means of application of the function-parameter to each element of the source list. Suppose the function ADD1(x) returns the result "x+1", then the function MAP operates as follows:
MAP((1,2,3,4), Add1) à (Add1(1), Add1(2), Add1(3), Add1(4)) à (2,3,4,5)
A number of so-called comparison functions is also available in functional programming languages. Such functions compare two values which are parameters, and return a boolean value (i.e., "True" or "False").
Suppose, the comparison functions available are as follows:
|
Function |
Returns “True” if |
|
EQ(N1, N2) |
|
|
GT(N1, N2) |
N1>N2 |
|
LT(N1, N2) |
N1<N2 |
Now more sophisticated queries can be defined by means of combining database functions which are declared in the database schema, and functions which belong to the functional programming language.
For instance, the query "Get names of products which cost more than 1000" can be defined as follows:
SELECT{PNAME(PRODUCT
( ) ), MAP[PRICE (PRODUCT( )),GT(1000)]}
Suppose, the functions PNAME and PRICE are:
|
PNAME (PRODUCT) |
string |
|
PRICE (PRODUCT) |
integer |
|
CPU |
|
PRICE($P1) |
900 |
|
|
PNAME($P2) |
VDU |
|
PRICE($P2) |
1200 |
|
PNAME($P3) |
HD |
|
PRICE($P3) |
1100 |
|
PNAME($P4) |
FD |
|
PRICE($P4) |
400 |
In this particular case, the query is evaluated as follows:
1. PNAME (PRODUCT) à (CPU, VDU, HD, FD)
2. PRICE (PRODUCT) à (900, 1200, 1100, 400)
3. MAP ( (900, 1200,1100, 400), GT(1000) ) à (False,True,True,False)
4. SELECT ( (CPU, VDU, HD, FD), (False,True,True,False) ) à (VDU, HD)
In analogy, the query "Get names of such customers who bought products that cost more than 1000" can be defined as follows:
CNAME
( CT ( INV_PT (
SELECT{ PRODUCT ( ), MAP [ PRICE
(PRODUCT), GT (1000) ] }
)))
For the same database, the query is evaluated as the following:
· PRODUCT ( ) à ($P1,$P2,$P3,$P4)
· PRICE (PRODUCT) à (900, 1200, 1100, 400)
· MAP ( (900, 1200,1100, 400), GT(1000) ) à (False,True,True,False)
· SELECT ( ($P1,$P2,$P3,$P4), (False,True,True,False)) à ($P2,$P3)
Suppose also that the functions INV_PT, CT and CNAME return respectively:
|
INV_PT(PRODUCT) |
TRANSACTION |
|
INV_PT ($P2) |
|
|
INV_PT ($P3) |
($T3) |
|
CT(TRANSACTION) |
CUSTOMER |
|
CT ($T1) |
|
|
CT ($T2) |
$C2 |
|
CT ($T3) |
$C3 |
|
CNAME(CUSTOMER) |
string |
|
CNAME ($C1) |
|
|
CNAME ($C2) |
Johns |

I.e., the current state of a hypermedia database can be depicted as follows:
In this particular case, the query is further evaluated as shown below:
· INV_PT( ($P2,$P3) ) à ($T1,$T2,$T3)
· CT( ($T1,$T2,$T3) ) à ($C1, $C2)
· CNAME( ($C1,$C2) ) à (Smith, Johns) - result of the query
Note that formally the example is not quite correct: in functional programming languages a multi-valued function always returns a list even if the list consists of one or zero single values. For instance, the multi-valued function INV_PT which is applied to the list of values ($P2, $P3) returns a list which consists of lists corresponding to each source value.
INV_PT( ($P2, $P3) ) à ( ($T1), ($T2), ($T3) )
Analogously, a single-valued function which is applied to a list of single values returns a list of the same arity (i.e. list which consists of the same number of elements) despite either the resultant list includes duplicates or not.
Thus, for instance,
CT ( ($T1, $T2, $T3) ) à ($C1, $C2, $C1)
Normally, functional languages include also the function SET which has a list as parameter and returns a resultant list consisting of only single values which can be found in source list; if an element of source list is a list, then the function SET is applied recursively. Note that the function SET does not include duplicate values into a resultant list. For example,
SET (INV_PT ($P2, $P3) ) à ($T1, $T2,$T3)
SET (CT ($T1, $T2, $T3) ) à (Smith, Johns)
In this paper, we consider that the function SET is applied to a result of any database function as a default option.
There also exist so-called single-valued set functions which are normally used to perform calculations in functional programming languages. Such a function accept a list of single values as a parameter, and returns a single value.
Set_function(<list>) à <single_value>
For instance, such well-known arithmetical functions as TOTAL, AVERAGE, MAX, MIN, etc., may be applied to a list of arithmetical values, and return an arithmetical value as a result.
Consider the following query "Get total price of all transactions committed by the customer "Smith""
TOTAL (TPRICE (INV_CT (INV_CNAME ("Smith") ) ) )
The query is evaluated as follows:
1. INV_CNAME ("Smith") à $C1
2. INV_CT($C1) à ($T1, $T3)
3. TPRICE( ($T1, $T3) ) à (2000,3000)
4.
TOTAL(2000, 3000)
à 5000
(result of the query)
Conventional set functions (i.e., UNION, INTERSECTION and DIFFERENCE) are also used in functional programming languages. Let us just recall the definition of these functions:
UNION (S1, S2) = {x:(x S1 OR x S2)}
DIFFERENCE (S1, S2) = {x:(x S1 AND x S2)}
INTERSECTION (S1, S2) = {x:(x S1 AND x S2)}
For instance, the query "How many pieces of the product "VDU" have been bought by the customer "Smith"" can be defined as follows:
TOTAL (QNT (
INTERSECTION ( INV_CT (INV_CNAME ("Smith") ),
INV_PT (INV_PNAME ("VDU") )
) ) )
In this particular case, the function INV_CT (INV_CNAME ("Smith”))à($T1, $T3) is a first parameter of the function INTERSECTION;
The function INV_PT (INV_PNAME ("VDU")) à ($T1, $T2) is the second parameter.
The function INTERSECTION returns list of entities "TRANSACTION" matching both the conditions (i.e., customer "Smith" and product "VDU")
INTERSECTION ( ($T1,$T3), ($T1,$T2)) à ($T1)
Evaluation of the database function QNT and the data manipulation function TOTAL are obvious.
Manual definition of queries in terms of the functional data model is quite tedious, to say the least. The most attractive feature of the functional data model is the possibility to define a so-called user's view of the structure of a database in the form of persistent purpose-oriented functions. The purpose-oriented functions are defined by a Database Administrator by means of so-called Lambda calculus. Thus, a functional database schema usually, includes not only database functions which describe the actual structure of the database, but also a number of purpose-oriented functions which describe a particular application of the database. In other words, the users access a database by means of purpose-oriented functions which reflects the user’s typical applications without any additional knowledge on actual structure of the database.

Forexample, the previously discussed functional database schema:
|
CUSTOMER
( ) à ENTITY PRODUCT
( ) à ENTITY TRANSACTION
( ) à ENTITY CNAME
(CUSTOMER) à STRING PNAME
(PRODUCT) à STRING .
. . . . . CT(TRANSACTION)
à CUSTOMER |
can be extended with the following persistent purpose-oriented functions:
|
PURCHASE(CUSTOMER) àà PRODUCT/* returns list of such products which were bought by a particular customer */ |
|
DEMAND(PRODUCT)
àà CUSTOMER |
|
TOTAL_EXPENSES(CUSTOMER) à <integer> /*returns value of total
expenses of a particular customer for company's products*/ |
Now, from the the user’s point of view, the databases supports these "new" functions. The term "persistent" means that the users can apply these functions at any time without additional definition. In other words, the functions of new functions simply extend a current database schema, and, thus, the functions can be used in the same way as database functions.
For instance, the query "Get names of customers who bought the product "VDU"" is defined as follows:
CNAME (DEMAND
(INV_PNAME ("VDU") ) )
Similarly, the query "Get names of customers who bought products which cost more than 1000" is defined as:
(DEMAND
(SELECT
(PRODUCT ( ), MAP (PRICE (PRODUCT), GT(1000)))
))
Purpose-oriented functions are defined in the form of so-called Lambda-expressions.
<function> = Lambda <parameter_1> [, ... <parameter_n>]
<Body>
where <Body> is a valid combination of previously defined functions.
For instance, the function DEMAND is defined by means of the following Lambda-expression:
|
DEMAND:Lambda x CT(INV_PT(x)) |
Note that the parameter "x" is used within the body definition.
Note also that types of input parameters and type of returned value are implicitly defined by the body of corresponding Lambda expression. Thus, in this particular case, the parameter "x" is used as a parameter of the database function "INV_PT", and, hence, must be a list of internal keys of the entity "PRODUCT".
INV_PT(PRODUCT)
àà TRANSACTION
CT(TRANSACTION)
àà CUSTOMER
Hence, the function DEMAND can be seen as the following:
DEMAND(PRODUCT)
àà CUSTOMER
The functions can be further reused to define new functions. Thus, the functions PURCHASE and TOTAL_EXPENSES can be defined as follows:
|
PURCHASE = Lambda Y |
TOTAL_EXPENSES
= Lambda Z TOTAL(TRANS_PRICE(INV_CT(Z))) |
It should be especially emphasized that a number of purpose-oriented functions may be at any moment extended by means of Lambda expressions. In other words, the users of the functional data model may develop a particular database system evolutionarily (i.e., in step-by-step fashion) by means of defining new purpose-oriented functions as soon as the users learn from the experience gained.
The most powerful mechanism of the functional data model is the so-called recursive definition of functions. In order to demontrate such facilities, consider the following example which is known in database literature as so-called “bill of materials”.
Suppose a database represents, among other things, the inventory of a manufacturing company. In particular, it provides information on how particular parts are manufactured out of other parts. Thus, for each complex part, the database contains a list of subparts which are needed to manufacture the part, a cost of manufacturing the complex part from its subparts, the mass increment or decrement that occurs when the subparts are assembled. Note that such manufactured parts may themselves be subparts in a further manufacturing process. The relationship between parts is therefore hierarchical, but it is a directed acyclic graph rather that a tree; for instance, the part D may be used in the manufacture of parts B and C, which are both used in the manufacture of part A. In addition, the database contains a particular information on the parts themselves: their names and, if they are imported (i.e., manufactured externally), the supplier and purchase cost.
Thus, the database schema can be defined as the following set of database functions.
PART ( ) à ENTITY
USES (PART) àà PART
Note that there exist so-called base parts (i.e., parts which are not assembled from subparts). When such a base part is a parameter of the function USES, the function returns an empty list as a result. The empty list is denoted as "NULL_VALUE".

Consider, for example, the following current state of the database:
|
PART ( ) |
ENTITY |
|
|
USES ( PART) |
PART |
|
A |
$P1 |
|
|
USES ( $P1) |
<$P2, $P3> |
|
B |
$P2 |
|
|
USES ( $P2) |
<$P4> |
|
C |
$P3 |
|
|
USES ( $P3) |
<$P4> |
|
D |
$P4 |
|
|
USES ( $P4) |
< > |
In this particular case,
USES($P2) à <$P4>
USES($P4) à NULL_VALUE = < >
In addition, the functions
MASS
(PART) à integer
and COST (PART) à integer
should be also declared in the database schema. Obviously, these functions may be applied only to base parts; they return a mass of a base part and a cost of a base part respectively. Note that the functions return "NULL_VALUE", if they are applied to a composite part.
For instance,
MASS($P1)
à NULL_VALUE
COST($P3)
à NULL_VALUE
MASS($P4)
à10
COST($P4)
à 5
Finally, the database function
QUANTITY
(PART, PART) à integer
returns a quantity of pieces of a particular subpart which is needed in order to manufacture one piece of a composite part. For instance, if one piece of the part "A" is manufactured out of 5 pieces of the part "B" and 4 pieces of the part "C"; and manufacturing one piece of either part "B" or part "C" requires 2 pieces of the part "D"; then the function QUANTITY can be seen as the following:
|
QUANTITY (PART,PART) |
integer |
|
QUANTITY ($P1,$P2) |
5 |
|
QUANTITY ($P1,$P3) |
4 |
|
QUANTITY ($P2,$P4) |
2 |
|
QUANTITY ($P3,$P4) |
2 |
Note that the function QUANTITY may be applied only if the second parameter is returned by the function USES applied to the first parameter. Otherwise, the function QUANTITY returns NULL_VALUE.
For instance,
QUANTITY($P1, $P4) à NULL_VALUE
QUANTITY($P2, $P3) à NULL_VALUE
A new function QNT_LIST can be defined as
|
QNT_LIST Lambda x QUANTITY (x, USES (x)) |
QNT_LIST
(PART) àà integers
Note that the function QNT_LIST has only one parameter (i.e., a composite part), and returns list of values which correspond to quantity of each subpart.
For this particualr example, since USES($P1) à <$P2, $P3>,
QNT_LIST ($P1) à <5,4>
In addition, a number of purpose-oriented functions can be defined on the basis of the database functions. Thus, the function BASE_PART which returns a list of base parts (i.e., a set of parts which are not manufactured from other subparts), can be defined as follows:
|
BASE_PART Lambda SELECT (PART ( ), EQ (MAP (PART (), USES), NULL_VALUE))) |
Suppose the function BASE_PART is applied to the previously discussed current state of the database.
|
PART ( ) à <$P1, $P2, $P3, $P4> SELECT (PART
( ), EQ (MAP (PART (), USES), NULL_VALUE))) |
|
SELECT
(<$P1, $P2, $P3, $P4>, EQ (MAP (<$P1, $P2, $P3, $P4>, USES),
NULL_VALUE))) SELECT
(<$P1, $P2, $P3, $P4>, EQ (USES($P1, $P2, $P3, $P4), NULL_VALUE))) SELECT
(<$P1, $P2, $P3, $P4>, EQ (USES($P1, $P2, $P3, $P4), NULL_VALUE))) |
|
SELECT
(<$P1, $P2, $P3, $P4>, EQ (<$P2, $P3>, <$P4>, <$P4>, NULL_VALUE), NULL_VALUE)) |
|
SELECT (<$P1, $P2, $P3, $P4>, |
<$P4> /* returned value */
In analogy, the function COMP_PART returning the list of composite parts (i.e., the subset of parts which are manufactured from other parts), is defined:
|
COMP_PART Lambda SELECT (PART ( ), NOTEQ (MAP (PART (), USES), NULL_VALUE))) |
Thus, the users may see these functions in the form:
BASE_PART ( ) àà PART
COMP_PART ( ) àà PART
In order to make recursive definition of functions possible, a number of special programming functions is normally used.
Thus, the function NIL returns the Boolean value "TRUE", if it is applied to an empty list; and the Boolean value "FALSE", otherwise.
For instance,
NIL(<$P1,$P2>) à FALSE
NIL(<>)
à TRUE
The function TRUE always returns the Boolean value "TRUE". Similarly, the function FALSE returns the Boolean value "FALSE".
The function COND is a more complex one. The function operates on the list which consists of a finite number of pairs of elements. Each pair includes a condition and another function. The function which is used as such a parameter, is evaluated, if a corresponding condition (i.e. , logical function) returns the Boolean value "TRUE".
The function COND may be seen in the following form:
|
COND( (<condition_1>,
<function_1>) (<condition_2>,
<function_2>) . . . . . . (<condition_n>, <function_n>) ) |
1. "condition_k" returns the Boolean value "TRUE" ;
2. all previous conditions (i.e., conditions <condition_1>, <condition_2>, ..., <condition_n-1>) return the Boolean value "FALSE".
Consider, for instance, the definition of the function IF_BASE which returns the Boolean value "TRUE", if it is applied to a base part, and returns the value "FALSE", otherwise.
There exist, at least, three different ways of defining this function.
Definition 1:
|
IF_BASE = LAMBDA x, EQ(USES(x), NULL_VALUE) |
Definition 2 (using the COND function):
|
IF_BASE = LAMBDA x, COND
( (EQ (INTERSECTION (x,BASE_PART),
NULL_VALUE), FALSE), (TRUE,TRUE) ) |
Note, the function COND is applied to the list which includes two elements:
1.
(condition_1, FALSE)
2.
(TRUE,TRUE)
where "condition_1" is the logical function
EQ (INTERSECTION (x, BASE_POINT), NULL_VALUE).
The “condition_1” returns "TRUE", if the parameter "x" is not a base part. In this case, the function IF_BASE returns the Boolean value "FALSE" because the function FALSE is evaluated in accordance with the evaluation rules for the function COND. If the function IF_BASE is applied to a base part, then the condition_1 returns the Boolean value "FALSE", and the function FALSE is not evaluated. The function TRUE always returns the value "TRUE", hence the corresponding function ("TRUE" in this particular case) is invoked whenever the previous functions ("FALSE" in this particular case) have not been evaluated.
Definition 3
(recursive):
There exist two special functions CAR and CDR which are commonly used in functional programming languages in order to define recursively functions.
The function CAR has a list as parameter and returns the first element of this list as result.
For example, CAR (<$P1, $P2, $P3>)
à $P1
Analogously, the function CDR returns the source list without the first element. In other words,
CDR
(<list>) = DIFFERENCE (<list>, CAR(<list>))
For instance,
CDR(<$P1,
$P2, $P3>) à <$P2, $P3>
CDR(CDR(<$P1,
$P2, $P3>)) à < $P3>
CAR(CDR(<$P1, $P2, $P3>)) à < $P2>
In order to define the function IF_BASE, we should define the function MEMBER first. The function MEMBER operates on a list and a single value as parameters; it returns:
- Boolean value "FALSE" otherwise.
|
MEMBER
= LAMBDA x, list COND( (NIL(list), FALSE), (EQ(x,CAR(list)), TRUE), (TRUE, MEMBER (x, CDR(list)) ) ) |
Consider the function evaluation:
MEMBER ($P2, <$P1, $P2, $P3>)
|
Step 1: x =
$P2, list = <$P1, $P2, $P3> COND( (NIL
( <$P1, $P2, $P3>))) à FALSE . . . . EQ(
<$P2, $P1>) à FALSE . . . . TRUE
à TRUE, then
the function
MEMBER ($P2, CDR($P1, $P2, $P3)) must be evaluated in turn. |
|
Step 2: x =
$P2, list = < $P2, $P3> COND( (NIL
( <$P1, $P2, $P3>))) à FALSE . . . . EQ( <$P2, $P1>) à TRUE , then the function TRUE must be evaluated. |
Thus, the function MEMBER($P2, <$P1, $P2, $P3>) returns "TRUE".
Now, definition of the function IF_BASE is obvious.
|
IF_BASE Lambda x, MEMBER(x, BASE_PART()) |
From a practical point of view, it seems to be reasonable to provide the users of the "Bill of Material" database with the additional function which returns a list of all base parts which are used in manufacturing of a particular composite part. Note that this function (say, named TR_USES) should take into consideration transitive relationships between parts. Thus, for instance, if the composite part "B" is used in manufacturing of the part "A", and the base parts "C" and "D" are used in manufacturing of the part "B"; then the function TR_USES being applied to the part "A" must return parts "C" and "D" as result. Note that in the same case, the database function "USES" returns the part "B" only.
Definition of the function TR_USES can be made in the following recursive form:
|
TR_USES
Lambda (x) COND ( (NIL(x),NULL_VALUE), (IF_BASE(CAR(x)),
UNION(TR_USES(CDR(x)),CAR(x)) (TRUE, TR_USES (UNION ( USES(CAR(x)),
CDR(x))) ) |
The function TR_USES($P1) is evaluated as follows:
|
Step1:
x=<$P1> NIL(<$P1>)
à FALSE IF_BASE($P1)
à FALSE TRUE à TR_USES (UNION (USES(<$P1>), NULL_VALUE)) |
|
Step2: x=<$P2,
$P3> NIL(<$P2,
$P3>) à FALSE IF_BASE
($P2) à FALSE TRUE à TR_USES (UNION (USES(<$P2>), <$P3>)) |
|
Step3: x=<$P4,
$P3> NIL(<$P4,
$P3>) à FALSE IF_BASE ($P4) à TRUE à UNION ( TR_USES(<$P3>), $P4 ) |
|
Step4:
x=<$P3> NIL(<$P3>)
à FALSE IF_BASE($P3)
à FALSE TRUE à TR_USES (UNION (USES(<$P3>), NULL_VALUE)) |
|
Step5: x=<$P4
> NIL(<$P4
>) à FALSE IF_BASE
($P4) à TRUE à UNION ( TR_USES(< >), $P4 ) à UNION (
NULL_VALUE, $P4 ) à < $P4 > |
Now, the result of the evaluation can be obtained by means of a so-called backtracking procedure which substitutes the recursive function by the result produced by means of evaluation of the next step starting from the last one. Thus, for this particular case, the backtracking procedure produces the following results:
|
Step 3: TR_USES(<$P4, $P3>) à UNION ( TR_USES(<$P3>), $P4 ) |
|
Step 2: TR_USES(<$P2, $P3>) à TR_USES (<$P4, $P3>) à
< $P4 > |
|
Step 1: TR_USES(<$P1>) à TR_USES(<$P2, $P3>) à
< $P4 > |
Thus, a functional database system consists of the following components:
1. Database functions defined by a database administrator (a functional database schema);
2. Current state of the database functions defined as a collection of pairs
[Parameter] à[Resultant Value]
3. A number of predefined DM functions which are reused to define new, purpose oriented functions by means of Lambda Calculus.
End-Users communicate with a functional database system by means of the purpose-oriented functions, and define a particular database query as a combination of database and purpose-oriented functions.
The following features of a Functional Database System distinguish it from Functional Programming Languages:
1. Database Functions are persistent database objects. They exist independently of user sessions and can be shared between users.
2. Database Functions have states that can change over time and may thus return different values to the same parameter at different times (this property is called reactiveness).
Since database functions are independent persistent database objects and purpose-oriented functions may be at any moment modified and/or extended by means of Lambda Calculus, functional databases are developed evolutionarily.