**Unformatted text preview: **See discussions, stats, and author profiles for this publication at: A Survey of Formal Software Development Methods
Article · April 1995
Source: CiteSeer CITATIONS READS 24 295 1 author:
Donald Sannella
The University of Edinburgh
141 PUBLICATIONS 3,201 CITATIONS SEE PROFILE All content following this page was uploaded by Donald Sannella on 26 June 2014.
The user has requested enhancement of the downloaded file. A survey of formal software development methods1
Donald Sannella
Department of Articial Intelligence and Department of Computer Science
University of Edinburgh
July 1988 1 Introduction
1.1 Scope This paper is a survey of the current state of the art of research on methods for formal
software development. The scope of this paper is necessarily restricted so as to avoid
discussion of a great many approaches at a very supercial level. First, although some of
the ideas discussed below could be (and have been) applied to hardware development as
well as to software development, this topic will not be treated here. Second, the special
problems involved in the development of concurrent systems will not be discussed here
although again many of the approaches mentioned below could be applied in this context.
Third, no attempt is made to treat programming methodologies such as Jackson's method
and program development systems such as the MIT Programmer's Apprentice which are
not formally based. Finally, this survey does not claim to be fully exhaustive although
an attempt has been made to cover most of the main approaches. Many of the technical
details of the dierent approaches discussed have been glossed over or simplied for the
purposes of this presentation; full details may be found in the cited references. 1.2 Software development: from requirements to program This section presents a general picture of the process by which a software system may be
developed by formal methods from a specication of the requirements the system must
fulll. This overall picture will be useful in discussing the wide variety of formal program
development approaches available, as the dierent approaches attack dierent aspects of
the problem.
Let SP 0 be a specication of the requirements which the software system is expected to
fulll, expressed in some formal specication language SL. (The process, sometimes known
as requirements engineering, by which such a precise formal specication is obtained starting from the informal and often vague requirements of the customer will not be discussed
here although it is acknowledged that this problem is by no means a trivial one.) This
specication constrains the input/output behaviour of the system in some fashion. It may
This paper was written under contract to GEC Research as a part of the Alvey-sponsored ISF (Integrated Systems Factory) Study.
1 1 also place constraints on the time and space resources available, although most of the
approaches to be discussed are unable to deal with constraints of this kind. The ultimate
objective is a program P written in some given programming language PL which satises
the requirements in SP 0.
The usual way to proceed is to construct P by whatever means are available, making
informal reference to SP 0 in the process, and then verify in some way that P does indeed
satisfy SP 0. The only practical verication method available at present is to test P ,
checking that in certain selected cases the input/output relation it computes satises the
constraints imposed by SP 0. This has the obvious disadvantage that (except for trivial
programs) correctness of P is never guaranteed by this process, even if the correct output
is produced in all test cases. On the other hand, methods for automatically generating
test cases which are likely to expose problems are available, see for example [BCFG 86].
An alternative to testing is a formal proof that the program P is correct with respect
to the specication SP 0. However, after two decades of work on program verication it
now seems to be more or less widely accepted that this will probably never be feasible
for programs of realistic size. At the very least, initial hopes for a system capable of
automatically generating proofs of program correctness are regarded as unrealizable.
Most recent work in this area has focused on methods for developing programs from
specications in such a way that the resulting program is guaranteed to be correct by
construction. The main idea is to develop P from SP 0 via a series of small renement
steps, inspired by the programming discipline of stepwise renement. Each renement step
captures a single design decision, for instance a choice between several algorithms which
implement the same function or between several ways of eciently representing a given
data type. This yields the following diagram:
SP 0 > SP 1 > SP 2 > > P If each individual renement step (SP 0 > SP 1, SP 1 > SP 2 and so on) can be proved
correct, then P itself is guaranteed to be correct. Each of these proofs is orders of magnitude easier than a proof that P itself is correct since each renement step is small. In
principle it would be possible to combine all the individual correctness proofs to yield a
proof of the correctness of P with respect to SP 0, but in practice this would never be
necessary.
When this approach is used to develop large and complex programs, the individual
specications SP n become large and unwieldy. This is particularly true as n increases. As
a consequence the proofs of correctness of renement steps become dicult, even though
the creative leap involved in a single step remains the same. The solution to this problem
adopted by some formal program development approaches is to allow specications to be
decomposed into smaller units during the development process. These smaller specications may then be rened independently of one another. A simple development involving 2 only two decompositions and six renement steps would then give the following diagram:
8
>> SP 1 >
SP 2 > P >>
<
8
>< SP 2 > P
SP 0 > >
>> SP 1 > > : SP > P
:
2
0 0 00 00 0 Here and are intended to denote arbitrary specication-building operations, and P ,
P and P are program modules. The program P (P P ) is guaranteed to be correct
with respect to the specication SP 0 provided each of the individual renement steps can
be proved correct. This assumes that and can be used for combining program modules
as well as specications, or at least that operations corresponding to and exist on the
level of program modules.
It is important to note that the neat and tidy diagrams above are not intended to
suggest that the formal development of realistic programs proceeds without backtracking,
mistakes and iteration. Formal program development approaches do not claim to remove
the possibility of unwise design decisions. But once a program is obtained by means of
some sequence of renement steps then a diagram like the one above which omits all the
blind alleys may be drawn. Then, provided all the required correctness proofs have been
carried out, correctness of the resulting program is guaranteed. Most approaches do not
require the correctness proofs to be carried out when the individual renement steps are rst proposed; the proof obligations can be recorded for later, to be discharged after it
becomes clear if the renement step will lead to (a module of) the nal program.
There are a number of questions which the general view of formal program development
presented above leaves completely open, and which any particular approach which ts into
this mould must answer. These include:
What is the specication language SL?
What specication-building operations such as and above are available in SL,
if any?
What is the programming language PL?
What is the relationship between PL and SL?
What does \renement" mean and under what circumstances is a renement step
correct?
How is the transition between SL and PL made?
What is the relationship between renement and decomposition?
Does renement of programs P > P make sense? What is the relation between
this and renement of specications SP > SP ?
0 00 0 0 0 3 00 Does the renement process itself have the status of a formal object subject to analysis and manipulation (see [SS 83])?
It is very important to the mathematical soundness of the particular program development approach under consideration that some of these questions (at least the rst seven)
be given very precise answers. For example, whatever specication language SL is used,
it is important that it be given a complete formal semantics. Without such a semantics,
specications have no precise meaning and so no formal proofs can be undertaken. The
same holds for the programming language PL. The notion of renement must be given a
precise mathematical denition and it must be shown that renements preserve behaviour
in an appropriate sense.
There are other questions which are important for determining the useability of an
approach:2
What methods are available for proving the correctness of renement steps?
Where do renement steps come from?
What tools are available for assisting with which aspects of the program development
process?
What level of sophistication is required by the user of such tools?
Which aspects of the program development process as sketched above can be fully
automated?
Does the approach require specications to be \complete" in any sense? Does it
provide a way of checking completeness of a specication or identifying areas of
incompleteness?
Does the approach provide ways of deriving programs which are optimal (or at least
adequate) with respect to some performance measure?
Does the approach provide a formal way of comparing pros and cons of dierent
implementations which meet a specication?
What are the complexity properties of the approach; for example, what happens to
the size of proofs of correctness as specications grow in size?
Again, it is essential for soundness that any methods which are established for proving
correctness of renement steps etc. are shown to be sound with respect to the formal
denition of renement.
Even once all these questions have been given precise answers, there are additional
conditions which the answers must satisfy. For example, whatever denition of renement
2 Thanks to Aaron Sloman for pointing out the importance of the last four of these. 4 is adopted, it is essential that renement steps be composable as discussed at the beginning of section 3. It turns out that this in turn imposes constraints on the semantics of
specication-building operations.
Although these are important questions, it is not the purpose of this paper to answer
all of them for all the approaches which will be mentioned below. Some of the answers
may be found in the cited papers. Sometimes an approach has no clear answer to one or
more of the most central questions; in this case I try to point this out. 1.3 Structure of this paper The rest of this paper surveys dierent approaches to formal program development in the
light of the discussion above. Section 2 surveys languages for writing specications and
programs. Section 3 examines the notions of renement adopted in dierent approaches.
Finally, section 4 discusses the issue of where individual renement steps come from. 2 Specication and programming languages
This section survey languages which are used in dierent formal program development approaches to write specications and programs. As suggested by the gradual evolution from
high-level specication to program described above, in many approaches the distinction
between specications and programs is blurred or even non-existent.
Some approaches described below adopt a single so-called wide spectrum language
which can be used to write high-level specications, ecient programs, and everything
which arises in the transition from the former to the latter during the program development process. In these intermediate steps it is natural for specication constructs to be
mixed freely with programming constructs because of the way that high-level specications
are gradually rened to programs. This also avoids various problems which arise when
separate specication and programming languages are used: there is no essential dierence
between renement of programs and renement of specications; the same modularization
constructs can be used to structure specications as well as programs; there is no sudden leap from one notation to another but rather a gradual transition from high-level
specication to ecient program.
For the purposes of this paper, \high-level specications" are descriptions which give
details of what is required. This is contrasted with \programs" which suggest how the
desired result is to be computed (which amounts to an algorithm of some kind). The word
\specication" shall be used to refer to any description of the input/output behaviour of a
system, whether algorithmic or not; thus a program is a specication which is executable.
This is consistent with the terminology used in wide spectrum languages where a program
is a specication which happens to use only the executable subset of the language.
In some approaches it is argued that the initial high-level specication of requirements
must be executable. This is thought to be necessary in order to ensure that this formal
specication accurately re
ects the customer's intentions; with an executable specication this can be ascertained by testing. We agree that it is necessary to start from a
5 specication of requirements which is correct with respect to our intentions, and that
bug-free formal specications are dicult to construct. But requiring the initial specication to be executable means that a major part (the most dicult part, in our view) of
the program development process is not formalized. For one thing, in constructing an
executable specication it is necessary to make many decisions which could be left open
in a non-executable specication. This means that a whole range of perfectly acceptable
alternative implementations is unnecessarily eliminated from consideration from the very
beginning. Writing specications in an executable specication language is just a form of
programming, and developing ecient programs from such specications is just program
optimization. Aiming for an initial specication which is as abstract and non-algorithmic
as possible often leads to useful simplications and generalizations which would not be
discovered otherwise. Such a specication can be \tested" and shown to accurately re
ect
our intentions using a theorem prover; instead of evaluating an expression e and checking
that the resulting value is v as expected, we try to prove that e = v is impled by our
specication.
A number of papers discussing these issues and others can be found in [GMc 86];
unfortunately most of the papers in this collection were originally published in the period
1977-1982 and so recent developments are not discussed. 2.1 VDM VDM (the Vienna Development Method) is a method for rigorous (not formal) program development. The objective is to produce programs by a process similar to the one
sketched in section 1.2 where the individual renement steps are shown to be correct using arguments which are formalizable rather than formal, thus approximating the level of
rigour used in mathematics. This is supposed to yield most of the advantages of formal
program development by ensuring that sloppiness is avoided without the foundational and
notational overhead of full formality. VDM is presented in [Jon 80] and [BJ 82]; a short
introduction to the approach is in [Jon 86]. VDM is the most widely accepted approach
to systematic program development available to date.
VDM uses a model-oriented approach to describing data types. Models are built using
functions, relations and sets. A simple example from [Jon 80] is the following specication
of dates:
Date :: YEAR : Nat MONTH : fJan,Feb,: : : ,Decg DAY : f1 : 31g
This models dates as triples, but does not require that dates be represented as triples in
the nal program. Not all of the values of type Date are valid; the legal ones are characterized by the following data type invariant : 6 inv-date(hy; m; di) =def
(m 2 fJan,Mar,May,Jul,Aug,Oct,Decg ) 1 d 31) ^
(m 2 fApr,Jun,Sep,Novg ) 1 d 30) ^
(m = Feb ^: is-leap-year(y) ) 1 d 28) ^
(m = Feb ^ is-leap-year(y) ) 1 d 29)
A problem with model-oriented specications is that it is easy to overspecify a system,
eliminating certain implementations from consideration from the beginning. In VDM a
precise notion of overspecication has been studied. A model is called biased if it is not
possible to dene an equality test on the data values in the model in terms of the operators
dened. Intuitively, a biased model contains unnecessary redundancy. An unbiased model
is viewed as suciently abstract to be the initial high-level specication of a system. More
concrete models are introduced during the process of renement (section 3.1).
Pre- and post-conditions are used to specify procedures, which may have side eects.
For example, a procedure called TIMEWARP which resets the date (part of the global
state) to a point 100 years in the past provided it is not October may be specied as
follows:
TIMEWARP
states: DATE
pre-TIMEWARP(hy; m; di) =def m 6= Oct
post-TIMEWARP(hy; m; di; r) =def r = hy ? 100; m; di
In the post-condition, r is the state which is produced by the procedure TIMEWARP. It is
necessary to show that TIMEWARP preserves the invariant associated with Date to ensure
that TIMEWARP cannot create an invalid date when given a valid date. Decomposition
during the renement process is a matter of breaking down procedures into individual
statements which can themselves be specied using pre- and post-conditions. When this
process has been carried out to completion the result is a program.
The ESPRIT project RAISE [BDMP 85] is attempting to provide VDM with a formal
foundation and support tools. A similar aim is being pursued by the MetaSoft project at
the Polish Academy of Sciences [Bli 87]. 2.2 Z Z is a specication language based on the principle that programs and data can be described using set theory just as all of mathematics can be built on a set-theoretic basis.
Thus, Z is no more than a formal notation for ordinary naive set theory. The rst version of
Z [ASM 79] used a rather clumsy and verbose notation but the current version [Spi 85,87]
adopts a more concise and elegant notation based on the idea of a scheme which Rgeneralizes the sort of thing behind mathematical notations like fx j x 7g, x:x + 1, 4x3dx,
8x:p(x), 9x:p(x), all of which involve bound variables of some kind.
Data types are modelled in Z using set-theoretic constructions, just as mathematical
\data types" like natural numbers, real numbers, ordered pairs and sequences are dened
7 in set-theoretic terms in mathematics. For example, in specifying a display-oriented text
editor [Suf 82] a document is described as an ordered pair consisting of the text before the
cursor and the text after the cursor:
DOC seq [CH ] seq [CH ] (CH is the set of characters which may appear in documents.) Two DOC -transforming
functions may then be specied as follows:
back : DOC ! DOC
ins : CH ! DOC ! DOC
dom back = fl; r j l 6= hig
(8(l; r) : DOC ; ch : CH )
back (l hch i; r) = (l; hch i r);
ins ch(l; r) = (l hch i; r); (In this specication, hi is the empty sequence; hch i is the sequence containing the single
character ch; and is the append function on sequences.) The function back (move
one character backwards) is partial; it is applicable only to documents having some text
before the cursor. This restriction on its domain is given by the rst axiom in the above
specication. The set of documents whose cursor is positioned at the beginning of a word
can be des...

View
Full Document