©2001 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. 

Emphasizing
Formal Analysis In a Software Engineering Curriculum
Ann E. Kelley Sobel
Abstract  The integration of formal method application throughout a six course software engineering curriculum is outlined. Formal analysis skills were included in order to increase the complex program solving skills of the student. The five instructionoriented courses presented highlight how formal analysis was introduced in and applied to the corresponding subject material. The materials presented, along with an accounting of the experiment, provide a basis for other academicians to teach formal analysis at their own institutions.
1. Introduction
A project that integrated formal methods into a software engineering curriculum[1] at Miami University’s Computer Science and Systems Analysis Department is outlined. This project endeavored to address the numerous articles in the current literature expressing the belief that the current educational program offered for a computer science degree does not stress the fundamental mathematical and engineering principles that should form the basis of software engineering [5,6,8,10,11]. A survey of computer science faculty attitudes toward formal methods found that faculty believe that formal methods are not only important, due to their disciplined approach to computer science, but are feasible to teach [9]. Among formal methods advocates, a common belief that the use of formal analysis during software development produces programs that have fewer errors is held. Despite this evidence, few undergraduate courses exist that teach the formal analysis of software and virtually no programs exist that attempt to base several standard core curriculum courses on the understanding and application of formal analysis during the construction of software.
It is for these reasons that an undergraduate software engineering curriculum would benefit from the integration of formal analysis. The experimental curriculum included the previously existing core curriculum courses of data structures, objectoriented software design, software engineering, and a senior capstone course that were modified to include formal analysis application. This curriculum also included two new courses designed to teach the skills necessary to apply formal methods; one emphasizing program derivation and the other emphasizing the analysis of concurrent programs. Students were required to have taken a discrete mathematics course at the end of their freshman year before they entered the experimental curriculum.
Two different formal methods were used throughout the experimental curriculum. The formal analysis of strictly sequential code followed Dijkstra’s weakest precondition semantics [3], while the analysis of concurrent programs used a tracebased operational formal method [12]. The latter formal method was chosen since it is relatively easy to learn, due to its operational nature and reliance on only prior knowledge of firstorder logic. These characteristics were particularly important given the student’s lack of software development experience and their limited exposure to only firstorder logic.
This paper provides only a sampling of examples, examination questions, and programming assignments. The full accounting of all materials can be found at http://www.eas.muohio.edu/san/formal. The evaluation of the experimental curriculum provides evidence of the benefits of undergraduate students learning to use formal analysis during software construction. The materials presented along with an accounting of the experiment provide a basis for other academicians to teach formal analysis at their own institutions.
Evidence was gathered throughout the threeyear period in order to establish the goals of the experiment; namely, undergraduate students are able to learn formal analysis techniques, the feasibility of teaching formal analysis, and the use of formal analysis increases the student’s complex problem solving skills. In order to control as many variables as possible, the experimental and standard core courses were taught by the same instructor using a common text. Mentors were given strict instruction to aid students only in the understanding of formal methods and its application. Students in the standard and experimental versions of a particular course experienced the same examinations and programming assignments. The analysis of the evidence establishes that all three goals of the experiment were met.
An overview of the experimental curriculum is presented in section 2 and expanded in sections 3,4,5, and 7. The basics of formal specification and program derivation are outlined in section 3. The use of coupling invariants to establish a valid implementation of an ADT is presented in section 4. Section 5 highlights the application of formal analysis during object oriented development. The introduction of the tracebased specification model used in the subsequent courses appears in section 6. The application of this specification model in the context of concurrent software systems is presented in section 7. Section 8 discusses the senior year software development project. The qualitative and quantitative results that assess the success of meeting the goals of the experimental curriculum are listed in section 9. Lastly, a summary is presented in section 10.
2. Overview of the
Curriculum
The established curriculum for the department that formed the basis from which the experimental curriculum was taught is a blend of computer science, software engineering, and operations research courses. The experimental course track included inserting formal analysis into four of the existing software engineering courses and two new courses that dealt solely with the teaching of formal methods. The sequence of the courses and their status was as follows.
Course Sequence and Status
Course
Title New Existing
Logical Foundations of Programs *
Data Structures *
ObjectOriented Software Design (OOD) *
Formal Analysis of Concurrent Programs *
Software Engineering *
Software Engineering
Senior Project *
The first course offered in the experimental curriculum emphasized program derivation. Students were taught to specify programs using firstorder logic and Cohen’s specification notation [2], and to represent programs using Dijkstra’s guarded command code [3]. Proofs were created manually and used both Hoare triples and the wp predicate transformer [4]. The programs examined in this class were typically quite small, usually fewer than fifteen lines of code.
The Data Structures course taught typical abstract data types (ADTs) which included lists, stack, queues, and trees. Programming assignments were of relatively short length and emphasized the use of a particular ADT as part of the application.
The OOD
course taught a variety of design methodologies (e.g., UML) and gave group
projects that focused on their use. The projects are traditionally
Windowsbased and the size of the graphical interface code (written using
Microsoft Foundation Classes) clearly dominates the total size of the
application code.
The Formal Analysis of Concurrent Programs course centered on the different modes of interaction between concurrent tasks/threads/processes. For the first time, students were faced with implementation code that produced multiple execution sequences coupled with the enforcement of synchronization. Students applied the tracebased operational formal method to specify and verify general problems that relied on multiple processes communicating by means of remote procedure calls, shared variables, and message passing.
As part of the department’s core requirements, a student experiences two courses that cover the entire software engineering life cycle during their senior year. The first course, Software Engineering, addresses feasibility study, prototyping, and design of a software system. The second senior course, Software Engineering Senior Project, assigns group projects with the expectation of a demonstration of a completed prototype of a software application at the end of the course.
Any student who participated in one of the existing experimental curriculum courses had to learn both the material that was commonly taught as well as how to apply formal analysis to any software application in that course.
3. Logical Foundations of
Programs
The basic skills
needed to apply formal analysis were taught in a new course, Logical
Foundations of Programs. This course covers mathematical induction,
wellfoundedness, specification, derivation, and the theory of the correctness
of programs. The text of Cohen, Programming in the 1990s, was chosen
because the presentation of this material was not coupled with an implemented
programming language; the use of the guarded command language [3] gives the
student the freedom to translate this language into a variety of contemporary
programming language constructs. The students were motivated to gain experience
in performing logical deduction by the assignment of Smullyan logic puzzles.
A simple example highlighting the derivation process follows. It originally appeared in an out of class quiz. Derive the guarded command code from the following specification without using the absolute value function.
var
x,z: int
; z : z = x
We must first note that the precondition for this specification is true and the postcondition is stated in the last line. This postcondition notes that only the variable z can be modified and the program terminates with z = x. Using the weakest precondition semantics (wp) [4], a statement, S, must be found that satisfies the following assertion.
true _{}wp.S.z = x
When deriving the statement S, comments that state the justification for each line are connoted using the notation, _{} < comment >. Continuing with the derivation of S, if we choose S to be z := x, we get
wp.z := x.z = x
_{} < definition of := >
x = x
_{} < definition of absolute value >
x _{}0
as the precondition which is not equivalent to true. If we choose S to be z := x, we get
wp. z := x. z = x
_{} < definition of := >
x = x
_{} < definition of absolute value >
x < 0
as the precondition which is not equivalent to true. However, we note that by combining these two assertions, x _{}0 _{}x < 0, we get the equivalent of true. This observation, and the fact that we started with identical postconditions in the two statement analysis above, permits the combination of the two statements into a single if statement:
if x _{}0 _{} z := x [] x < 0 _{} z := x fi
For a more complex example of program derivation, the following specification requires the derivation of a program that contains a loop.
var f:
array of int {F = #f}
; var s:
int
; s: s = (_{}i. 0 _{} i < F. f[i] )
Recognizing that this summation will probably require a loop ({P} do B_{}S od {R}), we must determine a loop invariant (P) and a guard (B). Using the technique of replacing a constant by a fresh variable, the postcondition can be rewritten as follows.
s
= (_{}i. 0 _{} i < n. f[i] ) _{} 0 _{} n _{}F
_{} n = F
The first two clauses will be used for the loop invariant and the modified last clause, n _{} F, will represent the guard. The postcondition is partitioned in this manner so that P _{} _{}B _{} R, which guarantees that the postcondition (R) holds upon termination. Next, the loop invariant must be shown to hold before execution of the loop begins. This requires finding a Q so that
Q
_{} wp. S_{o} . P
Clearly, S_{o }must contain_{ }some assignment of s and n. The weakest precondition semantics can be used to calculate the necessary values.
wp. s,n := E,G. s = (_{}i. 0 _{} i < n. f[i] ) _{} 0 _{} n _{}F
_{} < definition of := >
E = (_{}i. 0 _{} i < G. f[i] ) _{} 0 _{} G _{}F
_{} < choose G=0 to establish an empty range >
E = (_{}i. 0 _{} i < 0. f[i] ) _{} 0 _{} 0 _{}F
_{} < empty range, F = #f from Q, therefore F _{} 0 >
E = 0
S_{o} : s,n := 0,0
The final step
requires the calculation of S. Again, S
must contain some assignment of s and n.
wp. s,n := H,I . s =
(_{}i. 0 _{} i < n. f[i] ) _{} 0 _{} n _{}F
_{} < definition of := >
H = (_{}i. 0 _{}i < I. f[i] ) _{} 0 _{} I _{}F
_{} < choose I = n+1 >
H = (_{}i. 0 _{}i < n+1. f[i] ) _{} 0 _{} n+1 _{}F
_{} < algebra, range split >
H = (_{}i. 0 _{}i < n. f[i] ) + f[n]
_{} 0 _{} n+1 _{}F
_{} < s = (_{}i. 0 _{} i < n. f[i] ) >
H = s + f[n] _{} 0 _{} n+1 _{}F
_{} < n _{} F _{} 0 _{} n _{}F (given in precondition), algebra >
H = s + f[n]
Therefore,
S : s,n := s + f[n], n+1
Putting all the pieces together, the derived program follows.
{ F = #f }
s,n := 0,0;
do n _{} F _{} s,n := s + f[n], n+1 od
{ s = (_{}i. 0 _{} i < F. f[i] ) }
Classroom instruction was supplemented with several in class problem solving sessions that were supported by three undergraduate mentors. Students were evaluated by their performance on both in and out of class quizzes. In class quizzes predominantly tested their ability to perform textual substitutionbased proofs while out of class quizzes extended their experience performing program derivation. A major project in the course centered on the derivation of a guarded command program from a firstorder logic specification. Students had to verify that their program did indeed satisfy the given specification. Their programs were then translated into the C++ language. One student believed he was such a strong C++ programmer that he could generate a “better”' C++ program without using the guarded command program. Ironically, his C++ solution had several errors whereas his guarded command program had none.
4. Data Structures
The Data Structures course applied the formal analysis skills gained in the prior program derivation course to the development of abstract data types (ADT). ADT specifications were written using sequences of operations performed and the corresponding effect of performing those operations on an instance of the ADT. Correspondence between ADT operations and a sequence of implementation type operations were maintained using coupling invariants. Many implementations commonly used by software engineers, e.g., a list implemented using an array, were proven to correspond to the ADT. Both the instructor and the students were surprised to learn that the average length of proof for the correspondence of just one ADT operation was one and one half pages of hand written text. However, after completing such an exercise, there was no question as to the correctness of the implementation of a particular ADT.
Before introducing an example of establishing that a particular implementation satisfies the semantics of an ADT, the necessary sequence operations are given. For these definitions, s is a given sequence of type T (s _{} T*).
Operation 
Definition 
#s 
Length of s 
s ^ x 
x is concatenated to s where x_{}T 
s[k] 
k^{th} element of s (zerobased) 
s/<i,j>

restriction of the elements of s to the i^{th} through j^{th}^{ }element 
s^{} 
#s = 1 _{} #s > 1 _{} s/<0,...,s[#s2]> 
The following example illustrates the steps of establishing the correspondence between the operations for a stack and a list (sequence) based implementation of a stack. The axioms defining the semantics of a stack follow where T is any type.
initialize _{} empty (A1)
x _{} T. push[x] _{} _{} empty _{} top = x (A2)
_{}empty _{} pop (A3)
x _{} T. push[x]; pop _{} skip // no op (A4)
The sequence based definitions for the referenced stack operations in the group of axioms are given as follows.
st _{} T* := < > // st is a sequence of T values that is initialized to < >
push[x] _{} x _{} T _{} st’ = st ^ x
empty _{} # st = 0
top _{} _{} empty _{} st[#st1]
pop _{}
_{} empty _{} st’ = st^{}
Some of these definitions use of the “tick” notation, e.g., st’, to refer to the value of the sequence st after a particular sequence operation has been applied.
To demonstrate that the listbased implementation maintains the semantics of the push axiom given above (A2), one must prove:
st’ = st ^
x _{} #st’ _{} 0 _{} st’[#st’–1] = x
where x _{} T
which is the original push axiom with each stack operation replaced by its list implementation. Suppressing the details of the proof, one must establish the following first and second facts, (1) and (2) respectively, in order to verify the first and second clauses found on the right hand side of the implication.
(1) #st’ = #st + 1 _{} #st _{} N _{} #st’ _{} 0
(2) st’ =
st ^ x _{} st’/<0,…,#st’2> = st _{} st’[#st’1] = x
where x _{} T
Continuing with the pop axiom (A3) and
substituting each stack operation with its corresponding list implementation,
we can demonstrate that the list based implementation maintains the semantics
of the pop axiom by proving:
#st _{} 0 _{} st^{}^{}
which is true since st^{} is defined for #st_{}0.
Lastly, to demonstrate that the list
based implementation maintains the semantics of A4, one must prove:
( st’ = st ^ x; st = st’ ^{}^{ }) _{} st
Establishing the following two facts allow
the conclusion that the left hand side of the equivalence is indeed st.
(1) #st’ _{} 0 _{} st’/<0,…,#st’2> = st _{} st’[#st’1] = x
(2) #st’ _{} 0 _{} st’ ^{ } = st’/<0,…,#st’2>
All material covering the axiomatic
semantics of ADTs and the use of coupling invariants had to be generated by the
lecturer. Unfortunately, the formal methods students experienced less lecture
time on the standard data abstraction material so that time could be spent on
writing specifications. At times, we were forced to meet outside of regularly
scheduled class time. Some students
found this pace grueling and made their frustrations known. Evaluations
consisted of take home quizzes that centered on coupling invariants, in class
programming problems, and several out of class programming assignments that
were based on standard applications of ADTs.
5.
ObjectOriented Software Design
For the first time, students were challenged to
write the formal specification of a mediumsized programming problem before
writing any line of implementation code. Certainly, they already had a full
year of experience doing such a task but never on a sizable problem. Students
found it difficult to logically analyze nontrivial problems before writing a
single line of code. After gaining
more experience, they discovered that if one thoroughly analyzed the problem to
a point that they could write the specification of it, the problem was indeed
well understood by the developer.
A major project that was assigned during the
course was the development of an elevator controller. This assignment was chosen mainly for its familiarity to most
people, but it was also selected for comparison to several existing
specifications of the same problem. It is included here to demonstrate the size
and complexity of the programming problems that the students could formally
analyze. The informal requirements
consisted of a list of expected behaviors of the elevator.
1. Maintain a set of buttons that have been pushed inside the elevator (requested floor number)
2. Maintain a set of buttons that have been pushed outside the elevator (floor number at which passenger is waiting and requested direction)
3. Halt when elevator has no more requests to service
4. Service all outside requests eventually with all floors given equal priority
5. Service all inside requests eventually with all floors being serviced sequentially in the direction of travel.
The specification of the elevator scheduler follows. In words, the elevator’s new direction of movement is based on whether there exists a reason to continue in its current direction. The data structures used to maintain the current state of the elevator are the current floor (CurFloor), current direction of travel (CurDir), the set of requests internal to the elevator (InsideReqSet), and the set of requests external to the elevator (OutsideReqSet). The two predicates, ReasonToGo and PickupDropoff, evaluate whether such a reason exists. ReasonToGo determines whether it is necessary for the elevator to continue its movement in the current direction and PickupDropoff determines the necessary changes to the inside and outside request sets as passengers are picked up and dropped off at different floors throughout the building.
var CurFloor
: int
;var CurDir
: direction {CD = CurDir}
;var InsideReqSet
: set of int {I = InsideReqSet}
;var OutsideReqSet
: set of OutsideReq{O = OutsideReqSet}
;CurDir, InsideReqSet, OutsideReqSet: ( CD = HALT
_{} (ReasonToGo(UP) _{} CurDir=UP _{} PickupDropoff(UP))
_{} (ReasonToGo(DOWN) _{} CurDir=DOWN _{} PickupDropoff(DOWN))
_{} (_{}(ReasonToGo(UP) _{}ReasonToGo(DOWN)) _{}CurDir=HALT) )
_{}( CD_{}HALT
_{} (ReasonToGo(CD)_{}CurDir=CD_{}PickupDropoff(CD))
_{} ( _{}ReasonToGo(CD) _{}ReasonToGo(Rev(CD)) _{}CurDir=Rev(CD)
_{} PickupDropoff(Rev(CD)))
_{}(_{}(ReasonToGo(CD) _{}ReasonToGo(Rev(CD))) _{}CurDir=HALT
_{} PickupDropoff(HALT)) )
where
Direction {UP,DOWN,HALT}
Rev(Dir) UP if
Dir=DOWN
DOWN if Dir=UP
HALT if Dir=HALT
OutsideReq (floor:int, dir:Direction)
ReasonToGo(Dir) (_{}i.0_{}I<#I. (I.i > CurFloor _{} Dir=UP)
_{} (I.i < CurFloor _{} Dir=DOWN))
_{}(_{}o.0_{}o<#O. (O.o.floor > CurFloor_{} Dir=UP)
_{}(O.o.floor < CurFloor_{} Dir=DOWN)
_{}(O.o.floor = CurFloor _{} Dir=O.o.Dir))
PickupDropoff(Dir) ((CurFloor,Dir)_{}O _{} CurFloor_{}I)
_{}(CurFloor_{} I
_{} InsideReqSet=ICurFloor)
_{}((CurFloor,Dir)_{}O
_{}(CurFloor_{}I _{} InsideReqSet=I+GetDests()
_{}OutsideReqSet=O(CurFloor,Dir))
_{}(CurFloor_{} I
_{}InsideReqSet=I+GetDests()CurFloor
_{}OutsideReqSet=O(CurFloor,Dir)))
GetDests() A user interface function. This function
returns the set of buttons just pushed inside the elevator before the elevator
door closes.
The variables used in this specification are noted using the keyword, var. The braced predicates following their declarations represent the precondition that the variables’ initial values must satisfy. In the line after the variable declarations, those variables that are permitted to change are listed before the colon which is the symbol denoting the start of the specification. The details of this specification, as well as the full verification of the guarded command code, can be found in [1].
The original draft of the specification required approximately five hours to write. Revisions of the specification were required as errors were discovered through a process of test case analysis, as well as during the proof of the guarded command code. The errors were generally due to the specification being incomplete. For example, one error resulted from not specifying how the elevator should resume motion after halting. This error was discovered when considering a test case. Another error was discovered when the proof led to the generation of an abort statement. Revising the specification to account for errors required approximately ten additional hours, some of which was used for generating key portions of the proof. All errors in the specification were discovered before implementing code  that is, the specification never required revision due to errors discovered during actual code testing. The guarded command code was translated into C++ and no errors were found in the implementation to date.
6. TraceBased Operational Formal Method
The tracebased operational formal method was first introduced in the OOD course and extensively used in the concurrent programs course so it is introduced here. Only those components of the specification model that are necessary to understand the specification tables that are presented through the remainder of this work are provided in this section. The detailed components of both the specification and verification model can be found in [12].
The specification of a process/thread/task consists of a collection of the externally visible “action” specifications of the process. These externally visible actions constitute the process’ externally visible behavior. An action represents an interaction between this process and the other processes of the concurrent program. The occurrence of each action is recorded on a trace that is associated with the initiating process. The specification of an action describes when that action may occur in terms of the current value of the process trace and the effect of the occurrence of this action on the process trace by the addition of an element(s) to the trace sequence
6.1 Traces Operations and Functions
The variable h is used to represent the process trace which is a zerobased sequence. This sequence is represented as a list of elements delineated by angle brackets. A subscript i, e.g., h_{i}, is used to identify the process with which this trace sequence is associated. Generally, process trace sequences are initialized to the empty sequence _{}.
There are a number of trace functions that simplify the writing of predicates and use descriptive names to identify particular sequence element components instead of the unintuitive sequence component selection syntax, h[i][1]. The trace operations and convenient trace functions are defined below.
Notation 
Definition 
h[k] 
k^{th} element of h (zerobased) 
h’ ^= h 
h’ = h’ ^
h 
h’ _{} h 
h’ is an initial subsequence of h 
h^{r} 
reverse of h 
H/i 
restriction of the elements of h to those elements whose process id component (defined below) is I 
Type(h[i]) 
h[i][0]_{}{ !, ? } // send or receive 
Action(h[i]) 
h[i][1] // action component of the i^{th} element 
TAction(h[i]) 
h[i][0] ^ h[i][1] // type and action component 
PID(h[i]) 
h[i][2]
// process id component of the i^{th} element 
CallR(h[i]) 
h[i][3] //
process id of the receiver of the i^{th} element 
Val(h[i]) 
h[i][4]
// data component 
VSeq(h) 
_{} if h=< > VSeq(h^{})^Val(h^{r}[0]) otherwise 
SVSeq(h,a,b,c) 
_{} if h=< > SVSeq(h^{},a,b,c) ^ h^{r}[0] if Type(h^{r}[i])=a _{} Action(h^{r}[i])=b _{} Val(h^{r}[i])=c SVSeq(h^{},a,b,c) otherwise 
6.2 Process Specifications
Informally, an individual process execution is represented by a sequence of actions that are recorded on that process’ trace sequence. Typically, actions are some form of communication: synchronous message passing or shared variable communication. An action is defined in terms of a change of value of the corresponding process trace. A specification of an action, which includes the current and extended process trace, will be written in two parts: an enabling and an effect. The effect specifies the change in value of the process trace by concatenating element(s) to the trace. The enabling part specifies when this action may occur as a guard to the trace update.
To write the specification of a process, the actions that comprise the behavior of the process are determined. For each action, all possible enabling conditions and their effect on the process trace are listed. Tabular notation is used to represent the process specification. Each table is a visual representation of the potential changes of a process trace. Specification tables have the following form.
P_{i} 
Action 
enable 
Effect 
The enable condition is typically written in terms of the last element of the process trace sequence at any point in time, namely h^{r}_{i}[0]. Therefore, an individual element listed as the enable condition means that the last element of the current sequence is asserted to be of this form h^{r}_{i}[0]=enable_{j}. Similarly, the effect lists the element(s) added to the process trace sequence for a particular action, namely, h_{i }^= effect_{j}. Each process participates in an initial action that initializes a process’s trace sequence. This action will be omitted from the table when the process trace sequence is initialized to _{}.
A sequence element either has the form, <!, A, j, k, X> or <?, A, j, X>, where the first component represents the type of communication (either receive (?) or send (!)), A represents the externally visible action, and j names the process with which this element is associated. The k component names the process that either receives (?) or sends (!) this message, if known. If the receiver/sender is unknown or at a single, universally known location (e.g. a shared variable), then this field is listed as _{}. Lastly, X represents the sequence of data sent or received. If X is a data sequence of length one, then the enclosing angle brackets are omitted. If X is listed as _{}, then it is the empty data sequence. To conserve table space, the j component is omitted from the elements when its value is readily apparent. For example, using the previous specification table, the j component is omitted because it is known to have the value i which is given in the name of the table.
The notation, P_{i} sat T_{i}, indicates that P_{i}’s process trace sequence values satisfy the predicate t_{i }composed of P_{i}’s specification table entries (T_{i}) at any point during the execution of P_{i}. The predicate t_{i} is generally constructed as follows: _{j} h^{r}_{i}[0]=enable_{j} _{} h_{i} ^= effect_{j } where j ranges over the rows of process P_{i}’s specification table, T_{i}.
Given the following specification table for a process P_{i} which receives a value from another process in the program and forwards that value to the process P_{k} :
P_{i} 
Receive 
Send 
h_{i }= _{} _{} h^{r}_{i}[0]=<!,Send, k, X> 
<?, Receive,_{}, X> 

<?, Receive, _{}, X> 

<!, Send, k, _{}, X> 
The corresponding predicate of the specification table is as follows.
(h_{i }=_{} h^{r}_{i}[0]=<!, Send, k, X>) _{} h_{i} _{ }^= <?, Receive, _{},X> _{}
h^{r}_{i}[0] = <?, Receive, _{},X> _{} h_{i} ^= <!, Send, k, X>
7. Formal
Analysis of Concurrent Programs
Throughout the course, students used the
specification model defined in the tracebased operational formal method to
capture the formal specification of standard programming problems. Students wrestled with the complexities of
the order of execution of parallel programs and determining the necessary
conditions to enforce synchronization without causing deadlock. Students already had experience with the
concepts of nondeterminancy and fairness when studying sequential code, but
these concepts were considerably more complex for concurrent programs. It
became blatantly clear that the complexity of specification and especially of
verification had substantially increased.
An entire class period could be spent verifying a small problem,
however, these sessions were still extremely beneficial in that students could
actively participate in the verification process.
Students were evaluated using multiple out of class quizzes; several of which were used to build towards a full problem solution. Due to the complexity of the material, the final examination was also given as a take home assignment. Unfortunately, students were unable to execute many of the problems. However, we were able to execute any remote procedure call assignment using a recently obtained Ada compiler.
The following three subsections outline the application of the specification model to the various forms of synchronization found in concurrent systems: remote procedure call, shared variables, and synchronous message passing.
7.1 Remote Procedure Call
The semantics of a “rendezvous” center on the suspension of the calling process until the call is both accepted and completed. Afterwards, both processes resume their executions. In this situation, each action causes the addition of two elements to the process sequence, representing both the acceptance and the completion of the call.
The example chosen for the introduction of this type of synchronization between processes was a bounded queue. The concurrent program consists of three processes, one that sends data values to the queue (P_{s}), one that represents the queue data structure (P_{q}), and one that receives the data values from the queue (P_{r}). The maximum size of the queue is max. The specification of the program must ensure that VSeq(SSeq(h_{r},!,Get)) _{} VSeq(SSeq(h_{q},?,Put)) _{} VSeq(SSeq(h_{s},?,Put)); the items placed in the queue are removed in the same order from the queue. The specification of the queue process follows.
P_{q} 
Put 
Get 
R 
<?,Put,_{},x> ^ <!,Put,_{},_{}> 

T 

<?,Get,_{},_{}> ^ <!,Get,_{}, z > 
where
x is any datum
z = Val( SSeq(h_{q},?,Put)[#SSeq(h_{q},!,Get)1] )
R _{} #SSeq(h_{q},!,Put)  #SSeq(h_{q},!,Get) <
max
T _{} #SSeq(h_{q},!,Put)  #SSeq(h_{q},!,Get) > 0
The functions defining the value for z determine that the appropriate data item inserted (as defined in the program specification) is returned. In contrast to the classroom data base example, R ensures that room exists before the action Put is accepted and T ensures that there exists at least one element in the queue before the action Get is accepted.
A possible Ada implementation for the queue follows.
task body queue is
q: array(0..max1) of ELEM;
cnt,inq,outq:INTEGER := 0;
begin
loop
select
when cnt < max =>
accept
Put(c:in ELEM) do
q(inq mod max) := c;
end Put;
inq := inq+1;
cnt := cnt+1;
or when cnt > 0 =>
accept
Get(c:out ELEM) do
c := q(outq mod max);
end Get;
outq := outq+1;
cnt := cnt+1;
or
terminate;
end select;
end loop;
end queue;
7.2 Shared Variable Communication
The semantics of shared variable communication relies on process synchronization by means of the values of shared variables. The shared variables are used to record information that all processes need to assess the progress made towards solving a particular problem. Any expression can contain a reference to a shared variable.
This first example outlines the standard readers/writers problem. The informal specification of the problem states that any number of readers may read at one time when there is no writer but only one writer may write when there are no readers. The specification includes the use of two shared variables, <r,w>, which represent the number of readers and writers, respectively, and their access is maintained as critical regions by constructs in the implementation language. In this solution, each process wishing to access the shared resource must first determine how the resource is currently accessed by examining the current values of <r,w> and adjusting those values accordingly if access is allowed. In these trace elements, the symbol _{} is used to represent that no process is associated with the shared variables <r,w>.
P_{i} 
RWR_{i} 
Read_{I} 
Write_{i} 
TAction(h^{r}_{i}[0])={<?,RWR_{i}>, <!,Read_{i}>,
<!,Write_{i}>} _{} h^{r}_{i}[0=0 
<?,RWR_{i},_{},<r,w>> 


h^{r}_{i}[0]=<?, RWR_{i},_{},<r,w>> _{} Val(h^{r}_{i}[0])[1]
= 0 

<!, Read_{i},_{},<r+1,_{}>> ^
<?,Read_{i},_{},<r,w>> ^ <!,
Read_{i},_{},<r1,_{}>> 

h^{r}_{i}[0]=<?, RWR_{i},_{},<r,w>> _{} Val(h^{r}_{i}[0])[1]_{} 0 

_{} 

h^{r}_{i}[0]=<?,RWR_{i},_{},<r,w>> _{} Val(h^{r}_{i}[0])[0]
= 0 _{} Val(h^{r}_{i}[0])[1]
= 0 


<!, Write_{i},_{},<_{},1>> ^ <?,
Write_{i},_{},<r,w>> ^
<!,Write_{i},_{},<_{},0>> 
h^{r}_{i}[0]=<?,RWR_{i},_{},<r,w>> _{} Val(h^{r}_{i}[0])[0]
_{} 0 _{} Val(h^{r}_{i}[0])[1]
_{} 0 


_{} 
As long as no writers exist, this process may read the shared resource (first row). When it is finished reading, then the corresponding update is made to the value of r. If a writer exists, then the process is unable to enter its critical section. The third row states that when no readers or writers currently exist, this process may write to the shared resource. Once completed, the value of w is updated. Lastly, if there exists other readers or writers, then the process must wait.
Another interesting shared variable problem is determining the smallest index of a positive value in an array, c[ ]. In this program, one process will check the values of the even indices and the other checks the odd indices; both terminating once the location of the first positive value is found. Let n = size(c[ ]). The two shared variables will be paired as <et,ot> in the data component of the element; representing the even and odd indices, respectively. The specification for the program is as follows.
_{} et, ot. 0 _{} min(et,ot) < n.
(_{}j. 0 _{} j < min(et,ot). ( c[j] < 0 )_{} min(et,ot) < n ==> c[min(et,ot)] > 0)
P_{et} 
Init_{et} 
Cont_{et} 
Pos_{et} 
_{} 
<!,Init_{et},_{},<n,_{}>> 


R 

<?,Cont_{et},_{},<et,ot>> 

T 


<!,Pos_{et},_{},<i,_{}>> 
where
i = 2 * ( #SSeq(h_{et}, ?, Cont_{et})  1 )
R _{} #h_{et} > 1 _{} Type( h^{r}_{et}[0] ) = ? _{} i < min( Val( h^{r}_{et}[0] ) )
T _{} Type( h^{r}_{et}[0] ) = ? _{} c[i] > 0
P_{ot} 
Init_{ot} 
Cont_{ot} 
Pos_{ot} 
_{} 
<!,Init_{ot},
_{},<_{}, n>> 


R 

<?,Cont_{ot},_{},<et,ot>> 

T 


<!,Pos_{ot},_{},<_{}, j>> 
where
j = 2 * #SSeq(h_{ot}, ?, Cont_{ot})  1
R _{} #h_{ot} > 1 _{} Type( h^{r}_{ot}[0] ) = ? _{} j < min( Val( h^{r}_{ot}[0] ) )
T _{} Type( h^{r}_{ot}[0] ) = ? _{} c[j] > 0
In these specifications, i and j represent the first occurrence of a positive array value from examination of the even and odd indices, respectively. Each process continues searching until either a positive value is found or the end of the array is reached (R). If a positive value is found, the index of the positive value is written and the search is terminated (T). The process specifications can be refined into the following guarded command code.
P_{et} :: et := n; P_{ot} :: ot := n;
i := 0; j := 1;
do i < min(et,ot) _{} do j < min(et,ot) _{}
if c[i] > 0 _{} et := i; if c[j] > 0 _{} ot := j;
[] c[i] <= 0 _{} i := i + 2;  c[j] <= 0 _{} j := j + 2;
fi fi
od od
7.3 Synchronous Message Passing
The presentation of the semantics of synchronous message passing centers on the language communicating sequential processes (CSP) [7]. In CSP, processes that exchange messages name either the sending or receiving processes in their input/output statements. Processes can suspend execution while waiting for their communication partner to participate in the exchange. The semantics of this language is further complicated by the notion of distributed termination; processes can terminate if they are suspended at an I/O guard waiting for a message exchange from a process or processes that has/have already terminated. Therefore, all CSP process specifications will include the action of termination, Term, which is represented in the data component of an element by _{}. Since the action of termination is communicated to all processes within the program, the component representing the identity of the receiver is _{}.
The first example was chosen to carefully avoid the inclusion of distributed termination. Students were asked to write the specification of a process that performs the summation of natural numbers. The sum is reported back to the requesting process upon receipt of a negative number.
P_{sum} 
Receive 
Report 
Term 
h^{r}_{sum}[0]=<?,Receive,P_{j},y>_{}y_{}0 _{} h^{r}_{sum}=0 
<?,Receive,P_{j},x> 


h^{r}_{sum}[0]=<?,Receive,P_{j},y>_{} y<0 _{} h^{r}_{sum}=0 

<!,Report, P_{j},z> 

<!, Report, P_{j}, z> 


<!,Term,_{},_{}> 
where
x and y any data values
z _{}i . 0 _{} i < #h_{sum} . Type( h_{sum}[i] )=?. Val( h_{sum}[i] )
The specification was then used to derive the following CSP code.
P_{sum} :: sum := 0;
P ? x;
do x > 0 _{} sum := sum + x;
P ? x;
od
P ! sum
Another interesting problem centers on the rearrangement of the contents of two sets so that one of the resulting two sets contains the smaller elements and the other contains the larger elements. More formally, the specification of the set partitioning problem is as follows: Given two disjoint sets, S and T,
S_{}T = _{} _{} S’_{}T’ = _{} _{} ( S_{}T = S’_{} T’) _{} max(S’) < min(T’)
Where S’ and T’ represent the two sets after the execution of the set partitioning program terminates. The algorithm consists of two processes exchanging the largest value from S with the smallest value from T until the same value is exchanged in both directions. The specifications of the two processes, P_{s} and P_{t}, follow.
P_{s} 
Send_{s} 
Receive_{s} 
Term_{s} 
h ^{r}_{s}[0] _{}{_{}<?,Receive_{s},P_{t},x>} 
<!,Send_{s},P_{t},mx> 


<!,Send_{s},P_{t},mx> 

<?,Receive_{s},P_{t},x> 

R 


<!,Term_{s},_{},_{}> 
where
x any datum
mx max(g(S,h_{s}))
R #h_{s} > 1 _{} Val( h^{r}_{s}[0] ) _{} mx
g(Y,h) g(Y,h^{}) if #h>0 _{} Val( h^{r}_{s}[0] )= _{}
g(Y
 {Val(h^{r}[1])} + {Val(h^{r}[0])},
h/<0,…,#h3>)
if #h_{}2 _{} Val(h^{r}[0])_{} _{}
Y otherwise
P_{t} 
Receive_{t} 
Send_{t} 
Term_{t} 
h ^{r}_{t}[0] _{}{_{}<!,Send_{t},P_{s},mn>} 
<?,Receive_{t},P_{s},y> 


<?,Receive_{t},P_{s},y> 

<!,Send_{t},P_{s},mn> 

<?,Term_{t},P_{s},_{}> 


<!,Term_{t},_{},_{}> 
where
y any datum
mn min(w(T,h_{t}))
w(X,h) w(X,h^{}) if #h>0 _{} Val( h^{r}_{t}[0] )= _{}
w(X + {Val(h^{r}[1])} – {Val(h^{r}[0])}, h/<0,…,#h3>)
X otherwise
The two functions, g(S,h_{S}) and w(T,h_{t}), are used to determine the current values of the two sets S and T. The process P_{s} continues to send the maximum set value as long as the last value received is smaller than the current maximum set value (R). One should note that the process P_{t} only terminates when it receives the notification that P_{s }has terminated. The refinement of the two specifications results in the following CSP code.
P_{s } :: mx := max(S); P_{t} ! mx; P_{t} :: do P_{s }? y _{} T := T _{} {y};
S := S  {mx}; P_{t} ? x; mn := min(T);
S := S _{} {x}; mx := max(S); P_{s }! mn;
do mx > x _{} P_{t }! mx; T := T  {mn};
S := S 
{mx}; P_{t} ? x; od
S := S _{} {x};
mx := max(S)
od
8. Software Engineering Principles and Their
Application
The formal methods students were
challenged to apply their formal analysis skills to a safetycritical
industrial application in their senior year. Their first senior course
addressed the application of formal analysis during the requirements analysis
and requirements definition phase of the software engineering life cycle and
their second senior course extensively tested their ability to not only write
formal specifications but to verify the resulting design code (written in
guarded command language) of an industrial product.
It was important to choose a final senior project that possessed the requirement of being “fail safe” since it is this category of software that necessitates the type of analysis that a formal method provides. The project chosen was a software simulation of the Return To Launch Site (RTLS) scenario of the Command & Control software subsystem of the Space Shuttle for one of NASA’s Independent Verification & Validation centers. The formal methods students wrote the specifications of the subtasks of this system and derived/verified the resulting guarded command code. The end result of this project consisted of a document that includes over 800 lines of guarded command code, 1200 lines of specification, and 4000 lines of verification proofs. The formal methods students were able to identify numerous instances of inconsistency and incompleteness within the NASA document.
The students began this project at the end of the first senior course by analyzing the requirements specification document for the RTLS. The majority of these specifications were written in English and relied on line indentation to signify groupings of tasks. The first stage of the project centered on writing firstorder logic specifications of the main tasks of the RTLS. It immediately became obvious that the specifications would be exceptionally lengthy and difficult to write. But despite their length, the specifications were still not sufficiently strong to use the weakest precondition semantics to derive the code. The majority of these specifications were of the following form:
if x > 2, set y equal to 1
which would most naturally have as a
specification, “ x > 2 _{} y = 1” which is not sufficiently strong to
derive the appropriate guarded command code.
This is true because the code, y := 1, is a valid S using wp and
this specification as the postcondition; however, the assignment should only be
made if x is indeed greater than 2. To
correct this problem, one must write the specification to include additional
clauses that capture the state when variables do not change as well as when
they do. The above specification must
be written as:
( x > 2 _{} y = 1 ) _{} ( _{}(x >2) _{} y = Y ),
where y = Y is the precondition
For this reason,
the decision was made to abandon program derivation, write the guarded command
code independently of the formal specifications and then use verification
proofs to show correctness.
Once the guarded
command code was generated, the students began to prove this large
program by proving statements individually, and then proving increasingly
larger compositions of statements, continuing this process until the entire
program has been proved. Students began
this approach using the assumption that modularity is desirable: a particular
fragment of code might exist in several places and its proof could simply be
inserted each time it is needed. The
problem with this approach is that it ignores postconditions. It is not enough to prove that program S
satisfies its own postcondition when S is a component of a larger program. S must actually satisfy a larger
postcondition which includes clauses requiring that prior proofs may not be
invalidated. Therefore, the students
found they could reduce the number of proofs by starting with the program’s postcondition
when proving the individual program statements.
9. Empirical Results of the Educational Experiment
The
group of students that completed the experimental curriculum consisted of eight
seniors from an original group of fifteen.
The original group volunteered from a pool of sixtyfive freshmen.
Twentyfive percent of the completing students were women, which is a higher
percentage than for our majors. In the first two years, four students
reluctantly abandoned the experimental curriculum in order to coop or continue
their studies abroad which prevented the student from taking all six of the
required, consecutive courses. Two students found the time commitment of
learning the additional material too great.
Only one student left the program due to a dislike of the material.
These students possessed no more training
or education than our general undergraduate major student, but they did
volunteer to strengthen their complex problem solving skills. A comparison of the group’s average ACT
score with the remaining major’s average ACT score showed that the two
populations were roughly equivalent (difference of only 1%). It was particularly important that these two
groups be equivalent in ability since the experimental group could not be
randomly chosen from a large population.
The formal methods students performed an
average of sixteen percent higher on examinations in the Data Structures
course, an average of eight percent higher in the OOD course, and an
average of four percent higher in the first of the two senior courses. In all cases, the two groups performed the
same on their final examinations.
A
detailed, statistical comparison was made between a programming assignment
given to both the formal methods and our general students during their OOD
class and it appears in [1]. It is, however, worth repeating that all
of the formal methods student’s projects passed all test cases and only
five of the eleven (45.5%) nonformal method student’s projects passed all test
cases.
Clinical interviews attempted to capture
a student’s problem solving process by recording the verbalization of a
student’s attempt at solving a posed problem.
In clinical interviews, most students in both groups were able to give
solutions to known problems. However, the general students tended to give
solutions in terms of code statements and used general terms incorrectly.
Approximately one quarter of the general students were able to solve unknown
problems whereas almost all the formal method students were able to give
general, conceptual solutions to the same set of unknown problems.
A general analytic exam was given to all
senior majors in order to compare the two group’s analytic skills when solving
computer science problems. The questions
were randomly selected from sample test questions in the analytic portion of
the GRE examination. The test was administered anonymously on a single day in
the fourth year classes. Participating
students were cautioned to take the exam only once and to not discuss its
contents. The formal methods students
answered thirtysix percent more questions correctly than the general senior
major.
10. Summary
An overview
of the materials taught in an experimental curriculum that emphasized the
application of formal analysis was presented.
The goals of the experiment were met; namely, it is possible to teach
formal methods to undergraduates and the skill of formal analysis benefits the
student by increasing their complex problem solving ability. The approach, materials, and experiences are
presented here so that other academicians are given a basis for generating
their own courses emphasizing formal analysis.
As the reader may have
already concluded, this selfselected group came to this experiment of their
own free will, which suggests they had some interest in firstorder logic. Even these students who were inherently
motivated had to maintain high levels of dedication to master this material. It became clear after the Data Structures
course that the students needed additional motivation from the instructor to
see why this path was chosen to learn formal analysis skills.
The instructor must
face at least two challenges when teaching this type of material. The first centers on a student’s
eagerness to apply his knowledge to “realworld” problems. It is difficult to
maintain motivation when students are working on specifying their n^{th }relatively
small, independent function. Their
eagerness led to the selection of the Space Shuttle problem, which gave students
a virtually infinite amount of motivation (as evidenced by the creation of a
roughly five hundred page document containing thousands of lines of
analysis).
The other challenge to the instructor is
common to the teaching of any new subject; few, if any, textbooks exist.
Instructors must generate substantial portions of the material presented. This activity not only requires a large time
commitment, but leads to an untried delivery of the material. It is important
to periodically remind students that exploring new and relatively unpublished
subject matter bears the necessary burden of unpolished lectures.
The following is a quotation taken from
surveys given to the formal methods students several months after graduation. It
appears here as motivation and encouragement for other academicians to embrace
the challenge of enriching their student’s educational experience.
“Facts
and figures may never be of use in real life and forgotten quickly, but the
ability to analyze difficult problems will be applicable throughout life and
always be remembered.”
11. References
[1] Clarkson, M.R. and A.E.K. Sobel. “Formal Methods Application: An Empirical Tale of Software Development”, conditionally accepted to IEEE Trans. On Software Engineering, 2000.
[2] Cohen, E., Programming in the 1990s: An Introduction to the Calculation of Programs. SpringerVerlag, 1990.
[3] Dijkstra, E.W., A Discipline of Programming, Prentice Hall, 1976.
[4] Dijkstra, E.W. and C.S. Skholten, Predicate Calculus and Program Semantics, SpringerVerlag, 1990.
[5] Garlan, D., “Making Formal Methods Education Effective for Professional Software Engineers”, Information and Software Technology, Elsevier Science, Vol. 37, No. 56, May 1995, pp. 261268.
[6] Gries, D., “The Need for Education In Useful Formal Logic”, in “An Invitation to Formal Methods”, Computer, Vol. 29, No. 4, April 1996, pp. 2930.
[7] Hoare, C.A.R., “Communicating Sequential Processes”, CACM, Vol. 21, No. 8, 1978.
[8] Lutz, M., “Formal Methods and the Engineering Paradigm”, Proceedings of the Conference on Software Engineering Education”, SpringerVerlag, October 1992, pp. 121130.
[9]
Palmer, T.V. and J.C. Pleasant, "Attitudes Toward the Teaching of
Formal Methods of Software Development in the Undergraduate Computer Science
Curriculum: A Survey", SIGCSE Bulletin , Vol. 27, No. 3,
September 1995, pp. 5359.
[10] Parnas, D.L., “'Formal Methods' Technology Transfer Will Fail”, Journal of Systems and Software, Vol. 40, No. 3, March 1998, pp. 195198.
[11] Saiedian, H., “Towards More Formalism in Software Engineering Education”, SIGCSE Bulletin, Vol. 25, No. 1, March 1993, pp. 193197.
[12] Sobel, A.E.K., “The Model and Methodology of a TracedBased Operational Formal Method”, submitted to IEEE Trans. On Software Engineering, 1999.