Introduction

Tony Hoare introduced Communicating Sequential Processes (CSP) in 1978 as a language to describe interactions between concurrent processes. Historically, software advancement has mainly relied upon improvements in hardware that create enable faster CPUs and larger memory. Hoare recognized that a machine with hardware that is 10 times as fast running a code that consumes 10 times more resources is not an improvement.

While concurrency has many advantages over traditional sequential programming, it has failed to gain a popular audience because of its erroneous nature. With CSP, Hoare introduced a precise theory that can mathematically guarantee programs to be free of the common problems of concurrency. In his book, Learning CSP (the third most quoted book in Computer Science), Hoare uses calculus to show that it is possible to work with deadlocks and nondeterminism as if they were terminal events in ordinary processes. By reducing the errors of CSP, Hoare has enabled computer scientists to fully exploit the capacity of CPUs.

What is concurrency?

The ordinary task of doing your laundry illustrates the basics of concurrency. There are two ways to finish two loads of laundry:

Put 1st load in washing machine → put 1st load in dryer → fold 1st load → put the 2nd load into the washing machine → repeat process.
Put 1st load in washing machine → put 1st load in dryer → put 2nd load in washing machine → etc…

Clearly, the second method is quicker and better utilizes resources. If the first load is in the dryer, the washing machine isn’t being used and should be used by the second load. This is the basics of concurrency.

Historical Information

The earliest ideas for concurrent processing arose naturally in the 1960’s because of scarce resources. At that time, processing power was expensive, and it was wasteful for a processor to have to wait while it communicated with slow peripheral equipment or human (Hoare, Learning CSP).

Ex: The task of performing a simple addition problem shows how concurrency can greatly improve computational efficiency. Adding numbers requires three parts:

Ask user for input numbers
Performing calculation
Printing the answer

Problems of concurrency and Hoare’s solution

While concurrency offers great potential for faster computation, it is not without fault. Unlike parallelism, where n tasks run on n processors, in concurrency, n tasks run on 1 processor. Because the amount of resources doesn’t increase, programs that need to use the same resource must wait.

The amount of time user waits increases linearly
The amount of storage space needed increases with number of jobs
Difficulty verifying program correctness

The difficulty of verifying program correctness has been the primary hindrance in the field. Programmers have shied away from parallel programming because errors that arise in this method of programming are notoriously difficult to track down. Programs are often prone to errors that not obvious and surface under arbitrary and unrepeatable situations. The complex nature of concurrency leaves the programmer in doubt. Programmers often resort to exhaustive search tactics to find errors:

Testing the code rigorously to sort out obvious concurrency issues and hoping that all problems have been resolved
Completely follow design patterns and guidelines for concurrent programming. This method is limited in its applicability.

OR...
Hoare’s approach CSP uses mathematical deductions to prove that a program is error free. Through the application of CSP, programmers no longer search for bugs in written programs, but rather can write programs that are logically guaranteed to be correct.

General terms in CSP

Alphabet- set of events which are considered relevant for a particular object. The alphabet of a process is enclosed within curly brackets. a. ex: vending machine- alphabet {coin, chocolate} b. not valid: {coin, car} Trace: sequence of symbols recording the events in which the process has engaged in within a given time. The trace is enclosed within angle brackets. c. Vending machine:

Nondeterminism

In a concurrent program, two or more processes compete for the same resource. The resolution of this dilemma is not always deterministic. The unpredictable arrival order of messages creates a nondeterministic state (2).

Ex: A change machine can give change for a dollar in many different ways: four quarters, ten dimes, 100 pennies. The type of change given is not dependent on the type of change machine, but rather by some arbitrary or nondeterministic fashion.
Ex2: A passenger waiting for the bus to take him to London. The A, B, D, and F lines all go to London. The passenger can take any f these lines. Which line he gets on is only dependent upon which bus arrives first.

Why nondeterminism? Programmers use nondeterminism to exclude the events that don’t effect the outcome as a technique to simplify the problem. In the case of the change machine, only the sum, not the combination of coins matters. By reducing the number of variables, nondeterminism helps maintain a high level of abstraction when describing complicated systems.

Problems with nondeterminism Although nondeterminism can greatly reduce the complexity of problems, they also introduce their other issues. In a deterministic program, the answer will always be the same given the same inputs. In a nondeterministic program, the same inputs can yield different answers on different cycles or machines. This characteristic makes it difficult to check whether the program works.

CSP and nondeterminism

CSP introduces the notation Π to signify a process which “behaves either like P or like Q, where the selection between them is arbitrarily, without the knowledge of the external environment (1).”

Ex: During lunch, you can choose between an apple or an orange. The choice can be mathematically expressed as: orange Π apple, where the choice between the two is based on a unaccounted external factor.

Algebraic laws governing nondeterministic choices are simple.

Idempotenence: A choice between P and P is empty P Π Q = P
Symmetry:P Π Q= Q Π P The order does not influence the choice
Associative: P Π (Q Π R) = (P Π Q) Π R The choice between three options can be divided into two successive single choices.
Distributive: x → (P Π Q) = (x → P) Π (x → Q) Going from a defined path x to a choice between path P and Q, is the same as choosing between the options of going from path x to P or from path x to Q.

Fairness

In some theories, nondeterminism is obliged to be fair, in the sense that an event that infinitely often may happen eventually must happen (though there is no limit to how long it may be delayed). In Hoare’s theory, the concept of fairness doesn’t exist:

“Because we observe only finite traces of the behaviour of a process, if an event can be postponed indefinitely, we can never tell whether it is going to happen or not. If we want to insist that the event shall happen eventually, we must state that there is a number n such that every trace longer than n contains that event. Then the process must be designed explicitly to satisfy this constraint. For example, in the process P0 defined below, then event a must always occur within n steps of its previous occurrence.”

-Learning CSP

If fairness is required for the program, it must be considered and accounted for separately.

Shared Resources

Laws for reasoning about sequential processes derives from the fact that each variable is updated by one process (learning CSP). If storage is shared, only one process can change the variable. The potential for data corruption makes common variables and communication amongst processes difficult to implement.

Deadlock

Deadlock is the permanent blocking of a set of processes. It is a common problem in concurrency and arises from the conflicting needs of processes for similar resources or when communicating with each other.

Example of deadlock: Intersection of cars(3). The shared resources can be thought of as the lanes. Each car shares the four lanes. The process is the car. Deadlock occurs when each process holds one resource and requests the other. While one car can decide to switch lanes, no car can agree on the proper action to take. As in traffic, deadlock in computer science slows or completely halts a program.

Classical illustration of deadlock

There is a group of five philosophers who do nothing but eat and think all day. The philosophers sit around a round table with a bowl of noodles in the middle. As philosophers get paid less than computer scientists, they can afford only five single chopsticks, which are placed on each side of the philosopher. To eat the noodle, the philosopher needs both chopsticks. The philosopher first picks up the chopstick from his left, and then his right if it is not being used.

Deadlock arrives when all philosophers want to eat at exactly the same time. All philosophers pick up the chopstick to their left. However, none of the philosophers can eat because another philosopher is currently using the other chopstick.

Events that lead to Deadlock

There are several combinations of events can cause deadlock

Mutual exclusion: Only one process can use a resource at a time
Hold- and- wait: A process holds onto its resource until the next resource its needs becomes available
No preemption: No process can be forced to give up its resources
Circular wait: closed chain of processes where each process holds the resource the next process needs to function.

While these situations can lead to deadlock, there are precautions programmers can take to prevent their occurrence.

Mutual exclusion: Restrict the way in which resources can be requested.
Hold- and- wait: Require all processes to provide information about the resources they will need in advance. Use algorithms to insure that all resources are available to the process before it attempts to acquire them.
Circular wait: Establish a priority system that requires processes to request resources and process them in that order, such that a higher priority process will always have access to the resource first. This solution can lead to starvation.

Eliminating the possibility of a deadlock is better than dealing the deadlock during execution. However there will may arise arrive unique combination of situations that lead to deadlock. There are methods of resolving a deadlock when it’s detected, but these solutions are not efficient and resolve in lost data (4).

Preempt the resource from a process. The preempt process can be resumed at a later time.
Return to a point where the process did not need the acquired resource causing the deadlock.
Systematic killing of jobs until deadlock is resolved.

CSP and deadlock solution

In order to prevent data corruption, Hoare purposed the concept of a critical area. Processes cross the critical area to gain access to the shared data. Before entry to the critical area, all other processes must verify and update the value of the shared variable. Upon exit, the processes must again verify that all processes have the same value.

Another technique to maintain data integrity is through the use of mutual exclusion semaphore or a mutex. A mutex is a specific subclass of a semaphore that only allows one process to access the variable at once. A semaphore is a restricted access variable that serves as the classic solution to preventing race hazards in concurrency. Other processes attempting to access the mutex are blocked and must wait until the current process releases the mutex. When the mutex is released, only one of the waiting processes will gain access to the variable, and all others continue to wait.

In the early 1970s, Hoare developed a concept known as a monitor based on the concept of the mutex. According to a tutorial on CSP in the Java programming language written by IBM:

“A monitor is a body of code whose access is guarded by a mutex. Any process wishing to execute this code must acquire the associated mutex at the top of the code block and release it at the bottom. Because only one thread can own a mutex at a given time, this effectively ensures that only the owing thread can execute a monitor block of code.”

Monitors can help prevent data corruption and deadlocks (5)

Possible Solutions to Philosopher Problem

A physical representation of the solution to the deadlock problem can be visualized as the footman. The behavior of the footman allows him to sit only four philosophers at the table simultaneously.

The metaphor of the dining philosophers was thought by the well known computer scientist, Edsger Dijkstra. Carel S. Scholten discovered the footman solution.

Infinite Overtaking

This problem arises when priority is assigned to programs. Some program always takes precedence at the expense of other programs that are delayed forever.
Ex: You are waiting to be seated at a restaurant. You are next in line, and just about to be shown your table when a famous actor walks in. The restaurant, mindful of good publicity, seats the famous actor first. When the next table becomes available, the waiter turns to you, but then sees a famous singer walk in. Again, the waiter seats the singer before you. The weighting of resources leaves you at a disadvantage. If this cycle continues, you could be delayed forever, or at least for an unacceptable period of time.

Overtaking Solution

The task of deciding how to allocate resources to waiting processes is called scheduling. Scheduling is split into two events, which Hoare terms the please and the thankyou:

Please- processes requesting the resource
Thankyou- the allocation of the resource to processes.

The time between the request and granting of the resource is the waiting period. In CSP, there are several techniques that prevent infinite waiting times.

Limiting resource use and increasing availability of resource.
First in first out (FIFO)- allocate resource to the process that has waited the longest.
Bakery algorithm (A more technical explanation of the scheduling algorithm can be found in the reference (6))

Limitations of CSP

In determininistic programs, the result will be the same if the environment is constant. Because concurrency is based on non-determinisim, the environment does not affect the program. Given the paths chosen, the program can run several times and receive different result. To insure the accuracy of concurrent programs, programmers must be able to consider the execution of their program on a holistic level.

However, despite the formal methods that Hoare introduced, there still lacks any proof method to verify correct programs. CSP can only catch problems it knows exists, not unknown problems. While commercial applications based on CSP, such as ConAn, can detect the presence of errors, it can’t detect their absence. While CSP gives you the tools to write a program that can avoid the common concurrency errors, the proof of a correct program remains an unresolved area in CSP.

Future of CSP

CSP has great potential in biology and chemistry to model complex systems in nature. It has not been widely used in industry because of the many existing logical problems facing the industry. At the conference for the 25th anniversary for the development of CSP, Hoare noted that despite the many research projects funded by Microsoft, Bill Gates ignores the issue of when Microsoft will be able to commercialize the work on CSP (7).

Hoare reminds his audience that the area of dynamic procedures still requires much more research. Currently, the computer science community is stuck in the paradigm of sequential thought. With the foundation in formal methods of concurrency established by Hoare, the scientific community is primed to being the next revolution in parallel programming.

References

Hoare, C.A.R. Learning CSP. June 21, 2004.
Haghighi, Hassan and Mirian-Hosseinabadi, Seyyed H. Nondeterminism in Formal Development of Concurrent Programs: A Constructive Approach
Concurrency: Deadlock and Starvation. Presentation Obtained from engr.smu.edu/~kocan/7343/fall05/slides/Chapter06.ppt
Rinard, Martin C. Operating Systems Lecture Notes
Abhijit Belapurkar. CSP for Java Programmers, Part 1
Carnegie Melon. Bakery Algorithm
Numerico, Teresa an Bowen, Jonathon. 25 Years of CSP