Basics of Automata Theory

Introduction

Automata Theory is an exciting, theoretical branch of computer science. It established its roots during the 20th Century, as mathematicians began developing - both theoretically and literally - machines which imitated certain features of man, completing calculations more quickly and reliably. The word automaton itself, closely related to the word "automation", denotes automatic processes carrying out the production of specific processes. Simply stated, automata theory deals with the logic of computation with respect to simple machines, referred to as automata. Through automata, computer scientists are able to understand how machines compute functions and solve problems and more importantly, what it means for a function to be defined as computable or for a question to be described as decidable .

Automatons are abstract models of machines that perform computations on an input by moving through a series of states or configurations. At each state of the computation, a transition function determines the next configuration on the basis of a finite portion of the present configuration. As a result, once the computation reaches an accepting configuration, it accepts that input. The most general and powerful automata is the Turing machine.

The major objective of automata theory is to develop methods by which computer scientists can describe and analyze the dynamic behavior of discrete systems, in which signals are sampled periodically. The behavior of these discrete systems is determined by the way that the system is constructed from storage and combinational elements. Characteristics of such machines include:

Inputs: assumed to be sequences of symbols selected from a finite set I of input signals. Namely, set I is the set {x₁, x,₂, x₃... x_k} where k is the number of inputs.
Outputs: sequences of symbols selected from a finite set Z. Namely, set Z is the set {y₁, y₂, y₃ ... y_m} where m is the number of outputs.
States: finite set Q, whose definition depends on the type of automaton.

There are four major families of automaton :

Finite-state machine
Pushdown automata
Linear-bounded automata
Turing machine

The families of automata above can be interpreted in a hierarchal form, where the finite-state machine is the simplest automata and the Turing machine is the most complex. The focus of this project is on the finite-state machine and the Turing machine. A Turing machine is a finite-state machine yet the inverse is not true.

[top]

Finite State Machines

The exciting history of how finite automata became a branch of computer science illustrates its wide range of applications. The first people to consider the concept of a finite-state machine included a team of biologists, psychologists, mathematicians, engineers and some of the first computer scientists. They all shared a common interest: to model the human thought process, whether in the brain or in a computer. Warren McCulloch and Walter Pitts, two neurophysiologists, were the first to present a description of finite automata in 1943. Their paper, entitled, "A Logical Calculus Immanent in Nervous Activity", made significant contributions to the study of neural network theory, theory of automata, the theory of computation and cybernetics. Later, two computer scientists, G.H. Mealy and E.F. Moore, generalized the theory to much more powerful machines in separate papers, published in 1955-56. The finite-state machines, the Mealy machine and the Moore machine, are named in recognition of their work. While the Mealy machine determines its outputs through the current state and the input, the Moore machine's output is based upon the current state alone.


Warren McCulloch and Walter Pitts (source)

An automaton in which the state set Q contains only a finite number of elements is called a finite-state machine (FSM). FSMs are abstract machines, consisting of a set of states (set Q), set of input events (set I), a set of output events (set Z) and a state transition function. The state transition function takes the current state and an input event and returns the new set of output events and the next state. Therefore, it can be seen as a function which maps an ordered sequence of input events into a corresponding sequence, or set, of output events.

State transition function: I → Z

Finite-state machines are ideal computation models for a small amount of memory, and do not maintain memory. This mathematical model of a machine can only reach a finite number of states and transitions between these states. Its main application is in mathematical problem analysis. Finite-machines are also used for purposes aside from general computations, such as to recognize regular languages.

In order to fully understand conceptually a finite-state machine, consider an analogy to an elevator:

An elevator is a mechanism that does not remember all previous requests for service but the current floor, the direction of motion (up or down) and the collection of not-yet satisfied requests for services. Therefore, at any given moment in time, an elevator in operated would be defined by the following mathematical terms:

States: finite set of states to reflect the past history of the customers' requests.

Inputs: finite set of input, depending on the number of floors the elevator is able to access. We can use the set I, whose size is the number of floors in the building.

Outputs: finite set of output, depending on the need for the elevator to go up or down, according to customers' needs.

A finite-state machine is formally defined as a 5-tuple (Q, I, Z, ∂, W) such that:

Q = finite set of states
I = finite set of input symbols
Z = finite set of output symbols
∂ = mapping of I x Q into Q called the state transition function, i.e. I x Q → Q
W = mapping W of I x Q onto Z, called the output function
A = set of accept states where F is a subset of Q

From the mathematical interpretation above, it can be said that a finite-state machine contains a finite number of states. Each state accepts a finite number of inputs, and each state has rules that describe the action of the machine for ever input, represented in the state transition mapping function. At the same time, an input may cause the machine to change states. For every input symbol, there is exactly one transition out of each state. In addition, any 5-tuple set that is accepted by nondeterministic finite automata is also accepted by deterministic finite automata.

When considering finite-state machines, it is important to keep in mind that the mechanical process inside the automata that leads to the calculation of outputs and change of states is not emphasized or delved into detail; it is instead considered a "black box", as illustrated below:

Having finite, constant amounts of memory, the internal states of an FSM carry no further structure. They can easily be represented using state diagrams, as seen below:

(from Automata, page 7)

The state diagram illustrates the operation of an automaton. States are represented by nodes of graphs, transitions by the arrows or branches, and the corresponding inputs and outputs are denoted by symbols. The arrow entering from the left into q₀ shows that q₀ is the initial state of the machine. Moves that do not involve changes of states are indicated by arrows along the sides of individual nodes. These arrows are known as self-loops .

There exist several types of finite-state machines, which can be divided into three main categories:

acceptors: either accept the input or do not
recognizers: either recognize the input or do not
transducers: generate output from given input

Applications of finite-state machines are found in a variety of subjects. They can operate on languages with a finite number of words (standard case), an infinite number of words (Rabin automata, Bïrche automata), various types of trees, and in hardware circuits, where the input, the state and the output are bit vectors of a fixed size.

[top]

Finite State vs. Turing Machines

The simplest automata used for computation is a finite automaton. It can compute only very primitive functions; therefore, it is not an adequate computation model. In addition, a finite-state machine's inability to generalize computations hinders its power.

The following is an example to illustrate the difference between a finite-state machine and a Turing machine:

Imagine a Modern CPU. Every bit in a machine can only be in two states (0 or 1). Therefore, there are a finite number of possible states. In addition, when considering the parts of a computer a CPU interacts with, there are a finite number of possible inputs from the computer's mouse, keyboard, hard disk, different slot cards, etc. As a result, one can conclude that a CPU can be modeled as a finite-state machine.

Now, consider a computer. Although every bit in a machine can only be in two different states (0 or 1), there are an infinite number of interactions within the computer as a whole. It becomes exceeding difficult to model the workings of a computer within the constraints of a finite-state machine. However, higher-level, infinite and more powerful automata would be capable of carrying out this task.

World-renowned computer scientist Alan Turing conceived the first "infinite" (or unbounded) model of computation: the Turing machine, in 1936, to solve the Entscheindungsproblem. The Turing machine can be thought of as a finite automaton or control unit equipped with an infinite storage (memory). Its "memory" consists of an infinite number of one-dimensional array of cells. Turing's machine is essentially an abstract model of modern-day computer execution and storage, developed in order to provide a precise mathematical definition of an algorithm or mechanical procedure.

Alan Turing (source)

While an automaton is called finite if its model consists of a finite number of states and functions with finite strings of input and output, infinite automata have an "accessory" - either a stack or a tape that can be moved to the right or left, and can meet the same demands made on a machine.

A Turing machine is formally defined by the set [Q, Σ, Γ, δ, q₀, B, F] where

Q = finite set of states, of which one state q₀ is the initial state
Σ = a subset of Γ not including B, is the set of input symbols
Γ = finite set of allowable tape symbols
δ = the next move function , a mapping function from Q x Γ to Q x Γ x {L,R}, where L and R denote the directions left and right respectively
q₀ = in set Q as the start state
B = a symbol of Γ, as the blank
F ⊆ Q the set of final states

Therefore, the major difference between a Turing machine and two-way finite automata (FSM) lies in the fact that the Turing machine is capable of changing symbols on its tape and simulating computer execution and storage. For this reason, it can be said that the Turing Machine has the power to model all computations that can be calculated today through modern computers.

[top]