Automata Theory – Quick Guide Automata Theory Introduction Automata – What is it? The term “Automata” is derived from the Greek word “αὐτόματα” which means “self-acting”. An automaton (Automata in plural) is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State Machine (FSM). Formal definition of a Finite Automaton An automaton can be represented by a 5-tuple (Q, ∑, δ, q0, F), where − Q is a finite set of states. ∑ is a finite set of symbols, called the alphabet of the automaton. δ is the transition function. q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Related Terminologies Alphabet Definition − An alphabet is any finite set of symbols. Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and ‘d’ are symbols. String Definition − A string is a finite sequence of symbols taken from ∑. Example − ‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d} Length of a String Definition − It is the number of symbols present in a string. (Denoted by |S|). Examples − If S = ‘cabcad’, |S|= 6 If |S|= 0, it is called an empty string (Denoted by λ or ε) Kleene Star Definition − The Kleene star, ∑*, is a unary operator on a set of symbols or strings, ∑, that gives the infinite set of all possible strings of all possible lengths over ∑ including λ. Representation − ∑* = ∑0 ∪ ∑1 ∪ ∑2 ∪……. where ∑p is the set of all possible strings of length p. Example − If ∑ = {a, b}, ∑* = {λ, a, b, aa, ab, ba, bb,………..} Kleene Closure / Plus Definition − The set ∑+ is the infinite set of all possible strings of all possible lengths over ∑ excluding λ. Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪……. ∑+ = ∑* − { λ } Example − If ∑ = { a, b } , ∑+ = { a, b, aa, ab, ba, bb,………..} Language Definition − A language is a subset of ∑* for some alphabet ∑. It can be finite or infinite. Example − If the language takes all possible strings of length 2 over ∑ = {a, b}, then L = { ab, aa, ba, bb } Deterministic Finite Automaton Finite Automaton can be classified into two types − Deterministic Finite Automaton (DFA) Non-deterministic Finite Automaton (NDFA / NFA) Deterministic Finite Automaton (DFA) In DFA, for each input symbol, one can determine the state to which the machine will move. Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is called Deterministic Finite Machine or Deterministic Finite Automaton. Formal Definition of a DFA A DFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where − Q is a finite set of states. ∑ is a finite set of symbols called the alphabet. δ is the transition function where δ: Q × ∑ → Q q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Graphical Representation of a DFA A DFA is represented by digraphs called state diagram. The vertices represent the states. The arcs labeled with an input alphabet show the transitions. The initial state is denoted by an empty single incoming arc. The final state is indicated by double circles. Example Let a deterministic finite automaton be → Q = {a, b, c}, ∑ = {0, 1}, q0 = {a}, F = {c}, and Transition function δ as shown by the following table − Present State Next State for Input 0 Next State for Input 1 a a b b c a c b c Its graphical representation would be as follows − Non-deterministic Finite Automaton In NDFA, for a particular input symbol, the machine can move to any combination of the states in the machine. In other words, the exact state to which the machine moves cannot be determined. Hence, it is called Non-deterministic Automaton. As it has finite number of states, the machine is called Non-deterministic Finite Machine or Non-deterministic Finite Automaton. Formal Definition of an NDFA An NDFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where − Q is a finite set of states. ∑ is a finite set of symbols called the alphabets. δ is the transition function where δ: Q × ∑ → 2Q (Here the power set of Q (2Q) has been taken because in case of NDFA, from a state, transition can occur to any combination of Q states) q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Graphical Representation of an NDFA: (same as DFA) An NDFA is represented by digraphs called state diagram. The vertices represent the states. The arcs labeled with an input alphabet show the transitions. The initial state is denoted by an empty single incoming arc. The final state is indicated by double circles. Example Let a non-deterministic finite automaton be → Q = {a, b, c} ∑ = {0, 1} q0 = {a} F = {c} The transition function δ as shown below − Present State Next State for Input 0 Next State for Input 1 a a, b b b c a, c c b, c c Its graphical representation would be as follows − DFA vs NDFA The following table lists the differences between DFA and NDFA. DFA NDFA The transition from a state is to a single particular next state for each input symbol. Hence it is called deterministic. The transition from a state can be to multiple
Category: automata Theory
Discuss Automata Theory Automata Theory is a branch of computer science that deals with designing abstract selfpropelled computing devices that follow a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton. This is a brief and concise tutorial that introduces the fundamental concepts of Finite Automata, Regular Languages, and Pushdown Automata before moving onto Turing machines and Decidability. Learning working make money
Language Decidability A language is called Decidable or Recursive if there is a Turing machine which accepts and halts on every input string w. Every decidable language is Turing-Acceptable. A decision problem P is decidable if the language L of all yes instances to P is decidable. For a decidable language, for each input string, the TM halts either at the accept or the reject state as depicted in the following diagram − Example 1 Find out whether the following problem is decidable or not − Is a number ‘m’ prime? Solution Prime numbers = {2, 3, 5, 7, 11, 13, …………..} Divide the number ‘m’ by all the numbers between ‘2’ and ‘√m’ starting from ‘2’. If any of these numbers produce a remainder zero, then it goes to the “Rejected state”, otherwise it goes to the “Accepted state”. So, here the answer could be made by ‘Yes’ or ‘No’. Hence, it is a decidable problem. Example 2 Given a regular language L and string w, how can we check if w ∈ L? Solution Take the DFA that accepts L and check if w is accepted Some more decidable problems are − Does DFA accept the empty language? Is L1 ∩ L2 = ∅ for regular sets? Note − If a language L is decidable, then its complement L” is also decidable If a language is decidable, then there is an enumerator for it. Learning working make money
Pumping Lemma for CFG Lemma If L is a context-free language, there is a pumping length p such that any string w ∈ L of length ≥ p can be written as w = uvxyz, where vy ≠ ε, |vxy| ≤ p, and for all i ≥ 0, uvixyiz ∈ L. Applications of Pumping Lemma Pumping lemma is used to check whether a grammar is context free or not. Let us take an example and show how it is checked. Problem Find out whether the language L = {xnynzn | n ≥ 1} is context free or not. Solution Let L is context free. Then, L must satisfy pumping lemma. At first, choose a number n of the pumping lemma. Then, take z as 0n1n2n. Break z into uvwxy, where |vwx| ≤ n and vx ≠ ε. Hence vwx cannot involve both 0s and 2s, since the last 0 and the first 2 are at least (n+1) positions apart. There are two cases − Case 1 − vwx has no 2s. Then vx has only 0s and 1s. Then uwy, which would have to be in L, has n 2s, but fewer than n 0s or 1s. Case 2 − vwx has no 0s. Here contradiction occurs. Hence, L is not a context-free language. Learning working make money
Automata Theory Tutorial Job Search Automata Theory is a branch of computer science that deals with designing abstract self-propelled computing devices that follow a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton. This is a brief and concise tutorial that introduces the fundamental concepts of Finite Automata, Regular Languages, and Pushdown Automata before moving onto Turing machines and Decidability. Who Should Learn Automata Theory? This tutorial has been prepared for students pursuing a degree in any information technology or computer science related field. It attempts to help students grasp the essential concepts involved in automata theory. Prerequisites to Learn Automata Theory This tutorial has a good balance between theory and mathematical rigor. The readers are expected to have a basic understanding of discrete mathematical structures. What is Automata Theory? In mathematics and computer science, deals with abstract machines and computational problems. Automatons are self-propelled computing devices that follow predetermined operations automatically. According to John von Neumann, “Automata theory is a logical theory that deals with the study of automata and information, emphasizing the need for a detailed, mathematical, and analytical theory to understand complex automata.” What is an Automaton? An automaton is a finite-sized device with specified inputs and outputs, whose behavior is determined by inputs. In other words, an automaton refers to self-operating mechanical objects that transform information based on predetermined instructions or procedures. An automaton can also refer to a class of electromechanical devices, both theoretical and real, that transform information after being set in motion. What are the Main Types of Automata? The automata could be classified into different types as follows − Finite Automata − A Finite automaton is an automaton with a finite number of states that changes its state based on the input string of symbols. When the desiring symbol is search, then the transition occurs. Pushdown Automata − A pushdown automaton (PDA) is a stack-based automaton used in the theory of computation. It is more capable than finite-state machines to solve little complex problem than FA. Turing Machine − A Turing machine is a theoretical device that uses a table of rules to manipulate symbols on a tape strip, consisting of an infinitely long tape divided into cells with 1”s, 0”s, or empty spaces. Linear Bounded Automata − A linear-bounded automaton (LBA) is a Turing machine that uses only the input tape space, unlike a Turing machine. Its length is a linear function of the input”s length, and only a finite contiguous portion of the tape can be accessed by the read/write head. Cellular Automata − Cellular automata are deterministic systems that evolve in discrete time and space, typically a grid. They consist of locally updated cells that follow a global time scale and recursive rule, updating synchronously across the grid. What is a Finite Automaton? A is a mathematical model of computation that can be in one of a finite number of states at any given time. It can change from one state to another in response to inputs, called a transition. An FSM is defined by its states, initial state, and inputs. There are two types of FSM: deterministic and non-deterministic. For any non-deterministic FSM, an equivalent deterministic one can be constructed. Difference between Deterministic and Non-deterministic Finite Automata The following table highlights the major differences between Deterministic and Non-deterministic Finite Automate − Aspect Deterministic Finite Automaton Nondeterministic Finite Automaton State transitions Each input determines the resulting state uniquely Some inputs may allow a choice of resulting states Predictability Final state is completely determined by input word Several possible final states for a given input word Empty transitions Can change state only after reading an input May change state without reading any input Initial state One initial state May have more than one initial state Word acceptance Word is accepted if the final state is an acceptor state Word is accepted if at least one possible final state is an acceptor state State occupancy In one state at a time Can be thought of as being in multiple states at once What is a Pushdown Automaton? are nondeterministic finite state machines with additional memory in the form of a stack. It is more powerful than a FSM but less than a Turing machine. They accept context-free languages. For example, to ensure a valid code, a programmer can feed the code into a pushdown automaton programmed with transition functions that implement the context-free grammar for the language of balanced parentheses. If the code is valid and all parentheses are matched, the pushdown automaton will accept the code. If unbalanced parentheses are present, the automaton can return the code”s invalidity. What is a Turing Machine? A is an abstract computational model that performs computations by reading and writing to an infinite tape. It provides a powerful computational model for solving computer science problems and testing the limits of computation. Turing machines was invented by Alan Turing in 1936. A Turing Machine consists of an infinite tape, a tape head, and a state transition table. They execute on an input string of bits, with the tape head in a certain state and the table governing its behavior. Turing machines can simulate the complexity of a program and its reactions to different data in memory. What is the Chomsky Hierarchy? The hierarchy is a classification system in computer science and linguistics that consists of four levels − Type 0 (unrestricted) Type 1 (context-sensitive) Type 2 (context-free) Type 3 (regular) Named after Noam Chomsky, it characterizes the expressive power of different types of formal languages and grammars. It is a fundamental concept in the study of formal languages and is used in the development of parsing algorithms and other tools for working with formal languages. What are Context-Free Languages? are generated by context-free grammars, which are identical to the set of languages accepted by pushdown automata. Regular languages are a subset of context-free languages, and inputted languages are accepted by computational models if they end
Non-deterministic Finite Automaton In NDFA, for a particular input symbol, the machine can move to any combination of the states in the machine. In other words, the exact state to which the machine moves cannot be determined. Hence, it is called Non-deterministic Automaton. As it has finite number of states, the machine is called Non-deterministic Finite Machine or Non-deterministic Finite Automaton. Formal Definition of an NDFA An NDFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where − Q is a finite set of states. ∑ is a finite set of symbols called the alphabets. δ is the transition function where δ: Q × ∑ → 2Q (Here the power set of Q (2Q) has been taken because in case of NDFA, from a state, transition can occur to any combination of Q states) q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Graphical Representation of an NDFA: (same as DFA) An NDFA is represented by digraphs called state diagram. The vertices represent the states. The arcs labeled with an input alphabet show the transitions. The initial state is denoted by an empty single incoming arc. The final state is indicated by double circles. Example Let a non-deterministic finite automaton be → Q = {a, b, c} ∑ = {0, 1} q0 = {a} F = {c} The transition function δ as shown below − Present State Next State for Input 0 Next State for Input 1 a a, b b b c a, c c b, c c Its graphical representation would be as follows − DFA vs NDFA The following table lists the differences between DFA and NDFA. DFA NDFA The transition from a state is to a single particular next state for each input symbol. Hence it is called deterministic. The transition from a state can be to multiple next states for each input symbol. Hence it is called non-deterministic. Empty string transitions are not seen in DFA. NDFA permits empty string transitions. Backtracking is allowed in DFA In NDFA, backtracking is not always possible. Requires more space. Requires less space. A string is accepted by a DFA, if it transits to a final state. A string is accepted by a NDFA, if at least one of all possible transitions ends in a final state. Acceptors, Classifiers, and Transducers Acceptor (Recognizer) An automaton that computes a Boolean function is called an acceptor. All the states of an acceptor is either accepting or rejecting the inputs given to it. Classifier A classifier has more than two final states and it gives a single output when it terminates. Transducer An automaton that produces outputs based on current input and/or previous state is called a transducer. Transducers can be of two types − Mealy Machine − The output depends both on the current state and the current input. Moore Machine − The output depends only on the current state. Acceptability by DFA and NDFA A string is accepted by a DFA/NDFA iff the DFA/NDFA starting at the initial state ends in an accepting state (any of the final states) after reading the string wholly. A string S is accepted by a DFA/NDFA (Q, ∑, δ, q0, F), iff δ*(q0, S) ∈ F The language L accepted by DFA/NDFA is {S | S ∈ ∑* and δ*(q0, S) ∈ F} A string S′ is not accepted by a DFA/NDFA (Q, ∑, δ, q0, F), iff δ*(q0, S′) ∉ F The language L′ not accepted by DFA/NDFA (Complement of accepted language L) is {S | S ∈ ∑* and δ*(q0, S) ∉ F} Example Let us consider the DFA shown in Figure 1.3. From the DFA, the acceptable strings can be derived. Strings accepted by the above DFA: {0, 00, 11, 010, 101, ………..} Strings not accepted by the above DFA: {1, 011, 111, ……..} Learning working make money
Greibach Normal Form A CFG is in Greibach Normal Form if the Productions are in the following forms − A → b A → bD1…Dn S → ε where A, D1,….,Dn are non-terminals and b is a terminal. Algorithm to Convert a CFG into Greibach Normal Form Step 1 − If the start symbol S occurs on some right side, create a new start symbol S’ and a new production S’ → S. Step 2 − Remove Null productions. (Using the Null production removal algorithm discussed earlier) Step 3 − Remove unit productions. (Using the Unit production removal algorithm discussed earlier) Step 4 − Remove all direct and indirect left-recursion. Step 5 − Do proper substitutions of productions to convert it into the proper form of GNF. Problem Convert the following CFG into CNF S → XY | Xn | p X → mX | m Y → Xn | o Solution Here, S does not appear on the right side of any production and there are no unit or null productions in the production rule set. So, we can skip Step 1 to Step 3. Step 4 Now after replacing X in S → XY | Xo | p with mX | m we obtain S → mXY | mY | mXo | mo | p. And after replacing X in Y → Xn | o with the right side of X → mX | m we obtain Y → mXn | mn | o. Two new productions O → o and P → p are added to the production set and then we came to the final GNF as the following − S → mXY | mY | mXC | mC | p X → mX | m Y → mXD | mD | o O → o P → p Learning working make money
Non-Deterministic Turing Machine In a Non-Deterministic Turing Machine, for every state and symbol, there are a group of actions the TM can have. So, here the transitions are not deterministic. The computation of a non-deterministic Turing Machine is a tree of configurations that can be reached from the start configuration. An input is accepted if there is at least one node of the tree which is an accept configuration, otherwise it is not accepted. If all branches of the computational tree halt on all inputs, the non-deterministic Turing Machine is called a Decider and if for some input, all branches are rejected, the input is also rejected. A non-deterministic Turing machine can be formally defined as a 6-tuple (Q, X, ∑, δ, q0, B, F) where − Q is a finite set of states X is the tape alphabet ∑ is the input alphabet δ is a transition function; δ : Q × X → P(Q × X × {Left_shift, Right_shift}). q0 is the initial state B is the blank symbol F is the set of final states Learning working make money
Linear Bounded Automata A linear bounded automaton is a multi-track non-deterministic Turing machine with a tape of some bounded finite length. Length = function (Length of the initial input string, constant c) Here, Memory information ≤ c × Input information The computation is restricted to the constant bounded area. The input alphabet contains two special symbols which serve as left end markers and right end markers which mean the transitions neither move to the left of the left end marker nor to the right of the right end marker of the tape. A linear bounded automaton can be defined as an 8-tuple (Q, X, ∑, q0, ML, MR, δ, F) where − Q is a finite set of states X is the tape alphabet ∑ is the input alphabet q0 is the initial state ML is the left end marker MR is the right end marker where MR ≠ ML δ is a transition function which maps each pair (state, tape symbol) to (state, tape symbol, Constant ‘c’) where c can be 0 or +1 or -1 F is the set of final states A deterministic linear bounded automaton is always context-sensitive and the linear bounded automaton with empty language is undecidable.. Learning working make money
Post Correspondence Problem The Post Correspondence Problem (PCP), introduced by Emil Post in 1946, is an undecidable decision problem. The PCP problem over an alphabet ∑ is stated as follows − Given the following two lists, M and N of non-empty strings over ∑ − M = (x1, x2, x3,………, xn) N = (y1, y2, y3,………, yn) We can say that there is a Post Correspondence Solution, if for some i1,i2,………… ik, where 1 ≤ ij ≤ n, the condition xi1 …….xik = yi1 …….yik satisfies. Example 1 Find whether the lists M = (abb, aa, aaa) and N = (bba, aaa, aa) have a Post Correspondence Solution? Solution x1 x2 x3 M Abb aa aaa N Bba aaa aa Here, x2x1x3 = ‘aaabbaaa’ and y2y1y3 = ‘aaabbaaa’ We can see that x2x1x3 = y2y1y3 Hence, the solution is i = 2, j = 1, and k = 3. Example 2 Find whether the lists M = (ab, bab, bbaaa) and N = (a, ba, bab) have a Post Correspondence Solution? Solution x1 x2 x3 M ab bab bbaaa N a ba bab In this case, there is no solution because − | x2x1x3 | ≠ | y2y1y3 | (Lengths are not same) Hence, it can be said that this Post Correspondence Problem is undecidable. Learning working make money