Regular Expressions A Regular Expression can be recursively defined as follows − ε is a Regular Expression indicates the language containing an empty string. (L (ε) = {ε}) φ is a Regular Expression denoting an empty language. (L (φ) = { }) x is a Regular Expression where L = {x} If X is a Regular Expression denoting the language L(X) and Y is a Regular Expression denoting the language L(Y), then X + Y is a Regular Expression corresponding to the language L(X) ∪ L(Y) where L(X+Y) = L(X) ∪ L(Y). X . Y is a Regular Expression corresponding to the language L(X) . L(Y) where L(X.Y) = L(X) . L(Y) R* is a Regular Expression corresponding to the language L(R*)where L(R*) = (L(R))* If we apply any of the rules several times from 1 to 5, they are Regular Expressions. Some RE Examples Regular Expressions Regular Set (0 + 10*) L = { 0, 1, 10, 100, 1000, 10000, … } (0*10*) L = {1, 01, 10, 010, 0010, …} (0 + ε)(1 + ε) L = {ε, 0, 1, 01} (a+b)* Set of strings of a’s and b’s of any length including the null string. So L = { ε, a, b, aa , ab , bb , ba, aaa…….} (a+b)*abb Set of strings of a’s and b’s ending with the string abb. So L = {abb, aabb, babb, aaabb, ababb, …………..} (11)* Set consisting of even number of 1’s including empty string, So L= {ε, 11, 1111, 111111, ……….} (aa)*(bb)*b Set of strings consisting of even number of a’s followed by odd number of b’s , so L = {b, aab, aabbb, aabbbbb, aaaab, aaaabbb, …………..} (aa + ab + ba + bb)* String of a’s and b’s of even length can be obtained by concatenating any combination of the strings aa, ab, ba and bb including null, so L = {aa, ab, ba, bb, aaab, aaba, …………..} Learning working make money
Category: automata Theory
Moore and Mealy Machines Finite automata may have outputs corresponding to each transition. There are two types of finite state machines that generate output − Mealy Machine Moore machine Mealy Machine A Mealy Machine is an FSM whose output depends on the present state as well as the present input. It can be described by a 6 tuple (Q, ∑, O, δ, X, q0) where − Q is a finite set of states. ∑ is a finite set of symbols called the input alphabet. O is a finite set of symbols called the output alphabet. δ is the input transition function where δ: Q × ∑ → Q X is the output transition function where X: Q × ∑ → O q0 is the initial state from where any input is processed (q0 ∈ Q). The state table of a Mealy Machine is shown below − Present state Next state input = 0 input = 1 State Output State Output → a b x1 c x1 b b x2 d x3 c d x3 c x1 d d x3 d x2 The state diagram of the above Mealy Machine is − Moore Machine Moore machine is an FSM whose outputs depend on only the present state. A Moore machine can be described by a 6 tuple (Q, ∑, O, δ, X, q0) where − Q is a finite set of states. ∑ is a finite set of symbols called the input alphabet. O is a finite set of symbols called the output alphabet. δ is the input transition function where δ: Q × ∑ → Q X is the output transition function where X: Q → O q0 is the initial state from where any input is processed (q0 ∈ Q). The state table of a Moore Machine is shown below − Present state Next State Output Input = 0 Input = 1 → a b c x2 b b d x1 c c d x2 d d d x3 The state diagram of the above Moore Machine is − Mealy Machine vs. Moore Machine The following table highlights the points that differentiate a Mealy Machine from a Moore Machine. Mealy Machine Moore Machine Output depends both upon the present state and the present input Output depends only upon the present state. Generally, it has fewer states than Moore Machine. Generally, it has more states than Mealy Machine. The value of the output function is a function of the transitions and the changes, when the input logic on the present state is done. The value of the output function is a function of the current state and the changes at the clock edges, whenever state changes occur. Mealy machines react faster to inputs. They generally react in the same clock cycle. In Moore machines, more logic is required to decode the outputs resulting in more circuit delays. They generally react one clock cycle later. Moore Machine to Mealy Machine Algorithm 4 Input − Moore Machine Output − Mealy Machine Step 1 − Take a blank Mealy Machine transition table format. Step 2 − Copy all the Moore Machine transition states into this table format. Step 3 − Check the present states and their corresponding outputs in the Moore Machine state table; if for a state Qi output is m, copy it into the output columns of the Mealy Machine state table wherever Qi appears in the next state. Example Let us consider the following Moore machine − Present State Next State Output a = 0 a = 1 → a d b 1 b a d 0 c c c 0 d b a 1 Now we apply Algorithm 4 to convert it to Mealy Machine. Step 1 & 2 − Present State Next State a = 0 a = 1 State Output State Output → a d b b a d c c c d b a Step 3 − Present State Next State a = 0 a = 1 State Output State Output => a d 1 b 0 b a 1 d 1 c c 0 c 0 d b 0 a 1 Mealy Machine to Moore Machine Algorithm 5 Input − Mealy Machine Output − Moore Machine Step 1 − Calculate the number of different outputs for each state (Qi) that are available in the state table of the Mealy machine. Step 2 − If all the outputs of Qi are same, copy state Qi. If it has n distinct outputs, break Qi into n states as Qin where n = 0, 1, 2……. Step 3 − If the output of the initial state is 1, insert a new initial state at the beginning which gives 0 output. Example Let us consider the following Mealy Machine − Present State Next State a = 0 a = 1 Next State Output Next State Output → a d 0 b 1 b a 1 d 0 c c 1 c 0 d b 0 a 1 Here, states ‘a’ and ‘d’ give only 1 and 0 outputs respectively, so we retain states ‘a’ and ‘d’. But states ‘b’ and ‘c’ produce different outputs (1 and 0). So, we divide b into b0, b1 and c into c0, c1. Present State Next State Output a = 0 a = 1 → a d b1 1 b0 a d 0 b1 a d 1 c0 c1 C0 0 c1 c1 C0 1 d b0 a 0 Learning working make money
Chomsky Classification of Grammars According to Noam Chomosky, there are four types of grammars − Type 0, Type 1, Type 2, and Type 3. The following table shows how they differ from each other − Grammar Type Grammar Accepted Language Accepted Automaton Type 0 Unrestricted grammar Recursively enumerable language Turing Machine Type 1 Context-sensitive grammar Context-sensitive language Linear-bounded automaton Type 2 Context-free grammar Context-free language Pushdown automaton Type 3 Regular grammar Regular language Finite state automaton Take a look at the following illustration. It shows the scope of each type of grammar − Type – 3 Grammar Type-3 grammars generate regular languages. Type-3 grammars must have a single non-terminal on the left-hand side and a right-hand side consisting of a single terminal or single terminal followed by a single non-terminal. The productions must be in the form X → a or X → aY where X, Y ∈ N (Non terminal) and a ∈ T (Terminal) The rule S → ε is allowed if S does not appear on the right side of any rule. Example X → ε X → a | aY Y → b Type – 2 Grammar Type-2 grammars generate context-free languages. The productions must be in the form A → γ where A ∈ N (Non terminal) and γ ∈ (T ∪ N)* (String of terminals and non-terminals). These languages generated by these grammars are be recognized by a non-deterministic pushdown automaton. Example S → X a X → a X → aX X → abc X → ε Type – 1 Grammar Type-1 grammars generate context-sensitive languages. The productions must be in the form α A β → α γ β where A ∈ N (Non-terminal) and α, β, γ ∈ (T ∪ N)* (Strings of terminals and non-terminals) The strings α and β may be empty, but γ must be non-empty. The rule S → ε is allowed if S does not appear on the right side of any rule. The languages generated by these grammars are recognized by a linear bounded automaton. Example AB → AbBc A → bcA B → b Type – 0 Grammar Type-0 grammars generate recursively enumerable languages. The productions have no restrictions. They are any phase structure grammar including all formal grammars. They generate the languages that are recognized by a Turing machine. The productions can be in the form of α → β where α is a string of terminals and nonterminals with at least one non-terminal and α cannot be null. β is a string of terminals and non-terminals. Example S → ACaB Bc → acB CB → DB aD → Db Learning working make money
Language Generated by a Grammar The set of all strings that can be derived from a grammar is said to be the language generated from that grammar. A language generated by a grammar G is a subset formally defined by L(G)={W|W ∈ ∑*, S ⇒G W} If L(G1) = L(G2), the Grammar G1 is equivalent to the Grammar G2. Example If there is a grammar G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b} Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted string is ab, i.e., L(G) = {ab} Example Suppose we have the following grammar − G: N = {S, A, B} T = {a, b} P = {S → AB, A → aA|a, B → bB|b} The language generated by this grammar − L(G) = {ab, a2b, ab2, a2b2, ………} = {am bn | m ≥ 1 and n ≥ 1} Construction of a Grammar Generating a Language We’ll consider some languages and convert it into a grammar G which produces those languages. Example Problem − Suppose, L (G) = {am bn | m ≥ 0 and n > 0}. We have to find out the grammar G which produces L(G). Solution Since L(G) = {am bn | m ≥ 0 and n > 0} the set of strings accepted can be rewritten as − L(G) = {b, ab,bb, aab, abb, …….} Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’ including null. To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions − S → aS , S → B, B → b and B → bB S → B → b (Accepted) S → B → bB → bb (Accepted) S → aS → aB → ab (Accepted) S → aS → aaS → aaB → aab(Accepted) S → aS → aB → abB → abb (Accepted) Thus, we can prove every single string in L(G) is accepted by the language generated by the production set. Hence the grammar − G: ({S, A, B}, {a, b}, S, { S → aS | B , B → b | bB }) Example Problem − Suppose, L (G) = {am bn | m > 0 and n ≥ 0}. We have to find out the grammar G which produces L(G). Solution − Since L(G) = {am bn | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as − L(G) = {a, aa, ab, aaa, aab ,abb, …….} Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’ including null. To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the productions − S → aA, A → aA , A → B, B → bB ,B → λ S → aA → aB → aλ → a (Accepted) S → aA → aaA → aaB → aaλ → aa (Accepted) S → aA → aB → abB → abλ → ab (Accepted) S → aA → aaA → aaaA → aaaB → aaaλ → aaa (Accepted) S → aA → aaA → aaB → aabB → aabλ → aab (Accepted) S → aA → aB → abB → abbB → abbλ → abb (Accepted) Thus, we can prove every single string in L(G) is accepted by the language generated by the production set. Hence the grammar − G: ({S, A, B}, {a, b}, S, {S → aA, A → aA | B, B → λ | bB }) Learning working make money
Multi-track Turing Machine Multi-track Turing machines, a specific type of Multi-tape Turing machine, contain multiple tracks but just one tape head reads and writes on all tracks. Here, a single tape head reads n symbols from n tracks at one step. It accepts recursively enumerable languages like a normal single-track single-tape Turing Machine accepts. A Multi-track Turing machine can be formally described as a 6-tuple (Q, X, ∑, δ, q0, F) where − Q is a finite set of states X is the tape alphabet ∑ is the input alphabet δ is a relation on states and symbols where δ(Qi, [a1, a2, a3,….]) = (Qj, [b1, b2, b3,….], Left_shift or Right_shift) q0 is the initial state F is the set of final states Note − For every single-track Turing Machine S, there is an equivalent multi-track Turing Machine M such that L(S) = L(M). Learning working make money
DFA Minimization DFA Minimization using Myphill-Nerode Theorem Algorithm Input − DFA Output − Minimized DFA Step 1 − Draw a table for all pairs of states (Qi, Qj) not necessarily connected directly [All are unmarked initially] Step 2 − Consider every state pair (Qi, Qj) in the DFA where Qi ∈ F and Qj ∉ F or vice versa and mark them. [Here F is the set of final states] Step 3 − Repeat this step until we cannot mark anymore states − If there is an unmarked pair (Qi, Qj), mark it if the pair {δ (Qi, A), δ (Qi, A)} is marked for some input alphabet. Step 4 − Combine all the unmarked pair (Qi, Qj) and make them a single state in the reduced DFA. Example Let us use Algorithm 2 to minimize the DFA shown below. Step 1 − We draw a table for all pair of states. a b c d e f a b c d e f Step 2 − We mark the state pairs. a b c d e f a b c ✔ ✔ d ✔ ✔ e ✔ ✔ f ✔ ✔ ✔ Step 3 − We will try to mark the state pairs, with green colored check mark, transitively. If we input 1 to state ‘a’ and ‘f’, it will go to state ‘c’ and ‘f’ respectively. (c, f) is already marked, hence we will mark pair (a, f). Now, we input 1 to state ‘b’ and ‘f’; it will go to state ‘d’ and ‘f’ respectively. (d, f) is already marked, hence we will mark pair (b, f). a b c d e f a b c ✔ ✔ d ✔ ✔ e ✔ ✔ f ✔ ✔ ✔ ✔ ✔ After step 3, we have got state combinations {a, b} {c, d} {c, e} {d, e} that are unmarked. We can recombine {c, d} {c, e} {d, e} into {c, d, e} Hence we got two combined states as − {a, b} and {c, d, e} So the final minimized DFA will contain three states {f}, {a, b} and {c, d, e} DFA Minimization using Equivalence Theorem If X and Y are two states in a DFA, we can combine these two states into {X, Y} if they are not distinguishable. Two states are distinguishable, if there is at least one string S, such that one of δ (X, S) and δ (Y, S) is accepting and another is not accepting. Hence, a DFA is minimal if and only if all the states are distinguishable. Algorithm 3 Step 1 − All the states Q are divided in two partitions − final states and non-final states and are denoted by P0. All the states in a partition are 0th equivalent. Take a counter k and initialize it with 0. Step 2 − Increment k by 1. For each partition in Pk, divide the states in Pk into two partitions if they are k-distinguishable. Two states within this partition X and Y are k-distinguishable if there is an input S such that δ(X, S) and δ(Y, S) are (k-1)-distinguishable. Step 3 − If Pk ≠ Pk-1, repeat Step 2, otherwise go to Step 4. Step 4 − Combine kth equivalent sets and make them the new states of the reduced DFA. Example Let us consider the following DFA − q δ(q,0) δ(q,1) a b c b a d c e f d e f e e f f f f Let us apply the above algorithm to the above DFA − P0 = {(c,d,e), (a,b,f)} P1 = {(c,d,e), (a,b),(f)} P2 = {(c,d,e), (a,b),(f)} Hence, P1 = P2. There are three states in the reduced DFA. The reduced DFA is as follows − Q δ(q,0) δ(q,1) (a, b) (a, b) (c,d,e) (c,d,e) (c,d,e) (f) (f) (f) (f) Learning working make money
Accepted Language & Decided Language A TM accepts a language if it enters into a final state for any input string w. A language is recursively enumerable (generated by Type-0 grammar) if it is accepted by a Turing machine. A TM decides a language if it accepts it and enters into a rejecting state for any input not in the language. A language is recursive if it is decided by a Turing machine. There may be some cases where a TM does not stop. Such TM accepts the language, but it does not decide it. Designing a Turing Machine The basic guidelines of designing a Turing machine have been explained below with the help of a couple of examples. Example 1 Design a TM to recognize all strings consisting of an odd number of α’s. Solution The Turing machine M can be constructed by the following moves − Let q1 be the initial state. If M is in q1; on scanning α, it enters the state q2 and writes B (blank). If M is in q2; on scanning α, it enters the state q1 and writes B (blank). From the above moves, we can see that M enters the state q1 if it scans an even number of α’s, and it enters the state q2 if it scans an odd number of α’s. Hence q2 is the only accepting state. Hence, M = {{q1, q2}, {1}, {1, B}, δ, q1, B, {q2}} where δ is given by − Tape alphabet symbol Present State ‘q1’ Present State ‘q2’ α BRq2 BRq1 Example 2 Design a Turing Machine that reads a string representing a binary number and erases all leading 0’s in the string. However, if the string comprises of only 0’s, it keeps one 0. Solution Let us assume that the input string is terminated by a blank symbol, B, at each end of the string. The Turing Machine, M, can be constructed by the following moves − Let q0 be the initial state. If M is in q0, on reading 0, it moves right, enters the state q1 and erases 0. On reading 1, it enters the state q2 and moves right. If M is in q1, on reading 0, it moves right and erases 0, i.e., it replaces 0’s by B’s. On reaching the leftmost 1, it enters q2 and moves right. If it reaches B, i.e., the string comprises of only 0’s, it moves left and enters the state q3. If M is in q2, on reading either 0 or 1, it moves right. On reaching B, it moves left and enters the state q4. This validates that the string comprises only of 0’s and 1’s. If M is in q3, it replaces B by 0, moves left and reaches the final state qf. If M is in q4, on reading either 0 or 1, it moves left. On reaching the beginning of the string, i.e., when it reads B, it reaches the final state qf. Hence, M = {{q0, q1, q2, q3, q4, qf}, {0,1, B}, {1, B}, δ, q0, B, {qf}} where δ is given by − Tape alphabet symbol Present State ‘q0’ Present State ‘q1’ Present State ‘q2’ Present State ‘q3’ Present State ‘q4’ 0 BRq1 BRq1 ORq2 – OLq4 1 1Rq2 1Rq2 1Rq2 – 1Lq4 B BRq1 BLq3 BLq4 OLqf BRqf Learning working make money
Automata Theory Introduction Automata – What is it? The term “Automata” is derived from the Greek word “αὐτόματα” which means “self-acting”. An automaton (Automata in plural) is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State Machine (FSM). Formal definition of a Finite Automaton An automaton can be represented by a 5-tuple (Q, ∑, δ, q0, F), where − Q is a finite set of states. ∑ is a finite set of symbols, called the alphabet of the automaton. δ is the transition function. q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Related Terminologies Alphabet Definition − An alphabet is any finite set of symbols. Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and ‘d’ are symbols. String Definition − A string is a finite sequence of symbols taken from ∑. Example − ‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d} Length of a String Definition − It is the number of symbols present in a string. (Denoted by |S|). Examples − If S = ‘cabcad’, |S|= 6 If |S|= 0, it is called an empty string (Denoted by λ or ε) Kleene Star Definition − The Kleene star, ∑*, is a unary operator on a set of symbols or strings, ∑, that gives the infinite set of all possible strings of all possible lengths over ∑ including λ. Representation − ∑* = ∑0 ∪ ∑1 ∪ ∑2 ∪……. where ∑p is the set of all possible strings of length p. Example − If ∑ = {a, b}, ∑* = {λ, a, b, aa, ab, ba, bb,………..} Kleene Closure / Plus Definition − The set ∑+ is the infinite set of all possible strings of all possible lengths over ∑ excluding λ. Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪……. ∑+ = ∑* − { λ } Example − If ∑ = { a, b } , ∑+ = { a, b, aa, ab, ba, bb,………..} Language Definition − A language is a subset of ∑* for some alphabet ∑. It can be finite or infinite. Example − If the language takes all possible strings of length 2 over ∑ = {a, b}, then L = { ab, aa, ba, bb } Learning working make money
Turing Machine Halting Problem Input − A Turing machine and an input string w. Problem − Does the Turing machine finish computing of the string w in a finite number of steps? The answer must be either yes or no. Proof − At first, we will assume that such a Turing machine exists to solve this problem and then we will show it is contradicting itself. We will call this Turing machine as a Halting machine that produces a ‘yes’ or ‘no’ in a finite amount of time. If the halting machine finishes in a finite amount of time, the output comes as ‘yes’, otherwise as ‘no’. The following is the block diagram of a Halting machine − Now we will design an inverted halting machine (HM)’ as − If H returns YES, then loop forever. If H returns NO, then halt. The following is the block diagram of an ‘Inverted halting machine’ − Further, a machine (HM)2 which input itself is constructed as follows − If (HM)2 halts on input, loop forever. Else, halt. Here, we have got a contradiction. Hence, the halting problem is undecidable. Learning working make money
Pushdown Automata & Parsing Parsing is used to derive a string using the production rules of a grammar. It is used to check the acceptability of a string. Compiler is used to check whether or not a string is syntactically correct. A parser takes the inputs and builds a parse tree. A parser can be of two types − Top-Down Parser − Top-down parsing starts from the top with the start-symbol and derives a string using a parse tree. Bottom-Up Parser − Bottom-up parsing starts from the bottom with the string and comes to the start symbol using a parse tree. Design of Top-Down Parser For top-down parsing, a PDA has the following four types of transitions − Pop the non-terminal on the left hand side of the production at the top of the stack and push its right-hand side string. If the top symbol of the stack matches with the input symbol being read, pop it. Push the start symbol ‘S’ into the stack. If the input string is fully read and the stack is empty, go to the final state ‘F’. Example Design a top-down parser for the expression “x+y*z” for the grammar G with the following production rules − P: S → S+X | X, X → X*Y | Y, Y → (S) | id Solution If the PDA is (Q, ∑, S, δ, q0, I, F), then the top-down parsing is − (x+y*z, I) ⊢(x +y*z, SI) ⊢ (x+y*z, S+XI) ⊢(x+y*z, X+XI) ⊢(x+y*z, Y+X I) ⊢(x+y*z, x+XI) ⊢(+y*z, +XI) ⊢ (y*z, XI) ⊢(y*z, X*YI) ⊢(y*z, y*YI) ⊢(*z,*YI) ⊢(z, YI) ⊢(z, zI) ⊢(ε, I) Design of a Bottom-Up Parser For bottom-up parsing, a PDA has the following four types of transitions − Push the current input symbol into the stack. Replace the right-hand side of a production at the top of the stack with its left-hand side. If the top of the stack element matches with the current input symbol, pop it. If the input string is fully read and only if the start symbol ‘S’ remains in the stack, pop it and go to the final state ‘F’. Example Design a top-down parser for the expression “x+y*z” for the grammar G with the following production rules − P: S → S+X | X, X → X*Y | Y, Y → (S) | id Solution If the PDA is (Q, ∑, S, δ, q0, I, F), then the bottom-up parsing is − (x+y*z, I) ⊢ (+y*z, xI) ⊢ (+y*z, YI) ⊢ (+y*z, XI) ⊢ (+y*z, SI) ⊢(y*z, +SI) ⊢ (*z, y+SI) ⊢ (*z, Y+SI) ⊢ (*z, X+SI) ⊢ (z, *X+SI) ⊢ (ε, z*X+SI) ⊢ (ε, Y*X+SI) ⊢ (ε, X+SI) ⊢ (ε, SI) Learning working make money