Language Generated by a Grammar The set of all strings that can be derived from a grammar is said to be the language generated from that grammar. A language generated by a grammar G is a subset formally defined by L(G)={W|W ∈ ∑*, S ⇒G W} If L(G1) = L(G2), the Grammar G1 is equivalent to the Grammar G2. Example If there is a grammar G: N = {S, A, B} T = {a, b} P = {S → AB, A → a, B → b} Here S produces AB, and we can replace A by a, and B by b. Here, the only accepted string is ab, i.e., L(G) = {ab} Example Suppose we have the following grammar − G: N = {S, A, B} T = {a, b} P = {S → AB, A → aA|a, B → bB|b} The language generated by this grammar − L(G) = {ab, a2b, ab2, a2b2, ………} = {am bn | m ≥ 1 and n ≥ 1} Construction of a Grammar Generating a Language We’ll consider some languages and convert it into a grammar G which produces those languages. Example Problem − Suppose, L (G) = {am bn | m ≥ 0 and n > 0}. We have to find out the grammar G which produces L(G). Solution Since L(G) = {am bn | m ≥ 0 and n > 0} the set of strings accepted can be rewritten as − L(G) = {b, ab,bb, aab, abb, …….} Here, the start symbol has to take at least one ‘b’ preceded by any number of ‘a’ including null. To accept the string set {b, ab, bb, aab, abb, …….}, we have taken the productions − S → aS , S → B, B → b and B → bB S → B → b (Accepted) S → B → bB → bb (Accepted) S → aS → aB → ab (Accepted) S → aS → aaS → aaB → aab(Accepted) S → aS → aB → abB → abb (Accepted) Thus, we can prove every single string in L(G) is accepted by the language generated by the production set. Hence the grammar − G: ({S, A, B}, {a, b}, S, { S → aS | B , B → b | bB }) Example Problem − Suppose, L (G) = {am bn | m > 0 and n ≥ 0}. We have to find out the grammar G which produces L(G). Solution − Since L(G) = {am bn | m > 0 and n ≥ 0}, the set of strings accepted can be rewritten as − L(G) = {a, aa, ab, aaa, aab ,abb, …….} Here, the start symbol has to take at least one ‘a’ followed by any number of ‘b’ including null. To accept the string set {a, aa, ab, aaa, aab, abb, …….}, we have taken the productions − S → aA, A → aA , A → B, B → bB ,B → λ S → aA → aB → aλ → a (Accepted) S → aA → aaA → aaB → aaλ → aa (Accepted) S → aA → aB → abB → abλ → ab (Accepted) S → aA → aaA → aaaA → aaaB → aaaλ → aaa (Accepted) S → aA → aaA → aaB → aabB → aabλ → aab (Accepted) S → aA → aB → abB → abbB → abbλ → abb (Accepted) Thus, we can prove every single string in L(G) is accepted by the language generated by the production set. Hence the grammar − G: ({S, A, B}, {a, b}, S, {S → aA, A → aA | B, B → λ | bB }) Learning working make money
Category: automata Theory
Context-Free Grammar Introduction Definition − A context-free grammar (CFG) consisting of a finite set of grammar rules is a quadruple (N, T, P, S) where N is a set of non-terminal symbols. T is a set of terminals where N ∩ T = NULL. P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the production rule P does have any right context or left context. S is the start symbol. Example The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc. The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S → ε The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F | ε Generation of Derivation Tree A derivation tree or parse tree is an ordered rooted tree that graphically represents the semantic information a string derived from a context-free grammar. Representation Technique Root vertex − Must be labeled by the start symbol. Vertex − Labeled by a non-terminal symbol. Leaves − Labeled by a terminal symbol or ε. If S → x1x2 …… xn is a production rule in a CFG, then the parse tree / derivation tree will be as follows − There are two different approaches to draw a derivation tree − Top-down Approach − Starts with the starting symbol S Goes down to tree leaves using productions Bottom-up Approach − Starts from tree leaves Proceeds upward to the root which is the starting symbol S Derivation or Yield of a Tree The derivation or the yield of a parse tree is the final string obtained by concatenating the labels of the leaves of the tree from left to right, ignoring the Nulls. However, if all the leaves are Null, derivation is Null. Example Let a CFG {N,T,P,S} be N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε One derivation from the above CFG is “abaabb” S → SS → aSbS → abS → abaSb → abaaSbb → abaabb Sentential Form and Partial Derivation Tree A partial derivation tree is a sub-tree of a derivation tree/parse tree such that either all of its children are in the sub-tree or none of them are in the sub-tree. Example If in any CFG the productions are − S → AB, A → aaA | ε, B → Bb| ε the partial derivation tree can be the following − If a partial derivation tree contains the root S, it is called a sentential form. The above sub-tree is also in sentential form. Leftmost and Rightmost Derivation of a String Leftmost derivation − A leftmost derivation is obtained by applying production to the leftmost variable in each step. Rightmost derivation − A rightmost derivation is obtained by applying production to the rightmost variable in each step. Example Let any set of production rules in a CFG be X → X+X | X*X |X| a over an alphabet {a}. The leftmost derivation for the string “a+a*a” may be − X → X+X → a+X → a + X*X → a+a*X → a+a*a The stepwise derivation of the above string is shown as below − The rightmost derivation for the above string “a+a*a” may be − X → X*X → X*a → X+X*a → X+a*a → a+a*a The stepwise derivation of the above string is shown as below − Left and Right Recursive Grammars In a context-free grammar G, if there is a production in the form X → Xa where X is a non-terminal and ‘a’ is a string of terminals, it is called a left recursive production. The grammar having a left recursive production is called a left recursive grammar. And if in a context-free grammar G, if there is a production is in the form X → aX where X is a non-terminal and ‘a’ is a string of terminals, it is called a right recursive production. The grammar having a right recursive production is called a right recursive grammar. Learning working make money
Arden”s Theorem In order to find out a regular expression of a Finite Automaton, we use Arden’s Theorem along with the properties of regular expressions. Statement − Let P and Q be two regular expressions. If P does not contain null string, then R = Q + RP has a unique solution that is R = QP* Proof − R = Q + (Q + RP)P [After putting the value R = Q + RP] = Q + QP + RPP When we put the value of R recursively again and again, we get the following equation − R = Q + QP + QP2 + QP3….. R = Q (ε + P + P2 + P3 + …. ) R = QP* [As P* represents (ε + P + P2 + P3 + ….) ] Hence, proved. Assumptions for Applying Arden’s Theorem The transition diagram must not have NULL transitions It must have only one initial state Method Step 1 − Create equations as the following form for all the states of the DFA having n states with initial state q1. q1 = q1R11 + q2R21 + … + qnRn1 + ε q2 = q1R12 + q2R22 + … + qnRn2 ………………………… ………………………… ………………………… ………………………… qn = q1R1n + q2R2n + … + qnRnn Rij represents the set of labels of edges from qi to qj, if no such edge exists, then Rij = ∅ Step 2 − Solve these equations to get the equation for the final state in terms of Rij Problem Construct a regular expression corresponding to the automata given below − Solution − Here the initial state and final state is q1. The equations for the three states q1, q2, and q3 are as follows − q1 = q1a + q3a + ε (ε move is because q1 is the initial state0 q2 = q1b + q2b + q3b q3 = q2a Now, we will solve these three equations − q2 = q1b + q2b + q3b = q1b + q2b + (q2a)b (Substituting value of q3) = q1b + q2(b + ab) = q1b (b + ab)* (Applying Arden’s Theorem) q1 = q1a + q3a + ε = q1a + q2aa + ε (Substituting value of q3) = q1a + q1b(b + ab*)aa + ε (Substituting value of q2) = q1(a + b(b + ab)*aa) + ε = ε (a+ b(b + ab)*aa)* = (a + b(b + ab)*aa)* Hence, the regular expression is (a + b(b + ab)*aa)*. Problem Construct a regular expression corresponding to the automata given below − Solution − Here the initial state is q1 and the final state is q2 Now we write down the equations − q1 = q10 + ε q2 = q11 + q20 q3 = q21 + q30 + q31 Now, we will solve these three equations − q1 = ε0* [As, εR = R] So, q1 = 0* q2 = 0*1 + q20 So, q2 = 0*1(0)* [By Arden’s theorem] Hence, the regular expression is 0*10*. Learning working make money
Deterministic Finite Automaton Finite Automaton can be classified into two types − Deterministic Finite Automaton (DFA) Non-deterministic Finite Automaton (NDFA / NFA) Deterministic Finite Automaton (DFA) In DFA, for each input symbol, one can determine the state to which the machine will move. Hence, it is called Deterministic Automaton. As it has a finite number of states, the machine is called Deterministic Finite Machine or Deterministic Finite Automaton. Formal Definition of a DFA A DFA can be represented by a 5-tuple (Q, ∑, δ, q0, F) where − Q is a finite set of states. ∑ is a finite set of symbols called the alphabet. δ is the transition function where δ: Q × ∑ → Q q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Graphical Representation of a DFA A DFA is represented by digraphs called state diagram. The vertices represent the states. The arcs labeled with an input alphabet show the transitions. The initial state is denoted by an empty single incoming arc. The final state is indicated by double circles. Example Let a deterministic finite automaton be → Q = {a, b, c}, ∑ = {0, 1}, q0 = {a}, F = {c}, and Transition function δ as shown by the following table − Present State Next State for Input 0 Next State for Input 1 a a b b c a c b c Its graphical representation would be as follows − Learning working make money
Automata Theory – Useful Resources The following resources contain additional information on Automata Theory. Please use them to get more in-depth knowledge on this. Useful Video Courses 101 Lectures 9.5 hours 169 Lectures 13.5 hours 99 Lectures 10 hours 9 Lectures 1 hours 33 Lectures 6.5 hours 57 Lectures 9.5 hours Learning working make money
Multi-track Turing Machine Multi-track Turing machines, a specific type of Multi-tape Turing machine, contain multiple tracks but just one tape head reads and writes on all tracks. Here, a single tape head reads n symbols from n tracks at one step. It accepts recursively enumerable languages like a normal single-track single-tape Turing Machine accepts. A Multi-track Turing machine can be formally described as a 6-tuple (Q, X, ∑, δ, q0, F) where − Q is a finite set of states X is the tape alphabet ∑ is the input alphabet δ is a relation on states and symbols where δ(Qi, [a1, a2, a3,….]) = (Qj, [b1, b2, b3,….], Left_shift or Right_shift) q0 is the initial state F is the set of final states Note − For every single-track Turing Machine S, there is an equivalent multi-track Turing Machine M such that L(S) = L(M). Learning working make money
DFA Minimization DFA Minimization using Myphill-Nerode Theorem Algorithm Input − DFA Output − Minimized DFA Step 1 − Draw a table for all pairs of states (Qi, Qj) not necessarily connected directly [All are unmarked initially] Step 2 − Consider every state pair (Qi, Qj) in the DFA where Qi ∈ F and Qj ∉ F or vice versa and mark them. [Here F is the set of final states] Step 3 − Repeat this step until we cannot mark anymore states − If there is an unmarked pair (Qi, Qj), mark it if the pair {δ (Qi, A), δ (Qi, A)} is marked for some input alphabet. Step 4 − Combine all the unmarked pair (Qi, Qj) and make them a single state in the reduced DFA. Example Let us use Algorithm 2 to minimize the DFA shown below. Step 1 − We draw a table for all pair of states. a b c d e f a b c d e f Step 2 − We mark the state pairs. a b c d e f a b c ✔ ✔ d ✔ ✔ e ✔ ✔ f ✔ ✔ ✔ Step 3 − We will try to mark the state pairs, with green colored check mark, transitively. If we input 1 to state ‘a’ and ‘f’, it will go to state ‘c’ and ‘f’ respectively. (c, f) is already marked, hence we will mark pair (a, f). Now, we input 1 to state ‘b’ and ‘f’; it will go to state ‘d’ and ‘f’ respectively. (d, f) is already marked, hence we will mark pair (b, f). a b c d e f a b c ✔ ✔ d ✔ ✔ e ✔ ✔ f ✔ ✔ ✔ ✔ ✔ After step 3, we have got state combinations {a, b} {c, d} {c, e} {d, e} that are unmarked. We can recombine {c, d} {c, e} {d, e} into {c, d, e} Hence we got two combined states as − {a, b} and {c, d, e} So the final minimized DFA will contain three states {f}, {a, b} and {c, d, e} DFA Minimization using Equivalence Theorem If X and Y are two states in a DFA, we can combine these two states into {X, Y} if they are not distinguishable. Two states are distinguishable, if there is at least one string S, such that one of δ (X, S) and δ (Y, S) is accepting and another is not accepting. Hence, a DFA is minimal if and only if all the states are distinguishable. Algorithm 3 Step 1 − All the states Q are divided in two partitions − final states and non-final states and are denoted by P0. All the states in a partition are 0th equivalent. Take a counter k and initialize it with 0. Step 2 − Increment k by 1. For each partition in Pk, divide the states in Pk into two partitions if they are k-distinguishable. Two states within this partition X and Y are k-distinguishable if there is an input S such that δ(X, S) and δ(Y, S) are (k-1)-distinguishable. Step 3 − If Pk ≠ Pk-1, repeat Step 2, otherwise go to Step 4. Step 4 − Combine kth equivalent sets and make them the new states of the reduced DFA. Example Let us consider the following DFA − q δ(q,0) δ(q,1) a b c b a d c e f d e f e e f f f f Let us apply the above algorithm to the above DFA − P0 = {(c,d,e), (a,b,f)} P1 = {(c,d,e), (a,b),(f)} P2 = {(c,d,e), (a,b),(f)} Hence, P1 = P2. There are three states in the reduced DFA. The reduced DFA is as follows − Q δ(q,0) δ(q,1) (a, b) (a, b) (c,d,e) (c,d,e) (c,d,e) (f) (f) (f) (f) Learning working make money
Accepted Language & Decided Language A TM accepts a language if it enters into a final state for any input string w. A language is recursively enumerable (generated by Type-0 grammar) if it is accepted by a Turing machine. A TM decides a language if it accepts it and enters into a rejecting state for any input not in the language. A language is recursive if it is decided by a Turing machine. There may be some cases where a TM does not stop. Such TM accepts the language, but it does not decide it. Designing a Turing Machine The basic guidelines of designing a Turing machine have been explained below with the help of a couple of examples. Example 1 Design a TM to recognize all strings consisting of an odd number of α’s. Solution The Turing machine M can be constructed by the following moves − Let q1 be the initial state. If M is in q1; on scanning α, it enters the state q2 and writes B (blank). If M is in q2; on scanning α, it enters the state q1 and writes B (blank). From the above moves, we can see that M enters the state q1 if it scans an even number of α’s, and it enters the state q2 if it scans an odd number of α’s. Hence q2 is the only accepting state. Hence, M = {{q1, q2}, {1}, {1, B}, δ, q1, B, {q2}} where δ is given by − Tape alphabet symbol Present State ‘q1’ Present State ‘q2’ α BRq2 BRq1 Example 2 Design a Turing Machine that reads a string representing a binary number and erases all leading 0’s in the string. However, if the string comprises of only 0’s, it keeps one 0. Solution Let us assume that the input string is terminated by a blank symbol, B, at each end of the string. The Turing Machine, M, can be constructed by the following moves − Let q0 be the initial state. If M is in q0, on reading 0, it moves right, enters the state q1 and erases 0. On reading 1, it enters the state q2 and moves right. If M is in q1, on reading 0, it moves right and erases 0, i.e., it replaces 0’s by B’s. On reaching the leftmost 1, it enters q2 and moves right. If it reaches B, i.e., the string comprises of only 0’s, it moves left and enters the state q3. If M is in q2, on reading either 0 or 1, it moves right. On reaching B, it moves left and enters the state q4. This validates that the string comprises only of 0’s and 1’s. If M is in q3, it replaces B by 0, moves left and reaches the final state qf. If M is in q4, on reading either 0 or 1, it moves left. On reaching the beginning of the string, i.e., when it reads B, it reaches the final state qf. Hence, M = {{q0, q1, q2, q3, q4, qf}, {0,1, B}, {1, B}, δ, q0, B, {qf}} where δ is given by − Tape alphabet symbol Present State ‘q0’ Present State ‘q1’ Present State ‘q2’ Present State ‘q3’ Present State ‘q4’ 0 BRq1 BRq1 ORq2 – OLq4 1 1Rq2 1Rq2 1Rq2 – 1Lq4 B BRq1 BLq3 BLq4 OLqf BRqf Learning working make money
Automata Theory Introduction Automata – What is it? The term “Automata” is derived from the Greek word “αὐτόματα” which means “self-acting”. An automaton (Automata in plural) is an abstract self-propelled computing device which follows a predetermined sequence of operations automatically. An automaton with a finite number of states is called a Finite Automaton (FA) or Finite State Machine (FSM). Formal definition of a Finite Automaton An automaton can be represented by a 5-tuple (Q, ∑, δ, q0, F), where − Q is a finite set of states. ∑ is a finite set of symbols, called the alphabet of the automaton. δ is the transition function. q0 is the initial state from where any input is processed (q0 ∈ Q). F is a set of final state/states of Q (F ⊆ Q). Related Terminologies Alphabet Definition − An alphabet is any finite set of symbols. Example − ∑ = {a, b, c, d} is an alphabet set where ‘a’, ‘b’, ‘c’, and ‘d’ are symbols. String Definition − A string is a finite sequence of symbols taken from ∑. Example − ‘cabcad’ is a valid string on the alphabet set ∑ = {a, b, c, d} Length of a String Definition − It is the number of symbols present in a string. (Denoted by |S|). Examples − If S = ‘cabcad’, |S|= 6 If |S|= 0, it is called an empty string (Denoted by λ or ε) Kleene Star Definition − The Kleene star, ∑*, is a unary operator on a set of symbols or strings, ∑, that gives the infinite set of all possible strings of all possible lengths over ∑ including λ. Representation − ∑* = ∑0 ∪ ∑1 ∪ ∑2 ∪……. where ∑p is the set of all possible strings of length p. Example − If ∑ = {a, b}, ∑* = {λ, a, b, aa, ab, ba, bb,………..} Kleene Closure / Plus Definition − The set ∑+ is the infinite set of all possible strings of all possible lengths over ∑ excluding λ. Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪……. ∑+ = ∑* − { λ } Example − If ∑ = { a, b } , ∑+ = { a, b, aa, ab, ba, bb,………..} Language Definition − A language is a subset of ∑* for some alphabet ∑. It can be finite or infinite. Example − If the language takes all possible strings of length 2 over ∑ = {a, b}, then L = { ab, aa, ba, bb } Learning working make money
Turing Machine Halting Problem Input − A Turing machine and an input string w. Problem − Does the Turing machine finish computing of the string w in a finite number of steps? The answer must be either yes or no. Proof − At first, we will assume that such a Turing machine exists to solve this problem and then we will show it is contradicting itself. We will call this Turing machine as a Halting machine that produces a ‘yes’ or ‘no’ in a finite amount of time. If the halting machine finishes in a finite amount of time, the output comes as ‘yes’, otherwise as ‘no’. The following is the block diagram of a Halting machine − Now we will design an inverted halting machine (HM)’ as − If H returns YES, then loop forever. If H returns NO, then halt. The following is the block diagram of an ‘Inverted halting machine’ − Further, a machine (HM)2 which input itself is constructed as follows − If (HM)2 halts on input, loop forever. Else, halt. Here, we have got a contradiction. Hence, the halting problem is undecidable. Learning working make money