Theory of Automata Course: Theory of Automata Topic: Intro and Regular Languages Instructor: Mr. Muhammad Arif Course Assessment Criteria Final Exam Midterm Quizzes 50% 25% 4 best quizzes out of 6 10% Assignments
10% Presentation + Report 05% Total 100% [Week#01,02] - Intro to TOA & Regular Expressions 2 Literature Lecture Slides Soft copies (.pdf) Hard copies Research Papers Research papers from magazines/internet [Week#01,02] - Intro to TOA & Regular Expressions 3
Course contents in brief Finite State Models: Language definitions preliminaries Regular expressions/Regular languages, Finite automata (FAs), Transition graphs (TGs), NFAs, kleenes theorem, Transducers (automata with output), Pumping lemma and non regular language Grammars and PDA Context free grammars, Derivations, derivation trees and ambiguity, Simplifying CFLs, Normal form grammars and parsing, Push-down Automata, Pumping lemma and non-context free languages, Decidability, Chomskys hierarchy of grammars
Turing Machines Theory: Turing machines, Post machine, Variations on TM, TM encoding, Universal Turing Machine Context sensitive Grammars, Defining Computers by TMs. [Week#01,02] - Intro to TOA & Regular Expressions 4 Purpose of Course In this Course our concern is not with actual hardware and software. More interested in capability of computers. specifically, what can and what cannot be done by any existing computer or any computer ever built in the future. We will study different types of theoretical machines that are mathematical models for actual physical processes.
[Week#01,02] - Intro to TOA & Regular Expressions 5 Cont. By considering the possible inputs on which these machines can work,we can analyze their various strengths and weaknesses. We can then develop what we may believe to be the most powerful machine possible. Surprisingly, it will not be able to perform every task. [Week#01,02] - Intro to TOA & Regular Expressions 6
Cont. In particular, the way we shall be studying about computers is to build mathematical models, called machines, and then to study their limitations by analyzing the types of inputs on which they can operate successfully. The collection of these successful inputs is called the language of the machine [Week#01,02] - Intro to TOA & Regular Expressions 7 Cont. Every time we introduce a new machine, we will learn its language; and every time we develop a new language, we
will try to find a machine that corresponds to it. We will study different types of theoretical machines that are mathematical models for actual physical processes. By considering the possible inputs on which these machines can work, we can analyze their various strengths and weaknesses. [Week#01,02] - Intro to TOA & Regular Expressions 8 Recommended Books Introduction to Computer Theory, Denial Cohen, John Wiley & Sons, Inc. Theory of Automata By C.J. Martin Introduction to Automata Theory, Languages & Computation, J Hopcraft, D. Ullman Languages & Machines, An Into to the Theory of
Computer Science, 2/e Thomas A. Sudkamp, Addison Wesley. [Week#01,02] - Intro to TOA & Regular Expressions 9 Important Issues Attendance policy and late comers Assignments policy All typed assignments Title page: Registration number Course title Assignment name Assignment number Submission date
Font size of the headings should be 12, Bold and may be underlined [Week#01,02] - Intro to TOA & Regular Expressions 10 Important Issues Text font size should be 12 Font style should be Times, Arial or Book Antiqua Page numbers Default page settings Table of contents for large assignments
(applicable for more than 5 pages) Single spacing Justified No color except black and blue References of the source material used (no copied material will be accepted) [Week#01,02] - Intro to TOA & Regular Expressions 11 Chapter 1: Introduction to Theory of Automata and Regular Expressions [Week#01,02] - Intro to TOA & Regular Expressions 12 What Does Automata Mean? [Week#01,02] - Intro to TOA &
Regular Expressions 13 What Does Automata Mean? [Week#01,02] - Intro to TOA & Regular Expressions 14 What Does Automata Mean? [Week#01,02] - Intro to TOA & Regular Expressions 15 What Does Automata Mean? [Week#01,02] - Intro to TOA & Regular Expressions
16 [Week#01,02] - Intro to TOA & Regular Expressions 17 [Week#01,02] - Intro to TOA & Regular Expressions 18 What Does Automata Mean? [Week#01,02] - Intro to TOA & Regular Expressions 19 What Does Automata Mean? [Week#01,02] - Intro to TOA & Regular Expressions
20 What does automata mean? Automata is Greek letters .Automata is a word formulated from automation, which means machine designing or replacing human beings with machines It is the plural of automaton, and it means something that works automatically. [Week#01,02] - Intro to TOA & Regular Expressions 21 Different Kinds of Automata
Automata are distinguished by the temporary memory Finite Automata: Pushdown Automata: Turing Machines: [Week#01,02] - Intro to TOA & Regular Expressions no temporary memory stack random access memory 22 Finite Automaton [Week#01,02] - Intro to TOA &
Regular Expressions 23 Pushdown Automaton [Week#01,02] - Intro to TOA & Regular Expressions 24 Turing Machine [Week#01,02] - Intro to TOA & Regular Expressions 25 Power of Automata [Week#01,02] - Intro to TOA & Regular Expressions 26 Languages
Letters, Words, Sentences Alphabets join to form words Words combine to form sentences Sentences combine to form paragraphs and so on But the matter of fact is not all collections of letters form a valid word and not all collection of words form a valid sentence. [Week#01,02] - Intro to TOA & Regular Expressions 27 Languages How can you tell whether a given sentence belongs to a particular languages
Black is cat the The tea is hot I like chocolates two much Rules give a clue to forming as well as validating sentences. There are two types of languages: Formal Languages (Syntactic Languages) Informal Languages (Semantic Languages) [Week#01,02] - Intro to TOA & Regular Expressions 28 Formal vs. Informal Rules
Informal language -> abstract languages Incoherent strings are understandable Slang, idiom, dialect etc. But Raise ambiguity Interpretation varies with region I am through (BrE/AmE) Same words have multiple meanings. Like, light, base, etc.
[Week#01,02] - Intro to TOA & Regular Expressions 29 Informal languages Natural languages are generally defined informally Human brain are capable to understand incoherent even invalid sentences. You mangoes like We school daily go to Rectify grammatical errors etc. Resolve ambiguity
Interpret according to context Supporting aids such as Facial expressions and body language etc. [Week#01,02] - Intro to TOA & Regular Expressions 30 How to Communicate with machines ? Need a language: what sort Machines dont have human mind though may have its partial imitation Would fail on incorrect or ambiguous input Some recovery or input corrections may be proposed but again very limited.
Thus need a precise, explicit and universal definition of communication language [Week#01,02] - Intro to TOA & Regular Expressions 31 Summary of Languages Three aspects/specifications Lexical Syntactic Defines valid words/units of a language Defines rules for combining the units to form valid
sentences (computer programs in context of machines) Semantic Concerned with the interpretation or meaning of a sentence (what output to produce in context of machines) Affected by ambiguity the most. [Week#01,02] - Intro to TOA & Regular Expressions 32 Formal Languages Word formal refers to the fact that all the rules for the language are explicitly stated in terms of what string of symbols can occur
No ambiguities Universally uniform understanding Let the machine Interpret an input uniformly every time. i.e. always produces same output for a particular input Avoid crashes because of ambiguity Explicitly reject invalid input [Week#01,02] - Intro to TOA & Regular Expressions 33 Formal Languages Need precise uniformly understandable notation Representations
Alphabet Represents a finite set of fundamental units of lanauges, e.g. for English ={a,b,.z.A,Z,}a,b,.z.A,Z,} Denoted by = {a,b,.z.A,Z,}0,1} = {a,b,.z.A,Z,}0,1,2,3,4,5,6,7,8,9} A certain specified set of strings of characters from the alphabet is called the language (set of words) [Week#01,02] - Intro to TOA & Regular Expressions 34 Formal Languages
List of words Set of all valid words of a given language, e.g., a language English_Words that contains all valid words of English would have a = {a,b,.z.A,Z,}all entries of the dictionary + punctuation marks and blank space} Denoted by Is Finite or Infinite set. Strings: Concatenation of finite symbols from the alphabets is called a string. A string a finite sequence of symbols chosen from alphabet. Example: if ={a,b,.z.A,Z,}a,b} then a, abab, aaab, ababababa. [Week#01,02] - Intro to TOA & Regular Expressions 35 Formal Languages
Empty String or Null String Empty String is a string which does not contain any letter. It is same as the empty set. It is denoted by capital Greek letter lambda . Words In spoken languages not all strings are words. Example: in English if we combine abcd, it does not form any word. Words are strings belonging to some language. Example: if ={a,b,.z.A,Z,}x} then a language L can be defined as, L={a,b,.z.A,Z,}x : n=1,2,3} OR L={a,b,.z.A,Z,}x, xx, xxx, xxxx..} Here x, xx, xxx. are the words of L. Note: Not all strings are words but all words are strings n [Week#01,02] - Intro to TOA & Regular Expressions
36 Formal Languages Valid/In-valid Alphabets While defining an alphabets, an alphabet may contain letters consisting of group of symbols, e.g., consider 2 alphabets: 1={a,b,.z.A,Z,}B, aB, bab, d} and 2={a,b,.z.A,Z,}B, Ba, bab, d} and a string BababB This string may be tokenized in two different ways: (Ba), (bab), (B) (B), (abab), (B) Which shows that the 2nd group can not be identified as a string, defined over = {a,b,.z.A,Z,}a,b}
[Week#01,02] - Intro to TOA & Regular Expressions 37 Formal Languages Note While defining an alphabet of letters consisting of more than one symbols, no letter should be started with the letter of the same alphabet i.e. one letter should not be the prefix of another. However, a letter may be ended in the letter of same alphabet i.e. one letter may be the suffix of another. Therefore, 1 is a valid alphabet and 2 is in-valid alphabet. [Week#01,02] - Intro to TOA & Regular Expressions 38 Formal Languages
String Variable: A letter used for denoting a string. The author uses w, x, y and z as string variable. For example w = 0111100 , x = 123045, z = abbbcdeg String Length: The number of positions for symbols in the string. For simplicity we can say that it is the number of symbols in the string. For example |w| = 7 , |x| = ? , |z| = ? [Week#01,02] - Intro to TOA & Regular Expressions 39 Formal Languages
Reverse of a string The reverse of a string s, denoted by rev(s), is obtained by writing the letters of s in reverse order. Example 1: if s=abc is a string defined over ={a,b,.z.A,Z,}a,b,c} then Rev(s)= cba Example 2: if s=BaBbabBd is a string defined over ={a,b,.z.A,Z,}B,aB,bab,d} then Rev(s)= dBbabaBB [Week#01,02] - Intro to TOA & Regular Expressions 40 Defining Languages The language can be defined in different ways, such as
Descriptive definition Recursive definition Using Regular expressions (RE) and Using Finite automaton (FA) etc. [Week#01,02] - Intro to TOA & Regular Expressions 41 Defining Languages Define alphabet set Define rules for forming valid words and sequences of words from Called grammar Can be descriptive
Limitations of informalism Can be mathematical Can also define supporting functions e.g., length(X), reverse(x) [Week#01,02] - Intro to TOA & Regular Expressions 42 Defining languages Example ={a,b,.z.A,Z,}a,b,z} L = {a,b,.z.A,Z,}all words formed only of odd number of xs} L = {a,b,.z.A,Z,}xn | n is odd} L = {a,b,.z.A,Z,}all words of length less than or equal to 4} PALINDROME ={a,b,.z.A,Z,}, all strings x such that reverse (x) =
x} [Week#01,02] - Intro to TOA & Regular Expressions 43 Finite vs. Infinite Languages Finite Languages Countable set of words Can be defined by rigorously listing the words in E.g. English_Words Infinite Languages Infinite set of valid words Cant be listed completely E.g. English_Sentences [Week#01,02] - Intro to TOA &
Regular Expressions 44 Infinite Languages Most of the languages are infinite How can u check whether a word belongs to a language if it is Finite Checking its entry in Infinite Validating against rules [Week#01,02] - Intro to TOA &
Regular Expressions 45 Defining Language Define alphabet set Define rules for forming valid words and sequences of words from This is called grammar Can be descriptive Limitations of informalism Can be mathematical Can also define supporting functions e.g., length(X),
reverse(x) [Week#01,02] - Intro to TOA & Regular Expressions 46 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Defining Languages Language defining rules can be of two kinds; 1. 2. They can either tell us how to test a string of alphabet letters that we might be presented with, to see if it is a valid word or They can tell us how to construct all the
words in the language by some clear procedures (discussed later) [Week#01,02] - Intro to TOA & Regular Expressions 47 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Defining Languages Example Lets discuss a simple example of language, if we start with an alphabet having only one letter, the letter x = {a,b,.z.A,Z,}x}
We can define a language by saying any nonempty string of alphabet characters L = {a,b,.z.A,Z,}x xx xxx xxxx } L = {a,b,.z.A,Z,}x^n for n =1, 2, 3, } Because of the way we have defined it, this language does not include the null string () [Week#01,02] - Intro to TOA & Regular Expressions 48 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Defining Languages
We can define the operation of concatenation We can define a language that contain xn concatenated xm is the new word xn+m L = {a,b,.z.A,Z,}, x, xx, xxx, xxxx} = {a,b,.z.A,Z,}xn for n = 0, 1, 2, 3, } Here x0 = and not x0 =1 [Week#01,02] - Intro to TOA & Regular Expressions 49 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad
Defining Languages The language can be defined in different ways, such as Descriptive definition Recursive definition Using Regular expressions (RE) and Using Finite automaton (FA) etc. [Week#01,02] - Intro to TOA & Regular Expressions 50 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad
Defining Languages Descriptive definition The language is defined, describing the conditions imposed on its words. Example 1: the language L of strings of odd length, defined over ={a,b,.z.A,Z,}a} can be written as L={a,b,.z.A,Z,}a,aaa,aaaaa, } Example 2: the language L of strings that does not start with a, defined over ={a,b,.z.A,Z,}a,b,c} can be written as L={a,b,.z.A,Z,}b,c,ba,bb,bc,ca,cb,cc,.} [Week#01,02] - Intro to TOA & Regular Expressions 51 bilawalsheikh333.blogspot.com
Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Defining Languages Example 3: the language L of strings of length 2, defined over ={a,b,.z.A,Z,}0,1,2} can be written as L={a,b,.z.A,Z,}00,01,02,10,11,12,20,21,22} Example 4: the language L of strings ending in 0, defined over ={a,b,.z.A,Z,}0,1} can
be written as L={a,b,.z.A,Z,}0,00,10,000,010,100,110,} Example 5: the language EQUAL, of strings with number of as equal to number of bs, defined over ={a,b,.z.A,Z,}a,b} can be written as L={a,b,.z.A,Z,},ab,aabb,abab,baba,abba} [Week#01,02] - Intro to TOA & Regular Expressions 52 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Defining Languages
Example 6: the language EVEN-EVEN, of strings with even number of as and even number of bs, defined over ={a,b,.z.A,Z,}a,b} can be written as L={a,b,.z.A,Z,}, aa,bb,aaaa,aabb,abab,abba,baab,baba,bbaa,bbbb, } Example 7: the language INTEGER, of strings defined over ={a,b,.z.A,Z,}-,0,1,2,3,4,5,6,7,8,9} can be written as INTEGER={a,b,.z.A,Z,}..,-2, -1, 0, 1, 2, } Example 8: the language EVEN, of strings defined over ={a,b,.z.A,Z,}-,0,1,2,3,4,5,6,7,8,9} can be written as EVEN={a,b,.z.A,Z,}..,4, -2, 0, 2, 4, } [Week#01,02] - Intro to TOA & Regular Expressions 53 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad
Defining Languages Example 9: the language {a,b,.z.A,Z,}anbn}, of strings defined over ={a,b,.z.A,Z,}a,b}, as {a,b,.z.A,Z,}anbn : n=1,2,3}, can be written as {a,b,.z.A,Z,}ab, aabb, aaabbb,..} Example 10: the language {a,b,.z.A,Z,}anbnan}, of strings defined over ={a,b,.z.A,Z,}a,b}, as {a,b,.z.A,Z,}anbnan : n=1,2,3}, can be written as {a,b,.z.A,Z,}aba, aabbaa, aaabbbaaa,..} Example 11: the language PRIME, of strings defined over ={a,b,.z.A,Z,}a}, as {a,b,.z.A,Z,}ap : p is prime}, can be written as {a,b,.z.A,Z,}aa, aaa, aaaaa,..} [Week#01,02] - Intro to TOA & Regular Expressions 54 bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad Defining Languages PALINDROME: the language consisting of and the strings s defined over such that Rev(s)=s. Example ={a,b,.z.A,Z,}a, b}, PALINDROME = {a,b,.z.A,Z,}, a, b, aa, bb, aaa, aba, bab, bbb, .} [Week#01,02] - Intro to TOA & Regular Expressions 55 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Kleene Closure
Kleene Closure (applied to ) called Set Closure Given an alphabet , we wish to define a language in which any string of letters from is a word, even the null string. This language is called the closure of the alphabet Denoted by * Also called Kleene star [Week#01,02] - Intro to TOA & Regular Expressions 56 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Kleene Closure
Examples If = {a,b,.z.A,Z,}x} then If = {a,b,.z.A,Z,}0 1} then * = {a,b,.z.A,Z,}, x, xx, xxx } * = {a,b,.z.A,Z,}, 0, 1, 00, 01, 10, 11, 000, 001 } If = {a,b,.z.A,Z,}a b c} then * = {a,b,.z.A,Z,}, a, b, c, aa, ab, ac, ba, bb, bc, ca, cb, cc } [Week#01,02] - Intro to TOA & Regular Expressions 57
bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Kleene Star Kleene star is an operation that makes an infinite language of strings out of an alphabet infinite language means, infinitely many words, each of finite length We write words in the language in size order, we usually follow this method of sequencing a language This ordering is called lexicographic order [Week#01,02] - Intro to TOA &
Regular Expressions 58 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad PLUS Operation (+) PLUS operator is same as Kleene star closure except that it does not generate null string, automatically. Examples If = {a,b,.z.A,Z,}0 1} then += {a,b,.z.A,Z,}0, 1, 00, 01, 10, 11, 000, 001 }
[Week#01,02] - Intro to TOA & Regular Expressions 59 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad * and + * : The set of all strings over an alphabet and called Kleene Star Closure of alphabet. So we have * = 0 U 1 U 2 U 3 U + : The set of all strings over an alphabet excluding empty string, , and called plus operation. So we have + = 1 U 2 U 3 U
[Week#01,02] - Intro to TOA & Regular Expressions 60 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Some observations represents an empty string (not alphabet thus not a part of ) also represents the same is not equivalent to If = then
* = {}}} Is S* == (S*)* and so on [Week#01,02] - Intro to TOA & Regular Expressions 61 Recursive Language Definition bilawalsheikh333.blogspot.com (BSCS)-4A Fall 2012, BU Islamabad Recursion When an entity is referred within its definition Recursive functions
Theory of Automata A function calls itself within its definition/body Principles of recursion Define a base case For termination (in case of top down) For starting point (in case of bottom up) Define the recursive part in terms of base case [Week#01,02] - Intro to TOA & Regular Expressions 62
Recursive Language Definition bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad A recursive definition is characteristically a three steps process First, we specify some basic objects in the set Second, we give rules for constructing more objects in the set from the one we already know Third, we declare that no objects except those constructed in this way are allowed in the set [Week#01,02] - Intro to TOA & Regular Expressions
63 Recursive Language Definition bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Example 1 Language Even where = {a,b,.z.A,Z,}1,2, 3, 4} Informal definition
Language of all words x such that x is divisible by 2 Rule 1: 2 is in Even Rule 2: If x is in Even, then so is x+2 Rule 3: The only elements in the set Even are those that can be produced from the two rules above [Week#01,02] - Intro to TOA & Regular Expressions 64 Recursive Language Definition bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Example 2
Define a language Positive of all positive natural numbers Rule 1: 1 is in Positive Rule 2: If x and y are in Positive, then so are x+y, x*y and x/y Rule 3: The only elements in the set Positive are those that can be produced from the two rules above [Week#01,02] - Intro to TOA & Regular Expressions 65 Recursive Language Definition bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad Example 3 Define the language anbn}, n=1,2,3, of strings defined over = {a,b,.z.A,Z,}a b} Rule 1: ab is in anbn Rule 2: If x is in anbn then a*b is in anbn Rule 3: No strings except those constructed in above, are allowed to be in anbn. [Week#01,02] - Intro to TOA & Regular Expressions 66 Recursive Language Definition bilawalsheikh333.blogspot.com
Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Example 4 Define the language L, of strings ending in a, defined over = {a,b,.z.A,Z,}a b} Rule 1: a is in L Rule 2: If x is in L then s(x) is also is in L, where s belongs to * Rule 3: No strings except those constructed in above, are allowed to be in L. [Week#01,02] - Intro to TOA & Regular Expressions
67 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions We have discussed about a specific class of language called as regular language. We will also see the machine way of looking at the regular language. Means, given a regular language, we can always create a finite state of automata which is deterministic and nondeterministic that can accept all the words of a regular language. [Week#01,02] - Intro to TOA & Regular Expressions
68 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions One way of looking at the language is named as Regular Expressions. Regular expressions are nothing but consists of atomic expressions and some specific operators that operate on those atomic expressions to build or generate all the words of a given language. [Week#01,02] - Intro to TOA & Regular Expressions 69
bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions So the language can be viewed from three different ways. Grammar is nothing but the set of rules. [Week#01,02] - Intro to TOA & Regular Expressions 70 bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad Regular Expressions As discussed earlier that a* generates , a,aa,aaa,aaaa,aaaaa, and a+ generates a,aa,aaa,aaaa,aaaaa, so the language L1= {a,b,.z.A,Z,}, a,aa,aaa,aaaa,aaaaa, } and L2= {a,b,.z.A,Z,}a,aa,aaa,aaaa,aaaaa, } can simply be expressed by a* and a+ respectively. a* and a+ are called Regular Expressions (RE) for L1 and L2 respectively. [Week#01,02] - Intro to TOA & Regular Expressions 71 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions
With the small set of operators we build the entire regular expressions patterns. a* means 0 or more occurrences of a a+ means 1 or more occurrences of a a? means 0 or 1 occurrence of a [a-z] => a/b/cz [Week#01,02] - Intro to TOA & Regular Expressions 72 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions
The language can be defined by any of the expressions below: xx*, x+, xx*x*, x+x* => ab*a => (ab)* => a*b* => a*b* in not equal to (ab)* sign? (0/[1-9] digit*) [Week#01,02] - Intro to TOA & Regular Expressions 73 bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad Regular Expressions We now introduce another use of plus sign x+y where x and y are string of characters from an alphabet we mean either x or y Example 1: Consider the language T defined over the alphabet = {a,b,.z.A,Z,}a b c}: T = {a,b,.z.A,Z,}a c ab cb abb cbb abbb cbbb abbbb cbbbb..} All the words begin with an a or c and then are followed by some number of bs, we may write this T = language((a+c)b*)
[Week#01,02] - Intro to TOA & Regular Expressions 74 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Example 2: Consider a finite language L that contains all the strings of as and bs of length three exactly: L = {a,b,.z.A,Z,}aaa aab aba abb baa bab bba bbb} The first letter of each word in L is either an a or a b,
same is the case with the other 2 letters. So we may write L = language((a+b)(a+b)(a+b)) L= language(a+b)3 [Week#01,02] - Intro to TOA & Regular Expressions 75 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions If we want to define the set of all seven letter strings of as and bs, we may could write If we want to refer to the set of all possible strings of as and bs of any length, we may
could write L= language(a+b)7 L= language(a+b)* We can describe all the words that begin with the letter a a(a+b)* [Week#01,02] - Intro to TOA & Regular Expressions 76 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad
Regular Expressions Similarly, we can describe all the words that begin with the letter a and end with letter b simply as a(a+b)*b Remove ambiguity altogether Formal way to define the lexical specifications of a language [Week#01,02] - Intro to TOA & Regular Expressions 77 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad
Regular Expressions Called expressions on account of similarity with arithmetic expressions Use *, + and () * shows repetition + presents choice or disjunction () used for grouping [Week#01,02] - Intro to TOA & Regular Expressions 78 bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Given = {a,b,.z.A,Z,}a,b} a* = {a,b,.z.A,Z,}, a,aa,aaa,aaa,aaaa,aaaaa, } ab* = {a,b,.z.A,Z,}a, ab,abb,abbb,abbbb, } a+b = {a,b,.z.A,Z,}a/b} (ab)* = {a,b,.z.A,Z,}, ab, abab, ababab, } (a+b)* = {a,b,.z.A,Z,}, any string of as and bs} [Week#01,02] - Intro to TOA & Regular Expressions 79 bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad Regular Expressions The symbols that expressions are; appear in the regular the letters of the alphabet , the symbol for , Parentheses (), the star operator *, and
the plus sign + [Week#01,02] - Intro to TOA & Regular Expressions 80 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions The set of regular expression is defined by following rules 1. 2. Every letter of and is a regular expression If r1 and r2 are regular expressions, then so are
3. (r1) r1 r2 r1 + r 2 r1 * Nothing else is a regular expression [Week#01,02] - Intro to TOA & Regular Expressions 81 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Whether following are REs? if so what languages
do they generate? bb(a+b) (a+b)(a+b)(a+b) (a+b)*ba (a+b)*a(a+b)* (a+b)*aa(a+b)* [Week#01,02] - Intro to TOA & Regular Expressions 82 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions
Write RE for the following languages for = {a,b,.z.A,Z,}a, b} All words ending with b All words that start with a a(a+b)* All words that start with a double letter (a+b)*b (aa+bb)(a+b)*
All words that contain at least one double letter (a+b)*(aa+bb)(a+b)* [Week#01,02] - Intro to TOA & Regular Expressions 83 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Write RE for the following languages for = {a,b,.z.A,Z,}a, b} All words that start and end with a double letter
All words of length >=3 (a+b)(a+b)(a+b)(a+b)* All words that contain exactly one a or exactly one b (aa+bb)(a+b)*(aa+bb) b*ab* + a*ba* All words that dont end at ba (a+b)*(aa+ab+bb) [Week#01,02] - Intro to TOA & Regular Expressions
84 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Write RE for the following languages for = {a,b,.z.A,Z,}a, b} Language of all words that have at least two as (a+b)* a (a+b)* a (a+b)* that have at least one a and at least one b (a+b)* a (a+b)* b (a+b)*
[Week#01,02] - Intro to TOA & Regular Expressions 85 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Write RE for the languages L, of even length, defined over = {a,b,.z.A,Z,}a, b} ((a+b)(a+b))* Write RE for the languages L, of odd length, defined over = {a,b,.z.A,Z,}a, b}
((a+b)(a+b))*(a+b) [Week#01,02] - Intro to TOA & Regular Expressions 86 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions EVEN-EVEN ( = {a,b,.z.A,Z,}a, b}) Language of all words having even number of as and even number of bs Partitions/sets
Even as even bs (valid) Even as odd bs (need to adjust bs) Odd as odd bs (need to adjust as and bs) Odd as even bs (need to adjust as) [Week#01,02] - Intro to TOA & Regular Expressions 87 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions EVEN-EVEN ( = {a,b,.z.A,Z,}a, b}) i.e. = {a,b,.z.A,Z,}, aa, bb, aaaa, aabb, abab, abba, baab, baba, bbaa, bbbb, }
RE sets (aa+bb)* ((ab+ba)(ab+ba))* (aa + bb + (ab + ba )(aa + bb)* (ab + ba))* [Week#01,02] - Intro to TOA & Regular Expressions 88 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Note:
If r1=(aa+bb) and r2=(a+b) then r1+r2 = (aa+bb) + (a+b) r1r2 = (aa+bb) (a+b) = (aaa + aab + bba + bbb) r1* = (aa+bb)* Two way relation is important in case of association of a RE with a language All possible strings of a language can be generated from the RE All strings generated by the RE should be part of the language [Week#01,02] - Intro to TOA & Regular Expressions 89
bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions Equivalent Regular Expression: two regular expressions are said to be equivalent if they generate the same language. Example r1 = (a+b)*(aa+bb) r2 = (a+b)*aa+(a+b)*bb Both RE define the language of strings ending in aa or bb [Week#01,02] - Intro to TOA &
Regular Expressions 90 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Expressions The languages defined by a regular expression are called regular languages Or alternatively Any language that can be represented by a regular expression is a regular language It may be noted that a language may be expressed by more than 1 regular expression but given a RE there is a unique language generated by that RE.
[Week#01,02] - Intro to TOA & Regular Expressions 91 Language (Set) operations bilawalsheikh333.blogspot.com (BSCS)-4A Fall 2012, BU Islamabad If L1 and L2 are two languages (set of words) Theory of Automata L1L2 is a product set that contain all combinations of a string from L1 concatenated with a string from L2 L1+L2 is the union set (equivalently L1 U L2) containing
all words of L1 and L2 Examples If S = {a,b,.z.A,Z,}a aa aaa}, T = {a,b,.z.A,Z,} bb bbb} ST = {a,b,.z.A,Z,}abb abbb aabb aabbb aaabb aaabbb} S+T = {a,b,.z.A,Z,}a aa aaa bb bbb} If S = {a,b,.z.A,Z,}a bb bab}, T = {a,b,.z.A,Z,} a ab} [Week#01,02] - Intro to TOA & Regular Expressions 92 Languages Associated with REs bilawalsheikh333.blogspot.com Theory of Automata
(BSCS)-4A Fall 2012, BU Islamabad If r1 is a regular expression associated with the language L1 and r2 is a regular expression associated with the language L2, then Language(r1r2) = L1L2 Language(r1+ r2) = L1+ L2 = L1 U L2 Language(r1*) = L1* (Kleens Closure of L1) [Week#01,02] - Intro to TOA & Regular Expressions 93 bilawalsheikh333.blogspot.com Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad Regular Languages
How to tell whether a language is regular Define a RE for it, if it is possible to define, the language is Regular otherwise non-regular Must define a precise checking mechanism for RLs(to be discussed later) [Week#01,02] - Intro to TOA & Regular Expressions 94 Finite Languages are Regular bilawalsheikh333.blogspot.com
Theory of Automata (BSCS)-4A Fall 2012, BU Islamabad If L is a finite language (with only finitely many words), then L can be defined by a regular expression All finite languages are regular Example Consider a language L1, defined over = {a,b,.z.A,Z,}a, b}, of strings of length 2, starting with a, then L={aa, ab}, may be expressed the RE aa+ab. Hence, L1 by definition, is a regular language. [Week#01,02] - Intro to TOA & Regular Expressions 95