INTRODUCTION TO LANGUAGE
The language for which the compiler has been made is named as “M#”. M# is a general-purpose programming language having the most basic functionalities in it. This language basically focuses on the mathematical calculations, which are done using arithmetic operators and a few relational operators. M# is a case sensitive language.
There are five language constructs of M#.
Keywords are the defined words or commands of the language. All keywords in this language are in upper-case to distinguish them from other normal words. M# focuses on five basic keywords, which are as follows:
A brief description of each keyword is given below:
Keyword that tells the compiler that program starts now, like main( ) in C/C++.
Keyword that tells the compiler that program ends here.
Keyword that tells the compiler that the data type of the identifier is a number.
Keyword for the conditional statement, to execute certain code if the expression is true
Keyword used for displaying or printing on the screen.
An identifier is just like a variable which can hold different values. Identifiers in M# cannot have a number/digit in them. All characters in an identifier must be in small-case.
Comments in M# are defined as any no. of strings or digits between the two hash ‘#’ symbols. Comments are represented as follows: #….. #
Period or dot (.) is used for the termination of statements in this language. Each statement must end by a terminator.
The table below shows the valid Arithmetic and Relational Operators for the language.
1. + (Add)
1. = (Equal)
2. - (Subtract)
2. < (Less Than)
3. * (Multiply)
3. > (Greater Than)
4. / (Divide)
This is the first phase of compiler construction. In this phase a token is generated against all the lexemes in the source code. These lexemes and tokens are stored in the Symbol Table. Tokens against the lexemes are generated based on some patterns or rules.
PATTERNS / REGULAR EXPRESSIONS
The patterns or regular expressions for the M# language constructs are given below.
Keyword → START / END / INTEGER / IF / DISPLAY
This shows that any other word, even in upper-case letters will not be considered as the keyword for this language.
letter → a | b | … | z
Identifier → letter (letter)*
This shows that there must be atleast one letter present for being identified as an identifier. It can have any number of letters in an identifier but they should be in lower-case.
Comment → # (string)* #
string → letter | digit | operator | space etc
letter → A | B | C | … | Z | a | b | … | z
digit → 0 | 1 | 2 | … | 9
operator → + | * | - | / | > | < | = etc.
Comments can have any no. of letters, digits or operators within the two hash symbols. There may or may not be any character enclosed within the # symbol.
ArthOp → + | - | * | /
RelOp → = | < | >
Transition diagrams show different states of transition of lexemes. A lexemes becomes acceptable if it reaches the final state. Transition diagrams for the language constructs are drawn below.
Please join StudyMode to read the full document