INTRODUCTION
- The study of programming languages can be divided into the examination of syntax and semantics
- Syntax – is the form of expressions, statements, and program units
- Semantics – is the meaning of those expressions, statements, and program units
- In a well-designed programming language, semantics should follow directly from syntax
- Describing syntax is easier than describing semantics
THE GENERAL PROBLEM OF DESCRIBING SYNTAX
- Lexemes – the lowest level of syntactic unit
- The lexemes of a programming language include its identifiers, literals, operators and special words
- Token of a language is a category of its lexemes
LANGUAGE RECOGNIZERS
- Languages can be defined in two ways: by recognition and by generation
- A language generator is a device that can be used to generate the sentences of a language
FORMAL METHODS OF DESCRIBING SYNTAX
- John Backus and Noam Chomsky invented a notation that is most widely used for describing programming language syntax
CONTEXT FREE GRAMMERS
- Chomsky described 4 classes of grammars that define 4 classes of languages. Two of these grammar classes, context-free and regular turned out to be useful for describing the syntax of programming languages
- The tokens of programming languages can be described by regular grammars
ORIGINS OF BACKUS -NAUR FORM (BNF)
- BNF is a very natural notation for describing syntax
- Chomsky’s context-free languages is almost same as BNF’s context-free grammars
- meta-language is a language that is used to describe another language
- BNF is meta-language for programming languages
- The abstractions in BNF, or grammar are called non-terminals
- The lexemes and tokens of the rules are called terminals
- A BNF description, or grammar, is simply a collection of rules
PARSE TREES
- Most attractive feature of grammars is that they describe the hierarchical syntactic structure of sentences of the languages they define. These hierarchical structures are called parse trees
- A grammar that generates a sentence for which there are two or more distinct parse trees is said to be ambiguous
- Syntactic ambiguity of language structures is a problem because compilers often base the semantics of those structures on their syntactic form
SYNTAX GRAPHS
- A graph is a collection of nodes, some of which are connected by lines, called edges
- A directed graph is one in which the edges are directional; they have arrowheads on one end to indicate a direction
- The information in BNF rules can be represented in a directed graph, such graphs are called syntax graphs. These graphs use rectangles for non-terminals and circles for terminals
GRAMMARS AND RECOGNIZERS
- One of the most widely used of the syntax analyzer generators is named yacc – yet another compiler compiler
- Syntax analyzers for programming languages, which are often called parsers, construct parse trees for given programs
- The 2 broad classes of parsers are top-down, in which the tree is built from the root downward to the leaves, and bottom-up, in which the parse tree is built from the leaves upward to the root.
RECURSIVE DECENT PARSING
- Context-free grammar can serve as the basis for the syntax analyzer, or parser, of a compiler
- A simple kind of grammar-based top-down parser is named recursive decent
- Parsing is the process of tracing a parse tree for a given input string
- The basic idea of a recursive decent parser is that there is a subprogram for each non-terminal in the grammar
ATTRIBUTE GRAMMARS
- An attribute grammar is a device used to describe more of the structure of a programming language than is possible with a context-free grammar
- An attribute grammar is an extension to a context-free grammar
STATIC SEMANTICS
- The static semantics of a language is only indirectly related to the meaning of programs execution; rather, it has to do with the legal form of programs (syntax rather than semantics)
BASIC CONCEPTS
- Attributes which are associated with grammar symbols, are similar to variables in the sense that they can have values assigned to them
- Attribute computation functions are associated with grammar rules to specify how attribute values are computed
- Predicate functions which state some of the syntax and static semantic rules of the language, are associated with grammar rules
- Intrinsic attributes are synthesized attributes of leaf nodes whose values are determined outside the parse tree
DESCRIBING THE MEANING OF PROGRAMS: DYNAMIC SYMANTICS
- Dynamic semantics are the meaning of the expressions, statements and program units
- programmers need to know precisely what statements of a language do
- Compile writers determine the semantics of a language for which they are writing compilers from English descriptions
OPERATIONAL SEMANTICS
- Operational semantics the idea is to describe the meaning of a program by executing its statements on a machine, either real or simulated
- Operational semantics provides an effective means of describing semantics for language users and language implementers, as long as the descriptions are kept simple and informal
- Operational semantics depends on algorithms, not mathematics
AXIOMATIC SEMANTICS
- Axiomatic semantics defined in conjunction with the development of a method to prove the correctness of a program
- Axiomatic semantics is based on mathematical logic. The logical expressions are called predicates, or assertions. An assertion immediately following a statement describes a new constraints on those variables after execution of the statement. These assertions are called the precondition and post-condition. Developing an axiomatic description or proof of a given program requires that every statement in the program have both a precondition and a post-condition.
DENOTATIONAL SESMANTICS
- Denotation semantics is the most rigorous widely known method for describing the meaning of programs. It is based on recursive function theory.
- The fundamental concept of denotation semantics is to define for each language entity both a mathematical object and a function that maps instances of that entity onto instances of the mathematical object.
2 Responses to session 3(describing syntax and semantics)