Thursday, March 31, 2011

How to determine whether a grammar is LL(1) LR(0) SLR(1)

Is there a simple way to determine wether a grammar is LL1, LR0, SLR1... just from looking on the grammar without doing any complex analysis?

For instance: To decide wether a BNF Grammar is LL1 you have to calculate First and Follow sets first - which can be very time consuming in some cases.

Has anybody got an idea how to do this faster? Any help would really be appreciated!

From stackoverflow
  • First off, a bit of pedantry. You cannot determine whether a language is LL(1) from inspecting a grammar for it, you can only make statements about the grammar itself. It is perfectly possible to write non-LL(1) grammars for languages for which an LL(1) grammar exists.

    With that out of the way:

    • You could write a parser for the grammar and have a program calculate first and follow sets and other properties for you. After all, that's the big advantage of BNF grammars, they are machine comprehensible.

    • Inspect the grammar and look for violations of the constraints of various grammar types. For instance: LL(1) allows for right but not left recursion, thus, a grammar that contains left recursion is not LL(1). (For other grammar properties you're going to have to spend some quality time with the definitions, because I can't remember anything else off the top of my head right now :).

    Bruce Alderman : Good point about the distinction between language and grammar. If a grammar is not LL(1), it may still be possible to construct an LL(1) grammar for the language.
    Jason Orendorff : +1 for pedantry. It's a key distinction, and the fact that it's usually glossed over is an obstacle to understanding.
  • One aspect, "is the language/grammar ambiguous", is a known undecidable question like the Post correspondence and halting problems.

  • In answer to your main question: For a very simple grammar, it may be possible to determine whether it is LL(1) without constructing FIRST and FOLLOW sets, e.g.

    A → A + A | a

    is not LL(1), while

    A → a | b

    is.

    But when you get more complex than that, you'll need to do some analysis.

    A → B | a
    B → A + A

    This is not LL(1), but it may not be immediately obvious

    The grammar rules for arithmetic quickly get very complex:

    expr → term { '+' term }
    term → factor { '*' factor }
    factor → number | '(' expr ')'

    This grammar handles only multiplication and addition, and already it's not immediately clear whether the grammar is LL(1). It's still possible to evaluate it by looking through the grammar, but as the grammar grows it becomes less feasable. If we're defining a grammar for an entire programming language, it's almost certainly going to take some complex analysis.

    That said, there are a few obvious telltale signs that the grammar is not LL(1) — like the A → A + A above — and if you can find any of these in your grammar, you'll know it needs to be rewritten if you're writing a recursive descent parser. But there's no shortcut to verify that the grammar is LL(1).

    jpalecek : BTW, you don't need to compute FOLLOW for LL(1), since it's only defined in terms of FIRST.
  • aardvark,

    thanks a lot for the clear explanation of LL(1) grammar. it has helped me a lot.

    zaza

0 comments:

Post a Comment