Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

On the one hand, you're right. Python takes exactly this approach: it disambiguates using "for" and "if":

    LISTCOMP ::= [ EXPR STMT* ]
    STMT ::= "if" EXPR | "for" PAT "in" EXPR
This is LL(1) and trivial to parse recursively.

On the other hand, Haskell list comprehensions really aren't hard to parse... for humans! Most people I've talked about this have no idea that Haskell comprehensions are formally so much harder to parse than Python list comprehensions. The difficult cases just don't arise in practice.

And there are plenty of places where we accept problems or algorithms which are ridiculously inefficient in theory, but fine in practice. Damas-Milner type inference, for example, takes time O(2^2^PROGSIZE) in the worst case - doubly exponential! Far worse (in theory) than the O(n^3) of general CFG parsing algorithms (which can be quite simple to implement!).

And this problem doesn't just come up in Haskell list comprehensions, but also with monadic do-notation. I'd much rather write:

    do (x,y) <- foo
       z <- bar x
       return (x + y + z)
than:

    do for (x,y) <- foo
       for z <- bar x
       return (x + y + z)
The extra "for" (or whatever keyword you want to put there) is just syntactic noise to a human, however much easier it makes it for a computer to parse it.

So I'm of two minds, really.



I think humans have an easy time because it's easy for them to look ahead (literally!) to see if there's a <- token. So if I were going to tackle this problem I would do it in two passes. In the first pass I'd just tokenize everything, and set a flag if I saw "<-". Then I'd use the value of that flag to inform the second pass about what it was parsing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: