I don't necessarily consider this a fight at all, I think each has their strengt...

barrkel · on June 3, 2011

The C++ method doesn't give you a trivial way of capturing variables rather than values, where those variables go on to have a life that matches the lifetime of the closure created. You have to work rather hard to get that, by wrapping things up in little holders which are captured by value, but don't do a deep copy when they themselves are copied. It reminds me a little of how arrays are sometimes used to capture final variables in a mutable way in Java anonymous inner classes.

What this means is that whenever you pass off a lambda to a function, and that lambda captures a variable, you need to be aware of whether or not the function keeps a reference to the lambda someplace, so that you'll know if it's safe to capture by reference or not. If you capture by reference but the function keeps a reference to the lambda, and it gets called after the captured variable has gone out of scope, you'll get into some nasty trouble.

And this is why I don't like the C++ spec as it stands. It requires a kind of global knowledge to work with correctly in local contexts, in such a way that the compiler can't really help you either (it may have been possible to annotate types to indicate closure lifetime, but it would be painful without more powerful type inference than C++ has).

bitwize · on June 3, 2011

You simply can't do upward funargs in a C-family language without breaking the memory and activation-record model of the language. For example, let's say C++ had upward funargs. Any time you return a function value, or otherwise keep it around for longer than the lifetime of the function activation in which it is activated, any free local variables in the closure which are captured by the containing environment must refer to locations on the heap. (Copying them by-value breaks the semantics of closures.) This conflicts with the assumption in C++ that auto variables local to a function activation are part of the function's activation record on the stack.

You could get around this problem with spaghetti stacks or something, but you'd need to find a way to free the activation records that are floating around after their enclosing scopes have expired -- enter garbage collection which you do NOT want to require for C++.

That's the problem with Lisp, it's almost all-or-nothing. If you want to correctly include some of the benefits of the Lisp execution model -- like lambdas -- you need to accept the whole thing, lock, stock, and barrel. Including heap-allocated local vars and the garbage collector. (And yes -- Python, Ruby, Haskell, and Standard ML have a "Lisp execution model" in this sense.)

So we get compromises and hulking abominations like C++0x lambdas or -- worse yet -- their predecessor, Boost Lambdas.

tl;dr: Upward funargs are hard, let's go shopping.

ori_b · on June 3, 2011

You can do manual closure management.

    int f() {
        int x = 42;
        return dupclosure(^(){return x+10}};
    }
     int g()
     {
          int (^fn)();

          fn = f();
          fn();
          freeclosure(fn);
     }

It's not quite as pretty, but it's workable, and in line with a C-family language's semantics.

seabee · on June 4, 2011

However it does require severe adjustments to the lifetime rules for automatic variables, and you have to be mindful of e.g. custom allocators.

ori_b · on June 4, 2011

Custom allocators are relatively easy handle with an API something like:

     void* closureheap(void (^)())
     size_t closuresize(void(^)())

Or, well, anything else that allows you to separate the step of allocating memory from it's initialization. (There are probably more representation-independent APIs that would be better, but this was just off the top of my head)

And, yes, mutable captured variables will not be shared across multiple duplications. Somewhat unconventional, but given the mental model of copying closures, I don't think it's a surprising behavior.

barrkel · on June 3, 2011

I'm aware of all the issues; I implemented the feature in a commercial language that's semantically very similar to C (Delphi, a variant of Pascal). My point is that there are compromises, like reference counting, that make the whole thing much easier to use. Yes, reference counting is a form of GC, but it's also deterministic and localized, and with careful selection of implementation primitives in the runtime library, potentially open to user fiddling too (as C++ users are wont to do).

comex · on June 4, 2011

> [...] any free local variables in the closure which are captured by the containing environment must refer to locations on the heap.

Objective-C blocks automatically copy those variables to the heap. (Which is not to say they're not a bunch of compromises.)

calloc · on June 3, 2011

The issue you mention of not knowing the lifetime is something that happens with pointers and memory allocations as well in C++. Unless it is documented it can be a pain trying to figure out who is ultimately responsible for free'ing the memory that was allocated ...

Ultimately Garbage Collection would help there, but I am not sure we are going to see that in C++ anytime soon.

barrkel · on June 3, 2011

Yes, but the solution landscape to this problem domain is different. Automatic variables have well-defined lifetimes; they die when they go out of scope. This means there's greater scope for the compiler to take more initiative about lifetime of captured variables (which will all be automatic one way or another, i.e. implicit 'this', a local or a parameter; assume reference parameters etc. cannot be captured).

When I designed and implemented the same feature, anonymous methods in Delphi, I used reference counting to keep alive a heap-allocated activation record containing all captured variables. This works well for most scenarios; it can get into knots in more obscure situations where you have recursive lambdas that call themselves via a captured variable, but those are usually pretty rare.

You're right that GC is a help. The biggest thing GC gives you is freedom from having to worry about who controls the lifetime of parameters and function return values, in most cases. In the presence of GC, you can get more clever about your algorithms and data structures; you can cache and memoize, without paranoid concern for things disappearing behind your back. Consider a querying API that takes in closures for sorting and selection functions; I can see it building up temporary results and caching them, or streaming results in a multi-threaded fashion, but it can only do that if it can reliably hang on to closures after the select/sort/etc. function has returned.

zwieback · on June 4, 2011

That's what I was thinking when reading the article. It almost seems like capturing stack variables by reference should not be allowed at all but that would be too restrictive.

Ultimately the C++ problems have to be resolved by conventions and idioms, just like we've all learned to be careful when passing a pointer to a local variable.

stephen_g · on June 4, 2011

What would garbage collection do that C++0x's shared_ptr smart pointer (which is reference counted) doesn't?

I'm not really sure how GC works but using normal memory allocation when you can and reference counted pointers when somebody else is responsible for deallocating objects seems to be adequate and is still high performance...

shin_lao · on June 4, 2011

You seem to want to use a C++ 1x lambda where you should be using a future, am I wrong?

lloeki · on June 3, 2011

I'd go as far as saying that (Obj)C blocks feel right at home in pure C (where I actually used them more than in ObjC) whereas C++ lambdas fit in, well... C++.

I just wish the actual passive-aggressive fight between the FSF and Apple would resolve and blocks could finally make it into upstream GCC C compiler.

calloc · on June 3, 2011

There is no-one at Apple that can sign the copyright for the blocks code over to the FSF and as such it will never happen. At least that was told on the mailing list for LLVM/Clang.

I understand why the FSF wants copyright assignment, but it makes the process a lot longer and more complicated.

lloeki · on June 3, 2011

Probably Jobs can?

ben_straub · on June 3, 2011

Not likely. Apple's moved to a clang-based toolchain, haven't they?

lloeki · on June 3, 2011

They're still transitioning from pure gcc first to llvm-gcc then to llvm+clang, and blocks are available in all of them.

XCode 3.2 defaults to GCC 4.2 with LLVM optionally available and IIRC XCode 4 too (can't check as I downgraded for various reasons).

The blocks patchset against pure GCC exists, and the problem mostly lies in upstream GCC refusing patches whose copyright has not been assigned/transferred to the FSF (see https://lwn.net/Articles/405417/). The rationale is that a critical component such as GCC should not be at the mercy of multiple (possibly hundreds) conflicting parties and easing a possible relicense process to ensure its protection.

While I understand the rationale behind this, my opinion is that it feels bureaucratic to the point of hampering notable innovative contributions while favoring local forks which will inevitably end up dying, as maintaining a fork (whatever the patchset size) against the march of a behemoth like GCC is essentially hopeless.

_tggb · on June 3, 2011

XCode 3.2 defaults to GCC 4.2 with LLVM optionally available and IIRC XCode 4 too (can't check as I downgraded for various reasons).

Xcode 4 defaults to llvm-gcc, not gcc-4.2.

lloeki · on June 4, 2011

I wasn't quite sure about my memory of it when I wrote it, thanks for the correction. That's still not clang though.

bonch · on June 3, 2011

It's not a fight; Mike Ash was just using a tongue-in-cheek headline.