ARC’s Fast Autorelease

ARC has a trick that keeps returned objects out of autorelease pools if both the caller and callee are ARC.

But how does that work? One of the features of ARC is that old compiled-before-ARC code (MRC code) can call ARC code and vice-versa. But if ARC code doesn’t put a returned object in an autorelease pool that MRC code is expecting, then the object would just leak.

So ARC-ified clang code emits this function call when returning an object: objc_autoreleaseReturnValue(id).

If you look at objc_autoreleaseReturnValue’s implementation, it calls callerAcceptsFastAutorelease(). Even if you don’t read x86_64 or ARM assembly, the code’s comment is straight-forward:

/*
  Fast handling of returned autoreleased values.
  The caller and callee cooperate to keep the returned object 
  out of the autorelease pool.

  Caller:
    ret = callee();
    objc_retainAutoreleasedReturnValue(ret);
    // use ret here

  Callee:
    // compute ret
    [ret retain];
    return objc_autoreleaseReturnValue(ret);

  objc_autoreleaseReturnValue() examines the caller's instructions following
  the return. If the caller's instructions immediately call
  objc_autoreleaseReturnValue, then the callee omits the -autorelease and saves
  the result in thread-local storage. If the caller does not look like it
  cooperates, then the callee calls -autorelease as usual.

  objc_autoreleaseReturnValue checks if the returned value is the same as the
  one in thread-local storage. If it is, the value is used directly. If not,
  the value is assumed to be truly autoreleased and is retained again.  In
  either case, the caller now has a retained reference to the value.

  Tagged pointer objects do participate in the fast autorelease scheme, 
  because it saves message sends. They are not entered in the autorelease 
  pool in the slow case.
*/

To summarize, callerAcceptsFastAutorelease() inspects the caller’s instructions and uses it to determine at runtime whether it needs to actually put the returned object in the autorelease pool or if it’s on the same ARC-team and it can be skipped (speeding things up).

Clever girl.

※ ※ ※

Follows is the tweet-stream that led this posting, much thanks to David Smith for the education.

rentzsch / @rentzsch:

part of me still wants to write [NSMutableArray array] instead of [NSMutableArray new] since maayybbeee this code will be MRC one day again

David Smith / @Catfish_Man:

@rentzsch should be just as efficient under arc, so go for it :)

rentzsch / @rentzsch:

@Catfish_Man I think +array still hits the autorelease pool under ARC

David Smith / @Catfish_Man:

@rentzsch I’ll try to remember to check today and file a bug if it does

rentzsch / @rentzsch:

@Catfish_Man I love it but messaging sending is a serious conceptual barrier toward optimization

David Smith / @Catfish_Man:

@rentzsch aye. I got [NSDate date] fixed to not hit the ar pool though :)

rentzsch / @rentzsch:

@Catfish_Man very interesting! I’m not sure how that’s possible unless you’re playing some sort of cacheing magic

David Smith / @Catfish_Man:

@rentzsch ah didn’t realize you weren’t familiar with the general mechanism here. ARC can elide most autoreleased returns. Lemme find a link

David Smith / @Catfish_Man:

@rentzsch grep for callerAcceptsFastAutorelease here: opensource.apple.com/source/objc4/o…

mrc arc autorelease Jan 30 2014