C# Inline Methods and Optimization

Tuesday, August 19, 2008 – 2:10 pm

Last night I was playing around with a vector class that’s part of a scientific computation code I’ve been working on. It’s a long story more on that later but it means I’ve found some of my own time to write code in. Anyway… The Vector class uses the automatically implemented properties feature of C# 3.0

  public class Vector
  {
    public double X { get; set; }
    public double Y { get; set; }
    public double Z { get; set; }

    // ...
  }

I was concerned that the property getters and setters were adding overhead to my class. Normally this would be a non-issue but in this case the Vector class is being used inside a series of tight loops. According to the profiler over 90% of the application’s time is spent inside these loops and the property getters and setters get called hundreds of thousands of times as the loop works through some vector math. That’s a lot of method calls and I wanted to make sure that they were really getting inlined. As Eric Gunnerson points out there is no inline keyword in C#.

Looking at the generated code

So the best way to figure this out is to look at the code that gets generated. I created a class to investigate this:

  public class MyClass
  {
    public int A { get; set; }
    public int C;
  }

I’m using property get here but the discussion applies to any small method you hope the compiler will move inline. The calling code looks like this:

  public void TestInlineMethods()
  {
    MyClass target = new MyClass();

    int a = target.A;
    Console.WriteLine("A = {0}", a);

    int c = target.C;
    Console.WriteLine("C = {0}", c);
  }

Running this from the debugger and looking at the disassembly shows the following code. Clearly the property A get call did not get inlined, it calls MyClass.get_A() and you can step through this code in the debugger.

             int a = target.A;
  0000003e  mov         ecx,edi
  00000040  cmp         dword ptr [ecx],ecx
  00000042  call        dword ptr ds:[05FA29A8h]
  00000048  mov         esi,eax
  0000004a  mov         dword ptr [esp+4],esi
             int c = target.C;
  00000098  mov         edi,dword ptr [edi+4]

MyClass.get_A() looks like this:

  00000000  push        esi
  00000001  mov         esi,ecx
  00000003  cmp         dword ptr ds:[03B701DCh],0
  0000000a  je          00000011
  0000000c  call        76BA6BA7
  00000011  mov         eax,dword ptr [esi+0Ch]
  00000014  pop         esi
  00000015  ret 

This seemed a little odd as property getters seem like and ideal candidate. The inline version is one instruction compared to thirteen to make the method call.

What’s going on here?

Further investigation and conversations with p&p’s Chris Tavares pointed me in the right direction. There are two compilers at work here.

The C# compiler that takes to code and turns it into IL at compile time and the JIT compiler that takes the IL and generates native machine code at runtime based on the target processor architecture. The JIT compiler is responsible for deciding what to inline. This is because only the JIT compiler knows enough about the processor architecture to decide if putting a method inline is appropriate as it’s a tradeoff between instruction pipelining and cache size.

If you start a of your application from Visual Studio with the debugger attached (F5) then all the JIT optimizations will be disabled even if optimization is enabled. If you want to see the optimized code then you need to compile the application with optimization enabled (in the build tab of the project properties) without the debugger (CTRL-F5) and then attach it. This is covered in “Writing High-Performance Managed Applications : A Primer“, the “Why do I not see these optimizations in Visual Studio?” section half way down.

Attaching the debugger to an optimized executabe and showing the disassembly for the A property getter shows it is now inlined.

              int a = target.A;
  00000024  mov         ebx,dword ptr [edi+0Ch]

You’ll see the same behavior for a conventional property where the get and set methods are explicitly defined in the source code.

What gets inlined and when?

How to determine what get’s inlined? The short answer is that you can’t. The MSDN article “Writing High-Performance Managed Applications : A Primer” gives the following guidance:

  • Methods that are greater than 32 bytes of IL will not be inlined.
  • Virtual functions are not inlined.
  • Methods that have complex flow control will not be in-lined. Complex flow control is any flow control other than if/then/else; in this case, switch or while.
  • Methods that contain exception-handling blocks are not inlined, though methods that throw exceptions are still candidates for inlining.
  • If any of the method’s formal arguments are structs, the method will not be inlined.

But there are no hard and fast rules and in most cases you shouldn’t have to worry about this, just let the compilers do their thing and certainly don’t try to out guess them. If you do want to see what your code looks like you’ll have to attach the debugger. There’s also a further discussions on compiler optimization in MSDN Magazine.

  1. 9 Responses to “C# Inline Methods and Optimization”

  2. There are slight changes in .NET 3.5 SP1 with regard to the x86 JIT’s inlining heuristics. Vance Morrison covers this well:

    http://blogs.msdn.com/vancem/archive/2008/08/19/to-inline-or-not-to-inline-that-is-the-question.aspx

    By Sasha Goldshtein on Aug 20, 2008

  3. Sasha,

    Thanks for pointing this out! It’s a really good post and explains the tradeoffs involved in when to inline and when not to.

    Seems like simple property accessors will almost always be inlined as the inline code (one instruction) is less than the call (five instructions) even with a miltiplier of one:

    If InlineSize <= NonInlineSize * Multiplier do the inlining

    Thanks

    By Ade Miller on Aug 20, 2008

  4. I think a vector like this is a great place to use a struct.

    By Eric Gunnerson on Aug 20, 2008

  5. Hi Eric,

    I guess ill thought out (Manager trying to code) concern when I wrote the class was around memcpy overhead. Thinking about it further the Vector is only three doubles so it should be a struct. It “behaves” like an int so it should really be a struct. Time to rewrite it.

    In general though it’s interesting to figure out what the compiler(s) are doing under the hood. It might even make me a better programmer one day… maybe.

    Thanks

    By Ade Miller on Aug 20, 2008

  6. I think insted of doing Ctrl-F5, for doing JIT optimization, even uncheckecking “Supress JIT optimization on module Load” should also optimize code

    http://blogs.msdn.com/saraford/archive/2008/08/13/did-you-know-how-to-optimize-your-code-for-a-build-290.aspx

    Regards,

    By abhishek on Aug 20, 2008

  7. Not suppressing JIT optimization will work too, thanks for pointing it out. This gives you the attach debugger behavior all the time which might or might not be what you want.

    Thanks!

    By Ade Miller on Aug 21, 2008

  1. 3 Trackback(s)

  2. Aug 21, 2008: Links For August 21st 2008 | .Net
  3. Aug 26, 2008: C# String Assignment Optimization | #2782 - Agile software development and best practices for building Microsoft .NET applications.
  4. Sep 1, 2009: C# Optimization Revisited Part 3: The “Native Option” C++ | #2782 - Thinking about agile (small 'a') software development, patterns and practices for building Microsoft .NET applications.

Sorry, comments for this entry are closed at this time.