Wow THAT was a long break

Sunday, August 1, 2021 – 6:50 AM

So blogging sort of dropped of the map while I was building an analytics platform at Tier3/CenturyLink Cloud and now a live video transcoding engine at iStreamPlanet. For the most part I spend my time writing C++ so there just might be some more posts on that. I’ve also been keeping myself entertained writing some trivial C++ games using Cinder. You can find them here CinderExamples.

C++ AMP Algorithms Library Released

Thursday, November 13, 2014 – 6:01 PM

Finally there’s a new release of the C++ AMP Algorithms Library!

This release contains some important bug fixes on 0.9.4.

  • Fixed array initialization bug in radix_sort
  • count_if no longer uses an excessively large number of threads, even for small problem sizes.
  • Removed code to initialize tests for code coverage.
  • Improved unit test stability on different hardware.


What’s up next for the library? A port to Linux and more STL algorithm implementations and bug fixes. Right now I’m busy moving the tests to Google Test. There might be another follow-up post on moving to Google Test. So far it’s been a pretty good experience, more test cases with less code and a much faster turnaround time code/build/test…

C++ AMP Algorithms Library 0.9.4 Released

Thursday, October 2, 2014 – 8:39 PM

Finally there’s a new release of the C++ AMP Algorithms Library! It’s taken a while, largely due to other things, like CppCon taking up my time.

This release contains the following:

  • New C++ AMP features:
    • AMP and STL algorithms no longer depend on DirectX scan implementation.
    • New implementation of amp_algorithms::scan that does not have a direct dependency on theID3DX11Scan and ID3DX11SegmentedScan interfaces.
    • The amp_stl_algorithms::copy_if and remove_if algorithms now use the the new scan implementation for improved performance.
    • Implementation of radix sort amp_algorithms::radix_sort.
    • New utility functions functions; log2, is_power_of_two, count_bits, padded_read, padded_write,pack_byte and unpack_byte
    • New namespace added for DirectX dependent features, amp_algorithms::direct3d. All DirectX code now in a separate header file amp_algorithms_direct3d.h.
  • New C++ AMP STL features:
    • inner_product
    • minmax
    • pair<T1, T2>
    • rotate_copy
  • New SAXPY example.
  • Reorganized unit tests, consistent names and test categories.


What’s up next for the library? A port to Linux and more STL algorithm implementations and bug fixes.

Test Categories for Visual C++

Wednesday, April 30, 2014 – 9:52 PM

Recently I’ve been using the Visual Studio unit test runner for C++ and came across a trick for adding categories to C++ tests. While the test framework supports traits, a little work makes them much more usable.

        TEST_METHOD_ATTRIBUTE(L"TestCategory", "my_category")
        // ... 

That’s a lot of code to  just add a category to each test.

You can add a specific macro to shorten all of this up, especially if you’re adding categories to all of your tests.

    #define TEST_METHOD_CATEGORY(methodName, category)          \
        BEGIN_TEST_METHOD_ATTRIBUTE(methodName)                 \
            TEST_METHOD_ATTRIBUTE(L"TestCategory", L#category)  \
        END_TEST_METHOD_ATTRIBUTE()                             \

Now you an write tests like this:

    TEST_METHOD_CATEGORY(testMethod, "my_category")
        // ...

Better still you can do the same thing for classes. In this case the BEGIN_CLASS_METHOD_ATTRIBUTE can be used within another macro.

    #define TEST_CLASS_CATEGORY(className, category)                \
        TEST_CLASS(className)                                       \
        {                                                           \
            BEGIN_TEST_CLASS_ATTRIBUTE()                            \
                TEST_CLASS_ATTRIBUTE(L"TestCategory", L#category)   \

Note that the MSDN page is incorrect in it’s documentation of this attribute. It is used inside the class and does not take any arguments.

Now you can add a category to a class and, if needed, override the category for individual methods.

    TEST_CLASS_CATEGORY(the_tests, "foo")
    // No '{' required, part of the macro.
        TEST_METHOD_CATEGORY(test, "bar")
            // Test in category 'bar'.

            // Test in category 'foo'.

I’ve used this to organize the tests in the C++AMP Algorithms Library, so you can look what I did there if you want more examples.

Using C++ Classes with C++ AMP

Friday, January 31, 2014 – 6:25 PM

You can use classes with C++ AMP but you have to understand the limitations the CPU/GPU hardware model places on how and were you can use them. The C++ AMP Book says the following:

References and pointers (to a compatible type) may be used locally but cannot be captured by a lambda. Function pointers, pointer-to-pointer, and the like are not allowed; neither are static or global variables.

Classes must meet more rules if you wish to use instances of them. They must have no virtual functions or virtual inheritance. Constructors, destructors, and other non-virtual functions are allowed. The member variables must all be of compatible types, which could of course include instances of other classes as long as those classes meet the same rules. The actual code in your amp-compatible function is not running on a CPU and therefore can’t do certain things that you might be used to doing:

  • recursion
  • pointer casting
  • use of virtual functions
  • new or delete
  • RTTI or dynamic casting
  • goto
  • throw, try, or catch
  • access to globals or statics
  • inline assembler

In some ways this is a little bit misleading as it refers to classes that can be used within a C++ AMP kernel. Here’s an example of a class that meets these requirements.

   1: class stuff
   2: {
   3: public:
   4:      int a;
   6:      stuff() : a(0) {}
   7:      stuff(int v) restrict(amp, cpu) : a(v) { }
   8: };

Note that stuff also has a constructor marked with restrict(amp, cpu). This allows your code to create stuff instances in both CPU and GPU code (line 8 in the code below).

You can also create classes that use C++ AMP within a C++ class. In the example below test_case::test_amp() executes an AMP kernel. This works pretty much how you’d expect.

The gotcha comes when you declare a method on the test_case class that you want to call from within the AMP kernel. There is no way to share the class’ this pointer between the CPU and the GPU so methods that are declared as restrict(amp) must also be declared static. So here amp_method is static.

   1: class test_case
   2: {
   3: public:
   4:      test_case() { }
   6:      static stuff amp_method(stuff s) restrict(amp, cpu)
   7:      {
   8:          return stuff(s.a * s.a);
   9:      };
  11:      void test_amp()
  12:      {
  13:          concurrency::array_view<stuff, 1> data(100);
  14:          concurrency::parallel_for_each(data.extent,
  15:              [data](concurrency::index<1> idx) restrict(amp)
  16:          {
  17:              data[idx] = amp_method(data[idx]);
  18:          });
  19:          data.synchronize();
  20:      };          
  22:      void test_cpu()
  23:      {
  24:          std::vector<stuff> data(100, 0);
  25:          for (auto& d : data)
  26:          {
  27:              d = amp_method(d);
  28:          }
  29:      }
  30: };

Hopefully this clears up any confusion. This is the first of several blog posts I have lined up on C++ AMP and some of the things I’m working on for the C++ AMP Algorithms Library.