Using C++ Classes with C++ AMP
Friday, January 31, 2014 – 6:25 PMYou can use classes with C++ AMP but you have to understand the limitations the CPU/GPU hardware model places on how and were you can use them. The C++ AMP Book says the following:
References and pointers (to a compatible type) may be used locally but cannot be captured by a lambda. Function pointers, pointer-to-pointer, and the like are not allowed; neither are static or global variables.
Classes must meet more rules if you wish to use instances of them. They must have no virtual functions or virtual inheritance. Constructors, destructors, and other non-virtual functions are allowed. The member variables must all be of compatible types, which could of course include instances of other classes as long as those classes meet the same rules. The actual code in your amp-compatible function is not running on a CPU and therefore can’t do certain things that you might be used to doing:
- recursion
- pointer casting
- use of virtual functions
- new or delete
- RTTI or dynamic casting
- goto
- throw, try, or catch
- access to globals or statics
- inline assembler
In some ways this is a little bit misleading as it refers to classes that can be used within a C++ AMP kernel. Here’s an example of a class that meets these requirements.
1: class stuff
2: {
3: public:
4: int a;
5:
6: stuff() : a(0) {}
7: stuff(int v) restrict(amp, cpu) : a(v) { }
8: };
Note that stuff also has a constructor marked with restrict(amp, cpu). This allows your code to create stuff instances in both CPU and GPU code (line 8 in the code below).
You can also create classes that use C++ AMP within a C++ class. In the example below test_case::test_amp() executes an AMP kernel. This works pretty much how you’d expect.
The gotcha comes when you declare a method on the test_case class that you want to call from within the AMP kernel. There is no way to share the class’ this pointer between the CPU and the GPU so methods that are declared as restrict(amp) must also be declared static. So here amp_method is static.
1: class test_case
2: {
3: public:
4: test_case() { }
5:
6: static stuff amp_method(stuff s) restrict(amp, cpu)
7: {
8: return stuff(s.a * s.a);
9: };
10:
11: void test_amp()
12: {
13: concurrency::array_view<stuff, 1> data(100);
14: concurrency::parallel_for_each(data.extent,
15: [data](concurrency::index<1> idx) restrict(amp)
16: {
17: data[idx] = amp_method(data[idx]);
18: });
19: data.synchronize();
20: };
21:
22: void test_cpu()
23: {
24: std::vector<stuff> data(100, 0);
25: for (auto& d : data)
26: {
27: d = amp_method(d);
28: }
29: }
30: };
Hopefully this clears up any confusion. This is the first of several blog posts I have lined up on C++ AMP and some of the things I’m working on for the C++ AMP Algorithms Library.
Sorry, comments for this entry are closed at this time.