I was working on setting up some new CUDA projects as I’m doing some spiking (prototyping for the not so agile crowd) work to figure out how best to use CUDA 4.0. I’ve turned it in to a quick tutorial on how to write a simple application that allows you to use both CUDA and the latest C++0x features in Visual Studio 2010.
Because the current CUDA SDK requires projects to compile using the v90 toolset (Visual Studio 2008) the solution requires two projects. One DLL project containing the CUDA and targeting v90 and a second application project targeting v100 (VS 2010) containing the C++ code.
Click on the images to see full size versions.
Make sure you have the following installed.
- Visual Studio 2010 and 2008 SP1 (required by CUDA).
- Parallel NSight 1.51
- CUDA 4.0 RC or 3.2 and Thrust
If you don’t have 4.0. I built this walkthrough using the 4.0 RC but it should work with 3.2.
Setting up the solution
Create a solution containing two projects. Two projects are required because one targets the V100 (VS 2010) compiler to allow access to the latest C++0x language features and one targets the V90 (VS 2008) compiler because this is required by CUDA.
1) Create a Win32 console application called HelloWorld. Select the defaults for the remaining pages in the wizard. This project will contain the main entry point to your application and any Windows specific code, like the Parallel Patterns Library (PPL) code used for managing threads.
2) Create a second Win32 project called HelloWorldCuda. This is the DLL that will contain your CUDA code. In the application settings screen select DLL for the application type and check the empty project box.
Configure the CUDA project
There’s a number of settings that need to be configured on the HelloWorldCuda project.
3) Configure the HelloWorldCuda project.
3.1) Select the Project | Build Customizations… menu item. In the dialog select the CUDA 4.0 item. This adds support for CUDA C/C++ files but there needs to be a .CU file in the project before the build settings appear in the project properties. If you don’t have CUDA 4.0 then use the 3.2 rules.
3.2) Add two new items to the project; a C++ file (.cpp) and header file (.h) called Hello.cpp and Hello.h, rename the .cpp file to Hello.cu. Your solution should look like this:
3.3) Select the Hello.cu file and open it’s properties pages. In the general tab change the Item Type to “CUDA C/C++”.
3.4) Select the project and open the properties (ALT-Enter). In the general tab set the Platform Toolset field to v90 (if you are not able to do this then you probably don’t have VS 2008 installed, this is required by CUDA).
3.5) Open the Linker | General properties page and add “$(CUDA_PATH_V4_0)\lib\$(Platform);” to the Additional Libraries Directories field.
Note that the CUDA/C++ properties tab is now visible.
3.6) Open the Linker | Input properties page and add “cudart.lib;” to the Additional Dependencies field.
3.7) Make sure that your projects will always build in the correct order. Right click on the HelloWorld project and select Project Dependencies. Check the box next to HelloWorldCuda. This will force the HellowWorldCuda project to build before HelloWorld.
5) Build the solution. At this point the solution should build without any warnings or errors. It doesn’t do anything yet but all the pieces are in place.
Adding some CUDA/Thrust code
Now it’s time to add some code. We need to write some CUDA code in HelloWorldCuda DLL and export it so that the HelloWorld application can execute it.
5) Configure the HelloWorld project. It needs to link the HelloWorldCuda and also have access to the appropriate header files.
5.1) Open the Linker | General properties page and add “..\$(Configuration);$(CUDA_PATH_V4_0)\lib\$(Platform);” to the Additional Libraries Directories field.
5.2) Open the Linker | Input properties page and add “cudart.lib;HelloWorldCuda.lib;” to the Additional Dependencies field.
5.3) Open the C/C++ | general properties page and add “..\HelloWorldCuda\; $(CUDA_PATH_V4_0)\Include;” to the Additional Include Directories field.
5.4) Open the Project | Project Dependencies menu item and check the HelloWorldCuda box to make the CUDA project a dependency of the main Win32 application project.
6) Now it’s time to write some code. CUDA 4.0 now comes with Thrust so we’re going to use Thrust in our example. If you’re not using 4.0 then you need to download the latest Thrust library (link below) and copy it into a Thrust folder inside the CUDA SDK include folder %CUDA_PATH%\include\thrust.
This is a Hello World application so the code is very simple. It’s a variation of the first example on the Thrust project homepage.
Add the following class declaration to Hello.h. Most of the code is to fix up compilation warnings. Really all this does is declares a class that is constructed with a host_vector<unsigned long> and then has some methods that execute CUDA code and return results.
Hello.cu declares the constructor and Sum and Max methods. The constructor copies the data onto the device, while the Sum and Max methods call Thrust algorithms to carry out calculations on the GPU.
Finally HelloWorld.cpp contains the application’s entry point and executes the CUDA/Thrust code. It also calculates the answers on the host’s CPU so that you can check for correctness.
Run the application and you should see the following output:
You may see lots of warnings Resolving Thrust/CUDA warnings “Cannot tell what pointer points to…”. This appears to be a know issue. They only appear when the NVCC compiler’s –G0 flag is set and/or the project is compiling against arch sm_10.
Making use of the Parallel Patterns Library and C++ lambdas
So now we have a Win32 application that runs CUDA code using the Thrust template library. We could have done this with a single project that targeted the v90 toolset. Update the HelloWorld.cpp file to use the parallel_invoke algorithm to run the host and device code in parallel.
Notice how the output ordering has changed. The call to parallel_invoke takes to lambda expressions containing code that is now run in parallel.
The complete code for this sample is available on here.
Thrust (Project homepage on Google Code)
Lambda expressions in C++ Visual Studio 2010