Visual Studio 2010 and CUDA
Sunday, May 8, 2011 – 3:58 PMSo I finally got around to taking a stab at getting the CUDA 4.0 RC2 SDK up and running in between talks at ALT.NET Seattle.
I’m really hoping this is the last tutorial. It’s gotten a lot simpler to build CUDA on Windows in the last couple of releases.
Update June 19th 2011: I’ve updated the code and text for the RTM release of the CUDA 4.0 toolkit.
Prerequisites
Make sure you have the right stack.
- Visual Studio 2010 SP1
- CUDA 4.0 RC2 SDK and drivers
- NSight 2.0
Note that you don’t need Visual Studio 2008 any more. The dependency on the VC 9.0 compiler has been removed. This makes things much easier.
Create the project
- Create a Win32 Console Application called CudaHelloWorld. On the application settings page in the wizard check “Empty project” in the additional options menu. For the other settings use the defaults.
- Add a C++ class file called “HelloWorld.cpp”.
- Add another C++ file called “Hello.cu and a header file, “Hello.h”.
Due to the improvements in the RC2 that’s going to be all the files you need. No more linking separate DLLs and having to install the Visual Studio 2008 compiler. Your project should now look like this:
To make things easy we’re going to configure the project to support x64 now. This will save time in configuring settings later.
- Select the Build | Configuration Manager menu item.
- In the dialog select the Platform dropdown in the CudaHelloWorld and pick “<New…>”.
- In the new platform dialog create a new x64 platform and copy the settings from the existing Win32 project.
Now you have a project that targets both Win32 (x86) and x64.
Configure CUDA
Now configure the project to compile .cu files with the CUDA compiler.
Select the project in the solution explorer and then select the Project | Build Customizations… menu. In the dialog check the CUDA 4.0 targets.
Now right click on the Hello.cu file and select Properties. Make sure that the Configuration dropdown is set to “All Configurations” and the platform is set to “All Platforms” to make sure the settings get applied to all builds. In the Item Type field select “CUDA C/C++”
Now make sure that the x64 platform sets the NVCC CUDA compiler to also target x64.
- Right click on the CudaHelloWorld project and select Properties.
- Open the Configuration Properties | CUDA C/C++ tree item.
- Select “All Configurations” on the configuration dropdown and “x64” for the platform dropdown.
- Select “64-bit (–machine 64)” for the Target Machine Platform.
Now set the platform dropdown to “All Platforms” and configure the linker options
Open the Configuration Properties | Linker | Input tree item and add cudart.lib to the list in the Additional Dependencies field.
This will link the CUDA runtime library. For simplicity we’re linking the release libraries of both release and debug builds.
Add the code
Create the CUDA code. First declare the Hello class in Hello.h:
#pragma once
#pragma warning(push)
#pragma warning(disable:4996)
#include "thrust/device_vector.h"
#pragma warning(pop)
class Hello
{
private:
thrust::device_vector<unsigned long> m_data;
public:
Hello(const thrust::host_vector<unsigned long>& data);
unsigned long Sum();
unsigned long Max();
};
Add the corresponding definition in the Hello.cu file:
#include "thrust\host_vector.h"
#include "thrust\device_vector.h"
#include "thrust\extrema.h"
#include "Hello.h"
using namespace ::thrust;
Hello::Hello(const thrust::host_vector<unsigned long>& data)
: m_data(data.cbegin(), data.cend())
{
}
unsigned long Hello::Sum()
{
return reduce(m_data.cbegin(), m_data.cend(),
(unsigned long)0,
plus<unsigned long>());
}
unsigned long Hello::Max()
{
return *max_element(m_data.cbegin(), m_data.cend());
}
This shows how to use Thrust to calculate the sum and maximum of an array of numbers. The Hello constructor copies a vector of numbers stored on the host into a vector on the device (GPU). The Sum and Max methods call the Thrust library to execute these calculations on the GPU.
You need to add the code to call this in the HelloWorld.cpp file:
#ifdef _WIN32
# define WINDOWS_LEAN_AND_MEAN
# define NOMINMAX
# include <windows.h>
#endif
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <ppl.h>
#include "Hello.h"
using namespace ::Concurrency;
int main(int argc, char** argv)
{
printf("Generating data...\n");
thrust::host_vector<unsigned long> host_data(100000);
thrust::generate(host_data.begin(), host_data.end(), rand);
printf("generated %d numbers\n", host_data.size());
parallel_invoke(
[host_data]()
{
printf("\nRunning host code...\n");
unsigned long host_result = thrust::reduce(host_data.cbegin(),
host_data.cend(), 0, thrust::plus<unsigned long>());
printf("The sum is %d\n", host_result);
host_result = *thrust::max_element(host_data.cbegin(),
host_data.cend(), thrust::less<unsigned long>());
printf("The max is %d\n", host_result);
},
[host_data]()
{
printf("\nCopying data to device...\n");
Hello hello(host_data);
printf("\nRunning CUDA device code...\n");
unsigned long device_result = hello.Sum();
printf("The sum is %d\n", device_result);
printf("\nRunning CUDA device code...\n");
device_result = hello.Max();
printf("The max is %d\n", device_result);
}
);
return 0;
}
The output shows the same calculation executed on both the host CPU and the GPU on different threads. It uses the Parallel Patterns Library to create threads and then executes CUDA code from one of them. The final output looks like this:
The following no longer applies for the RTM release. Nvidia have fixed this. The code above will still work but using the C++ std::cout also works fine now. I’ve updated the sample code in the link below to use this.
This code differs from code I showed previously as it uses printf, rather than the STL streams libraries. There’s a bug in the RC2 CUDA release that will give you a lot of linking errors. Looks like this will be fixed for RTM. I’ll update the same for RTM.
You can download the full sample code from:
https://bitbucket.org/ademiller/blog/src/ff789bb7a938/CudaHelloWorld_4_RC2/
35 Responses to “Visual Studio 2010 and CUDA”
Quick question, for those do not have dual GPUs and want to use VS 2010 with CUDA, should the steps in this tutorial work?
By John on May 19, 2011
What I mean by my previous comment is, I’m using a normal GeForce 560Ti, can i have Nsight?
By John on May 19, 2011
John,
Yes this should work fine for anyone with a CUDA enabled GPU.
Ade
By Ade Miller on May 19, 2011
Hmm. I’m getting the following compile errors:
error C2039: ‘parallel_invoke’ : is not a member of ‘Concurrency’
error C3861: ‘parallel_invoke: identifier not found
I’ve checked and I have all the includes right and the namespace/identifiers are there. They show up in IntelliSense and everything :/
By Mark on May 20, 2011
Ade,
Thanks for the quick response, it’s very much appreciated.
I’m unsure as to whether I need to uninstall everything already on my system. CUDA 3.2, VS 2005, 2008 etc. But I’ll try first with them all still on, if it don’t work, I’ll unstall and try it with what you’ve stated as requirements.
Once again thanks for the detailed walk through.
By John on May 22, 2011
Brilliant stuff Ade, really helpful.
By JC on May 22, 2011
John,
You should be able to run side-by side with VS 2008 and CUDA 3.2 at least. I’ve been doing that.
Ade
By Ade Miller on May 22, 2011
Mark,
Have you tried running the completed version of the app by downloading it from here?
https://bitbucket.org/ademiller/blog/src/ff789bb7a938/CudaHelloWorld_4_RC2/
Ade
By Ade Miller on May 22, 2011
Hello, could anyone tell me which version of Visual Studio 2010 was used in this tutorial please?
Do you use premium, professional or ultimate? Also is it the same as VS2008, where you get package but you use the VS C++ program and not just VS itself?
By Pete on May 23, 2011
I used Ultimate but Pro or Premium with C++ should work OK. 2008 is not the same as 2010.
By Ade Miller on May 23, 2011
Wow ! It works !!!!
Really really really, really thank you !
I tried a long time, it’s a relief !
Just a little question : is it possible to save a “CUDA project” type, in order not to have to re-do all theses configurations ?
Thank you again.
By RyuKa on May 24, 2011
I have many warnings about memory, is it the same for you ?
Now i can run my own cuda code, but i can’t run sdk samples by opening BlackScholes_vs2010.vcxproj for example.
“1> nvcc fatal : Value ‘2010’ is not defined for option ‘cl-version’
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 4.0.targets(276,3): error MSB3721: The command “”C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\bin\nvcc.exe” -gencode=arch=compute_10,code=\”sm_10,compute_10\” -gencode=arch=compute_20,code=\”sm_20,compute_20\” –use-local-env –cl-version 2010 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin” -I”./” -I”../../common/inc” -I”../../../shared/inc” -I”C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\include” -G0 –keep-dir “Debug” -maxrregcount=32 –machine 32 –compile -D_NEXUS_DEBUG -g -Xcompiler “/EHsc /nologo /Od /Zi /MDd ” -o “Win32/Debug/BlackScholes.cu.obj” “C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK 4.0\C\src\BlackScholes\BlackScholes.cu”” exited with code -1.
1>
1>Build FAILED.”
Any idea ?
By RyuKa on May 24, 2011
Hi RyuKa,
This is one of the Nvidia samples which I don’t support (I don’t work for Nvidia). The best place to get support for these is on the Nvida forums. However… It looks like the sample is including the 3.2 SDK on the path which isn’t going to help. My experience is that many of the samples do not yet work with the 4.0 RC.
Sorry I can’t help more.
Thanks,
Ade
By Ade Miller on May 24, 2011
Thank you for making this, I feel I am nearing the end of a very long and tiring search. But now the $799.00 question: will it work with VS2010 Express? I am almost sure the answer is no. So a followup: is there any way to compile any CUDA program in windows without paying many hundreds of dollars?
thanks again,
-n
By Nuun on May 24, 2011
Okay, thank you anyway !
And do you know a way to create project with theses “options” already ? It’s for people I work with …
I also have one more question, do we really need Nsight 1.5 ?
By RyuKa on May 25, 2011
If you are getting the error:
The result “” of evaluating the value “$(CudaBuildTasksPath)” of the “AssemblyFile” attribute in element is not valid. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\BuildCustomizations\CUDA 4.0.targets
See this StackOverflow entry for a resoultion:
http://stackoverflow.com/questions/6156037/issue-with-production-release-of-cuda-toolkit-4-0-and-nsight-2-0
By John Gietzen on May 29, 2011
I’ve had some issues with normal VS C++ 2010, the problem is in the first instructions only, I don’t have a drop down for 64, it’s just 32 bit, I have created files with exactly the same name as you. I’m using 4.0 and I’ve got Nsight. So I’m downloading an installing a trial version of VS Ultimate, don’t know if that’ll help.
By John on Jun 4, 2011
Well Ultimate version of VS was much better, got all of your stuff done, compiled and built nicely, no errors.
I’m trying to build the SDK samples that comes with 4.0 and I keep getting the error to do with the cutil64d.
LINK : fatal error LNK1104: cannot open file ‘cutil64D.lib’
Any ideas on how to overcome this, I am following the same steps as for your cudaworld example.
By John on Jun 4, 2011
Thanks a lot!
This was really helpful.
I was not able to do it myself (even following all your steps) because I couldn’t open the property page of the Hello.cu file. Also, The property page of my helloworld project hasn’t all these options (cuda/c++). Any ideas?
By ehubo on Jun 8, 2011
John,
I don’t support NVida’s samples. However I think this is caused by not building the common folder in the main samples folder. This contains a utils project upon which the other samples depend.
Ade
By Ade Miller on Jun 9, 2011
Sounds like something isn’t installed properly. Which version of the SDK are your running and can you get my sample to work if you download the finished version and compile that?
By Ade Miller on Jun 12, 2011
Hi, Ade Miller. The project above works. But it has no GUI. Can you create a simple MFC project, so we can work with GUI. I just create one, but can only work in Release Mode, not work in Debug mode. Thank you.
By Yubao Wu on Jul 14, 2011
I found I can add a WinForm to your current project. But Can I call the GUI in the main function of this project? MFC maybe too old to be used to design GUI. WinForm is more recently.
By Yubao Wu on Jul 14, 2011
The latest version of the CUDA SDK actually includes several samples, like N-body, that have an OpenGL GUI and use VS 2010 in a similar fashion to the example here. It may be possible to use Managed code WinForms from C++ using the C+/CLI managed technologies but I’ve never done this. I’m not a great fan of MFC so you will not see me using this here.
Ade
By Ade Miller on Jul 17, 2011
Ade, Thanks so much for sharing this. This has been a huge help — there is so much old info out there about having to use VS 2008 when in fact, you don’t. Thanks again for providing such a clear and concise solution. A huge time saver!!
By Bill on Aug 3, 2011
Hi Abe i’ve added this link to my blog, let the people know how to setup cuda, this blog should be linked by nvidia main page :)
best regards!
By Maciej on Aug 15, 2011
I am porting a Cuda program that I am familiar with but did not write from Linux to Windows 7.
I found the article online by Ade Miller and it was very helpful. I installed •Visual Studio 2010 SP1 •CUDA 4.0 SDK and drivers •NSight 2.0 as instructed and the Hello World program he has in his blog article worked just fine, thank you.
From there I proceeded to add .cu and .h files and encountered only minor port problems which I quickly overcame. Two problems persist which I do not know how to address. I’m wondering if any of you might have a clue as to what I’m missing or did wrong?
First off, blockIdx and threadIdx and blockDim and gridDim are undefined. The error list shows the errors as follows:
3 IntelliSense: identifier “blockDim” is undefined c:\solutions\cudahelloworld\cudahelloworld\hello.cu 32 46 CudaHelloWorld
1 IntelliSense: identifier “blockIdx” is undefined c:\solutions\cudahelloworld\cudahelloworld\hello.cu 32 13 CudaHelloWorld
2 IntelliSense: identifier “gridDim” is undefined c:\solutions\cudahelloworld\cudahelloworld\hello.cu 32 24 CudaHelloWorld
4 IntelliSense: identifier “threadIdx” is undefined c:\solutions\cudahelloworld\cudahelloworld\hello.cu 32 57 CudaHelloWorld
The code looks like this:
__global__
void cu_doit(
float *cuda_data
)
{
int idx =(blockIdx.y*gridDim.x+blockIdx.x)*blockDim.x+threadIdx.x;
cuda_data[idx]=255.0f;
}
Secondly, the identifier glTexImage3D and associated GL_TEXTURE_3D constant is undefined.
Any ideas how to fix this?
By Jack on Aug 24, 2011
dear all,
I have installed the CUDA 4.0 RC2 SDK and drivers
NSight 2.0 and VS 2010 (and all were fine) but in the Creat the Project step in this tutorial I could not open cuda file (Hello.cu). Could someone please, tell me how to do this?
regards,
By Benaoumeur Bakhti on Aug 29, 2011
Hi,
Part 1 and 2 x64 works fine!
I would like to find a tutorial that addressed CUDA+VS2010+Windows Form application
Thanks for the post!
Walmor
By Walmor on Sep 20, 2011
You have to created the Hello.cu file. It’s not created for you.
Ade
By Ade Miller on Sep 20, 2011
Are you just seeing kust intellisense errors or does the code not compile at all?
Ade
By Ade Miller on Nov 5, 2011