CUDA Compilation and execution in Visual Studio Dr Paul Richmond
Compiling a CUDA program CUDA C Code is compiled using nvcc e.g. Will compile host AND device code to produce an executable
nvcc –o example example.cu We will be using Visual Studio to build our CUDA code so we will not need to compile at the command line
Creating a CUDA Project Create New CUDA Project Select NVIDIA -> CUDA 7.0 This will create a project with a default kernels.cu file containing a basic vector addition example
Preferred Method!
Adding a CUDA source file Alternatively add a CUDA source file to an existing application If you do this you must modify the project properties to include CUDA build customisations http://developer.downlo ad.nvidia.com/compute/c uda/6_5/rel/docs/CUDA_ Getting_Started_Window s.pdf (section 3.4)
Compilation CUDA source file (*.cu) are compiled by nvcc An existing cuda.rules file creates property page for CUDA source files Configures nvcc in the same way as configuring the C compiler Options such as optimisation and include directories can be inherited from project defaults
C and C++ files are compiled with cl (MSVS compiler)
.c / .cpp
.cu
cl.exe
nvcc.exe
Host .obj
Host functions
CUDA kernels
cl.exe
cudaacc
Host .obj
CUDA .obj
linker
Executable
Device Versions Different generations of NVIDIA hardware have different compatibility In the last lecture we saw product families and chip variants These are classified by CUDA compute versions
Compilation normally builds for CUDA compute version 2 See Project Properties, CUDA C/C++Device->Code Generation Default value is “compute_20,sm_20” Any hardware with greater than the compiled compute version can execute the code (backwards compatibility)
You can build for multiple versions using separator E.g. “compute_20,sm_20;compute_30,sm_30;compute_35,sm_35” All Diamond and Lewin Labs GPUs
This will increase build time and execution file size Runtime will select the best version for your hardware https://en.wikipedia.org/wiki/CUDA#Supported_GPUs
Device Versions of Available GPUs Diamond High Spec Lab (lab machines) Quadro 5200 compute_35;sm_35;
Lewin Lab Main Room GeForce GT 520 compute_20;sm_20;
For Maxwell you can use Iceberg Tesla K40 compute_50;sm_50;
CUDA Properties
Debugging NSIGHT is a GPU debugger for debugging GPU kernel code It does not debug breakpoints in host code
To launch select insert a breakpoint and select NSIGHT-> Start CUDA Debugging You must be in the debug build configuration. When stepping all warps except the debugger focus will be paused
Use conditional breakpoints to focus on specific threads Right click on break point and select Condition
Error Checking cudaError_t: enumerator for runtime errors
Can be converted to an error string (const char *) using cudaGetErrorString(cudaError_t)
Many host functions (e.g. cudaMalloc, cudaMemcpy) return a cudaError_t which can be used to handle errors gracefully cudaError_t cudaStatus; cudaStatus = cudaMemcpy(dev_a, a, size * sizeof(int), cudaMemcpyHostToDevice); if (cudaStatus != cudaSuccess) { //handle error }
Kernels do not return an error but if one is raised it can be queried using the cudaGetLastError() function addKernel<<<1, size>>>(dev_c, dev_a, dev_b); cudaStatus = cudaGetLastError();
agated to other modules and ( 2 ) no code is generated for module language ..... The specialisation of ModML functors is much similar to ho w Ada generic.
Mar 6, 2013 - NEC Lekgotla on the issues of making education an essential service. ... individuals but individuals who are able to interpret and make sense .... that call others âfunny peopleâ and appeal to members to crush leaders to stop.
technology (e.g. Sociology) or (3) a creator of business value derived from technological and social structures of the new web (e.g. Business Administration). From the perspective of the authors of this paper, Web 2.0 is a natural evolution of the we
Windows: Windows is currently not supported. We recommend you install a virtual machine https://www.storagecraft.com/blog/the-dead-simple-guide-to-installing-a-linux-virtual-machine-on-windows and a recent version of Scientific Linux. Then follow the
annotate the content and express implicit relations among wiki concepts. This aids users to gradually ... typed links, concept types and properties. In addition, it ...
such as VMWare[6] and Xen[2], or application level sandboxes such as Native ..... B. Customizable execution environments with virtual desktop grid computing.
where applications are confined to a closed environment, to prevent them from causing any harm to the hosting system. To provide the security that volunteers expect, .... supports the platform and operating system where it is executed. By using Libbo
large user base, it suffers from two fundamental limitations: first, it can only ... pend on external software, having the same dependencies as BOINC itself. This ... advantage of not requiring any modifications to the BOINC client but is tied to ...
2Assistant Professor, Department of ECE,Velalar College of Engineering and Technology, Anna University. Chennai ... substitution box design essentially matches all the important security properties. ... using Mentor Graphics EDA (Electronic Design Au
Dec 17, 2007 - Also, this is one case where is it acceptable to mix a driver API function (cuMemGetInfo) .... worksheet included with the CUDA SDK will aid in.
... the apps below to open or edit this item. pdf-1443\supplemental-a-compilation-of-the-messages-a ... 1-1906-from-bureau-of-national-literature-and-art.pdf.
notification dated. 12.9.2013. Statement to be annexed to notice shall be under Sec. 102 for statements after 11 Sep 2013. Circular. 2013 General Circular No.
Dec 17, 2007 - The strategy employed in this example is highly optimized ... Changing image source data type is trivial and does not require re-optimization.
graphics processing units (GPUs). It also discusses software packages that span more than one type of hardware and can be used from more than one type of programming language. Readers will find that the foundation established in this book will genera