gpgpu-computing2.blogspot.com
Confessions of a Speed Junkie (Resources): CUDA Zone
http://gpgpu-computing2.blogspot.com/2009/08/cuda-zone.html
Confessions of a Speed Junkie (Resources). Tuesday, August 11, 2009. Nvidia has put together a webportal for CUDA developers called the CUDA Zone. This site contains documentation, software, examples of what others are doing with CUDA, tutorials, and the official CUDA forums. Http:/ www.nvidia.com/object/cuda home.html#. Posted by Dr Zaius. Tuesday, August 11, 2009. 5) Sign up for CUDA Alerts. 4) AMD OpenCL Forums.
gpgpu-computing4.blogspot.com
Confessions of a Speed Junkie (Code Examples): Matrix Multiplication 2 (OpenCL)
http://gpgpu-computing4.blogspot.com/2009/09/matrix-multiplication-2-opencl.html
Confessions of a Speed Junkie (Code Examples). Monday, September 28, 2009. Matrix Multiplication 2 (OpenCL). I will assume that you have gone through the CUDA Matrix Multiplication 2. Example and understand the conceptual changes that we will be making to our OpenCL kernel. All we really need to do is express our kernel from CUDA Matrix Multiplication 2 in terms of OpenCL and slightly modify our main program from our OpenCL Matrix Multiplication 1. Main program (Listing 1). Multiply two matrices A * B = C.
gpgpu-computing1.blogspot.com
Confessions of a Speed Junkie (Products): Tesla S1070
http://gpgpu-computing1.blogspot.com/2009/08/tesla-s1070.html
Confessions of a Speed Junkie (Products). Tuesday, August 11, 2009. With the world’s first teraflop many-core processor, the NVIDIA Tesla™ S1070 computing system speeds the transition to energy-efficient parallel computing. With 960 processor cores and a standard C compiler that simplifies application development, Tesla S1070 scales to solve the world’s most important computing challenges—more quickly and accurately. For more information http:/ www.nvidia.com/object/product tesla s1070 us.html.
gpgpu-computing4.blogspot.com
Confessions of a Speed Junkie (Code Examples): CUDA Program Structure
http://gpgpu-computing4.blogspot.com/2009/08/cuda-program-structure.html
Confessions of a Speed Junkie (Code Examples). Tuesday, August 25, 2009. CUDA’s parallel programming model is designed to overcome the many challenges of parallel programming while providing a quick learning curve for programmers familiar with C. At its core are three abstractions: a hierarchy of thread groups, shared memory, and thread synchronization. These abstractions are exposed to the programmer via a small set of language extensions. C for CUDA provides a minimal set of extensions to the C languag...
gpgpu-computing1.blogspot.com
Confessions of a Speed Junkie (Products): SciComp
http://gpgpu-computing1.blogspot.com/2009/08/scicomp.html
Confessions of a Speed Junkie (Products). Monday, August 10, 2009. SciComp is a leading provider of scientific computing solutions to the financial markets. Driven by customer needs in a constantly evolving marketplace, SciComp provides unique software solutions, products, expertise, and support services for the pricing and risk management of derivative instruments. For more information http:/ www.scicomp.com/parallel computing/GPU OpenMP. Posted by Dr Zaius. Monday, August 10, 2009. 1) The Portland Group.
gpgpu-computing.blogspot.com
Confessions of a Speed Junkie (Overview): Change... It's a good thing
http://gpgpu-computing.blogspot.com/2009/08/change-its-good-thing.html
Confessions of a Speed Junkie (Overview). Friday, August 14, 2009. Change. It's a good thing. So now you know what GPGPU is, what it is not, the magnitude of the performance gains you can expect to get now and in the near future, and a bit of history as to how we got here. But how can you effect the necessary change in your organization to benefit from this new technology? With all of these impediments to change how do you get the ball rolling? Section and subscribe to my site feed. Live Free or Die.
gpgpu-computing1.blogspot.com
Confessions of a Speed Junkie (Products): AccelerEyes (Jacket)
http://gpgpu-computing1.blogspot.com/2009/08/accelereyes-jacket.html
Confessions of a Speed Junkie (Products). Monday, August 10, 2009. AccelerEyes' first product, Jacket, is used by customers across all major HPC industries, such as the automotive, financial, medical, and seismic industries. Further, Jacket's Graphics Toolbox enables true Visual Computing, seamlessly merging the compute power of CUDA with OpenGL visualizations. AccelerEyes plans to adapt and expand Jacket for other hardware and software platforms. Jacket is not another GPU API, nor is it simply a collect...
gpgpu-computing4.blogspot.com
Confessions of a Speed Junkie (Code Examples): Matrix Multiplication 1 (CUDA)
http://gpgpu-computing4.blogspot.com/2009/08/matrix-multiplication-1.html
Confessions of a Speed Junkie (Code Examples). Thursday, August 27, 2009. Matrix Multiplication 1 (CUDA). I will assume that you have already downloaded and installed the appropriate CUDA driver, toolkit and SDK from Nvidia. I am using version 2.2 but what we are covering should work with any version. If you don’t already have the software and driver installed go to http:/ www.nvidia.com/object/cuda get.html. Main program (Listing 1). Multiply two matrices A * B = C. Void randomInit(float* data, int size).
gpgpu-computing2.blogspot.com
Confessions of a Speed Junkie (Resources): OpenCL Tutorials
http://gpgpu-computing2.blogspot.com/2009/08/opencl-tutorials.html
Confessions of a Speed Junkie (Resources). Thursday, August 13, 2009. AMD has put together a very basic OpenCL tutorial using the proposed C bindings to OpenCL. This tutorial can be found at. Http:/ developer.amd.com/gpu/ATIStreamSDK/pages/TutorialOpenCL.aspx. AMD is working with Mindshare to develope a 3 day class on OpenCL programming. The class is not yet available but should be soon. For more information. Http:/ www.mindshare.com/learn/? Posted by Dr Zaius. Thursday, August 13, 2009.
gpgpu-computing4.blogspot.com
Confessions of a Speed Junkie (Code Examples): OpenCL Program Structure
http://gpgpu-computing4.blogspot.com/2009/09/opencl-program-structure.html
Confessions of a Speed Junkie (Code Examples). Thursday, September 17, 2009. As of today there have been no OpenCL implementations released. AMD and Nvidia are both working on implementations for their processors but they are both still in beta. All of the following examples have been built and tested against Nvidia’s beta OpenCL. So how much more complicated is OpenCL? With OpenCL we have an NDRange (N – Dimensional Range) of work groups that contain multiple work items. Posted by Dr Zaius.