Application

Here you can find information regarding this year's application, including source, installation instructions and background information.

Application background

Congratulations! If you are reading this it means you have struggled your way through the CGRA assignment and haven't gone completely mad just yet. We realize the ECA assignments can be quite stressful for our students, so to give you a bit of a break the GPU assignment this year is all about becoming Zen!

The assignment is inspired by the amazing robotfindskitten Zen simulation. As the name suggests, the game involves both a robot and a kitten, and the goal is for robot to find kitten. The player assumes the role of robot and has to filter through many non-kitten items to find kitten.

Actual gameplay footage

Image of actual gameplay

Undoubtedly the idea of the creator of the game back in 1997 was to relax and reach a state of Zen while you wander around trying to find kitten. However, the year is 2018 now and "ain't nobody got time for that!". As a matter of fact, personally I get mildly infuriated trying to make my way through all the non-kitten items, while I came here to just watch kittens and relax. Luckily, thanks to recent advancements in Artificial Intelligence, we now have the technology to automate the tedious task of filtering through non-kitten items! This is were you come in.

For this assignment you are provided with a C-implementation of a state-of-the-art convolutional neural network. This network is trained to distinguish a 1000 different types of objects in images, so it can help us to automatically filter non-kittens from kittens! However, the current C implementation is quite slow, so it still takes some time to go through a bunch of non-kitten images and "ain't nobody got time for that!". Therefore you are tasked to harness the power of GPUs to speed up this process, so we can find more kitten more faster.

More background on the application can be found in the Introduction Slides

Side Mission

The provided code can scan images for kittens, but the images still have to be supplied to the code manually. If you have a creative idea for a tool that can crawl the web for kittens (there should be plenty kittens out there on the interwebs) and connect it to the provided code for a truly automated robotfindskitten application, feel free to implement it and send us the code as part of your code submission. No points can be awarded for this, but you will make your assignment reviewer more Zen :)

Source for the application

You can access the source directly from gitlab. Be sure to also checkout the README at the bottom of the page with useful information on usage.

Installation and usage

To obtain the kitten finding code execute the following command:

git clone https://gitlab.com/ecatue/gpu2018_code.git

After the cloning process completes you should end up with a directory named "gpu2018_code". To compile and execute the code, you can simply issue the make command

make

To test on more images you can execute

make check_all

For more information on how the use the provided code, please refer to the README file in the directory (or on gitlab) for the nicely rendered version).

From here on you can modify the code as you please. Running the make command again will rebuild the required files and automatically check the classification against an expected value.

If you want to extend the code with either OpenCL or CUDA, you can simply add these files to the directory (.cu for cuda, and .cl for opencl) and the Makefile should automatically detect them without further modification of the Makefile. In the provided code already a preprocessing function has been ported to the GPU using CUDA as an example. The CUDA code for this function can be found in "preprocessing.cu". Note that although the code is correct, it is in no way optimized yet.

As a starting point we suggest you have a look into this code (preprocessing.cu) to see how memory is allocated on the GPU, how data can be transferred between CPU and GPU, and how you can invoke a kernel on the GPU.

The ported preprocessing function combined with the cookbooks and of course the lecture on GPUs should be sufficient to get you started with this assignment. If you encounter any problems, we encourage you to first try searching the provided resources. When you are still stuck after this, or you have a question specific to this assignment, please check the oncourse forum.

Notes on profiling GPU code:

  • To profile your GPU code you can use the Nvidia Visual Profiler (nvprof), which is also installed on our servers
  • The first function call to the GPU will take about 0.6 seconds, and acts as a kind of warm-up for the driver. If you do profiling within your application please take this into account. If the first call occurs within a loop, the first iteration of this loop may take significantly longer than the others. We suggest to do a 'dummy' call to the GPU at the beginning of the application to avoid measuring this effect.