Why performing offloading to the GPU is a good idea?

6

I'm following a few forums on the internet and I realize that recently there has been a lot of talk about doing offloading tasks for the GPU.

Mozilla has implemented a new engine for your browser called Servo . In his heart he makes the offloading of everything possible for the GPU.

As far as I can tell, the GPU is nothing more than an additional processor dedicated to rendering graphics.

  • How can the GPU help with processing?
  • This offloading usually occurs through libraries such as openMP. What exactly do these libraries help?
  • Is there any overhead of transferring these processing to the GPU? Does it pay for this overhead to process on the GPU?
asked by anonymous 23.02.2017 / 20:57

1 answer

7

Basically: The more processing power, the better. That simple. You do not stop using the CPU to use the GPU, but you use both. A number of historical and market phenomena have contributed to an incredible advancement in the processing power of GPUs and the demand for tasks for which GPUs have been designed.

  

As far as I can tell, the GPU is nothing more than an additional processor dedicated to rendering graphics.

No. GPUs were designed with graphics in mind, as the cost of processing the intricate mathematical functions associated with perspective and texture interpolation was very demanding for processors. Over time, these operations were becoming simple in relation to the computational power of the CPUs, but the number of CPUs grew a lot (especially related to the increasing resolution of the games), so the GPUs were approaching large parallel processing arrays ( it is common in a game to have to apply the same operation to the millions of pixels on the screen hundreds of times per second), and this massive and parallel processing began to draw attention to other tasks, and some hacks began to be used to simulate problems of physics (for example) as texture pixels, and companies realized that and embraced generic processing (with the introduction of programmable shaders as a big milestone), and today the GPU is actually a bunch of cores that can to be used for graphics.

  

How can the GPU help with processing?

The GPU has hundreds of cores (the GTX 960 has 1024 processing cores) - far beyond any home PC. These cores are highly specialized for unconditional and serial operations, that is, to process data without execution deviations, especially when acting on large memory regions (when the parallelism of the cores can be better exploited), which is the common scenario when handling large volumes of media information.

  

This offloading usually occurs through libraries such as openMP. What exactly do these libraries help?

Although GPUs have great computational power, they are still CPU-controlled peripherals. These libraries provide communication routines to send commands and data to computer peripherals (including the GPU). They link the program with the drivers to expose the functions needed for generic processing as well as the graphical libraries exposes the OpenGL functionalities, for example.

  

Is there any overhead to transfer these renderings to the GPU? Does it pay for this overhead to process on the GPU?

Yes and yes (in well-designed programs). All communication between peripherals is considered overhead, even communication between cores of the same CPU has overhead. The data needs to be formatted in type and protocol and followed for transfer. Recent advances in bus and RAM technologies have allowed the associated latency to be reduced, allowing real-time experience, and also allowed an increase in the amount of data transmitted, making the preparation overhead compensated by the increase in perceived results (technical terms associated with preparation delay and amount of result per time are latency and throughput ).

An illustrative example (with enough freedom):

Let's say you have an array of 1024 numbers, and you want to perform a series of operations on them, such as adding a constant.

With an 8-core CPU, each core is responsible for processing 128 numbers, already on the GPU with its 1024 cores, each core is responsible for only one number. Assuming the GPU acts at half the frequency and each operation takes 1 clock, it is a 64-fold advantage in processing speed.

The data has to be transmitted to the GPU, so maybe the speed does not compensate for such a simple operation, but let's say you want to add a number, multiply, compare with another array, take the cosine ... a series of commands. Once the data is in the GPU, the CPU only needs to send the necessary commands (as in the case of the shaders used in games), and is free for other tasks that the GPU does not specialize in doing, such as check user inputs, communicate with other peripherals, handle other programs (etc) and then request the result.

    
24.02.2017 / 02:47