More than "holding temporary values", the function of a buffer is to group data into sets that would be too small to handle individually. When a process (usually a communication) has a fixed overhead for each processed data, the larger that data the smaller the significance of the overhead than the process makes useful.
The best examples I can give refer to communication, but as we are talking about C I will give an example of reading a file on the hard drive. Let's say for some reason you want to read one character file per character and do some relatively time-consuming operation with each one. What happens when you ask the computer to read a character from the file?
The hard disk, if it is stopped, begins to rotate. Most of the time, it will already be running;
Depending on where the data is in relation to the read head, the disk can rotate from nothing to an almost complete revolution until you get to the character you want. Once he got there, he copies the character to memory;
Once in memory, your program can access it. It does something with it, and asks for the second character of the file;
It turns out that the disk did not "brake" on that specific character you read: it continued to run, because it would be impractical to stop at that exact point of the last read (first because it might be physically impossible, given its speed, even if it is possible this would spend a lot of energy and / or wear out the materials on the disk, and third because you may have other programs running on the computer that might be waiting to also access the data on the disk);
The result is that the character you want to read now, one after what you have just read, is already "behind": the disk would have to go full circle until the reading head returned point you want to read, and your program would run as slow as it takes for your hard drive to spin (and not at a speed more proportional to your processor's clock ).
What is the solution? Instead of reading a single character from the file, you read several at a time, saving them in a temporary area in memory to read it the next time you want another character. So while the hard drive is running it already copies several characters at once into memory - in a single fraction of a spin - and your program consumes those data from memory with the latency relative to that media (and its caches < in>), and not with hard disk latency.
Being in stack or heap is irrelevant [for this purpose], as explained by bigown . What matters is that you took data from somewhere - the disk, an external storage, a socket - and "queued" that data into memory to consume them at the most appropriate speed for their particular use, regardless of the best treatment thereof at their source. Similarly, you can use a buffer to save the data that you produce, and only send them to their final destination when they have enough volume to be handled by your destination.