The way you described is in fact the simplest. When rendering a scene, you can use two lighting models : local and global.
Localmodel
Inthelocalmodel,eachobjectisrenderedindependentlyoftheothers.Thisallowseachofthem-aswellaseachoftheirfaces,theirvertices,etc.-toberenderedinparallel,whichinmodernGPUsisdoneinamassiveway(thousandsoffloating-pointoperationsaredonesimultaneously).Thisallowsverycomplexscenestobepaintedinrealtime(approximately),whichisnecessarywheninteractivityisrequired(games,forexample).
Thepricethatispaidisthateachobjectis"ignorant" of the others. There is no interaction between them, either reflection (the light hits the first object, then the second, and ends in the camera), refraction (the light hits the first, is deflected by the second, and ends in the camera), shadows hit the object, but is blocked by the second, and does not reach the camera), etc. This creates unrealistic scenes, and ends up being necessary to use "tricks" to improve their appearance.
One of these tricks is to remove the reflective object from the scene, render it from a variety of angles (forming a square known as the "environment map" or sometimes "skybox" ) and use the rendering result as texture for the reflective object. So when this object is rendered in the final rendering (which the user will actually see) it will be a reflection of the rest of the scene, even though at that moment it is being rendered without any " other objects of the scene.
This technique is known as "render to texture" . If you only have a reflective object in the scene, the thing is fairly simple, and these additional renderings do not consume so much as well (in large part because the rendering result is texture in the GPU itself, the overhead of transmitting it to and from the main memory). That's why it's so much used in racing games, for example, where the player's car is the one that is more in focus and deserves a better rendering than the rest of the scene (including other cars). But if you have more than one reflective object, it complicates, and you end up having to use an approximate model that looks okay at first glance, but if you look carefully you can see the inaccuracies (< a href="http://threejs.org/examples/#webgl_materials_cubemap_dynamic2"> example ; requires modern GPU; it does not work on my).
And if there are too many objects, there is no way out but to ignore each other and only reflect the skybox in a generic way. In this example you can see that the balls only reflect the scenery around, not each other, which is far from being a faithful representation of reality ...
Global model
In the global model each object takes into account all the others in determining its appearance, resulting in a much more realistic scene. The main technique used is Raytracing , which roughly picks up each ray of light that came out of each light source, observes in which object (s) it hit, whether it was reflected or not refracted or not (and in more advanced applications if it is scattered or not), etc., until finally it arrives in the camera, contributing to the color of that pixel on the screen.
(This is a simplification, in practice reverse raytracing is used, where one starts from the camera and tries to reach some light source, but this is a bit harder to explain) / p>
The price you pay is a much larger rendering time ( example ), plus of the difficulty of taking advantage of GPU resources (which is more optimized for real-time rendering). That's why this technique is most used when time is not important, for example in static images (much used in Architecture, for example) or in films (it can take hours to render a single frame, but then just display).
Note: I'm sorry I do not have OpenGL and C examples, but as you commented they would be a bit extensive, and you say in the question that you've already found the relevant code in tutorials, so I'm just giving you an overview here.