The user @karlphillip already provided an excellent answer, but I wanted to complement with some comments and suggestions but the comment field was small. So I decided to add my own answer.
The idea of using thresholding makes perfect sense, especially for examples like those given in Karl's answer. However, the person who asked the question (@ user6357) did not provide even an image of your problem domain, so it is difficult to provide really concrete suggestions. So, I unfortunately agree with Karl on the fact that the question is quite badly made. But regardless of the quality of the question, I think the knowledge put here can help the community and so I preferred to collaborate rather than simply flag the question.
Assuming that the images captured by the camera do not have a background as different from red as the example provided in the cited response, the use of the threshold alone may not give satisfactory results.
In this way, I imagine two alternatives:
1. Add to the thresholding (or any other targeting method) a preprocessing that considers only areas of the image where there is movement.
If you have control over capturing images from the camera (this is also something that you did not mention in the question) and you are able to take two images in sequence, you can compare the images to verify the differences: a mere subtraction of the images. values of the pixels between the images will produce a new image whose pixels will be nonzero in the regions where motion occurs. In your question you mention "fires", and the crackle of fire naturally causes movement. If this is the case, it should be possible to pre-segment the image only in areas where movement occurs to then apply the pipeline suggested by Karl.
There are other solutions with more complex classifiers that also use the movement as well as the texture analysis for the detection of fires. This article is a great example that you may find useful.
2. Use a more complex classifier.
Assuming you have no image sequences, ie only a single image to perform the detection and that due to background variations the use of thresholding is insufficient. In this case, you can try with more robust classifiers.
Since you're using C ++ (and I've already suggested OpenCV, which is a fantastic library), I'd suggest using Cascade Classifier . You will need to train your own classifier with positive example images (where there are fire outbreaks) and negative (where there are no fire outbreaks), and this tutorial is really cool to do this (it detects bananas in images, but the principle is the same - just use examples correct, hehehe).
This classifier works quite intelligently: Basically, during training it "learns" from example images the light intensity values for different types of characteristics (#) that indicate when there is a chance that the object of interest exists in a certain area of the image. Because these features are easily scaled, it is easy to look at the image for variations in size (scale) of the same object - and in practice that is what the algorithm does: it looks in windows of gradually larger size until the classifier indicates as true the occurrence of an object in one of the search windows.
The features that OpenCV uses in existing implementations are those of the following image, which I believe should be sufficient for your type of problem.
The OpenCV site's own link description is pretty cool, but you do not have to fully understand the idea of this classifier to use it. The quality of the detection will only depend on the quality of the samples you provide. In the tutorial I quoted examples of bananas are all horizontal, so that the resulting classifier may not be very robust for images with banana photos vertically. Similar variations can influence your result, so just be aware of the possibility of having to retrain with more examples.