This needs a lot of "actors" like erlang, clojure, scala or akka (now scala standard but with java implementation as well).
There are other competition models too, you can go to apache storm, have an application using and it is very fast, in this case you can control the number of spouts (data generators) for the bolts or have a single spout and the amount of bolts for each "customer" you want. In my case I have 10 bolts that get burned to the database in the last layer so storm. The possibilities of the storm are endless but it is recommended for clusters, recommended, I use on a single machine and I have a problem with his zookeeper data that clog my tmpfs.
If you do not import how many threads you want to use and your application is on 1 server, then you can think of the lmax disruptor, practically the disruptor tries to keep the data warm ie in the cache memory of the processor. By chance it will be the next evolution of my application that I quoted.
Study quite, very much all the possibilities I mentioned, simply because they contain less accidental complexity than trying to manipulate threads, there are no problems with that, but the less we need to go to the lowest level of everything, with a great cost-benefit ratio is better.
Take it easy on the possibilities of everything in life, keep your application running just the way it is, patch only to avoid your fear of overloading the server and jump to a higher level solution, "maybe" to a new language if it is the case ... the best solution is the one that is best for you and the business.