96kHz sample rate (or fs = 96kS/s) is good enough to reproduce frequencies up to 48kHz (fs/2) which is way beyond the audible range.
TL;DR: In general I believe that, all other things being equal there will be a sampling frequency beyond which no benefit will be seen. That sampling frequency will almost certainly be less than 96kS/s. I would personally think that it is far closer to 44.1kS/s for the vast majority of us.
In general for a given sampling rate (fs), a digital to analogue conversion will re-create the original analoge (audio) signal for frequencies up to fs/2 but it will also add reconstruction artifacts at frequencies above fs/2. These reconstruction artifacts, will, if allowed to propagate through your amplification stages, do your speakers or headphones no good whatsoever (but you wont be able to hear anything until your speakers fail). As a consequence, after the actual digital to analogue conversion, there is always an analogue low pass filter present in a DAC to remove these unwanted artifacts.
For red book CD (fs = 44.1kS/s), this analogue low pass filter has to have an exceptionally sharp cutoff frequency in order to preserve all of the audible signal. Thus, if we accept that the limit of human hearing is 20kHz (although that is too high for most of us), a perfect filter for red book would have to go from pass at 20kHz to 100% rejection in the region above 22.05kHz. This sounds like it might be quite a lot of room (2.05kHz) - but in actuallity this is just one tone of the western musical scale (in other words the difference between an A and a B on the musical scale or the notes produced by two white keys on a piano keyboard that are separated by a single black key) which makes this a very demanding filter requirement indeed.
In practise, the filter used will not be perfect and will, itself, introduce one or more of the following artifacts:
- Start cutting off frequencies that are in the (extreme high) audible range - at least for those of use with very good and very young ears.
- Allow some reconstruction artifacts (frequencies higher than fs/2) to propate.
- Have a non-flat frequency response in the audible range.
- Have a non-linear phase response (meaning that different signal frequencies get delayed by different amounts).
It is a matter of debate as to whether all of these filter artifacts will be audible. It would certainly be possible to imagine (1) and (3) being audible for some people. I think the consensus is that (2) would not be audible. The audibility of (4) is very uncertain - my knowledge in this area is a bit thin but I believe, in general, human hearing is not sensitive to phase but there may be some secondary affects related to the presence of multiple frequencies that could conceivably be audible.
However, if you use fs = 96kS/s, then the reconstruction artifacts (which start from fs/2 = 48kHz) is a long way (> 1 octave) away from any audible frequency. Thus an analogue low pass filter with a much gentler cutoff can be used and this allows hardware designers to choose filters with much better characteristics in the audible range (more appropriate cutoff frequency, better [flatter] frequency response and more linear phase response).
In principle, higher still sampling rates (fs = 196kS/s and beyond) gives even more lattitude to the hardware designer to design a very good analogue filter - but as with everything it is a law of diminishing returns both in terms of the hardware designers ability to design better filters and the listeners ability to hear the difference.
As a consequence of this, I can believe that it may be possible for some people to tell the difference between RedBook CD and 96kS/s sampling rates. However, even 48kS/s gives more room for filter design and so it would be quite possible for those that can hear the difference between 44.1kS/s and 96kS/s could not so easily tell the difference between 48kS/s and 96kS/s.
Edit: Repositioned TLDR
Edit 2: Removed references to quantisation noise/artifacts and replaced with ‘reconstruction artifacts’ to address the point made by @WiWavelength below.