I haven’t gone down this particular path myself so I can’t make any comments on available software or methodologies. I do want to interject that there’s a reality to digital processing that needs to be taken into account with respect to resolution.
None of what I’m about to say here has anything to do with the question of whether or not one can hear the difference between X number of bits and Y sampling frequency vs 16/44. That’s up to the individual listener to decide. This is all focused on the unfortunate side effects of the analog to digital to analog conversion process.
Without going into a treatise on sampling theory it’s easier just to say that the minimum sampling frequency (the 44, 48, 96, etc part) must be at least 2x the maximum audio frequency you’re trying to reproduce. 22KHz is 1/2 of 44KHz so all is well, right? Well… no. Due to the way in which sampling works “ghost” images (or aliases) of the original signal will end up getting “recorded” (in the A to D step) or “reproduced” (in the D to A step) at frequency multiples.
Aliases must be removed and this is a standard part of the A to D process (anti-aliasing filter) or the D to A process (reconstruction filter). The problem here is that you want to reproduce up to 20KHz in order to get the complete audio band, but at 44.1KHz you CANNOT have any data that’s at a frequency above 22.5KHz (or really bad things happen). That gives you 2.5 KHz in which to build a filter that has the ability to kill 90 - 140dB of dynamic range. That’s a STEEP filter!!
Simple enough, right? Just cut out all of that extraneous information! Sadly filters don’t work that way. The steeper you make them the more they mess with the adjacent frequencies and the more they have the ability to mess with transient response. In other words the more you try to make a filter that deals with the mathematical realities of digital to analog conversion the more you run the risk of really messing up high frequencies in the audio band. This is one of the many reasons why “early” digital sounded so harsh.
An easy and cheap solution is to bump up the sample rate to a much higher value. At 96KHz you have a much easier task as your filter can operate in a more relaxed manner far away from the audio band and (hopefully) do no damage to the audible frequencies.
I’ll be the first one to say that 16/44 digital has the potential for superior dynamic range to vinyl. The key here is “potential.” The reality is that unless you’re using extremely high-quality A to D converters your effective dynamic range is somewhat reduced. In reality, for this particular use case that’s probably not a big deal as the noise floor on vinyl is extremely variable.
A problem arises when you want to do any sort of processing of the digital data. All that you can do here is throw data away. You might make it subjectively better, but that’s going to be at a cost of information. If you’re going to do any sort of pop reduction, volume leveling, equalization (including applying the RIAA curve in the digital domain) then you are going to be shedding (or interpolating) bits. You’re also going to be filtering the digital signal AGAIN which can be problematic with a limited sample frequency.
The difference in dynamic range (and therefore in the amount of actual information) between 16 bit and 24 bit is absolutely HUGE. 65,536 possible representations of analog “levels” in 16 bit and 16,777,216 possible “levels” in 24 bit. Creating a complete audio system that can reproduce that range and having the perfect hearing that’s able to process it is pretty much impossible. The real benefit of 24 bit is having the flexibility in the digital domain to do processing (which is going to effectively shave bits) and not worry about getting into the range of what a typical system and listener can discern. In other words, you have way more headroom in a 24 bit sample in which to do mathematical operations.
There’s a reason why studios don’t record in 16/44. There’s just not enough headroom in which to do post-processing and mixing. Ripping an LP should be looked at as being no different than recording microphone feeds because, ultimately, you’re doing the same thing.
Storage is cheap (and getting cheaper). There’s no difference in time required to record at 16/44 vs 24/96 (or 192, or 384). Even the space requirements are reasonable. Given equal lengths a 24/96 file is only 3.3x larger than the 16/44 file, but the potential benefits are truly exponential.
Finally, while the noise reduction capabilities available today are fairly crude I imagine that with advancements in machine learning they are going to get very powerful in the near future. They’re also going to be mathematically intensive and extreme headroom is going to be a requirement.
I would do the conversion at the highest possible resolution that you can afford even if you end up decimating those files down to a more manageable size for immediate use. 10 years from now you may end up kicking yourself otherwise.