I think it is realistic to expect that GSX can do this in that other's are already doing this e.g. Majestic Dash. Majestic not only allow you to define which output device for a stream of separate sounds but also the 3D placement.
This only requires an additional output device to be provided. This will allow the voice sounds, sent to the selected device.
As the sounds for the voice are only in two directories, so it would require no file restructuring, only identifying the correct direct directories with each audio control.
Two audio controls -
1. For Non-voice sounds (excludes the voice sound directories)
2. For Voice sounds (only has the voice sound directories)
For initial settings, it should use the default sound device, but with the option to change (as above) in GSX settings.