I am going to try my best to explain and please correct me if I am wrong.
Basically, in a video call there are two logical channels (a video channel and a voice channel), in this case you're using Cisco VTA with Cisco IP phone and I am going to assume that your video speed is configured at 384k and you're using G.711 for voice. In this scenario if you're running G.711 for voice between the two video enabled phones then you will not a transcoder, transcoder is only needed when you're running different codecs in your environment and transcoder has nothing to do with video performance, it does not affect video performance. In fact Cisco IP phone has its own DSP in it so it will be able to negociate different codecs.
In regard to video over VPN. I have tested VTA with IP Communicator over VPN and it worked perfectly fine.
Hope that helps !!
D