Yes. The point is that VerySharp upscales the images prior to alignment so that, under the assumption that the shifts between the images are represented by an even distribution on a subpixel level, the resulting image stack on the upscaled subpixel grid can be interpreted as a high resolution image convolved with the superposition of a box kernel with the size of the upscaling factor and the upscaling operator. Thus, proper deconvolution recovers that high resolution image to some extent, which is also performed by VerySharp.
Also, as far as I know, panotools use a feature-based approach for image alignment, which is the best solution for the purpose of panotools, whereas most probably it does not provide the accuracy needed here, which is the reason for using the slower iterative ECC algorithm provided by OpenCV.
That is an interesting idea
However, rotating the camera (shifting would not work due to perspectivic distortions) could not be done with the required precision using normal tools, and the projection of the lenses prevents a homogeneous shift of the image over the whole sensor area during rotation...
Doing sub-pixel alignments when shooting the images is not even necessary because doing random sampling (shooting handheld ) and reconstructing the shifts with sub-pixel alignment does the job quite well. Of cause there is always room for improvement, but I doubt that the extensive effort would pay off in terms of resolution gain.