This is so close to what will be a Singularity in mapping and crowd-sourced/community image and video sites. These sites already offer geo-tagging, but when the maps we are using actually understand the 3D composition of locations, matching up images and reconciling them with the current model will start happening.
Bing Maps seems to be at the stage (courtesy of the PhotoSynth tech) that they have a sparse point-cloud representation of New York, which they can match the Flickr images to. Sadly the experience demo’ed in the presentation is quite alien from my own exploration of Bing Maps 3D. 3D models are untextured, hand-created, and there is as yet no street-view facility.