Converting PhotoSynths to Dense Point-Clouds

I’ve discussed (with examples of a reconstruction from Newcastle) some of the technologies for doing 3D reconstruction. PhotoSynth is by far the easiest way to get into this, but it is not as powerful as some other options.

For example, PMVS2 allows you to reconstruct a detailed 3D point-cloud, and takes as its input a collection of images, and parameters about the position and intrinsics of the cameras.

Helpfully, the latest version of SynthExport allows you to export the camera parameters from your PhotoSynth, for use in the more powerful reconstruction. I have written a javascript form that generates the two input files needed for the PMVS2 pipeline.

Edit: The focal length is now being calculated correctly, but there is a problem with the data – it is not processing correctly. The rotation matrix calculations are from a great matrix calculator and seem to be working fine, and the translation are just simply brought over directly from PhotoSynth. I don’t have a good way to plot these points myself, but I might ask a colleague to give it a try for me. Obviously I’m keen to get this up and running ASAP, so please comment for feedback and revision suggestions!

Edit: I am working on getting the correct format for piping PhotoSynths to PMVS.


Here are the side-by-side comparisons between the PhotoSynth output for the Newcastle Freeze Mob Synth, and my offline Bundler processing. PhotoSynth did better at finding camera positions, so there are more cameras in the view. (PhotoSynth is related to Bundler – they are built on the same code.) These models were produced using 3D Studio Max and two script generators; one that worked from the PhotoSynth camera parameters file, and the other that processed the Bundler output. The processing for Bundler output is guided by the Bundler Documentation, though something is obviously awry.

If you review the Synth you will see the rough path that the camera takes through the scene, and the PhotoSynth output is a perfect representation of this. Note that the camera does not really rotate as it passes through, it does a smooth dolly-shot across the scene.

In the image below, you can see that the cameras twist rotations, which was not accurate.

The Bundler output looks at least partly right because the translation section of the matrix is not part of the rotation matrix from the bundle.out file, so at least the cameras are in the correct shape, although the axis are switched.


Here’s a comparison of results from the Kermit example images for Bundler (view PhotoSynth):

The camera positions embedded in the Bundler point-cloud above are pretty much identical to those from the PhotoSynth output below.

But I can’t get the camera positions from the Bundler output file in the correct format (and thus get PhotoSynth output into the same format). Here’s what I get; for a start the cameras are on a different axis, and the positions are mirrored.

And here is the file output:

PhotoSynth camera positions

Camera0 = freecamera()
myTransform = Camera0.transform
myTransform.row1 = [0.244536772038, 0.969616206400956, 0.00679554308750407] myTransform.row2 = [0.303616062795044, -0.0699121872420001, -0.950226063885787] myTransform.row3 = [-0.920879500007504, 0.234428450405787, -0.311487155604] myTransform.row4 = [0.272829, -0.460082, -0.025248] Camera0.transform = myTransform

Camera1 = freecamera()
myTransform = Camera1.transform
myTransform.row1 = [-0.0533696273639999, 0.998427890386384, 0.0171298153411824] myTransform.row2 = [0.376080835861616, 0.03598815695, -0.925887713201212] myTransform.row3 = [-0.925048586709182, -0.0429720869627876, -0.37741027009] myTransform.row4 = [0.194019, -0.0691434, 0.0326086] Camera1.transform = myTransform

Camera2 = freecamera()
myTransform = Camera2.transform
myTransform.row1 = [-0.44396760373, 0.894615938164381, 0.0505478982798027] myTransform.row2 = [0.271716917599619, 0.18817091587, -0.943801686325473] myTransform.row3 = [-0.853851675367803, -0.405282653962527, -0.326624075764] myTransform.row4 = [0.254576, 0.451339, 0.0373162] Camera2.transform = myTransform

Camera3 = freecamera()
myTransform = Camera3.transform
myTransform.row1 = [-0.74130608605, 0.671167067552163, -0.000232849008482583] myTransform.row2 = [0.286824737087837, 0.316485108662, -0.904195081931823] myTransform.row3 = [-0.606792268391517, -0.670352104068177, -0.427119654888] myTransform.row4 = [-0.169543, 0.753825, 0.0722654] Camera3.transform = myTransform

Camera4 = freecamera()
myTransform = Camera4.transform
myTransform.row1 = [-0.923441055146, 0.383009276380708, 0.0236751320412161] myTransform.row2 = [0.133154528755292, 0.377678021014, -0.91631281990099] myTransform.row3 = [-0.359897887105216, -0.843008426203009, -0.399762809936] myTransform.row4 = [-0.563822, 0.729699, -0.0342284] Camera4.transform = myTransform

Camera5 = freecamera()
myTransform = Camera5.transform
myTransform.row1 = [0.525824342342, 0.850497633694891, 0.0127489638792393] myTransform.row2 = [0.300989461965109, -0.17202784705, -0.937982816273996] myTransform.row3 = [-0.795558988879239, 0.497051501273996, -0.346447254708] myTransform.row4 = [-0.0535178, -0.85111, -0.0541857] Camera5.transform = myTransform

Camera7 = freecamera()
myTransform = Camera7.transform
myTransform.row1 = [0.103343164028, 0.993844386470724, 0.0399189920878644] myTransform.row2 = [0.582432980529276, -0.0279322709220002, -0.812398677641049] myTransform.row3 = [-0.806282837247865, 0.107205987341049, -0.58173435745] myTransform.row4 = [0.0764454, -0.254899, 0.408282] Camera7.transform = myTransform

Camera8 = freecamera()
myTransform = Camera8.transform
myTransform.row1 = [0.00111216005600002, 0.999005160074132, 0.0445808618721977] myTransform.row2 = [0.854213952293868, 0.02222945228, -0.519446219697305] myTransform.row3 = [-0.519920462000198, 0.0386593015533045, -0.85333942344] myTransform.row4 = [-0.352342, -0.224872, 0.73212] Camera8.transform = myTransform

Camera9 = freecamera()
myTransform = Camera9.transform
myTransform.row1 = [-0.200889882868, 0.970968367563747, -0.12986024854386] myTransform.row2 = [-0.0644564208597473, -0.145376882578, -0.987274496693728] myTransform.row3 = [-0.97749098449614, -0.189963131166272, 0.091789890654] myTransform.row4 = [0.170197, 0.118426, -0.569607] Camera9.transform = myTransform

Camera10 = freecamera()
myTransform = Camera10.transform
myTransform.row1 = [-0.058238125416, 0.997169687196283, -0.0475492974175039] myTransform.row2 = [-0.0431093717242826, -0.0500977325700001, -0.997813509259562] myTransform.row3 = [-0.997371496894496, -0.0560609679564378, 0.04590495675] myTransform.row4 = [0.171159, -0.193182, -0.599323] Camera10.transform = myTransform

Bundle.out file

# Bundle file v0.3
11 623
6.7483561379e+002 -9.2909176483e-002 -7.1452728557e-004
9.9137151689e-001 -1.1993357464e-001 5.2900408063e-002
9.8726398766e-002 9.4864141643e-001 3.0055375763e-001
-8.6230004559e-002 -2.9273776783e-001 9.5229668990e-001
2.0104152055e-001 1.0006100568e+000 -5.3611738918e-001
6.7665492312e+002 -8.2788477805e-002 -1.5495688596e-002
9.8107969862e-001 6.4153247757e-002 -1.8266632352e-001
-1.7897830105e-002 9.6951650853e-001 2.4437145365e-001
1.9277523862e-001 -2.3647854127e-001 9.5232116793e-001
-4.8824954210e-001 8.3078302635e-001 -3.2765590435e-001
6.9594685805e+002 1.6075856973e-002 -6.1758264844e-003
8.2472475473e-001 3.1837521750e-001 -4.6740378670e-001
-1.5041243203e-001 9.2019899304e-001 3.6139993567e-001
5.4516527699e-001 -2.2775213301e-001 8.0679847960e-001
-1.2503867966e+000 1.1229738934e+000 -1.2577283457e+000
6.8393864087e+002 -8.7992965606e-003 7.6634490784e-003
5.5296099706e-001 4.4786856695e-001 -7.0260079879e-001
-2.7796451946e-001 8.9409871323e-001 3.5117405502e-001
7.8547429089e-001 1.1125377983e-003 6.1889328693e-001
-1.9044153757e+000 1.2161906946e+000 -1.3300891219e+000
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
6.8447535552e+002 -6.4255104279e-002 -8.1737040619e-003
9.1291554876e-001 -2.8328096147e-001 2.9383175067e-001
1.9935278303e-001 9.3767914176e-001 2.8463361539e-001
-3.5615118802e-001 -2.0127027595e-001 9.1249471631e-001
1.1266368937e+000 1.0317378118e+000 -4.2850805998e-001
0 0 0
0 0 0
0 0 0
0 0 0
0 0 0
6.8222578863e+002 -9.8611796890e-002 -6.3262065149e-003
9.9916713027e-001 -9.2759089965e-003 -3.9736674407e-002
9.6681255192e-003 9.9990631492e-001 9.6896193167e-003
3.9643071647e-002 -1.0065728282e-002 9.9916320388e-001
-1.1945327215e-001 -1.6066502534e-002 -4.0000672191e-001
6.8556802108e+002 -8.5817439178e-002 -1.9029040798e-002
9.9071140322e-001 5.7798475103e-002 -1.2308635910e-001
-1.0008727000e-001 9.2269230682e-001 -3.7231901016e-001
9.2051365578e-002 3.8118006666e-001 9.1990668161e-001
-2.0419030672e-001 -1.1309509170e+000 -4.9814597685e-001
6.8340278196e+002 -1.0275312214e-001 -1.8491494029e-002
9.2296024395e-001 2.7121191052e-002 -3.8393857463e-001
2.2870682478e-001 7.6366983583e-001 6.0373965427e-001
3.0957644677e-001 -6.4503707090e-001 6.9863409648e-001
-1.1194835731e+000 1.8679491761e+000 -8.9878098319e-001
6.7312430685e+002 -1.1220939686e-001 2.6068044663e-003
9.7094006942e-001 1.7350651792e-002 -2.3869297533e-001
1.3528752594e-001 7.8292714291e-001 6.0722514295e-001
1.9741496122e-001 -6.2187140454e-001 7.5782800773e-001
-3.4669200651e-001 2.1011837748e+000 -6.5191141116e-001
-4.3599130124e-001 -3.4406986976e-001 -2.6258599380e+000

which I bring into transformation matrices as

Camera0 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.9137151689e-001, -1.1993357464e-001, 5.2900408063e-002] myRotation.row2 = [9.8726398766e-002, 9.4864141643e-001, 3.0055375763e-001] myRotation.row3 = [-8.6230004559e-002, -2.9273776783e-001, 9.5229668990e-001] myTranslation.row4 = [2.0104152055e-001, 1.0006100568e+000, -5.3611738918e-001] Camera0.transform = myTranslation * myRotation

Camera1 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.8107969862e-001, 6.4153247757e-002, -1.8266632352e-001] myRotation.row2 = [-1.7897830105e-002, 9.6951650853e-001, 2.4437145365e-001] myRotation.row3 = [1.9277523862e-001, -2.3647854127e-001, 9.5232116793e-001] myTranslation.row4 = [-4.8824954210e-001, 8.3078302635e-001, -3.2765590435e-001] Camera1.transform = myTranslation * myRotation

Camera2 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [8.2472475473e-001, 3.1837521750e-001, -4.6740378670e-001] myRotation.row2 = [-1.5041243203e-001, 9.2019899304e-001, 3.6139993567e-001] myRotation.row3 = [5.4516527699e-001, -2.2775213301e-001, 8.0679847960e-001] myTranslation.row4 = [-1.2503867966e+000, 1.1229738934e+000, -1.2577283457e+000] Camera2.transform = myTranslation * myRotation

Camera3 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [5.5296099706e-001, 4.4786856695e-001, -7.0260079879e-001] myRotation.row2 = [-2.7796451946e-001, 8.9409871323e-001, 3.5117405502e-001] myRotation.row3 = [7.8547429089e-001, 1.1125377983e-003, 6.1889328693e-001] myTranslation.row4 = [-1.9044153757e+000, 1.2161906946e+000, -1.3300891219e+000] Camera3.transform = myTranslation * myRotation

Camera4 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [0, 0, 0] myRotation.row2 = [0, 0, 0] myRotation.row3 = [0, 0, 0] myTranslation.row4 = [0, 0, 0] Camera4.transform = myTranslation * myRotation

Camera5 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.1291554876e-001, -2.8328096147e-001, 2.9383175067e-001] myRotation.row2 = [1.9935278303e-001, 9.3767914176e-001, 2.8463361539e-001] myRotation.row3 = [-3.5615118802e-001, -2.0127027595e-001, 9.1249471631e-001] myTranslation.row4 = [1.1266368937e+000, 1.0317378118e+000, -4.2850805998e-001] Camera5.transform = myTranslation * myRotation

Camera6 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [0, 0, 0] myRotation.row2 = [0, 0, 0] myRotation.row3 = [0, 0, 0] myTranslation.row4 = [0, 0, 0] Camera6.transform = myTranslation * myRotation

Camera7 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.9916713027e-001, -9.2759089965e-003, -3.9736674407e-002] myRotation.row2 = [9.6681255192e-003, 9.9990631492e-001, 9.6896193167e-003] myRotation.row3 = [3.9643071647e-002, -1.0065728282e-002, 9.9916320388e-001] myTranslation.row4 = [-1.1945327215e-001, -1.6066502534e-002, -4.0000672191e-001] Camera7.transform = myTranslation * myRotation

Camera8 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.9071140322e-001, 5.7798475103e-002, -1.2308635910e-001] myRotation.row2 = [-1.0008727000e-001, 9.2269230682e-001, -3.7231901016e-001] myRotation.row3 = [9.2051365578e-002, 3.8118006666e-001, 9.1990668161e-001] myTranslation.row4 = [-2.0419030672e-001, -1.1309509170e+000, -4.9814597685e-001] Camera8.transform = myTranslation * myRotation

Camera9 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.2296024395e-001, 2.7121191052e-002, -3.8393857463e-001] myRotation.row2 = [2.2870682478e-001, 7.6366983583e-001, 6.0373965427e-001] myRotation.row3 = [3.0957644677e-001, -6.4503707090e-001, 6.9863409648e-001] myTranslation.row4 = [-1.1194835731e+000, 1.8679491761e+000, -8.9878098319e-001] Camera9.transform = myTranslation * myRotation

Camera10 = freecamera ()
myTranslation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation = matrix3 [1,0,0] [0,1,0] [0,0,1] [0,0,0] myRotation.row1 = [9.7094006942e-001, 1.7350651792e-002, -2.3869297533e-001] myRotation.row2 = [1.3528752594e-001, 7.8292714291e-001, 6.0722514295e-001] myRotation.row3 = [1.9741496122e-001, -6.2187140454e-001, 7.5782800773e-001] myTranslation.row4 = [-3.4669200651e-001, 2.1011837748e+000, -6.5191141116e-001] Camera10.transform = myTranslation * myRotation