Single Image 3D Reconstruction using MCC, SAM, and ZoeDepth


By combining MetaAI's Segment Anything Model (SAM) and Multiview Compressive Coding (MCC) we can get a 3D object from a single image.

The basic idea is to use SAM to create a generic object mask so we can exclude the background.

The next step is to generate a depth image. Here we use the awesome ZoeDepth to get realistic depth from the color image.

With depth, color, and an object mask we have everything needed to create a colored point cloud of the object from a single view

MCC encodes the colored points and then creates a reconstruction by sweeping through the volume, querying the network for occupancy and color at each point.

This is a really great example of how a lot of cool solutions are built these days; by stringing together more targeted pre-trained models.The details of the three building blocks can be found in the respective papers: