Graduation Semester and Year

2023

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Farhad Kamangar

Abstract

In the field of computer vision, learning representations of images is an important task. This dissertation introduces deep generative sculpting models (DGSM), deep learning models that learn 3D representations of objects from 2D images. DGSMs use convolutional networks combined with a differentiable renderer to attempt to "sculpt" a base 3D mesh, such as a sphere, to faithfully represent an object in the scene, and render it to reconstruct the input image. The core methodology revolves around the encoding of the input image into latent variables. These variables are decoded into interpretable scene parameters, describing the object's translation, rotation, scale, texture, and "sculpting parameters". These are used to build a scene, and render it using a differentiable renderer. Because DGSMs use a differentiable renderer, all of the latent variables describing an image are mapped directly to a parameter a human can understand, such as a scale factor, translation vector, rotation angle, or adjustment to a vertex of a triangle mesh. In this dissertation we investigate two different models: The additive model, wherein each vertex undergoes independent adjustments. The warping model, characterized by a single-shot transformation using Gaussian Radial Basis Functions (RBFs). We perform experiments on synthetic data rendered from 3D models. Our focus in this work is datasets that contain a single class of object. Our synthetic data consists of three datasets: faces, cars, and airplanes.

Keywords

Deep learning, 3d reconstruction, Differentiable rendering

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS