University of California, Berkeley
Electrical Engineering and Computer Sciences Department
Lectures | Readings | Resources | Piazza

EE290T, Fall 2019 - Advanced Topics in Signal Processing:

3D Image Processing and Computer Vision

Wednesday: 11:00 am to 2:00 pm @ 540 Cory Hall

Prerequisites: Knowledge of linear algebra, understanding of signals and systems at the level of EE120 or equivalent.

Course Announcement
Academic Dishonesty Policy

  • 10% class participation
  • 50% paper presentation and/or homework
  • 40% Class project; proposals due October 15th.

Lecturer: Professor Avideh Zakhor @
Office: 507 Cory Hall
Phone: (510) 643-677
Office Hours: Wednesday, 2:00pm - 3:00pm @ 507 Cory Hall

Course Reader: Ilya Chugunov @
Office Hours: Monday, 1:00pm - 2:00pm @ 504 Cory Hall

Class Piazza:


  1. Computer Vision: Algorithms and Applications by Richard Szeliski; Springer PDF
  2. Computer Vision: A Modern Approach by Ponce and Forsyth PDF
  3. Multiple View Geometry in Computer Vision Hartley and Zisserman PDF Tutorial Presentation
  4. Deep Learning Ian Goodfellow, Yoshua Bengio and Aaron Courville Book
  5. Probabilistic Robotics by Sebastian Thrun, Wolfram Burgard, and Dieter Fox PDF


Date: Lecture Notes: Homework:
8/28/2019 lecture1-3d-recon.pdf

Read and Review:

  1. ″PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,″ by Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas, 2017 
  2. ″PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space ,″ by Charles R. Qi Li Yi Hao Su Leonidas J. Guibas, 2017 
  3. ″Frustum PointNets for 3D Object Detection from RGB-D Data,″ (Links to an external site.)by Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, Leonidas J. Guibas, 2018 

9/04/2019 lecture2-deep-learning.pdf

Read and Review:

  1. "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection" by Yin Zhou , Oncel Tuzel
  2. ″VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition,″ by Daniel Maturana and Sebastian Scherer, 2015 

9/11/2019 lecture3-3d-acquisition.pdf

Read and Review:

  1. Depth from Motion for Smartphone AR Julien Valentin et al, 2018.
  2. UltraStereo: Efficient Learning-based Matching for Active Stereo Systems S. R. Fanello et al, 2017



Read and Review:

  1. The Need 4 Speed in Real-Time Dense Visual Tracking Adash Kowdle et al, 2018.
  2. ADVIO: An Authentic Dataset for
    Visual-Inertial Odometry
    Santiago Cortés , Arno Solin, Esa Rahtu, and Juho Kannala, 2018
10/02/2019 lecture6-single-view.pdf

Read and Review:

  1. Volumetric Capture of Humans with a Single RGBD Camera via Semi-Parametric Learning ( 2019) Rohit Pandey, Anastasia Tkach, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Ricardo Martin-Brualla, Andrea Tagliasacchi, George Papandreou, Philip Davidson, Cem Keskin, Shahram Izadi, Sean Fanello
  2. Fusion4D: Real-time Performance Capture of Challenging Scenes Mingsong Dou, Sameh Khamis, Yury Degtyarev, Philip Davidson*, Sean Ryan Fanello*, Adarsh Kowdle*, Sergio Orts Escolano*, Christoph Rhemann*, David Kim, Jonathan Taylor, Pushmeet Kohli, Vladimir Tankovich, Shahram Izadi
  3. Montage4D: Real-time Seamless Fusion and Stylization of Multiview Video Textures Ruofei Du, Ming Chuang, Wayne Chang, Hughes Hoppe, Amitabh Varshney



Read and Review:

  1.  Single View Pose Estimation of Mobile Devices in Urban Environments, A. Hallquist and A. Zakhor





Read and Review:

  1.  Deep Depth Completion of a Single RGB-D Image, Yinda Zhang and Thomas Funkhouser





Read and Review:

  1.  Unsupervised Learning of Depth and Ego-Motion From Video Tinghui Zhou, Matthew Brown, Noah Snavely, David G. Lowe



Read and Review:

  1.  Bayesian DeNet: Monocular Depth Prediction and Frame-Wise Fusion With Synchronized Uncertainty Xin Yang, Yang Gao, Hongcheng Luo, Chunyuan Liao, Kwang-Ting Cheng



Read and Review:

  1.  FlowNet3D: Learning Scene Flow in 3D Point Clouds Xingyu Liu, Charles R. Qi, Leonidas J. Guibas
  2.  Supervised Fitting of Geometric Primitives to 3D Point Clouds Lingxiao Li, Minhyuk Sung, Anastasia Dubrovina, Li Yi, Leonidas Guibas



Read and Review:

  1.  Incrementally-deployable Indoor Navigation with Automatic Trace GenerationYuanchao Shu, Zhuqi Li, Börje Karlsson, Yiyong Lin, Thomas Moscibroda, Kang Shin
  2.  Cognitive Mapping and Planning for Visual Navigation Saurabh Gupta, Varun Tolani, James Davidson, Sergey Levine, Rahul Sukthankar, Jitendra Malik

Back to top


CVPR 2019:

  1. Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks [pdf
  2. SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration Without Correspondences [pdf]
  3. TopNet: Structural Point Cloud Decoder [pdf]
  4. Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition [pdf]
  5. FlowNet3D: Learning Scene Flow in 3D Point Clouds [pdf]
  6. PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud [pdf
  7. Structural Relational Reasoning of Point Clouds [pdf
  8. Modeling Local Geometric Structure of 3D Point Clouds Using Geo-CNN [pdf
  9. Supervised Fitting of Geometric Primitives to 3D Point Clouds [pdf
  10. HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds [pdf]
  11. Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling [pdf]
  12. GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud [pdf]
  13. Associatively Segmenting Instances and Semantics in Point Clouds [pdf]
  14. ClusterNet: Deep Hierarchical Cluster Network With Rigorously Rotation-Invariant Representation for Point Cloud Analysis [pdf
  15. The Perfect Match: 3D Point Cloud Matching With Smoothed Densities [pdf]
  16. PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing [pdf
  17. RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion [pdf]
  18. Embodied Question Answering in Photorealistic Environments With Point Cloud Perception [pdf]
  19. GeoNet: Deep Geodesic Networks for Point Cloud Analysis [pdf]
  20. PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet [pdf]
  21. A-CNN: Annularly Convolutional Neural Networks on Point Clouds [pdf]
  22. Point Cloud Oversegmentation With Graph-Structured Deep Metric Learning [pdf]
  23. PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds [pdf
  24. DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds [pdf]
  25. JSIS3D: Joint Semantic-Instance Segmentation of 3D Point Clouds With Multi-Task Pointwise Networks and Multi-Value Conditional Random Fields [pdf]
  26. Relation-Shape Convolutional Neural Network for Point Cloud Analysis [pdf]
  27. Generating 3D Adversarial Point Clouds [pdf]
  28. PointConv: Deep Convolutional Networks on 3D Point Clouds [pdf]
  29. Octree Guided CNN With Spherical Kernels for 3D Point Clouds [pdf]
  30. Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes [pdf
  31. Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks [pdf]
  32. Graph Attention Convolution for Point Cloud Semantic Segmentation [pdf]
  33. LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds [pdf]
  34. PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval [pdf]
  35. PointPillars: Fast Encoders for Object Detection From Point Clouds [pdf]
  36. Deep Reinforcement Learning of Volume-Guided Progressive View Inpainting for 3D Point Scene Completion From a Single Depth Image [pdf]
  37. Deep Sketch-Shape Hashing With Segmented 3D Stochastic Viewing [pdf]
  38. PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding [pdf]
  39. Unsupervised Learning of Consensus Maximization for 3D Vision Problems [pdf]
  40. MVF-Net: Multi-View 3D Face Morphable Model Regression [pdf]
  41. Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction [pdf]
  42. 3D Point Capsule Networks [pdf
  43. GS3D: An Efficient 3D Object Detection Framework for Autonomous Driving [pdf]
  44. Single-Image Piece-Wise Planar 3D Reconstruction via Associative Embedding [pdf]
  45. 3DN: 3D Deformation Network [pdf]
  46. Deep Fitting Degree Scoring Network for Monocular 3D Object Detection [pdf]
  47. Pushing the Envelope for RGB-Based Dense 3D Hand Pose Estimation via Neural Rendering [pdf]
  48. Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry [pdf]
  49. Dense 3D Face Decoding Over 2500FPS: Joint Texture & Shape Convolutional Mesh Decoders [pdf
  50. Towards High-Fidelity Nonlinear 3D Face Morphable Model [pdf]
  51. GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction [pdf]
  52. Leveraging Shape Completion for 3D Siamese Tracking [pdf]
  53. Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction [pdf]
  54. Elastic Boundary Projection for 3D Medical Image Segmentation [pdf]
  55. DeepVoxels: Learning Persistent 3D Feature Embeddings [pdf]
  56. 3D Local Features for Direct Pairwise Registration [pdf]
  57. Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition [pdf]
  58. DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion [pdf]
  59. Exploiting Temporal Context for 3D Human Pose Estimation in the Wild [pdf]
  60. What Do Single-View 3D Reconstruction Networks Learn? [pdf]
  61. Semantic Graph Convolutional Networks for 3D Human Pose Regression [pdf]
  62. 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans [pdf]
  63. PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image [pdf]
  64. Occupancy Networks: Learning 3D Reconstruction in Function Space [pdf]
  65. 3D Shape Reconstruction From Images in the Frequency Domain [pdf
  66. H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions [pdf]
  67. GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching [pdf]
  68. LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks [pdf]
  69. ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving [pdf]
  70. Synthesizing 3D Shapes From Silhouette Image Collections Using Multi-Projection Generative Adversarial Networks [pdf]
  71. Scan2Mesh: From Unstructured Range Scans to 3D Meshes [pdf]
  72. Learning 3D Human Dynamics From Video [pdf]
  73. Unsupervised 3D Pose Estimation With Geometric Self-Supervision [pdf]
  74. Led3D: A Lightweight and Efficient Deep Approach to Recognizing Low-Quality 3D Faces [pdf]
  75. Photo Wake-Up: 3D Character Animation From a Single Photo [pdf]
  76. Patch-Based Progressive 3D Point Set Upsampling [pdf]
  77. Photon-Flooded Single-Photon 3D Cameras [pdf]
  78. Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks [pdf]
  79. Multi-Task Multi-Sensor Fusion for 3D Object Detection [pdf]
  80. Triangulation Learning Network: From Monocular to Stereo 3D Object Detection [pdf]
  81. Stereo R-CNN Based 3D Object Detection for Autonomous Driving [pdf]
  82. 3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis [pdf]
  83. RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion [pdf]
  84. 3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training [pdf]
  85. Learning to Regress 3D Face Shape and Expression From an Image Without 3D Supervision [pdf]
  86. RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation [pdf]
  87. Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views [pdf
  88. PA3D: Pose-Action 3D Machine for Video Recognition [pdf]
  89. Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving [pdf]
  90. Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video [pdf]
  91. Argoverse: 3D Tracking and Forecasting With Rich Maps [pdf
  92. Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes [pdf]
  93. Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment [pdf]
  94. 3D Appearance Super-Resolution With Deep Learning [pdf]
  95. Unsupervised Primitive Discovery for Improved 3D Generative Modeling [pdf]
  96. Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres [pdf]
  97. Learning View Priors for Single-View 3D Reconstruction [pdf]
  98. SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception [pdf]
  99. Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network [pdf
  100. 3D Guided Fine-Grained Face Manipulation [pdf]
  101. Capture, Learning, and Synthesis of 3D Speaking Styles [pdf]
  102. Superquadrics Revisited: Learning 3D Shape Parsing Beyond Cuboids [pdf]
  103. 3D Hand Shape and Pose Estimation From a Single RGB Image [pdf]
  104. 3D Hand Shape and Pose From Images in the Wild [pdf]
  105. Self-Supervised 3D Hand Pose Estimation Through Training by Fitting [pdf
  106. HoloPose: Holistic 3D Human Reconstruction In-The-Wild [pdf]
  107. Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation [pdf]
  108. In the Wild Human Pose Estimation Using Explicit 2D Features and Intermediate 3D Representations [pdf]
  109. Combining 3D Morphable Models: A Large Scale Face-And-Head Model [pdf]
  110. Boosting Local Shape Matching for Dense 3D Face Correspondence [pdf
  111. Expressive Body Capture: 3D Hands, Face, and Body From a Single Image [pdf]
  112. Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation [pdf]
  113. Robustness of 3D Deep Learning in an Adversarial Setting [pdf]
  114. Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction [pdf]
  115. Disentangled Representation Learning for 3D Face Shape [pdf]
  116. An Efficient Schmidt-EKF for 3D Visual-Inertial SLAM [pdf
  117. Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments [pdf]
  118. Multiview 2D/3D Rigid Registration via a Point-Of-Interest Network for Tracking and Triangulation [pdf
  119. LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving [pdf]
  120. Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds [pdf]


  1. "Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs" by Loic Landrieu and Martin Simonovsky, 2018  
  2. "Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View " by Shuran Song, Andy Zeng, Angel X. Chang, Manolis Savva, Silvio Savarese, and Thomas Funkhouser, 2018 
  3. ″Tangent Convolutions for Dense Prediction in 3D,″ by Maxim Tatarchenko, Jaesik Park, Vladlen Koltun, and Qian-Yi Zhou, 2018 
  4. ″SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation,″ by Weiyue Wang, Ronald Yu, Qiangui Huang, and Ulrich Neumann, 2018 
  5. ″Automatic 3D Indoor Scene Modeling from Single Panorama,″ by Yang Yang, Shi Jin, Ruiyang Liu, Sing Bing Kang, and Jingyi Yu, 2018 
  6. ″Deep Depth Completion of a Single RGB-D Image,″ by Yinda Zhang and Thomas Funkhouser, 2018 
  7. ″LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image,″by Chuhang Zou, Alex Colburn, Qi Shan, and Derek Hoiem, 2018 
  8. ″Frustum PointNets for 3D Object Detection from RGB-D Data,″by Charles R. Qi, Wei Liu, Chenxia Wu, Hao Su, Leonidas J. Guibas, 2018 
  9. ″KinectFusion: Real-Time Dense Surface Mapping and Tracking∗,″ by Richard A. Newcombe, Andrew J. Davison, Shahram Izadi, Pushmeet Kohli, Otmar Hilliges, Jamie Shotton, David Molyneaux, Steve Hodges, David Kim, and Andrew Fitzgibbon, 2011 
  10. ″VoxNet: A 3D Convolutional Neural Network for Real-Time Object Recognition,″ by Daniel Maturana and Sebastian Scherer, 2015 
  11. ″ElasticFusion: Dense SLAM Without A Pose Graph,″ by Thomas Whelan, Stefan Leutenegger, Renato F. Salas-Moreno, Ben Glocker and Andrew J. Davison, 2015 
  12. ″3D Semantic Parsing of Large-Scale Indoor Spaces,″ by Iro Armeni1 Ozan Sener, Amir R. Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese, 2016 
  13. ″Fine-To-Coarse Global Registration of RGB-D Scans,″ by Maciej Halber and Thomas Funkhouser, 2016 
  14. ″SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks,″ by John McCormac, Ankur Handa, Andrew Davison, and Stefan Leutenegger,2016 
  15. ″Semantic Scene Completion from a Single Depth Image,″ by Shuran Song, Fisher Yu, Andy Zeng, Angel X. Chang, Manolis Savva, and Thomas Funkhouser, 2016 
  16. ″Matterport3D: Learning from RGB-D Data in Indoor Environments,″ by Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva,Shuran Song, Andy Zeng, and Yinda Zhang, 2017 
  17. ″3DLite: Towards Commodity 3D Scanning for Content Creation,″ by Jingwei Huang, Angela Dai, Leonidas Guibas, and Matthias Niessner, 2017 
  18. ″BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration,″ by Angela Dai Matthias Niessner, Michael Zollhoer, Shahram Izadi, Christian Theobalt, 2017 
  19. ″Joint 2D-3D-Semantic Data for Indoor Scene Understanding,″ by Iro Armeni, Alexander Sax, Amir R. Zamir, and Silvio Savarese, 2017 
  20. ″PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation,″ by Charles R. Qi, Hao Su, Kaichun Mo, and Leonidas J. Guibas, 2017 
  21. ″PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space ,″ by Charles R. Qi Li Yi Hao Su Leonidas J. Guibas, 2017 
  22. ″ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes,″ by Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner, 2017 
  23. ″Predicting Complete 3D Models of Indoor Scenes,″ by Ruiqi Guo, Chuhang Zou, and Derek Hoiem, 2017 
  24. ″2D-Driven 3D Object Detection in RGB-D Images,″ by Jean Lahoud, Bernard Ghanem,2017 
  25. ″3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation,″ by Angela Dai and Matthias Niessner, 2018 
  26. ″ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans,″ by Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jugen Sturm, Matthias Nießner, 2018 
  27. ″Deep Depth Completion of a Single RGB-D Image,″ by Yinda Zhang and Thomas Funkhouser, 2018 
  28. ″Fusion++: Volumetric Object-Level SLAM,″ by John McCormac, Ronald Clark, Michael Bloesch, Andrew J. Davison, and Stefan Leutenegger, 2018 
  29. "VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection" by Yin Zhou , Oncel Tuzel
  30. 3D object detection: Learning 3D bounding boxes from scaled down 2D bounding boxes in RGB-D images by Mohammad Muntasir Rahman, Yanhao Tan, Jian Xue, Ling Shao, Ke Lu, 2019
  31. D-SSD: Learning Hierarchical Features from RGB-D Images for Amodal 3D Object Detection by Qianhui Luo, Huifang Ma, Yue Wang, Li Tang, Rong Xiong, 2018
  32. Amodal Detection of 3D Objects: Inferring 3D Bounding Boxes from 2D Ones in RGB-Depth Images by Zhuo Deng and Longin Jan Latecki, 2017

Back to top

Last updated 2019 by Ilya