
|
Fixing Malfunctional Objects With Learned Physical Simulation and Functional Prediction
Yining Hong,
Kaichun Mo,
Li Yi,
Leonidas J. Guibas,
Antonio Torralba,
Joshua Tenenbaum and
Chuang Gan
CVPR 2022
This paper studies the problem of fixing malfunctional 3D objects.
Given a malfunctional object, humans can perform mental simulations to reason about its functionality and figure out how to fix it.
We propose FixIt, a dataset that contains around 5k poorly-designed 3D physical objects paired with choices to fix them.
We present FixNet, a novel framework that seamlessly incorporates perception and physical dynamics.
Specifically, FixNet consists of a perception module to extract the structured representation from the 3D point cloud, a physical dynamics prediction module to simulate the results of interactions on 3D objects, and a functionality prediction module to evaluate the functionality and choose the correct fix.
[Paper]
[Project]
[Video]
[Bibtex]
|
|

|
IFR-Explore: Learning Inter-object Functional Relationships in 3D Indoor Scenes
Qi Li*,
Kaichun Mo*,
Yanchao Yang,
Hang Zhao and
Leonidas J. Guibas
ICLR 2022
We take the first step in building AI system learning inter-object functional relationships in 3D indoor environments (e.g., a switch on the wall turns on or off the light, a remote control operates the TV).
The key technical contributions are modeling prior knowledge by training over large-scale scenes and designing interactive policies for effectively exploring the training scenes and quickly adapting to novel test scenes.
[Paper]
[Bibtex]
|
|

|
Object Pursuit: Building a Space of Objects via Discriminative Weight Generation
Chuanyu Pan*,
Yanchao Yang*,
Kaichun Mo,
Yueqi Duan and
Leonidas J. Guibas
ICLR 2022
We propose a framework to continuously learn object-centric representations for visual learning and
understanding.
Our method leverages interactions to effectively sample diverse variations of an object and the corresponding training signals while learning the object-centric representations.
Throughout learning, objects are streamed one by one in random order with unknown identities, and
are associated with latent codes that can synthesize discriminative weights for each object through
a convolutional hypernetwork.
[Paper]
[Bibtex]
|
|

|
VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects
Ruihai Wu*,
Yan Zhao*,
Kaichun Mo*,
Zizheng Guo,
Yian Wang,
Tianhao Wu,
Qingnan Fan,
Xuelin Chen,
Leonidas J. Guibas and
Hao Dong
ICLR 2022
We propose object-centric actionable visual priors as a novel perception-interaction handshaking point that the perception system outputs more actionable guidance than kinematic structure estimation, by predicting dense geometry-aware, interaction-aware, and task-aware visual action affordance and trajectory proposals.
We design an interaction-for-perception framework VAT-Mart to learn such actionable visual representations by simultaneously training a curiosity-driven reinforcement learning policy exploring diverse interaction trajectories and a perception module summarizing and generalizing the explored knowledge for pointwise predictions among diverse shapes.
[Paper]
[Project]
[Video]
[Bibtex]
|
|

|
DSG-Net: Learning Disentangled Structure and Geometry for 3D Shape Generation
Jie Yang*,
Kaichun Mo*,
Yu-Kun Lai,
Leonidas J. Guibas and
Lin Gao
ACM Transactions on Graphics (ToG) 2022, to be presented at
SIGGRAPH 2022
We introduce DSG-Net, a deep neural network that learns a disentangled structured mesh representation for 3D shapes, where two key aspects of shapes, geometry and structure, are encoded in a synergistic manner to ensure plausibility of the generated shapes, while also being disentangled as much as possible. This supports a range of novel shape generation applications with intuitive control, such as interpolation of structure (geometry) while keeping geometry (structure) unchanged.
Our method not only supports controllable generation applications, but also produces high-quality synthesized shapes.
[Paper]
[Project]
[Video]
[Bibtex]
|
|

|
O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning
Kaichun Mo,
Yuzhe Qin,
Fanbo Xiang,
Hao Su and
Leonidas J. Guibas
CoRL 2021
Contrary to the vast literature in modeling, perceiving, and understanding agent-object interaction in computer vision and robotics, very few past works have studied the task of object-object interaction, which also plays an important role in robotic manipulation and planning tasks.
There is a rich space of object-object interaction scenarios in our daily life, such as placing an object on a messy tabletop, fitting an object inside a drawer, pushing an object using a tool, etc.
In this paper, we propose a large-scale object-object affordance learning framework that requires no human annotation or demonstration.
[Paper]
[Project]
[Video]
[Bibtex]
|
|

|
Learning to Regrasp by Learning to Place
Shuo Cheng,
Kaichun Mo and
Lin Shao
CoRL 2021
Regrasping is needed whenever a robot's current grasp pose fails to perform desired manipulation tasks.
In this paper, we propose a system for robots to take partial point clouds of an object and the supporting environment as inputs and output a sequence of pick-and-place operations to transform an initial object grasp pose to the desired object grasp poses. The key technique includes a neural stable placement predictor and a regrasp graph based solution through leveraging and changing surrounding environment.
[Paper]
[Project]
[Bibtex]
|
|

|
Where2Act: From Pixels to Actions for Articulated 3D Objects
Kaichun Mo,
Leonidas J. Guibas,
Mustafa Mukadam,
Abhinav Gupta and
Shubham Tulsiani
ICCV 2021
One of the fundamental goals of visual perception is to allow agents to meaningfully interact with their environment.
In this paper, we take a step towards that long-term goal -- we extract highly localized actionable information related to elementary actions such as pushing or pulling for articulated objects with movable parts.
We propose, discuss, and evaluate novel network architectures that given image and depth data, predict the set of actions possible at each pixel, and the regions over articulated parts that are likely to move under the force. We propose a learning-from-interaction framework with an online data sampling strategy that allows us to train the network in simulation (SAPIEN) and generalizes across categories.
[Paper]
[Project]
[Video]
[Bibtex]
|
|

|
Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
He Wang*,
Zetian Jiang*,
Li Yi,
Kaichun Mo,
Hao Su and
Leonidas J. Guibas
CVPR 2021 Workshop "Learning to generate 3D Shapes and Scenes"
We show that sampling-insensitive discriminators (e.g.PointNet-Max) produce shape point clouds with point clustering artifacts while sampling-oversensitive discriminators (e.g.PointNet++, DGCNN) fail to guide valid shape generation.
We propose the concept of sampling spectrum to depict the different sampling sensitivities of discriminators.
We point out that, though recent research has been focused on the generator design, the main bottleneck of point cloud GAN actually lies in the discriminator design.
[Paper]
[Bibtex]
|
|

|
Generative 3D Part Assembly via Dynamic Graph Learning
Jialei Huang*,
Guanqi Zhan*,
Qingnan Fan,
Kaichun Mo,
Lin Shao,
Baoquan Chen,
Leonidas J. Guibas and
Hao Dong
NeurIPS 2020
Analogous to buying an IKEA furniture, given a set of 3D part point clouds, we predict 6-Dof part poses to assemble a 3D shape.
To tackle this problem, we propose an assembly-oriented dynamic graph learning framework that leverages an iterative graph neural network as a backbone.
It explicitly conducts sequential part assembly refinements in a coarse-to-fine manner, exploits a pair of part relation reasoning module and part aggregation module for dynamically adjusting both part features and their relations in the part graph.
[Paper]
[Project]
[Bibtex]
|
|

|
Learning 3D Part Assembly from a Single Image
Yichen Li*,
Kaichun Mo*,
Lin Shao,
Minhyuk Sung and
Leonidas J. Guibas
ECCV 2020
(Also be present at Holistic Scene Structures for 3D Vision)
We introduce a novel problem, single-image-guided 3D part assembly, that assembles 3D shapes from parts given a complete set of part point cloud scans and a single 2D image depicting the object.
The task is motivated by the robotic assembly setting and the estimated per-part poses serve as a vision-based initialization before robotic planning and control components.
[Paper]
[Project]
[Video (1-min)]
[Video (7-min)]
[Bibtex]
|
|

|
PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
Kaichun Mo,
He Wang,
Xinchen Yan and
Leonidas J. Guibas
ECCV 2020
This paper investigates the novel problem of generating 3D shape point cloud geometry from a symbolic part tree representation.
In order to learn such a conditional shape generation procedure in an end-to-end fashion, we propose a conditional GAN "part tree"-to-"point cloud" model (PT2PC) that disentangles the structural and geometric factors.
[Paper]
[Project]
[Video (1-min)]
[Video (10-min)]
[Bibtex]
|
|

|
SAPIEN: A SimulAted Part-based Interactive ENvironment
Fanbo Xiang,
Yuzhe Qin,
Kaichun Mo,
Yikuan Xia,
Hao Zhu,
Fangchen Liu,
Minghua Liu,
Hanxiao Jiang,
Yifu Yuan,
He Wang,
Li Yi,
Angel X.Chang,
Leonidas J. Guibas and
Hao Su
CVPR 2020, Oral Presentation
We propose a realistic and physics-rich simulation environment hosting large-scale 3D articulated objects from ShapeNet and PartNet.
Our PartNet-Mobility dataset contains 14,068 articulated parts with part motion information for 2,346 object models from 46 common indoor object categories.
SAPIEN enables various robotic vision and interaction tasks that require detailed part-level understanding.
[Paper]
[Project Page]
[Code]
[Demo]
[Video]
[Bibtex]
|
|

|
StructEdit: Learning Structural Shape Variations
Kaichun Mo*,
Paul Guerrero*,
Li Yi,
Hao Su,
Peter Wonka,
Niloy Mitra and
Leonidas J. Guibas
CVPR 2020
Featured in:
CVPR Daily (Tue)
Video:
CVPR Workshop: Learning 3D Generative Models (Invited Talk By Paul Guerrero)
We learn local shape edits (shape deltas) space that captures both discrete structural changes and continuous variations.
Our approach is based on a conditional variational autoencoder (cVAE) for encoding and decoding shape deltas, conditioned on a source shape.
The learned shape delta spaces support shape edit suggestions, shape analogy, and shape edit transfer, much better than StructureNet, on the PartNet dataset.
[Paper]
[Project]
[Video]
[Bibtex]
|
|

|
Learning to Group: A Bottom-Up Framework for 3D Part Discovery in Unseen Categories
Tiange Luo,
Kaichun Mo,
Zhiao Huang,
Jiarui Xu,
Siyu Hu,
Liwei Wang and
Hao Su
ICLR 2020
We address the problem of learning to discover 3D parts for objects in unseen categories under the zero-shot learning setting.
We propose a learning-based iterative grouping framework which learns a grouping policy to progressively merge small part proposals into bigger ones in a bottom-up fashion.
On PartNet, we demonstrate that our method can transfer knowledge of parts learned from 3 training categories to 21 unseen categories.
[Paper]
[Project]
[Video]
[Bibtex]
|
|