Me
Shao-Hua Sun (孫紹華)
Assistant Professor
at National Taiwan University
Shao-Hua Sun
My research interests span over the fields of Deep Learning, Computer Vision, Reinforcement Learning, Meta-learning, and Robotics.


Bio

I am an Assistant Professor at National Taiwan University (NTU) with a joint appointment in the Department of Electrical Engineering and the Graduate Institute of Communication Engineering. Prior to joining NTU, I recently completed my Ph.D. in Computer Science at the University of Southern California, where I worked in the Cognitive Learning for Vision and Robotics Lab (CLVR) with Professor Joseph J. Lim. Before that, I received my B.S. degree in Electrical Engineering from NTU. My research interests span Robot Learning, Reinforcement Learning, Robotics, and Program Synthesis.

Hiring news: I am looking for students interested in robot learning, reinforcement learning, robotics, program synthesis, or related areas. Specifically, I am hiring M.S. and Ph.D. students admitted to the Data Science and Smart Networking Group at the Graduate Institute of Communication Engineering (電信所丙組/資料科學與智慧網路組) or the Data Science Degree Program (資料科學學位學程) at NTU. Also, I am seeking undergraduate students (專題說明會投影片), research assistants, and visiting students/scholars with different experience levels. If interested, please contact me at shaohuas@ntu.edu.tw with your CV and transcript.

If you are interested in joining my group, you are very welcome to visit my lab at MK-518 (學新館518) to talk to my graduate and undergraduate students to learn about their research topics, my advising style, etc. Note that weekday afternoon usually works best, especially Thursday and Friday afternoons.

Ph.D. Dissertation

Title: Program-Guided Framework for Interpreting and Acquiring Complex Skills with Learning Robots

Abstract: My research focuses on developing a robot learning framework that enables robots to acquire long-horizon and complex skills with hierarchical structures, such as furniture assembly and cooking. Specifically, I aim to devise a robot learning framework which is: (1) interpretable: by decoupling interpreting skill specifications (e.g. demonstrations, reward functions) and executing skills, (2) programmatic: by generalizing from simple instances to complex instances without additional learning, (3) hierarchical: by operating on a proper level of abstraction that enables human users to interpret high-level plans of robots allows for composing primitive skills to solve long-horizon tasks, and (4) modular: by being equipped with modules specialized in different functions (e.g. perception, action) which collaborate, allowing for better generalization. This dissertation discusses a series of projects toward building such an interpretable, programmatic, hierarchical, and modular robot learning framework.

Dissertation committee: Professor Joseph J. Lim (Chair), Professor Gaurav Sukhatme, Professor Stefanos Nikolaidis, and Professor Quan Nguyen

Publications

Skill-based Meta-Reinforcement Learning
in International Conference on Learning Representations (ICLR) 2022 and
Meta-Learning Workshop at Neural Information Processing Systems (NeurIPS) 2021 and
Deep RL Workshop at Neural Information Processing Systems (NeurIPS) 2021

We devise a method that enables meta-learning on long-horizon, sparse-reward tasks, allowing us to solve unseen target tasks with orders of magnitude fewer environment interactions. Specifically, we propose to (1) extract reusable skills and a skill prior from offline datasets, (2) meta-train a high-level policy that learns to efficiently compose learned skills into long-horizon behaviors, and (3) rapidly adapt the meta-trained policy to solve an unseen target task.

Learning to Synthesize Programs as Interpretable and Generalizable Policies
in Neural Information Processing Systems (NeurIPS) 2021

We present a framework that learns to synthesize a program, detailing the procedure to solve a task in a flexible and expressive manner, solely from reward signals. To alleviate the difficulty of learning to compose programs to induce the desired agent behavior from scratch, we propose to learn a program embedding space that continuously parameterizes diverse behaviors in an unsupervised manner and then search over the learned program embedding space to yield a program that maximizes the return for a given task.

Generalizable Imitation Learning from Observation via Inferring Goal Proximity
in Neural Information Processing Systems (NeurIPS) 2021

Task progress is intuitive and readily available task information that can guide an agent closer to the desired goal. Furthermore, a progress estimator can generalize to new situations. From this intuition, we propose a simple yet effective imitation learning from observation method for a goal-directed task using a learned goal proximity function as a task progress estimator, for better generalization to unseen states and goals. We obtain this goal proximity function from expert demonstrations and online agent experience, and then use the learned goal proximity as a dense reward for policy training.

Program Guided Agent
in International Conference on Learning Representations (ICLR) 2020   (Spotlight)

We propose to utilize programs, structured in a formal language, as a precise and expressive way to specify tasks, instead of natural languages which can often be ambiguous. We then devise a modular framework that learns to perform a task specified by a program – as different circumstances give rise to diverse ways to accomplish the task, our framework can perceive which circumstance it is currently under, and instruct a multitask policy accordingly to fulfill each subtask of the overall task.

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation
in Neural Information Processing Systems (NeurIPS) 2019   (Spotlight)

Model-agnostic meta-learners aim to acquire meta-prior parameters from a distribution of tasks and adapt to novel tasks with few gradient updates. Yet, seeking a common initialization shared across the entire task distribution substantially limits the diversity of the task distributions that they are able to learn from. We propose a multimodal MAML (MMAML) framework, which is able to modulate its meta-learned prior according to the identified mode, allowing more efficient fast adaptation.

Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks
in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019

We propose feedback adversarial learning (FAL) framework that can improve existing generative adversarial networks by leveraging spatial feedback from the discriminator. We formulate the generation task as a recurrent framework, in which the generator conditions on the discriminator spatial output response and its previous generation to improve generation quality over time - allowing the generator to attend and fix its previous mistakes.

Composing Complex Skills by Learning Transition Policies
in International Conference on Learning Representations (ICLR) 2019

Humans acquire complex skills by exploiting previously learned skills and making transitions between them. To empower machines with this ability, we propose a method that can learn transition policies which effectively connect primitive skills to perform sequential tasks without handcrafted rewards. To efficiently train our transition policies, we introduce proximity predictors which induce rewards gauging proximity to suitable initial states for the next skill.

Toward Multimodal Model-Agnostic Meta-Learning
in Meta-Learning Workshop at Neural Information Processing Systems (NeurIPS) 2018

Model-agnostic meta-learners aim to acquire meta-prior parameters from a distribution of tasks and adapt to novel tasks with few gradient updates. Yet, seeking a common initialization shared across the entire task distribution substantially limits the diversity of the task distributions that they are able to learn from. We propose a multimodal MAML (MMAML) framework, which is able to modulate its meta-learned prior according to the identified mode, allowing more efficient fast adaptation.

Multi-view to Novel View: Synthesizing Novel Views with Self-Learned Confidence
in European Conference on Computer Vision (ECCV) 2018

We aim to synthesize a target image with an arbitrary camera pose from multipple given source images. We propose an end-to-end trainable framework which consists of a flow prediction module and a pixel generation module to directly leverage information presented in source views as well as hallucinate missing pixels from statistical priors. We introduce a self-learned confidence aggregation mechanism to merge the predictions produced by the two modules given multi-view source images.

Neural Program Synthesis from Diverse Demonstration Videos
in International Conference on Machine Learning (ICML) 2018

Interpreting decision making logic in demonstration videos is key to collaborating with and mimicking humans. To empower machines with this ability, we propose a framework that is able to explicitly synthesize underlying programs from behaviorally diverse and visually complicated demonstration videos. We introduce a summarizer module to improve the network’s ability to integrate multiple demonstrations and employ a multi-task objective to encourage the model to learn meaningful intermediate representations.

Professional Activity

Conference reviewer NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, CoRL, AAAI, HRI, WACV, ICIP, BMVC
Journal reviewer Transactions on Machine Learning Research, IEEE Transactions on Image Processing