HyperTransformer Model Generation for Supervised and Semi-supervised Few-show Learning

📖 Arxiv: 2201.04182

Motivation

One approach of few-shot learning is metric-based

💡 The main idea is to use the transformer model that given a few-shot task episode generates an entire inference model by producing all model weights in a single pass.

Contribution

Small CNN Architectures: this method is effective than training a universal task-independent embedding.
Large CNN Architectures:

We develop a novel replay buffer consistent with the architecture and training protocol of ODT

Methodology

This paper combines DT with SAC, which adopt a maximum-entropy idea to encourage the exploration in fine-tuning.
Minor changes:
1. Change replay buffer from saving transitions to trajectories.
2. Utilizing HER to improve the sample-efficiency in sparse rewards settings.
3. Sampling strategy.

References

Levine, Sergey. “Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review.” ArXiv abs/1805.00909 (2018): n. pag.
Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, OpenAI Pieter Abbeel, and Wojciech Zaremba. Hindsight Experience Replay. In Advances in Neural Information Processing Systems, 2017.