Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
meta-learning [2018/01/30 21:31]
admin
meta-learning [2018/12/02 13:58]
admin
Line 344: Line 344:
 gradient-based meta-learning consistently leads to learning strategies that generalize gradient-based meta-learning consistently leads to learning strategies that generalize
 more widely compared to those represented by recurrent models. more widely compared to those represented by recurrent models.
 +
 +In studying
 +the universality of MAML, we find that, for a sufficiently deep learner model, MAML has the same
 +theoretical representational power as recurrent meta-learners. We therefore conclude that, when
 +using deep, expressive function approximators,​ there is no theoretical disadvantage in terms of representational
 +power to using MAML over a black-box meta-learner represented,​ for example, by a
 +recurrent network.
 +
 +https://​arxiv.org/​abs/​1802.07245v1 Meta-Reinforcement Learning of Structured Exploration Strategies
 +
 +We introduce a novel gradient-based fast adaptation algorithm -- model agnostic exploration with structured noise (MAESN) -- to learn exploration strategies from prior experience. The prior experience is used both to initialize a policy and to acquire a latent exploration space that can inject structured stochasticity into a policy, producing exploration strategies that are informed by prior knowledge and are more effective than random action-space noise.
 +
 +https://​arxiv.org/​abs/​1711.08105v1 Visual Question Answering as a Meta Learning Task
 +
 + We propose instead to approach VQA as a meta learning task, thus separating the question answering method from the information required. At test time, the method is provided with a support set of example questions/​answers,​ over which it reasons to resolve the given question. The support set is not fixed and can be extended without retraining, thereby expanding the capabilities of the model. To exploit this dynamically provided information,​ we adapt a state-of-the-art VQA model with two techniques from the recent meta learning literature, namely prototypical networks and meta networks.
 +
 +https://​arxiv.org/​pdf/​1802.08969v1.pdf Meta Multi-Task Learning for Sequence Modeling
 +
 +Semantic composition functions have been playing a pivotal
 +role in neural representation learning of text sequences. In
 +spite of their success, most existing models suffer from the
 +underfitting problem: they use the same shared compositional
 +function on all the positions in the sequence, thereby lacking
 +expressive power due to incapacity to capture the richness
 +of compositionality. Besides, the composition functions
 +of different tasks are independent and learned from scratch.
 +In this paper, we propose a new sharing scheme of composition
 +function across multiple tasks. Specifically,​ we use a
 +shared meta-network to capture the meta-knowledge of semantic
 +composition and generate the parameters of the taskspecific
 +semantic composition models. ​
 +
 +https://​arxiv.org/​pdf/​1712.06283v1.pdf A Bridge Between Hyperparameter Optimization
 +and Larning-to-learn
 +
 +We consider a class of a nested optimization problems involving inner and outer objectives.
 +We observe that by taking into explicit account the optimization dynamics for the
 +inner objective it is possible to derive a general framework that unifies gradient-based hyperparameter
 +optimization and meta-learning (or learning-to-learn). Depending on the
 +specific setting, the variables of the outer objective take either the meaning of hyperparameters
 +in a supervised learning problem or parameters of a meta-learner. We show that
 +some recently proposed methods in the latter setting can be instantiated in our framework
 +and tackled with the same gradient-based algorithms. Finally, we discuss possible
 +design patterns for learning-to-learn and present encouraging preliminary experiments
 +for few-shot learning
 +
 +https://​arxiv.org/​abs/​1806.04640 Unsupervised Meta-Learning for Reinforcement Learning
 +
 +We describe a general recipe for unsupervised meta-reinforcement learning, and describe an effective instantiation of this approach based on a recently proposed unsupervised exploration technique and model-agnostic meta-learning. We also discuss practical and conceptual considerations for developing unsupervised meta-learning methods.
 +
 +https://​arxiv.org/​abs/​1806.07917v1 Meta Learning by the Baldwin Effect
 +
 +https://​arxiv.org/​abs/​1807.05960 META-LEARNING WITH LATENT
 +EMBEDDING OPTIMIZATION
 +
 +We introduced Latent Embedding Optimization (LEO), a gradient-based meta-learning technique
 +which uses a parameter generative model in order to capture the diverse range of parameters useful
 +for a distribution over tasks, paving the way for a new state-of-the-art result on the challenging
 +5-way 1-shot miniImageNet classification problem. LEO is able to achieve this by reducing the
 +effective numbers of adapted parameters by one order of magnitude, while still making use of large
 +models with millions of parameters for feature extraction. This approach leads to a computationally
 +inexpensive optimization-based meta-learner with best in class generalization performance.
 +
 +https://​openreview.net/​forum?​id=HJGven05Y7 How to train your MAML
 +
 +MAML is great, but it has many problems, we solve many of those problems and as a result we learn most hyper parameters end to end, speed-up training and inference and set a new SOTA in few-shot learning
 +
 +https://​arxiv.org/​pdf/​1810.02334.pdf UNSUPERVISED LEARNING VIA META-LEARNING
 +
 +we construct tasks from unlabeled
 +data in an automatic way and run meta-learning over the constructed tasks. Surprisingly,​
 +we find that, when integrated with meta-learning,​ relatively simple task
 +construction mechanisms, such as clustering unsupervised representations,​ lead to
 +good performance on a variety of downstream tasks. Our experiments across four
 +image datasets indicate that our unsupervised meta-learning approach acquires a
 +learning algorithm without any labeled data that is applicable to a wide range
 +of downstream classification tasks, improving upon the representation learned by
 +four prior unsupervised learning methods.
 +
 +https://​arxiv.org/​abs/​1810.03548v1 Meta-Learning:​ A Survey
 +
 +In this chapter, we provide an overview of the state of the art in this fascinating and continuously evolving field.
 +
 +https://​arxiv.org/​pdf/​1810.03642v1.pdf CAML: Fast Context Adaptation via Meta-Learning
 +
 +
 +CAML: Fast Context Adaptation via Meta-Learning
 +Luisa M Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson
 +(Submitted on 8 Oct 2018 (this version), latest version 12 Oct 2018 (v2))
 +We propose CAML, a meta-learning method for fast adaptation that partitions the model parameters into two parts: context parameters that serve as additional input to the model and are adapted on individual tasks, and shared parameters that are meta-trained and shared across tasks.
 +
 +https://​arxiv.org/​pdf/​1810.08178.pdf Gradient Agreement as an Optimization Objective for
 +Meta-Learning
 +
 +Our approach is based on pushing the parameters of the
 +model to a direction in which tasks have more agreement upon. If the gradients
 +of a task agree with the parameters update vector, then their inner product will be
 +a large positive value. As a result, given a batch of tasks to be optimized for, we
 +associate a positive (negative) weight to the loss function of a task, if the inner
 +product between its gradients and the average of the gradients of all tasks in the
 +batch is a positive (negative) value. ​
 +
 +https://​openreview.net/​pdf?​id=HkxStoC5F7 META-LEARNING PROBABILISTIC INFERENCE FOR
 +PREDICTION
 +
 + 1) We develop ML-PIP, a general framework for Meta-Learning approximate
 +Probabilistic Inference for Prediction. ML-PIP extends existing probabilistic
 +interpretations of meta-learning to cover a broad class of methods. 2) We
 +introduce VERSA, an instance of the framework employing a flexible and versatile
 +amortization network that takes few-shot learning datasets as inputs, with arbitrary
 +numbers of shots, and outputs a distribution over task-specific parameters in
 +a single forward pass. VERSA substitutes optimization at test time with forward
 +passes through inference networks, amortizing the cost of inference and relieving
 +the need for second derivatives during training.
 +
 +https://​arxiv.org/​pdf/​1810.06784.pdf PROMP: PROXIMAL META-POLICY SEARCH
 +
 +This paper provides
 +a theoretical analysis of credit assignment in gradient-based Meta-RL. Building
 +on the gained insights we develop a novel meta-learning algorithm that overcomes
 +both the issue of poor credit assignment and previous difficulties in estimating
 +meta-policy gradients. By controlling the statistical distance of both
 +pre-adaptation and adapted policies during meta-policy search, the proposed algorithm
 +endows efficient and stable meta-learning. Our approach leads to superior
 +pre-adaptation policy behavior and consistently outperforms previous Meta-RL algorithms
 +in sample-efficiency,​ wall-clock time, and asymptotic performance. Our
 +code is available at github.com/​jonasrothfuss/​promp
 +
 +https://​pdfs.semanticscholar.org/​0b00/​3bb28f25627f715b0fd53b443fabfcf5a817.pdf?​_ga=2.110922695.354576531.1543161615-2107301068.1536926320 Meta-Learning with Latent Embedding Optimization
 +
 +The resulting approach, latent embedding optimization (LEO), decouples the gradient-based adaptation procedure from the underlying high-dimensional space of model parameters. Our evaluation shows that LEO can achieve state-of-the-art performance on the competitive miniImageNet and tieredImageNet few-shot classification tasks
 +
 +https://​arxiv.org/​pdf/​1810.03642.pdf CAML: FAST CONTEXT ADAPTATION VIA META-LEARNING
 +
 +https://​arxiv.org/​pdf/​1611.03537.pdf Linear predictors for nonlinear dynamical
 +systems: Koopman operator meets model
 +predictive control