Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
unsupervised_learning [2018/04/23 15:58]
admin
unsupervised_learning [2018/10/26 10:38]
admin
Line 276: Line 276:
 original sentence after the primal and dual translations),​ we can iteratively update original sentence after the primal and dual translations),​ we can iteratively update
 the two models until convergence (e.g., using the policy gradient methods). We call the two models until convergence (e.g., using the policy gradient methods). We call
-the corresponding approach to neural machine translation dual-NMT+the corresponding approach to neural machine translation dual-NMT
 + 
 +https://​arxiv.org/​abs/​1606.04596 Semi-Supervised Learning for Neural Machine Translation 
 + 
 +While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation. Since parallel corpora are usually limited in quantity, quality, and coverage, especially for low-resource languages, it is appealing to exploit monolingual corpora to improve NMT. We propose a semi-supervised approach for training NMT models on the concatenation of labeled (parallel corpora) and unlabeled (monolingual corpora) data. The central idea is to reconstruct the monolingual corpora using an autoencoder,​ in which the source-to-target and target-to-source translation models serve as the encoder and decoder, respectively. Our approach can not only exploit the monolingual corpora of the target language, but also of the source language. Experiments on the Chinese-English dataset show that our approach achieves significant improvements over state-of-the-art SMT and NMT systems. 
 + 
 +https://​arxiv.org/​abs/​1804.09170v1 Realistic Evaluation of Deep Semi-Supervised Learning Algorithms 
 + 
 +we argue that these benchmarks fail to address many issues that these algorithms would face in real-world applications. After creating a unified reimplementation of various widely-used SSL techniques, we test them in a suite of experiments designed to address these issues. We find that the performance of simple baselines which do not use unlabeled data is often underreported,​ that SSL methods differ in sensitivity to the amount of labeled and unlabeled data, and that performance can degrade substantially when the unlabeled dataset contains out-of-class examples. To help guide SSL research towards real-world applicability,​ we make our unified reimplemention and evaluation platform publicly available. 
 + 
 +https://​arxiv.org/​abs/​1808.08485v1 Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision 
 + 
 +. In this paper, we propose deep probabilistic logic (DPL) as a general framework for indirect supervision,​ by composing probabilistic logic with deep learning. DPL models label decisions as latent variables, represents prior knowledge on their relations using weighted first-order logical formulas, and alternates between learning a deep neural network for the end task and refining uncertain formula weights for indirect supervision,​ using variational EM. This framework subsumes prior indirect supervision methods as special cases, and enables novel combination via infusion of rich domain and linguistic knowledge. http://​hanover.azurewebsites.net/​ 
 + 
 +https://​openreview.net/​forum?​id=r1g7y2RqYX Label Propagation Networks 
 + 
 +https://​arxiv.org/​abs/​1810.02840 Training Complex Models with Multi-Task Weak Supervision 
 + 
 + We show that by solving a matrix completion-style problem, we can recover the accuracies of these multi-task sources given their dependency structure, but without any labeled data, leading to higher-quality supervision for training an end model. Theoretically,​ we show that the generalization error of models trained with this approach improves with the number of unlabeled data points, and characterize the scaling with respect to the task and dependency structures. On three fine-grained classification problems, we show that our approach leads to average gains of 20.2 points in accuracy over a traditional supervised approach, 6.8 points over a majority vote baseline, and 4.1 points over a previously proposed weak supervision method that models tasks separately. 
 + 
 +https://​colinraffel.com/​publications/​nips2018realistic.pdf Realistic Evaluation of Deep Semi-Supervised 
 +Learning Algorithms . https://​github.com/​brain-research/​realistic-ssl-evaluation