Learning Stochastic Feedforward Neural Networks
Yichuan Tang, Ruslan Salakhutdinov

Auditing: Active Learning with Outcome-Dependent Query Costs
Sivan Sabato, Anand D. Sarwate, Nati Srebro

A message-passing algorithm for multi-agent trajectory planning
Jose Bento, Nate Derbinsky, Javier Alonso-Mora, Jonathan S. Yedidia

Inferring neural population dynamics from multiple partial recordings of the same neural circuit
Srini Turaga, Lars Buesing, Adam M. Packer, Henry Dalgleish, Noah Pettit, Michael Hausser, Jakob Macke

Multi-Prediction Deep Boltzmann Machines
Ian Goodfellow, Mehdi Mirza, Aaron Courville, Yoshua Bengio

Adaptive Submodular Maximization in Bandit Setting
Victor Gabillon, Branislav Kveton, Zheng Wen, Brian Eriksson, S. Muthukrishnan

Approximate Inference in Continuous Determinantal Processes
Raja Hafiz Affandi, Emily Fox, Ben Taskar

Multiclass Total Variation Clustering
Xavier Bresson, Thomas Laurent, David Uminsky, James von Brecht

Reservoir Boosting : Between Online and Offline Ensemble Learning
Leonidas Lefakis, François Fleuret

Matrix factorization with binary components
Martin Slawski, Matthias Hein, Pavlo Lutsik

Learning to Pass Expectation Propagation Messages
Nicolas Heess, Daniel Tarlow, John Winn

Latent Structured Active Learning
Wenjie Luo, Alex Schwing, Raquel Urtasun

Two-Target Algorithms for Infinite-Armed Bandits with Bernoulli Rewards
Thomas Bonald, Alexandre Proutiere

RNADE: The real-valued neural autoregressive density-estimator
Benigno Uria, Iain Murray, Hugo Larochelle

Multiscale Dictionary Learning for Estimating Conditional Distributions
Francesca Petralia, Joshua T. Vogelstein, David Dunson

Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima
Po-Ling Loh, Martin J. Wainwright

Rapid Distance-Based Outlier Detection via Sampling
Mahito Sugiyama, Karsten Borgwardt

Better Approximation and Faster Algorithm Using the Proximal Average
Yao-Liang Yu

Dynamic Clustering via Asymptotics of the Dependent Dirichlet Process Mixture
Trevor Campbell, Miao Liu, Brian Kulis, Jonathan P. How, Lawrence Carin

Stochastic Convex Optimization with Multiple Objectives
Mehrdad Mahdavi, Tianbao Yang, Rong Jin

Error-Minimizing Estimates and Universal Entry-Wise Error Bounds for Low-Rank Matrix Completion
Franz Kiraly, Louis Theran

Message Passing Inference with Chemical Reaction Networks
Nils E. Napp, Ryan P. Adams

A Kernel Test for Three-Variable Interactions
Dino Sejdinovic, Arthur Gretton, Wicher Bergsma

Contrastive Learning Using Spectral Methods
James Y. Zou, Daniel Hsu, David C. Parkes, Ryan P. Adams

Lexical and Hierarchical Topic Regression
Viet-An Nguyen, Jordan Boyd-Graber, Philip Resnik

How to Hedge an Option Against an Adversary: Black-Scholes Pricing is Minimax Optimal
Jacob Abernethy, Peter Bartlett, Rafael Frongillo, Andre Wibisono

Discovering Hidden Variables in Noisy-Or Networks using Quartet Tests
Yacine Jernite, Yonatan Halpern, David Sontag

Least Informative Dimensions
Fabian Sinz, Anna Stockl, January Grewe, January Benda

A Scalable Approach to Probabilistic Latent Space Inference of Large-Scale Networks
Junming Yin, Qirong Ho, Eric Xing

Adaptive dropout for training deep neural networks
Jimmy Ba, Brendan Frey

Efficient Online Inference for Bayesian Nonparametric Relational Models
Dae Il Kim, Prem Gopalan, David Blei, Erik Sudderth

Adaptivity to Local Smoothness and Dimension in Kernel Regression
Samory Kpotufe, Vikas Garg

Optimization, Learning, and Games with Predictable Sequences
Sasha Rakhlin, Karthik Sridharan

Approximate inference in latent Gaussian-Markov models from continuous time observations
Botond Cseke, Manfred Opper, Guido Sanguinetti

Linear Convergence with Condition Number Independent Access of Full Gradients
Lijun Zhang, Mehrdad Mahdavi, Rong Jin

Multi-Task Bayesian Optimization
Kevin Swersky, Jasper Snoek, Ryan P. Adams

Multilinear Dynamical Systems for Tensor Time Series
Mark Rogers, Lei Li, Stuart Russell

Restricting exchangeable nonparametric distributions
Sinead A. Williamson, Steve N. MacEachern, Eric Xing

Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting
Shunan Zhang, Angela J. Yu

Probabilistic Movement Primitives
Alexandros Paraschos, Christian Daniel, January Peters, Gerhard Neumann

Policy Shaping: Integrating Human Feedback with Reinforcement Learning
Shane Griffith, Kaushik Subramanian, Jonathan Scholz, Charles Isbell, Andrea L. Thomaz

Linear decision rule as aspiration for simple decision heuristics
Ozgur Simsek

Online PCA for Contaminated Data
Jiashi Feng, Huan Xu, Shie Mannor, Shuicheng Yan

Deep content-based music recommendation
Aaron van den Oord, Sander Dieleman, Benjamin Schrauwen

PAC-Bayes-Empirical-Bernstein Inequality
Ilya O. Tolstikhin, Yevgeny Seldin

Point Based Value Iteration with Optimal Belief Compression for Dec-POMDPs
Liam C. MacDermed, Charles Isbell

Matrix Completion From any Given Set of Observations
Troy Lee, Adi Shraibman

Online Learning with Costly Features and Labels
Navid Zolghadr, Gabor Bartok, Russell Greiner, András György, Csaba Szepesvari

Sparse nonnegative deconvolution for compressive calcium imaging: algorithms and phase transitions
Eftychios A. Pnevmatikakis, Liam Paninski

A Stability-based Validation Procedure for Differentially Private Machine Learning
Kamalika Chaudhuri, Staal A. Vinterbo

A Novel Two-Step Method for Cross Language Representation Learning
Min Xiao, Yuhong Guo

Capacity of strong attractor patterns to model behavioural and cognitive prototypes
Abbas Edalat

B-test: A Non-parametric, Low Variance Kernel Two-sample Test
Wojciech Zaremba, Arthur Gretton, Matthew Blaschko

q-OCSVM: A q-Quantile Estimator for High-Dimensional Distributions
Assaf Glazer, Michael Lindenbaoum, Shaul Markovitch

Mid-level Visual Element Discovery as Discriminative Mode Seeking
Carl Doersch, Abhinav Gupta, Alexei A. Efros

Non-Linear Domain Adaptation with Boosting
Carlos J. Becker, Christos M. Christoudias, Pascal Fua

Exact and Stable Recovery of Pairwise Interaction Tensors
Shouyuan Chen, Michael R. Lyu, Irwin King, Zenglin Xu

On Decomposing the Proximal Map
Yao-Liang Yu

Polar Operators for Structured Sparse Estimation
Xinhua Zhang, Yao-Liang Yu, Dale Schuurmans

A Gang of Bandits
Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella

Inverse Density as an Inverse Problem: the Fredholm Equation Approach
Qichao Que, Mikhail Belkin

Robust Image Denoising with Multi-Column Deep Neural Networks
Forest Agostinelli, Michael R. Anderson, Honglak Lee

EDML for Learning Parameters in Directed and Undirected Graphical Models
Khaled Refaat, Arthur Choi, Adnan Darwiche

A Comparative Framework for Preconditioned Lasso Algorithms
Fabian L. Wauthier, Nebojsa Jojic, Michael Jordan

A memory frontier for complex synapses
Subhaneil Lahiri, Surya Ganguli

First-order Decomposition Trees
Nima Taghipour, Jesse Davis, Hendrik Blockeel

Marginals-to-Models Reducibility
Tim Roughgarden, Michael Kearns

Accelerating Stochastic Gradient Descent using Predictive Variance Reduction
Rie Johnson, Tong Zhang

Online Variational Approximations to non-Exponential Family Change Point Models: With Application to Radar Tracking
Ryan D. Turner, Steven Bottone, Clay J. Stanek

Optimal Neural Population Codes for High-dimensional Stimulus Variables
Zhuo Wang, Alan Stocker, Daniel Lee

Fast Algorithms for Gaussian Noise Invariant Independent Component Analysis
James R. Voss, Luis Rademacher, Mikhail Belkin

Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation
Martin Azizyan, Aarti Singh, Larry Wasserman

Predicting Parameters in Deep Learning
Misha Denil, Babak Shakibi, Laurent Dinh, Marc'Aurelio Ranzato, Nando de Freitas

Estimating the Unseen: Improved Estimators for Entropy and other Properties
Paul Valiant, Gregory Valiant

What do row and column marginals reveal about your dataset?
Behzad Golshan, John Byers, Evimaria Terzi

One-shot learning by inverting a compositional causal process
Brenden M. Lake, Ruslan Salakhutdinov, Josh Tenenbaum

Statistical analysis of coupled time series with Kernel Cross-Spectral Density operators
Michel Besserve, Nikos K. Logothetis, Bernhard Schölkopf

Optimizing Instructional Policies
Robert Lindsey, Michael Mozer, William J. Huggins, Harold Pashler

Integrated Non-Factorized Variational Inference
Shaobo Han, Xuejun Liao, Lawrence Carin

More data speeds up training time in learning halfspaces over sparse vectors
Amit Daniely, Nati Linial, Shai Shalev-Shwartz

(Nearly) Optimal Algorithms for Private Online Learning in Full-information and Bandit Settings
Abhradeep Guha Thakurta, Adam Smith

Curvature and Optimal Algorithms for Learning and Minimizing Submodular Functions
Rishabh K. Iyer, Stefanie Jegelka, Jeff A. Bilmes

Σ-Optimality for Active Learning on Gaussian Random Fields
Yifei Ma, Roman Garnett, Jeff Schneider

Learning Kernels Using Local Rademacher Complexity
Corinna Cortes, Marius Kloft, Mehryar Mohri

Causal Inference on Time Series using Restricted Structural Equation Models
Jonas Peters, Dominik Janzing, Bernhard Schölkopf

Symbolic Opportunistic Policy Iteration for Factored-Action MDPs
Aswin Raghavan, Roni Khardon, Alan Fern, Prasad Tadepalli

Real-Time Inference for a Gamma Process Model of Neural Spiking
David Carlson, Vinayak Rao, Joshua T. Vogelstein, Lawrence Carin

Phase Retrieval using Alternating Minimization
Praneeth Netrapalli, Prateek Jain, Sujay Sanghavi

Translating Embeddings for Modeling Multi-relational Data
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, Oksana Yakhnenko

Generalizing Analytic Shrinkage for Arbitrary Covariance Structures
Daniel Bartz, Klaus-Robert Müller

Flexible sampling of discrete data correlations without the marginal distributions
Alfredo Kalaitzis, Ricardo Silva

Adaptive Anonymity via $b$-Matching
Krzysztof M. Choromanski, Tony Jebara, Kui Tang

Convex Tensor Decomposition via Structured Schatten Norm Regularization
Ryota Tomioka, Taiji Suzuki

Unsupervised Structure Learning of Stochastic And-Or Grammars
Kewei Tu, Maria Pavlovskaia, Song-Chun Zhu

Modeling Overlapping Communities with Node Popularities
Prem Gopalan, Chong Wang, David Blei

Learning from Limited Demonstrations
Beomjoon Kim, Amir massoud Farahmand, Joelle Pineau, Doina Precup

On the Complexity and Approximation of Binary Evidence in Lifted Inference
Guy van den Broeck, Adnan Darwiche

Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)
Francis Bach, Eric Moulines

Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching
Marcelo Fiori, Pablo Sprechmann, Joshua T. Vogelstein, Pablo Muse, Guillermo Sapiro

Transportability from Multiple Environments with Limited Experiments
Elias Bareinboim, Sanghack Lee, Vasant Honavar, Judea Pearl

Supervised Sparse Analysis and Synthesis Operators
Pablo Sprechmann, Roee Litman, Tal Ben Yakar, Alexander M. Bronstein, Guillermo Sapiro

Generalized Denoising Auto-Encoders as Generative Models
Yoshua Bengio, Li Yao, Guillaume Alain, Pascal Vincent

Documents as multiple overlapping windows into grids of counts
Alessandro Perina, Nebojsa Jojic, Manuele Bicego, Andrzej Truski

Solving the multi-way matching problem by permutation synchronization
Deepti Pachauri, Risi Kondor, Vikas Singh

Robust Bloom Filters for Large MultiLabel Classification Tasks
Moustapha M. Cisse, Nicolas Usunier, Thierry Artières, Patrick Gallinari

Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies
Yangqing Jia, Joshua T. Abbott, Joseph Austerweil, Thomas Griffiths, Trevor Darrell

Dirty Statistical Models
Eunho Yang, Pradeep Ravikumar

Locally Adaptive Bayesian Multivariate Time Series
Daniele Durante, Bruno Scarpa, David Dunson

Optimistic Concurrency Control for Distributed Unsupervised Learning
Xinghao Pan, Joseph E. Gonzalez, Stefanie Jegelka, Tamara Broderick, Michael Jordan

Auxiliary-variable Exact Hamiltonian Monte Carlo Samplers for Binary Distributions
Ari Pakman, Liam Paninski

Adaptive Step-Size for Policy Gradient Methods
Matteo Pirotta, Marcello Restelli, Luca Bascetta

Probabilistic Principal Geodesic Analysis
Miaomiao Zhang, P.T. Fletcher

Mixed Optimization for Smooth Functions
Mehrdad Mahdavi, Lijun Zhang, Rong Jin

Projecting Ising Model Parameters for Fast Mixing
Justin Domke, Xianghang Liu

Which Space Partitioning Tree to Use for Search?
Parikshit Ram, Alexander Gray

Conditional Random Fields via Univariate Exponential Families
Eunho Yang, Pradeep Ravikumar, Genevera I. Allen, Zhandong Liu

Provable Subspace Clustering: When LRR meets SSC
Yu-Xiang Wang, Huan Xu, Chenlei Leng

Stochastic blockmodel approximation of a graphon: Theory and consistent estimation
Edoardo M. Airoldi, Thiago B. Costa, Stanley H. Chan

Generalized Random Utility Models with Multiple Types
Hossein Azari Soufiani, Hansheng Diao, Zhenyu Lai, David C. Parkes

Bayesian Hierarchical Community Discovery
Charles Blundell, Yee Whye Teh

Optimistic policy iteration and natural actor-critic: A unifying view and a non-optimality result
Paul Wagner

Predictive PAC Learning and Process Decompositions
Cosma Shalizi, Aryeh Kontorovitch

From Bandits to Experts: A Tale of Domination and Independence
Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Yishay Mansour

Action is in the Eye of the Beholder: Eye-gaze Driven Model for Spatio-Temporal Action Localization
Nataliya Shapovalova, Michalis Raptis, Leonid Sigal, Greg Mori

Efficient Optimization for Sparse Gaussian Process Regression
Yanshuai Cao, Marcus A. Brubaker, David Fleet, Aaron Hertzmann

Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
Gunnar Kedenburg, Raphael Fonteneau, Remi Munos

Learning the Local Statistics of Optical Flow
Dan Rosenbaum, Daniel Zoran, Yair Weiss

Estimation Bias in Multi-Armed Bandit Algorithms for Search Advertising
Min Xu, Tao Qin, Tie-Yan Liu

Robust learning of low-dimensional dynamics from large neural ensembles
David Pfau, Eftychios A. Pnevmatikakis, Liam Paninski

On model selection consistency of penalized M-estimators: a geometric theory
Jason Lee, Yuekai Sun, Jonathan E. Taylor

New Subsampling Algorithms for Fast Least Squares Regression
Paramveer Dhillon, Yichao Lu, Dean P. Foster, Lyle Ungar

Dropout Training as Adaptive Regularization
Stefan Wager, Sida Wang, Percy Liang

Using multiple samples to learn mixture models
Jason Lee, Ran Gilad-Bachrach, Rich Caruana

Annealing between distributions by averaging moments
Roger B. Grosse, Chris J. Maddison, Ruslan Salakhutdinov

Learning Hidden Markov Models from Non-sequence Data via Tensor Decomposition
Tzu-Kuo Huang, Jeff Schneider

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent
Shai Shalev-Shwartz, Tong Zhang

Faster Ridge Regression via the Subsampled Randomized Hadamard Transform
Yichao Lu, Paramveer Dhillon, Dean P. Foster, Lyle Ungar

Recurrent linear models of simultaneously-recorded neural populations
Marius Pachitariu, Biljana Petreska, Maneesh Sahani

Learning Adaptive Value of Information for Structured Prediction
David J. Weiss, Ben Taskar

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity
Anima Anandkumar, Daniel Hsu, Majid Janzamin, Sham M. Kakade

Estimating LASSO Risk and Noise Level
Mohsen Bayati, Murat A. Erdogdu, Andrea Montanari

Relevance Topic Model for Unstructured Social Group Activity Recognition
Fang Zhao, Yongzhen Huang, Liang Wang, Tieniu Tan

Context-sensitive active sensing in humans
Sheeraz Ahmad, He Huang, Angela J. Yu

Confidence Intervals and Hypothesis Testing for High-Dimensional Statistical Models
Adel Javanmard, Andrea Montanari

Factorized Asymptotic Bayesian Inference for Latent Feature Models
Kohei Hayashi, Ryohei Fujimaki

Tracking Time-varying Graphical Structure
Erich Kummerfeld, David Danks

Compressive Feature Learning
Hristo S. Paskov, Robert West, John C. Mitchell, Trevor Hastie

Moment-based Uniform Deviation Bounds for $k$-means and Friends
Matus Telgarsky, Sanjoy Dasgupta

