I am a Principal Researcher at Microsoft Research Amsterdam, where I work on the intersection of deep learning and computational chemistry and physics for molecular simulation. I will discuss how spectral graph theory yields vertex representations and a generalized convolution that shares weights beyond symmetries. Finally, we show preliminary results suggesting that our model yields a nested spatial hierarchy of increasingly abstract categories, analogous to observations from the human ventral temporal cortex. We study the calibration of L2D systems, investigating if the probabilities they output are sound. Variational autoencoders (VAEs) optimize an objective that comprises a reconstruction loss (the distortion) and a KL term (the rate). My research centers around causal inference and graphical modelling. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity. The learning to defer (L2D) framework has the potential to make AI systems safer. We compare our model with related supervised approaches, namely the TDANN, and discuss both theoretical and empirical similarities. Both institutes join forces in the development of AI algorithms to improve cancer treatment. In this talk, we show a third way to compute off-policy gradients that exhibit a fair bias/variance tradeoff using a closed-form solution of a proposed non-parametric Bellman equation. In these works, we mainly focus on multi-modal scenarios that naturally occur in the real world that depict common concepts, such as image-caption, photo-sketch, video-audio etc. Deep learning is a form of machine learning with neural networks, loosely inspired by how neurons process information in the brain. He was also program chair of AISTATS in 2009 and ECCV in 2016 and general chair of MIDL 2018. A collaboration between CWI, KNAW HuC, KB, Rijksmuseum, Netherlands Institute for Sound and Vision, TNO, the University of Amsterdam, and the VU University of Amsterdam. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts, while training more quickly and surpassing the performance of functionally constrained counterparts. Moreover, it is not even guaranteed to produce valid probabilities due to its parameterization being degenerate for this purpose. Microsoft to open research lab in Amsterdam. A collaboration between Ahold Delhaize and the University of Amsterdam. We support the theoretical analysis with experiments on image classification tasks performed with multi-layer, fully-connected neural networks. Our models accuracy is always comparable (and often superior) to Mozannar & Sontags (2020) models in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions. To accomplish this, we introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables. Geometric and Physical Quantities improve E (3) Equivariant Message Passing, Brandstetter, Johannes,Hesselink, Rob,Pol, Elise,Bekkers, Erik,and Welling, Max, Self-Supervised Inference in State-Space Models, Detecting dispersed radio transients in real time using convolutional neural networks, Ruhe, David,Kuiack, Mark,Rowlinson, Antonia,Wijers, Ralph,and Forr, Patrick, Pol, Elise,Hoof, Herke,Oliehoek, Frans,and Welling, Max, Deep Policy Dynamic Programming for Vehicle Routing Problems, Kool, Wouter,Hoof, Herke,Gromicho, Joaquim,and Welling, Max, Fast and Data Efficient Reinforcement Learning from Pixels via Non-Parametric Value Approximation, Whlke, Jan,Schmitt, Felix,and Hoof, Herke, Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods, Hpner, Niklas,Tiddi, Ilaria,and Hoof, Herke, Modeling Category-Selective Cortical Regions with Topographic Variational Autoencoders, Keller, T. Category-selectivity in the brain describes the observation that certain spatially localized areas of the cerebral cortex tend to respond robustly and selectively to stimuli from specific limited categories. I'm a PhD student at University of Amsterdam, under the supervision of Joris Mooij. The Amsterdam Machine Learning Lab (AMLab) conducts research in machine learning, artificial intelligence, and its applications to large scale data domains in science and industry. Equivariance is verified quantitatively by measuring the approximate commutativity of the inference network and the sequence transformations. We demonstrate the flexibility of this framework by implementing advanced variational methods based on amortized Gibbs sampling and annealing. Dr. Max Welling is a research chair in Machine Learning at the University of Amsterdam and a Distinguished Scientist at Microsoft Research (MSR). He is a fellow at the Canadian Institute for Advanced Research (CIFAR) and the European Lab for Learning and Intelligent Systems (ELLIS) where he also serves on the founding board. In this work, we leverage the newly introduced Topographic Variational Autoencoder to model the emergence of such localized category-selectivity in an unsupervised manner. Sim(2)-equivariance further improves performance on all tasks considered. Over the next five years, seven PhD researchers will work in the lab on projects that will focus, among other things, on achieving a quicker diagnosis of Alzheimers disease, modelling cardiac rhythms and on generating automatic reports based on X-ray images. However, most NP variants place a strong emphasis on a global latent variable. Furthermore, through topographic organization over time (i.e. This includes the development of deep generative models, methods for approximate inference, probabilistic programming, Bayesian deep learning, causal inference, reinforcement learning, graph neural networks, and geometric deep learning. My research has spanned a range of topics from generative modeling, variational inference, source compression, graph-structured learning to condensed matter physics. We argue that causal concepts can be used to explain the success of data augmentation by describing how they can weaken the spurious correlation between the observed domains and the task labels. Combining our estimator with REINFORCE, we obtain a policy gradient estimator and we reduce its variance using a built-in control variate which is obtained without additional model evaluations. Remarkably, this curation process can be used to understand three very different areas in deep learning: semi-supervised learning, out-of-distribution detection and the cold posterior effect. My interests are: causal inference, graphical models, structure learning. We develop operators for construction of proposals in probabilistic programs, which we refer to as inference combinators. Specifically, on a synthetic dataset, we show that standard baselines are substantially improved upon through the use of APC, yielding the greatest gains in the combined setting of high missingness and severe class imbalance. Before this he did a post-doc in applied differential geometry at the dept. Much real-world data is sampled at irregular intervals, but most time series models require regularly-sampled data. He directs the Amsterdam Machine Learning Lab (AMLAB) and co-directs the Qualcomm-UvA deep learning lab (QUVA) and the Bosch-UvA Deep Learning lab. We evaluate our frameworks ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. We demonstrate the effectiveness of our method on several tasks in computational physics and chemistry and provide extensive ablation studies. I was teaching assistant for the Master AI Reinforcement Learning 2019 and 2020 course at the University of Amsterdam. In both cases, G-CNN architectures outperform their classical 2D counterparts and the added value of atrous and localized group convolutions is studied in detail. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. In order to obtain equivariance to arbitrary affine Lie groups we provide a continuous parameterisation of separable convolution kernels. Researchers at UvA will collaborate with Bosch researchers on topics including generative models, causal learning, geometric deep learning, uncertainty quantification in deep learning, human-in-the-loop methods, outlier detection, scene reconstruction, image decomposition, and semantic segmentation. Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Herke van Hoof, Patrick Forr, Eric Nalisnick, Erik Bekkers, Christian Naesseth, and Sara Magliacane serve as tenure-track faculty. Moreover, AI algorithms have the potential to guide medical interventions accurately to the location of the tumor without damaging surrounding healthy tissue. Our algorithm can be applied offline on human-demonstrated data, providing a safe scheme that avoids dangerous interaction with the real robot. The lab investigates how technology can deal with biases in data, account for multiple perspectives and subjective interpretations and bridge cultural differences. While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. Deep learning is a form of machine learning with neural networks, loosely inspired by how neurons process information in the brain. He finished his PhD in theoretical high energy physics under supervision of Nobel laureate prof. Gerard t Hooft. High levels of missing data and strong class imbalance are ubiquitous challenges that are often presented simultaneously in real-world time series data. Yet even though neural network models see increasing use in the physical sciences, they struggle to learn these symmetries. Our experiments demonstrate that SENs facilitate the application of equivariant networks to data with complex symmetry representations. To solve this, we perform probabilistic reasoning over the depth of neural networks. The usual parametrization of robotic movements allows high expressivity but is usually inefficient, as it covers movements not relevant to the task. One of the most well known examples of category-selectivity is the Fusiform Face Area (FFA), an area of the inferior temporal cortex in primates which responds preferentially to images of faces when compared with objects or other generic stimuli. We develop and use machine learning techniques to discover patterns in data streams produced by experiments in a wide variety of scientific fields, ranging. Furthermore, our loss function is also a consistent surrogate for multiclass L2D, like Mozannar & Sontags (2020). In this work, we leverage the newly introduced Topographic Variational Autoencoder to model of the emergence of such localized category-selectivity in an unsupervised manner. This enables us to train a single, amortized model that infers causal relations across samples with different underlying causal graphs, and thus leverages the shared dynamics information. Our experiments demonstrate that conjugate EBMs achieve competitive results in terms of image modelling, predictive power of latent space, and out-of-domain detection on a variety of datasets. Most proposed flow models therefore either restrict to a function class with easy evaluation of the Jacobian determinant, or an efficient estimator thereof. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. Our E(3) Equivariant Diffusion Model (EDM) learns to denoise a diffusion process with an equivariant network that jointly operates on both continuous (atom coordinates) and categorical features (atom types). We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds and graphs, which is equivariant under continuous 3D roto-translations. In scientific applications, domain knowledge can give a linear approximation of the latent transition maps, which we can easily incorporate into our model. David also co-founded Invenia, an energy forecasting and trading company. Usage of such domain knowledge is reflected in excellent results (despite our models simplicity) on the chaotic Lorenz system compared to fully supervised and variational inference methods. Machine learning is marking a revolution in the world. This work introduces a diffusion model for molecule generation in 3D that is equivariant to Euclidean transformations. Empirical results demonstrate MoE-NPs strong generalization capability to unseen tasks in these benchmarks. The Amsterdam Machine Learning Lab (AMLab) conducts research in machine learning, artificial intelligence, and its applications to large scale data domains in science and industry. The AI4Science Lab is also connected to AMLAB, the Amsterdam Machine Learning Lab. Academics in turn gain a better understanding of how AI is used to innovate research platforms to solve real-world societal problems. The researchers in the lab work on techniques across the full breadth of AI. The research projects cover fundamental research topics, ranging from model-based exploration, parallel model-based reinforcement learning, methods for combined online and offline evaluation, prediction methods that correct for undesired feedback loops and selection bias, domain generalization and domain adaptation, and novel language processing models for better generalization. A collaboration between City of Amsterdam, the University of Amsterdam, and the VU University Amsterdam. His previous appointments include VP at Qualcomm Technologies, professor at UC Irvine, postdoc at U. Toronto and UCL under supervision of prof. Geoffrey Hinton, and postdoc at Caltech under supervision of prof. Pietro Perona. In addition, we also proposed 4 criteria (with evaluation metrics) that multi-modal deep generative models should satisfy; in the second work, we designed a contrastive-ELBO objective for multi-modal VAEs that greatly reduced the amount of paired data needed to train such models. This enables us to train a single, amortized model that infers causal relations across samples with different underlying causal graphs, and thus makes use of the information that is shared.