# Seminars

The Intelligent Systems Group periodically organises academic research seminars which usually take place in the Queens Building, normally on Wednesday afternoons. (we also organise Problem Workshops with companies and other interested parties, see here for more info)

For UoB members, subscribe to our outlook group if you want to receive calendar events.

For many of the upcoming ISL research seminars, there will also be opportunities for booking individual meetings with the visiting scholars. Contact us if you are interested.

Enquiries: For any enquiries related to ISL seminar, please contact the organisers Song Liu (song.liu (at) bristol.ac.uk).

Note that time and location of the seminars may vary between the weeks.

COMING SEMINARS

Dr. Luigi Acerbi, 8th May, 2-3pm, 1.58 Queens Building. This is an ISL/CS/CNU joint seminar.

Title Variational Bayesian Monte Carlo
Abstract
Many probabilistic models in computational neuroscience and machine learning have black-box, expensive likelihoods that prevent the application of standard techniques for approximate Bayesian inference, such as MCMC, which would require access to the gradient or a large number of likelihood evaluations.
In this talk, I introduce a novel sample-efficient inference framework, Variational Bayesian Monte Carlo (VBMC) — think of Bayesian optimization, but with the goal of obtaining a full Bayesian posterior instead of a single point estimate.
VBMC combines variational inference with Gaussian-process based, active-sampling Bayesian quadrature, using the latter to efficiently approximate the intractable integral in the variational objective. Our method produces both a nonparametric approximation of the posterior distribution and an approximate lower bound of the model evidence, useful for model selection. Across a number of tested problems and dimensions (up to D = 10), including a neuronal model with real data, VBMC performs consistently well in reconstructing the ground-truth posterior and model evidence with a limited budget of likelihood evaluations, showing promise as a general tool for inference with black-box, expensive likelihoods.

Paper: https://arxiv.org/abs/1810.05558
MATLAB toolbox: https://github.com/lacerbi/vbmc

# PAST SEMINARS

Prof. David Saad, Aston University, Date: 2-3pm, Wednesday 27 March, 2019, Senate house 5.10.

Abstract: The modern world comprises interlinked networks of individuals and computing devices and social groups, where information and opinions propagate through their edges in a probabilistic or deterministic manner via interactions between individual constituents (exchanging political views, gossiping or transmitting computer viruses). Winners are those who maximise the impact by deploying resource to the most influential available nodes at the right time. We developed an analytical framework based on statistical physics tools for impact maximisation in probabilistic information
propagation on networks. It is based on Dynamical Message Passing that calculates
the propagation probability, combined with a global variational optimisation process
within a finite time horizon. We address the following questions: 1) Given a budget
and a propagation/infection process, which nodes are best to infect in order to
maximise the spreading? 2) Maximising the impact on a subset of particular nodes at
given times, while having a restricted access. 3) Identify the most appropriate
vaccination targets to contain an epidemic. 4) Optimal deployment of resource in the
presence of competitive/collaborative processes. We point to potential applications.

Lokhov A.Y. and Saad D., Optimal Deployment of Resources for Maximizing Impact
in Spreading Processes, PNAS 114 (39), E8138 (2017)

Title: Informative Features for Model Comparison, Date: 3.30-4.30pm, Thursday 21st March. SM1, Math building. Heishiro Kanagawa, UCL.
Abstract: Given two candidate models, and a set of target observations, we address the problem of measuring the relative goodness of fit of the two models. We propose two new statistical tests which are nonparametric, computationally efficient (runtime complexity is linear in the sample size), and interpretable. As a unique advantage, our tests can produce a set of examples (informative features) indicating the regions in the data domain where one model fits significantly better than the other. In a real-world problem of comparing GAN models, the test power of our new test matches that of the state-of-the-art test of relative goodness of fit, while being one order of magnitude faster.

Prof. Trevo Martin, University of Bristol, Date: 1-2pm, Wednesday 13 March, 2019, – 2.17 – 35 BSQ HWB – 35 Berkeley Square

Title: Collaborative Intelligence – a Matter of Degree?

Abstract: In contrast to the knowledge-driven nature of earlier AI, most current systems are data-driven; however in both “new” and “old” AI, we can see autonomous and collaborative intelligent systems.  In autonomous AI, a system performs tasks without significant human input. Examples include product recommendation, game-playing, control of appliances and driverless vehicles. In contrast, collaborative intelligent systems aim to use the complementary strengths of humans  and   computers in  partnership for tasks such as assisted driving, computer-aided diagnosis and complex data analysis.  Situation awareness – where multiple heterogeneous sources of data must be integrated – is ideally suited to a collaborative intelligence approach. Human analysts provide insight and interpretation, while machines perform data collection, repetitive processing and visualisation. An important aspect of collaborative intelligence is the common definition of terms used by humans and machines to identify and categorise the entities, relations and events under consideration. In this talk we will argue that graded concepts (based on fuzzy set theory) are a natural framework for the interaction and exchange of information between analysts and machines. We will describe a new approach to fuzzy categorisation, and outline examples where this assists collaborative intelligence.

Wenkai Xu, UCL Gatsby Unit, 20 Feb, 1.58 – QB .

Title: Community and Relational Detection via Structured Non-negative Factorisation

Abstract: We propose a new method for community detection in directed networks. The proposed method identifies the communities based on directed interactions between them. Simultaneously, the method summarises these interactions by the directed edges between communities. The community assignment criteria are based on maximizing the net flow of (weighted) directed edges from one community to another. We show that in the absence and presence of noise, positive values in decomposed vectors of adjacency matrices are useful to identify communities and motivate our structured non-negative models. We present the multiplicative update algorithm for our model and show normalisation facilitates better convergence. In addition, we extend our model to a tensor version to tackle multiple network problem, that we identify communities with common structure and interactions between such communities, with multiple networks observed over the same set of vertices. For instance, in a social network, we observe different relationships such as “like”, “hatred”, “respect”, etc. between users.

Jeffrey Bowers, University of Bristol, 13 Feb, 2.17 – 35 BSQ HWB – 35 Berkeley Square.

Title: Looking for grandmother cells in neural networks and some other recent work regarding visual generalization in neural networks

Abstract: I will describe some work in which we look for selective “grandmother cell” representations in neural networks.  We find some in recurrent networks, but not in feedforward networks including in AlexNet.  We hypothesize why networks learn selective representations in some conditions and not others.   I will also briefly summarize some work in my lab that investigates how CNNs categorize images.  We show how the operate very differently than the human visual system: CNNs do not have a shape bias when categorizing images, and if time, I’ll show they do not encode relations when categorizing images.

Dr Nathan Lepora, University of Bristol, 16th Jan,  2.17, 35 BSQ HWB – 35 Berkeley Square

Title: Deep learning for robot hands

Abstract: Deep learning has the potential to have the impact on robot touch that it has had on computer vision, where it is currently dominating recent progress in vision-based robotics. I describe our work on applying deep learning to an optical biomimetic tactile sensor, the TacTip, which images an array of papillae (pins) inside its sensing surface analogous to structures within human skin. Our main result is that the application of a deep CNN can give a robust policy for planning contact points to move when interacting with objects. These results rely on using techniques to encourage generalization to tasks beyond which the model was trained, and hold promise to endow robotic hands with the capability to robustly and dexterously manipulate held objects.

This work is being continued in a joint project with Google DeepMind.

# Dr Patrick Rubin-Delanchy, University of Bristol, 9th Jan, 1-2pm, 1.58, Queens Building.

Title: Statistical modelling of networks

Abstract: Many modern statistical problems involve making sense of large networks. Inference is complicated because principled statistical approaches often cannot reasonably be implemented at operational scales, while tractable analysis techniques often have unsatisfactory theoretical justifications. In this talk I will outline mathematical results, spanning the last decade, that arguably provide the theoretical underpinnings of spectral embedding and clustering. A benefit of this theoretical perspective is to identify obvious and easily corrected inefficiencies in the standard implementations. Empirical improvements in link prediction as well as the potential to uncover much richer hidden community structure are demonstrated in a cyber-security application.

Mr Angel Miguel Garcia-Vico. University of Jaen (Spain). 26th November  TBA, 15.30-16.30pm. Brandon Room, Magg’s House.

Title: Challenges of Emerging Pattern Mining

Abstract: Supervised descriptive rule discovery is a data mining framework whose main aim is the extraction of rules that describe insights related to the underlying phenomena in data regarding to a variable of interets for the expert. In particular, emerging pattern mining (EPM) is a data mining task that finds patterns or rules describing emerging behaviour throughout time or the differentiating characteristics with respect to a variable of interest. However, the extraction of these patterns is challenging on nowadays scenarios where massive amounts of data are stored. In addition, this extraction becomes more complex on scenarios where data is continouosly arriving as constraints such as processing time and memory are stronger. In this talk, some proposed solutions together with applications to real-world data are presented.

Bio: Ángel Miguel García-Vico is currently working toward the Ph.D. degree in the Intelligent Systems and Data Mining (SiMiDat) research group at the University of Jaén (Spain). He received its B.Sc. degree in Computer Science from the University of Jaén in 2015 and his M.Sc. degree in data science and computer engineering from the University of Granada, Spain, in 2016. He has been visitor researcher at Northumbria University at Newcastle (UK) and DeMonfort University at Leicester (UK). His research interests include emerging pattern mining, subgroup discovery, evolutionary fuzzy systems, big data analysis and data stream mining where he has published four papers in top-impact journals such as Soft Computing, Cognitive Computation or IEEE Transactions on Fuzzy Systems. He also received the best paper award within the Big Data and Scalable Data Analysis track of the XVIII Conference of the Spanish Association for Artificial Intelligence in 2018.

Title: Univariate Mean Change Point Detection: Penalization, CUSUM and Optimality, 1.58  QB, Wednesday 5th Dec.

Abstract:  The problem of univariate mean change point detection and localization based on a sequence of $n$ independent observations with piecewise constant means has been intensively studied for more than half century, and serves as a blueprint for change point problems in more complex settings.  We provide a complete characterization of this classical problem in a general framework in which the upper bound $\sigma^2$ on the noise variance, the minimal spacing $\Delta$ between two consecutive change points and the minimal magnitude $\kappa$ of the changes, are allowed to vary with $n$.   We first show that consistent localization of the change points， when the signal-to-noise ratio $\frac{\kappa \sqrt{\Delta}}{\sigma} < \sqrt{\log(n)}$, is impossible. In contrast, when $\frac{\kappa \sqrt{\Delta}}{\sigma}$ diverges with $n$ at the rate of at least $\sqrt{\log(n)}$,  we demonstrate that two computationally-efficient  change point estimators, one based on the solution to  an $\ell_0$-penalized least squares problem and the other on the popular wild binary segmentation algorithm, are both consistent and achieve a localization rate of the order $\frac{\sigma^2}{\kappa^2} \log(n)$. We further show that such rate is minimax optimal,  up to a $\log(n)$ term.

Professor Fabio Cuzzolin, Oxford Brookes University, 21 Nov, 1-2pm, 1.58 QB
Title: Machine theory of mind at the crossroad of artificial intelligence and neuroscience

Abstract:  Artificial intelligence is becoming part of our lives. Smart cars will engage our roads in less than ten years’ time; shops with no checkout, which automatically recognise customers and what they purchase, are already open for business. But to enable machines to deal with uncertainty, we must fundamentally change the way machines learn from the data they observe so that they will be able to cope with situations they have never encountered in the safest possible way. Interacting naturally with human beings and their complex environments will only be possible if machines are able to put themselves in people’s shoes: to guess their goals, beliefs and intentions – in other words, to read our minds.
Professor Steven Schockaert, Cardiff University, 7th Nov, 1-2pm, 1.58 QB

Title: Distributional Relation Vectors

Abstract: Word embeddings implicitly encode a rich amount of semantic knowledge. The extent to which they can capture relational information, however, is inherently limited. To address this limitation, we propose to learn relation vectors, describing how two words are related based on the distribution of words in sentences where these two words co-occur. In this way, we can capture aspects of word meaning that are complementary to what is captured by word embeddings. For example, by examining clusters of relation vectors, we observe that relational similarities can be identified at a more abstract level than with traditional word vector differences. These relation vectors can be used, among others, to enrich the input to neural text classification models. From a network of relation vectors, we can also learn relational word vectors. These are vector representations of word meaning which, unlike standard word vectors, capture relational properties rather than similarity. On a range of different tasks, we find that combining these relational word vectors with standard word vectors leads to improved results.
Frank Hopfgartner. Senior Lecturer in Data Science. University of Sheffield.
Wednesday 31st October. 1-2pm. Room TBD

Title: News Retrieval and Recommendation Throughout the Years

Abstract: While traditionally, news was delivered via newspapers, radio and television, the Web has opened up new opportunities on both distributing and accessing news from different sources. Thanks to the low distribution costs of news only, we are bombarded with what appears to be an endless supply of news and other reports. In this talk, I will give an overview of research and evaluation initiatives that focus on news retrieval and recommendation.

Short Bio: Frank Hopfgartner is Senior Lecturer in Data Science and Head of the Information Retrieval Research Group at The University of Sheffield. His research to date can be placed in the intersection of information access, document analysis and data science. He has (co-) authored over 150 publications in above mentioned research fields, including a book on smart information systems, various book chapters and papers in peer-reviewed journals, conferences and workshops.

Title: Data-Efficient Machine Learning, Time: 12.00 – 13.00, 15th Oct (Monday), Venue: QB 1.69

Abstract: Making decisions using machine learning requires information concerning data to the task at hand. In many real-life applications, the datasets available are often not ideal, where missing data is typical.  In this talk, I will present machine learning methods that can utilize datasets with missing data entries and efficiently acquire information in a cost-saving manner.  In particular, I will mainly focus on two projects. The first one is the project EDDI for Efficient Dynamic Discovery of high-value Information.  In EDDI, we propose a novel partial variational autoencoder (Partial VAE), to efficiently handle missing data over varying subsets of known information. Based on Bayesian experimental design, EDDI combines this Partial VAE with an acquisition function that maximizes expected information gain on a set of target variables. EDDI is efficient and demonstrates that dynamic discovery of high-value information is possible.  Secondly, I will present work on active mini-batch sampling using point processes. This simultaneously balances the dataset with selection bias and reduces the variance for stochastic gradient methods. I will conclude the talk with my general research schedule and the research interest in machine intelligence and perception group in Microsoft Research, Cambridge.

Short Bio: Cheng Zhang is a researcher at the Machine Intelligence and Perception group at Microsoft Research Cambridge.  Before joining Microsoft, she was at the Statistical Machine Learning group, at Disney Research Pittsburgh, located at Carnegie Mellon University.  She has received her PhD at the Department of Robotics, Perception and Learning (RPL/ former CVAP), KTH Royal Institute of Technology Stockholm. She is interested in both machine learning theory, including variational inference, deep generative models and causality, as well as various machine learning applications with social impact.

Machine Learning Approach to Topological Data Analysis, Prof Kenji Fukumizu, The Institute of Statistical Mathematics, Japan. Date: 3pm-4pm 18th Sep 2018, Place: F.101a in Queen’s Building.

Abstract: Topological data analysis (TDA) is a recent methodology for extracting topological and geometrical features from data of complex geometric structures. Persistent homology, a new mathematical notion proposed by Edelsbrunner (2002), provides a multiscale descriptor for the topology of data, and has been recently applied to a variety of data analysis. In this talk I will introduce a machine learning framework of TDA by combining persistence homology and kernel methods. As an expression of persistent homology, persistence diagrams are widely used to express the lifetimes of generators of homology groups. While they serve as a compact representation of data, it is not straightforward to apply standard data analysis to persistence diagrams, since they consist of a set of points in 2D space expressing the lifetimes.  We introduce a method of kernel embedding of the persistence diagrams to obtain their vector representation, which enables one to apply any kernel methods in topological data analysis, and propose a persistence weighted Gaussian kernel as a suitable kernel for vectorization of persistence diagrams. Some theoretical properties including Lipschitz continuity of the embedding are also discussed. I will also present applications to change point detection and time series analysis in the field of material sciences and biochemistry.

“Approximate Kernel Embeddings and Learning on Aggregates”. Dino Sejdinovic, University of Oxford, 12.00- 13.00, Monday, 18th, June, 1.68 Queen’s Building.
Abstract: Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD), the resulting probability metric, are useful tools for fully nonparametric hypothesis testing and for learning on distributional inputs, i.e. where outputs are only observed at an aggregate level of inputs. I will give an overview of this framework and describe the use of large-scale approximations to kernel embeddings in the context of Bayesian approaches to learning on distributions and in the context of distributional covariate shift, e.g. where measurement noise on the training inputs differs from that on the testing inputs.
“Cryptocurrency Risk Modeling from Blockchain Graph Analysis”. Matthew Dixon, IIT Stuart School of Business (Chicago, IL, US). 21st May 2018, 12-1pm. Room: QB 0.35 ARR.

Abstract: In contrast to financial exchanges, Blockchain based crypto-currencies expose the entire transaction history to the public. By processing all transactions, we model the network with a high fidelity graph so that it is possible to characterize how the flow of information in the network evolves over time. We demonstrate how this data representation permits a new form of financial modeling — with the emphasis on the topological network structures to study the role of users, entities and their interactions in formation and dynamics of crypto-currency investment risk. In particular, we identify certain sub-graphs (‘chainlets’) that exhibit predictive influence on Bitcoin price and volatility and characterize the types of chainlets that signify bitcoin losses.  This is joint work with Cuneyt Akcora, Yulia Gel and Murat Kantarcioglu.

Bio: Matthew Dixon holds a Ph.D. in Applied Mathematics from Imperial College (2007) and a MSc. in Scientific Computing from Reading University (2002). He began his research career as a visiting research fellow at the Center for Nonlinear Studies (LANL) in 2005 and 2006.  This was followed by postdoctoral appointments at the Institute for Computational and Mathematical Engineering, Stanford University, and UC Davis, where he focused increasingly on the computational problems arising in large-scale predictive simulations. This led him to work with Silicon Valley and Finance firms, with an interest in theory and practical applications of machine learning and computational statistics. Matthew joined the Illinois Institute of Technology in 2015, as a tenure-track assistant professor,  where he teaches computational finance and Bayesian modeling in the Mathematics and Finance Departments.  His research in fintech is funded by Intel.

“On Monte-Carlo Tree Search and Reinforcement Learning“. Spyros Samothrakis, Institute for Analytics and Data Science. University of Essex. 23rd April 2018, 12:00 – 13:00, Room TBD.

“Book a meeting” link available soon

Abstract: Fuelled by successes in Computer Go, Monte Carlo tree search (MCTS) has achieved widespread adoption within the games community.  Its links to traditional reinforcement learning (RL) methods have been outlined in the past; however, the use of RL semantics within tree search has not been thoroughly studied yet.  In this talk we re-examine in depth this close relation between the two fields; we show that a straightforward adaptation of RL semantics within tree search can lead to a wealth of new algorithms, for which the traditional MCTS is only one of the variants.  We confirm that planning methods  inspired  by  RL  in  conjunction  with  online  search  demonstrate  encouraging  results  on several classic board games and in arcade video game competitions, where our algorithm recently ranked first. Our study promotes a unified view of learning, planning, and search.

Bio : Spyros Samothrakis is a Lecturer and Assistant Director in the Institute for Analytics and Data Science at the University of Essex (IADS). Prior to his current role, he was a Senior Research Officer within the School of Computer Science and Electronic Engineering (CSEE). His research interests include Reinforcement Learning, Neural Networks and Causality. He obtained his PhD from the University of Essex (2014). He has won prizes in causality competitions (June 2013) and has published papers in all the areas above, in the relevant journals and conferences. He has been reviewing for a number of journals (IEEE Transactions on Evolutionary Computation, IEEE Transactions on Computational Intelligence and AI in Games) and has served as a programme committee member for a number of international conferences including FDG/DIGRA, IEEE CIG and GECCO. He has applied his knowledge of Reinforcement and Machine learning in widely different fields (e.g., natural language processing, game playing) with the explicit aim of showing that the underlying fundamental principles and methods remain the same, irrespective of the application domain. He is the academic supervisor in a number of KTP partnerships the University of Essex holds with local businesses and has recently won the Best KTP Academic award from the University for his work.

Using Machine Learning for Cyber Security“. Harsha Kumara Kalutarage, Centre for Secure Information Technologies, Queen’s University Belfast. 16th April 2018, 12:00 – 13:00, Room TBD.

“Book a meeting” link available soon

Abstract: With the recent advancements in Machine Learning (ML), systems built on ML can be found in every domain. In spite of extensive academic research, however, such systems are not yet widely used in practice for Cybersecurity. This is because of some fundamental differences between Cybersecurity problems and other problems where ML usually finds much more success. This talk begins with an understanding of the behaviours of intruders with a particular focus on computer networks and then presents our recent work on this research area. It includes carefully deployed empirical analyses with number of attack scenarios on computer and control area networks. Finally, the talk concludes with a discussion on research challenges and provides necessary suggestions to move forward in this research line.

Bio : Harsha Kumara Kalutarage is currently a Senior Research Engineer of Security Data Analytics at the Centre for Secure Information Technologies, Queen’s University of Belfast, UK. His research interest is to improve Cybersecurity through the advancement and application of Data Science and Machine Learning. He wants to leverage his applied computer science research background to develop and evaluate new technologies in Cybersecurity. Harsha particularly enjoys tackling real world security problems not only for academic interest but also for generating useful tools to improve everyday life. To date the impact of Harsha’s research in this area includes over
20 publications, a patent (pending) and technology transfer. Harsha holds a Ph.D. in Computing (Cybersecurity), an M.Phil. in Computer Science (Speech Synthesis) and a B.Sc. Special degree (Statistics and Computing).

Title: Teaching and learning in uncertainty, Varun Jog,  ECE Department at UW Madison,  Queens building, room 1.6, 13.00-14.00, Friday, 9th March 2018.
Abstract:

We investigate a simple model for social learning with two characters: a teacher and a student. The teacher’s goal is to teach the student the state of the world $Theta$, however, the teacher herself is not certain about $Theta$ and needs to simultaneously learn it and teach it. We examine several natural strategies the teacher may employ to make the student learn as fast as possible. Our primary technical contribution is analyzing the exact learning rates for these strategies by studying the large deviation properties of the sign of a transient random walk on $mathbb Z$.

Theoretical support of machine learning debugging via weighted M-estimation“. Xiaomin Zhang, University of Wisconsin-Madison. 1st March 2018, 12:00 – 13:00, Room  0.3 MVB. *Cancelled, due to extreme weather*.

Abstract: We study a linear regression formulation of machine learning debugging, where data are obtained from two distinct pools of “clean” and “contaminated” data. The goal is to correctly identify the subset of buggy data contained in the contaminated data pool. We propose a novel weighted $M$-estimator that applies a Huber loss to the contaminated data and a squared error loss to the clean data, and derive rigorous statistical properties of the estimator. Our results reveal the dependence between the proper choice of relative weights; the sample sizes of the clean and contaminated data sets; and the ratio between the noise variances of the two datasets. Simulation studies demonstrate the success of our method when applied to debugging tasks involving synthetic and real datasets.

Bio: I am a third year graduate student in CS Department at UW-Madison. My advisor is Po-Ling Loh. I am interested in the intersection of statistics, machine learning and optimization. Currently my research focuses on high-dimensional statistics.

Modeling disease propagation in networks: source-finding and influence maximization“. Po-Ling Loh, University of Wisconsin-Madison. 29th January 2018, 12:00 – 13:00, QB F101c (Queen’s Building).

Book a meeting with Po-Ling (UOB Login required)

Abstract: We present several recent results concerning stochastic modeling of disease propagation over a network. In the first setting, nodes are infected one at a time, starting from a single infected individual, and the goal is to infer the source of the infection based on a snapshot of infected individuals. We show that if the underlying graph is a tree and possesses a certain regular structure, it is possible to construct confidence sets for the diffusion source with size independent of the number of infected nodes. Furthermore, the confidence sets we construct possess an attractive property of “persistence,” meaning they eventually settle down as the disease spreads over the network. In the second setting, nodes are infected in waves according to linear threshold or independent cascade models. We establish upper and lower bounds for the influence of a subset of nodes in the network, where the influence is defined as the expected number of infected nodes at the conclusion of the epidemic. We quantify the gap between our upper and lower bounds in the case of the linear threshold model and illustrate the gains of our upper bounds for independent cascade models in relation to existing results. Importantly, our lower bounds are monotonic and submodular, implying that a greedy algorithm for influence maximization is guaranteed to produce a maximizer within a 1-1/e factor of the truth. This is joint work with Justin Khim and Varun Jog.

Bio : Po-Ling grew up in the lovely town of Madison, Wisconsin. After graduating from Caltech with a BS in math and minor in English, she moved to UC Berkeley, where she subsequently earned an MS in computer science and PhD in statistics. From 2014–2016, She was an assistant professor in the Department of Statistics at the Wharton School of the University of Pennsylvania. She moved back to Madison in the summer of 2016.

“Event reasoning for transport video surveillance”. Huiyu Zhou, University of Leicester. 23rd January 2018, 13:00 – 14:00, SCEEM 1.11 (Queens Building MVB).

Abstract: The aim of transport video surveillance is to provide robust security camera solutions for mass transit systems, ports, subways, city buses and train stations. As we have known, numerous security threats exist within the transportation sector, including crime, harassment, liability suits and vandalism. Possible solutions have been directed to insulate transportation system from security threats and to make the system safer for passengers. In this talk, I will introduce our solution to deal with several challenges in transports, in particular, city buses. In general, I will structure this talk into the following four sections: (1) The techniques that we have developed for automatically extracting and selecting features from face images for robust age recognition, (2) An effective combination of facial and full body measurements for gender classification, (3) Human tracking and trajectory clustering approaches to handle challenging circumstances such as occlusions and pose variations, and (4) event reasoning in smart transport video surveillance.

Bio: Dr. Huiyu Zhou obtained a Bachelor of Engineering degree in Radio Technology from Huazhong University of Science and Technology of China, and a Master of Science degree in Biomedical Engineering from University of Dundee of United Kingdom, respectively. He was then awarded a Doctor of Philosophy degree in Computer Vision from Heriot-Watt University, Edinburgh, United Kingdom. Dr. Zhou presently is a Reader at Department of Informatics, University of Leicester, United Kingdom. He has published widely in the field. He was the recipient of “CVIU 2012 Most Cited Paper Award”, “ICPRAM 2016 Best Paper Award” and shortlisted for “ICPRAM 2017 Best Student Paper Award” and “MBEC 2006 Nightingale Prize”. Dr. Zhou serves as the Editor-in-Chief of “Recent Advances in Electrical & Electronic Engineering” and Associate Editor of “IEEE Transaction on Human-Machine Systems”, and is on the Editorial Boards of several refereed journals. He is one of the Technical Committee of “Information Assurance & Intelligent Multimedia-Mobile Communication in IEEE SMC Society”, “Robotics Task Force” and “Biometrics Task Force” of the Intelligent Systems Applications Technical Committee, IEEE Computational Intelligence Society. He has given over 50 invited talks at international conferences, industry and universities, and has served as a chair for 30 international conferences and workshops. His research work has been or is being supported by UK EPSRC, EU ICT, MRC, Innovate UK, Leverhulme Trust, Invest NI and industry.

“A Linear-Time Kernel Goodness-of-Fit Test*”, Dr. Wittawat Jitkrittum (Gatsby Unit, UCL), Date: 18, Dec 2017. Place: Queens Building, Small Lecture Theater

*Best Paper Award In NIPS 2017
Abstract: We propose a novel adaptive test of goodness of fit, with computational cost linear in the number of samples. We learn the test features that best indicate the differences between observed samples and a reference model, by minimizing the false negative rate. These features are interpretable, indicating where the model does not fit the samples well. The features are constructed via Stein’s method, meaning that it is not necessary to compute the normalising constant of the model. We analyse the asymptotic Bahadur efficiency of the new test, and prove that under a mean-shift alternative, our test always has greater relative efficiency than a previous linear-time kernel test, regardless of the choice of parameters for that test. In experiments, the performance of our method exceeds that of the earlier linear-time test, and matches or exceeds the power of a quadratic-time kernel test. In high dimensions and where model structure may be exploited, our goodness of fit test performs far better than a quadratic-time two-sample test based on the Maximum Mean Discrepancy, with samples drawn from the model.
Joint work with Wenkai Xu, Zoltan Szabo, Kenji Fukumizu, Arthur Gretton.

# Manifolds of Shape via Gaussian Process Latent Variable Models, Dr. Neill Campbell, University of Bath, 2nd of February, 15:00-16:00, MVB 1.06

Abstract: In this talk we will look at Gaussian Processes and Latent Variable Models, in particular focusing on how they may be used to learn generative, probabilistic models of shape. As well as looking at some of the theory behind the models I will show a number of real-world applications of such models with the domains of computer vision and graphics. I will also provide details of the challenges in this area and some early results of new work.
Bio: Neill CampbellI is a lecturer in the Department of Computer Science at the University of Bath  in Computer Vision, Graphics and Machine Learning. He also hold an Honorary Lecturer position in the Virtual Environments and Computer Graphics Group in the Department of Computer Science at University College London where he was formerly a Research Associate working with Jan Kautz andSimon Prince on synthesizing and editing photorealistic visual objects funded by the EPSRC. Prior to this Neill was a Research Associate in the Computer Vision Group of the Machine Intelligence Laboratory, in the Department of Engineering at the University of Cambridge working on the EU Hydrosys Project led by Ed Rosten. Neill completed his PhD, in the Computer Vision Group at the University of Cambridge, under the supervision of Roberto Cipolla and the guidance of George Vogiatzis and Carlos Hernández.

# Prof. Andrea Sgarro, University of Trieste, 9th of February, 14:00-15:00, MVB 1.06

Abstract: Back in 1967 the Croat linguist. Muljacic had used a fuzzy generalization of the Hamming distance between binary strings to classify Romance languages. In 1956 Cl. Shannon had introduced the notion of codeword distinguishability in zero-error information theory. Distance and distinguishability are subtly different notions, even if, with distances as those usually met in coding theory (with the exception of zero-error information theory, which is definitely non-metric), the need for string distinguishabilities evaporates, since the distinguishability turns out to be an obvious and trivial function of the distance. Fuzzy Hamming distinguishabilities derived from Muljacic distances, instead, are not that trivial, and must be considered explicitly. They are quite easy to compute, however, and we show how they could be applied in coding theory to channels with erasures and blurs. The new tool of fuzzy Hamming distinguishability appears to be quite promising to extend Muljacic approach from linguistic classification to linguistic evolution.
Bio: Andrea Sgarro is full professor of Theoretical Computer Science at the University of Trieste. His research interests include information theory and codes, cryptography, bioinformatics, soft computing, management of incomplete knowledge and computational linguistics. He is responsible for the scientific section of the Circolo della Cultura e delle Arti of Trieste, and is quite active in scientific communication: his books Secret Codes, Mondadori, and Cryptography, Muzzio, for the first time have introduced cryptology to an Italian-speaking audience. In his free time he enjoys languages, of which he speaks a dozen with varying degrees of competence, and plays the one-keyed transverse baroque flute.

# CANCELLED Prof. Mark Girolami, University College London, 23rd of February, 14:00-15:00, MVB 1.06

Abstract: Ambitious mathematical models of highly complex natural phenomena are challenging to analyse, and more and more computationally expensive to evaluate. This is a particularly acute problem for many tasks of interest and numerical methods will tend to be slow, due to the complexity of the models, and potentially lead to sub-optimal solutions with high levels of uncertainty which needs to be accounted for and subsequently propagated in the statistical reasoning process. This talk will introduce our contributions to an emerging area of research defining a nexus of applied mathematics, statistical science and computer science, called “probabilistic numerics”. The aim is to consider numerical problems from a statistical viewpoint, and as such provide numerical methods for which numerical error can be quantified and controlled in a probabilistic manner. This philosophy will be illustrated on problems ranging from predictive policing via crime modelling to computer vision, where probabilistic numerical methods provide a rich and essential quantification of the uncertainty associated with such models and their computation.
Bio: Mark Girolami is Professor of Statistics in the Department of Statistical Science at Imperial College London. Prior to joining Imperial College, Mark held Chairs in Computing and Inferential Science at the University of Glasgow, in Statistics at UCL and subsequently Warwick University. In 2011 he was elected to the Fellowship of the Royal Society of Edinburgh when he was also awarded a Royal Society Wolfson Research Merit Award. He was one of the founding Executive Directors of the Alan Turing Institute for Data Science from 2015 to 2016. He is an EPSRC Established Career Research Fellow and Director of the Lloyds Register Foundation-Turing Programme on Data Centric Engineering of The Alan Turing Institute. He is currently an Associate Editor for J. R. Statist. Soc. C, Journal of Computational and Graphical Statistics, Statistics & Computing, and Area Editor for Pattern Recognition Letters. He is a member of the Research Section of the Royal Statistical Society.
Problem workshop with Piccadilly Group, 23rd of March, 15:00-16:00, MVB 1.06

Abstract: In this session, we”ll hear from the CEO of Piccadilly Group, Dan Hooper and CTO, Adam Smith, who will outline the underlying issues and challenges in the management of software testing and technology delivery within banking, and how we see AI addressing many of these challenges.

Problem Statement: The group discussion will focus on the practical challenges
of developing artificial intelligence and machine learning for use in this
space.

and expert consultancy knowledge across the entire test landscape.

# Indian Buffet process for model selection in convolved multiple-output Gaussian processes, Dr Mauriciou Alvarez, University of Sheffield, 4th of May, 15:00-16:00, MVB 1.06

Abstract: Multi-output Gaussian processes have received increasing attention during the last few years as a natural mechanism to extend the powerful flexibility of Gaussian processes to the setup of multiple output variables. The key point here is the ability to design kernel functions that allow exploiting the correlations between the outputs while fulfilling the positive definiteness requisite for the covariance function. Alternatives to construct these covariance functions are the linear model of coregionalization and process convolutions. Each of these methods demands the specification of the number of latent Gaussian processes used to build the covariance function for the outputs. We propose the use of an Indian Buffet process as a way to perform model selection over the number of latent Gaussian processes. This type of model is particularly important in the context of latent force models, where the latent forces are associated with physical quantities like protein profiles or latent forces in mechanical systems. We use variational inference to estimate posterior distributions over the variables involved and show examples of the model performance over artificial data and several real-world datasets.
Bio: Dr. Álvarez received a degree in Electronics Engineering (B. Eng.) with Honours, from Universidad Nacional de Colombia in 2004, a master degree in Electrical Engineering (M. Eng.) from Universidad Tecnológica de Pereira, Colombia in 2006, and a Ph.D. degree in Computer Science from The University of Manchester, UK, in 2011. After finishing his Ph.D., Dr. Álvarez joined the Department of Electrical Engineering at Universidad Tecnológica de Pereira, Colombia, where he was appointed as a Faculty member until Dec 2016. From January 2017, Dr. Álvarez was appointed as Lecturer in Machine Learning at the Department of Computer Science of the University of Sheffield, UK.

Dr. Álvarez is interested in machine learning in general, its interplay with mathematics and statistics, and its applications. In particular, his research interests include probabilistic models, kernel methods, and stochastic processes. He works on the development of new approaches and the application of Machine Learning in areas that include applied neuroscience, systems biology, and humanoid robotics.

# Probabilistic and Bayesian deep learning, Dr Andreas Damianou, Amazon Research, 15th of May, 14:00-15:00, MVB 1.06

Abstract: In this talk I will firstly motivate the need for introducing probabilistic and Bayesian flavor to “traditional” deep learning approaches. For example, Bayesian treatment of neural network parameters is an elegant way of avoiding overfitting and “heuristics” in optimization, while providing a solid mathematical grounding. Moreover, introducing ideas from Bayesian uncertainty treatment and probabilistic graphical models, allows for a higher level of reasoning which is needed for solving non-perceptual tasks, such as transfer/unsupervised learning and decision making. In the talk I will highlight the deep Gaussian process family of approaches, which can be seen as non-parametric Bayesian neural networks. Unfortunately, combining deep nets with probabilistic reasoning is challenging, because uncertainty needs to be propagated across the neural network during inference. This comes in addition to the (easier) propagation of gradients (e.g. back-propagation). Therefore, as part of my talk I will talk about approximation methods to tackle the aforementioned computational issue, such as variational, amortized and black-box inference.
Bio: Andreas Damianou completed my PhD studies under Neil Lawrence in Sheffield, and subsequently pursued a post-doc in the intersection of machine learning and bio-inspired robotics. He have now moved to the industry as a machine learning scientist, based in Cambridge, UK. His area of interest is machine learning, and more specifically: Bayesian non-parametrics (focusing on both data efficiency and scalability), representation learning, uncertainty quantification, big data. In a recent work he seeks to bridge the gap between representation learning and decision-making, with applications in robotics and data science pipelines. Personal website.

# Deep probabilistic models for weakly supervised structured prediction, Diane Bouchacourt, University of Oxford, 8th of June, 15:00-16:00, MVB 1.06

Abstract: Structured prediction refers to the prediction of a structured, complex output given an input value. This task is challenging as there is often uncertainty on the output. In this setting, deep probabilistic networks are powerful tools to learn the distribution of the structure to predict. Such models parametrise the distribution of the data with a neural network. This allows reasoning under uncertainty and decision making, according to the task at hand. However, while we can easily gather a large amount of data observations, retrieving ground-truth values of the output to predict is costly, if not infeasible. In this talk, I will present how to employ deep probabilistic models to perform structured prediction for computer vision tasks; both in the supervised and weakly supervised setting when only part of the ground-truthlabelingis available.

Bio: Diane Bouchacourt is a PhD student in the Optimization for Vision and Learning (OVAL Group) at the Department of Engineering Science at University of Oxford. She works under the co-supervision of M Pawan Kumar at the University of Oxford and Sebastian Nowozin at Microsoft Research Cambridge. Her research focuses on developing novel optimization algorithms and deep probabilistic models for structured output prediction. She is currently focusing on unsupervised and supervised learning of generative models based on neural networks.

# ISL Problem Workshops

ISL also organises problem workshops with companies and other interested parties. These are talks by industrialists, companies in the area of finance, healthcare companies, and many other areas who have an application which would involve machine learning or computational statistics. They are keen to establish a collaborative link with ISL members. They have typically indicated that they wish to co-invest in support of this objective. Because the latter are not our regular academic seminars they can be of much shorter duration than the usual 50 minute duration and typically consist in the presentation of the topic of interest, and discussion of data they have available. The presentation is informal and followed by a discussion. Given the nature of these talks, no Abstract is given and the title may be omitted. ISL members, affiliates and UoB academic staff from other faculties are welcome to attend and we are always keen to facilitate developing contacts.