Examples research at the Intelligent Systems Laboratory:
- Cancer Bioinformatics
- Detecting Events in Textual Streams on the Web
- Found in Translation
- Global Happiness
- Mining for Interesting Patterns in Structured Data
- Submission Sifting
- Other Research Collaborations
The algorithms group in ISL look at various aspects of the theory and practice of algorithms, in particular in the context of pattern matching. The goal of our research is both to provide scalable solutions to existing problems and to understand the limits of what is possible. The quantity of data available in digital form continues to increase at an exponential rate. The need for faster and more accurate algorithms is now more important than ever before. We also want to understand where improvements are impossible by establishing provable lower bounds, both in terms of space and time. Finally, we study quantum algorithms and computation, as well as other aspects of quantum information theory.
“People who analyze algorithms have double happiness. First of all they experience the sheer beauty of elegant mathematical patterns that surround elegant computational procedures. Then they receive a practical payoff when their theories make it possible to get other jobs done more quickly and more economically.” — Donald Knuth
Decision Support and Recommender Systems
This activity is headed by Colin Campbell, whose main interests are in machine learning including probabilistic graphical models, metric learning and kernel-based methods. He is also very interested in the application of machine learning techniques to bioinformatics, cancer bioinformatics and medical informatics. His research is currently supported by the EPSRC, Cancer Research UK and PASCAL2. As an example of his work he is currently organising an EPSRC-supported workshop on cancer bioinformatics in Cambridge in September 2010 with an accompanying edited volume planned with Cambridge University Press. Machine learning techniques are very applicable to the analysis of biomedical datasets, particularly from modern genomics. In turn, data analysis in this context stimulates the development of new methods. Thus he is currently very interested in data analysis methods which can handle disparate types of data within the same model. Cancer research currently has some of the largest and most challenging datasets within biomedical research and is thus a major interest.
Detecting Events in Textual Streams on the Web
Flu DetectorVarious projects at the ISL involve the automatic extraction of useful information from online text.
One of them, titled as Flu Detector, is aimed at tracking the levels of flu like illness by reading the contents of Twitter. For a non technical description of this work, please read this press release.
This work has been featured in the media and was also been selected by EPSRC as a research highlight for the year.
Found in Translation
This research with the School of Journalism at the University of Cardiff involves extracting patterns of content from multilingual news. More than 200 European newspapers in 23 languages are monitored constantly, and the choice of stories is automatically analyzed by the system, producing hypotheses about various biases present in the EU media sphere.
Every day on the Found in Translation website one can find an overview of statistical patterns found in the news appearing in the leading European news outlets.
Happiness economics is an active area of research, and the aim of this project was to contribute using a machine learning approach, to explain and predict global happiness.
At present countries mainly use GDP to determine progress but this has come under increasing criticism as measuring,
“… everything, in short, except that which makes life worthwhile.” — Robert Kennedy, 1968
A broad range of variables were investigated to discover more appropriate measures of progress, that impact on happiness. A focus was given towards variables over which we have control and can change in order to improve quality of life, and those that governmental policy can directly affect such as crime, education health etc. Environmental variables such as weather were also considered as these are affected by issues such as climate change on which government policy has an impact.
Mining for Interesting Patterns in Structured Data
Exploratory data mining (EDM) is a branch of data mining concerned with the general exploration of data. Sounds vague?
In this project, our goal is to put EDM on a solid basis. We do this by formalizing the data mining process as an interaction between the data and the data miner, with the algorithm as an intelligent interface between both, presenting to the data miner only what is of real interest. We tackle this problem for simple data types as well as for highly structured and interdependent data as stored in relational databases.
Peer review of written works is an essential pillar of the academic research process, providing the central quality control and feedback mechanism for submissions to conferences, journals and funding bodies across a wide range of disciplines. Identifying the most appropriate reviewers for a given submission is a non-trivial and time-consuming task for conference chairs, journal editors and funding managers. This problem motivated the development ISL’s SubSift “submission sifting” application which matches submitted conference or journal papers to potential peer reviewers based on their similarity to published works of prospective reviewers in online bibliographic databases, such as Google Scholar. The software has already been used to support several major data mining conferences and other interesting applications are now emerging, such as expert finding for the press and media, organisational profiling, and suggesting potential interdisciplinary research partners.
Other Research Collaborations