|
|
Information for:
| | |
|
|
|
|
|
| | |
Information about:
| | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| | |
|
| | |
|
|
|
Knowledge Discovery, Management and Visualisation - Project Details
Advanced Clustering Techniques
The soft computation techniques are investigated, improved and applied to various
clustering algorithms.
These advanced clustering algorithms can be applied to data mining, image compression,
texture segmentation, speech recognition and digital watermarking.
This project aims at both developing efficient and effective soft computation techniques
and applying these techniques to advanced clustering.
Significantly, these clustering algorithms may be applied to real applications such as
data mining, image compression and digital watermarking.
Brain Computer Interface
This project uses advanced learning techniques to analyze Electroencephalographic and other
electrophysiological signals (EEG, EOG, ECG, EMG) and use them to:
- predict behaviours
- assess the extent of skill learning
- control computers or physical devices (such as a wheelchair)
This research has been applied in a defense context.
Ethics in Data Mining
Knowledge discovery allows considerable insight into data.
This insight brings with it the inherent risk that what may be inferred may be private or
ethically sensitive.
The validity of rules generated from a mining operation becomes an ethical issue when the
results are used in decision making processes that affect people, while mining customer
data may unwittingly compromise the privacy of those customers.
Significantly, the sensitivity of a rule may not be apparent to the miner, particularly
since the volume and diversity of rules can often be large.
This project investigates these issues and aims to develop a system that detects and
highlights the ethical and privacy sensitivity of rules.
Evolution and Fusion of Learners
The metaphor of evolution allows for the development and evaluation of populations of
solutions, recombining and mutating the best algorithms to produce new potentially
better solutions.
Mutation provides the potential to explore the full space of possibilities and overcome
local optima, whilst recombination provides the opportunity to develop hybrids reflecting
the best points of two solutions.
There is a continuum between traditional 'hard' machine learning techniques, modern 'soft'
learning techniques like neural networks, and evolutionary techniques as we explore here.
The focus of our research is to break down the barriers and evolve better learning
programs for visual, communicative and cooperative tasks, including evolving neural
networks for visual pattern recognition and evolving language learning programs.
There is a specific interest in the application of evolutionary techniques to data fusion
where different data sources or learning mechanisms provide different levels of accuracy
for different subproblems, and the object of data fusion is to produce a new solution that
does better than any one approach.
In typical data fusion some kind of voting takes place, but this often leads
to catastrophic fusion in which worse results are achieved.
Evolutionary fusion combines the best elements of the individual approaches without in
general including the whole of the original learner.
Information Retrieval, Ranking, Clustering and Visualization
Search engines are the best known information retrieval interfaces and reflect a state
of the art that is still largely linear and text based.
We have developed techniques for extending web search from conventional html 'static'
pages to 'dynamic' pages generated from a backend database.
This is the basis of the technology on which the university's start up company
YourAmigo is based.
The increasing quantities of information available as the web grows and more databases
become accessible exacerbates another problem - that of dealing with thousands,
even millions, of search results.
We have developed and employed advanced clustering and visualization techniques to allow
more intuitive and faster organization and sifting of a mass of retrieved data and
continue to evaluate and extend the techniques.
The basic idea is to present information in such a way that our highly efficient
subconscious sensory processing relieves our conscious minds of considering items or
clusters individually and navigates in dimensions appropriate to the data.
Mining Population Health Data
This project looks at applying existing techniques to large scale population data to
determine whether and which data mining techniques can be of use in an epidemiological
context.
Robot Baby and Intelligent Room
This project has two complementary aims.
The first is to develop multimodal learning and data fusion technology to combine
information from multiple sensory-motor and linguistic data sources - examples include
tracking and homing in on a speaker in a room, or improving speech recognition by lip
reading.
The second is to develop databases of psycholinguistic sensory-motor data from the
perspective of a baby learning language interactively, using this to test and model
theories of the way we learn about the world and evolve a suitable language for our
environment as well as for diagnostic, teaching and remedial purposes.
There also two kinds of sensory focus, outward (the baby) and inward (the room).
In addition two major learning technologies are being developed - techniques for
unsupervised learning of syntactic and semantic relationships, and techniques for
effectively fusing data by comparing the significance and errors of predictions
from different sources, preprocessors or learning techniques.
Spelling and Grammar Checking, Segmentation and Transcription of Text,
Phonetics and Speech
We have developed powerful new techniques for using syntactic and semantic context to
predict confused words, homonyms and multirole words.
Conventional spelling checks cannot pick up legal words that are confused with the
intended word, including words that sound the same.
Asian languages like Chinese have an even worse problem since a typical phonetic syllable
has around 20 different written representations (characters) and even more meanings.
Typing in Chinese is typically done in phonetics using a menu to select between up to
40 or more individual characters.
Transcribing speech in any language has the same problem to some degree, as many words
sound the same or similar.
Working with speech, phonemes, or Chinese characters has an additional problem that the
division into words is not obvious, and the problem of segmentation into words and phrases
is an important first step in many applications.
Subjective Data Mining
Knowledge Discovery requires a symbiosis of computer and user to be effective.
To date this association has pervaded all but the actual data mining stage, which is still
largely computer autonomous, with the mining being directed by static measures of interest.
The problem is that knowledge is ultimately user centric and by occluding the user from
the data mining stage the level of derived knowledge is diminished.
Our hypothesis is that by incorporating the user within the data mining stage of processing,
through the inclusion of subjective measures of interest, the efficiency of the knowledge
discovery process will improve.
Skin Colour Analysis for Detection of Facial Features, Aging Characteristics and
Surgical Success
This project merges two threads of research.
The first is an outgrowth of the robot baby project in which new techniques for
discriminating facial features from background skin colour were developed by Powers
and Lewis.
The second is the development by Nield of standardization methods for the assessment of
features such as scars that arise in surgery.
Recently we have joined forces to apply this more widely and have been approached by an
overseas cosmetic company to apply them to the analysis of aging.
Temporal Sequence Mining
The task of temporal sequence mining is to discover long frequent episodes from a
codified series of actions or events that have a time order associated with them.
The goals of this project are to incorporate the temporal semantics, as described by
Allen and expanded on by Freksa, into the mining activity and thus be able to describe
the interaction between episodes and provide a useful description of their behaviour.
Temporal Web Mining
With the huge amounts of data now available on the web, the automated discovery and
analysis of useful information from the web becomes a necessity.
Web mining has been used to perform such tasks for several years.
However, most research overlooks the temporal nature of the data being studied.
This project looks into how to incorporate temporal semantics into the processes
currently being used for web mining.
Multi-Perspective Rendering
Single-perspective rendering techniques draw 3D data sets as if viewed from a single
point in space. This project is developing techniques that can efficiently render views
where different parts of the image are rendered from different viewpoints.
The project builds on earlier work in the area of distortion-oriented displays, which help
solve the problem of how to visualise both detail and context in a complex data set.
In addition to such visualisations, our technique can also create displays that could not be
represented by data distortion, such as viewing an object from both sides at the same time.
The work has resulted in a new rendering algorithm that has applications in both data
visualisation (by drawing pictures of complex data that smoothly merge multiple viewpoints)
and computer graphics (by producing accurate and efficient drawings of reflections from
curved surfaces).
Visualising Evolutionary Trees
Bioinformaticists interested in the evolution of species use algorithms that compare
gene sequence data to infer evolutionary changes that might have occurred in past ages.
The algorithms generate a family of tree-of-life diagrams that suggest how individuals
may have shared a common ancestor.
The trees are often large and complex, which makes visual comparison tedious and
detailed analys impossible.
Currently, the only way to understand the family of trees is to compute a
lowest-common-denominator("consensus") tree, which shows only the common elements and
discards the potentially important information about the differences.
This project is investigating ways to visualise families of related trees.
We are devising techniques to compare and quantify trees, and using them to develop
pictures of tree-structured data so as to emphasise both similarities and differences.
In addition to its use in bioinformatics, the work also has application for other domains
that deal with complex hierarchical data structures.
More Information
Further information about this research program is available from
Prof John Roddick.
For further information about the School's research programs, the opportunities for higher degree study and scholarship information click here.
|
|