1 x 2-hour lecture weekly
1 x 1-hour workshop weekly
1 x 52-hour project work per semester
Enrolment not permitted
1 of COMP4716, COMP7008 has been successfully completed
Assumed knowledge
A good knowledge of Java (such as obtained from COMP2008 Computer Programming 2A) or C (such as obtained from COMP2007 Systems Programming) programming is desirable, and an understanding of basic data structures and algorithms or data base methods could be helpful, such as obtained from COMP2001 Computer Programming 2B.
Topic description
The topic explores the techniques used for indexing and retrieving textual information with a particular focus on the internet/intranet search engine. The basic framework explores different models of information indexing, retrieval and evaluation, as well as exploring a number of advanced topics including parallelism, visualisation and multimedia, whilst text processing is dealt with in an interdisciplinary way with input from linguistics and psychology. Students study and critique a number of research papers to deepen their understanding of specified subjects. A major and very practical focus concerns fundamental research methods that ensure appropriate evaluation of any system, including set up of data sets, validation and testing, plus calculation of accuracy, significance and confidence.
Educational aims
To prepare the student for commercial and research environments where the internet and the web have become ubiquitous and web search and information retrieval are core pillars of the information age, and companies have strong pressure to have a web presence, to employ an intranet search engine, and to achieve a good profile on internet search engines and yet search engine technology, including especially evaluation and visualization is still in its infancy and a focus of current research and technology transfer.

In addition the topic aims to train students to research the literature, recognize the advantages and disadvantages of proposals and synthesize solutions to problems. In particular students will learn to recognize and employ the interdisciplinary techniques and research that bears on the field.
Expected learning outcomes
At the completion of the topic, students are expected to be able to:
  1. Evaluate and install a search engine
  2. Develop/customize a search engine
  3. Understand the state of the art in search engine research and development
  4. Recognize the issues with current information retrieval technology
  5. Write a balanced critique of a proposal or paper
  6. Understand the nature and needs of research in information/text retrieval