Ganesha project specification



Subsections:


Problem

As more and more information is published on the Internet (especially on the World Wide Web) it is becoming increasingly important to be able to find specific and relevant data reliably and effectively. Furthermore it would be desirable to have tools that not only aid us in finding information but also help with analysing or evaluating that data.

To achieve this it must be possible for computers to interpret information on the Internet so that they can help us work with it. Current search engines already do a fairly good job of 'understanding' web pages written in HTML or similar formats in order to index them but there is much room for improvement and when faced with non-textual data such as images, sounds or video the search engines are generally at a loss.

Fortunately the World Wide Web Consortium (W3C) has created the Resource Description Framework (RDF), which provides a standard way of publishing descriptive meta data (ie: data about data) for virtually anything. RDF is a key technology in the W3C's efforts to establish the so called Semantic Web - an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. (Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001).

Although RDF has existed for several years now it is not yet widely used. I believe that this is partly due to the fact that RDF (especially when encoded in XML) can appear quite complex and daunting. Average computer users do not want to be concerned with the gritty details of RDF and XML. Professional web designers and companies running large sites can afford to create custom tools to generate RDF data or to teach their staff the necessary know-how, but there appears to be no easy to use solution for individuals or maintainers of small to medium sized websites.

Objectives

I intend to make RDF accessible to the average computer user just as WYSIWYG HTML editors have made web page creation easy and popular. Therefore it is the aim of my project to create a simple GUI-based RDF editor program. The program will be named Ganesha, the Hindu god of knowledge and destroyer of obstacles! :)

XML seems to be the most convenient encoding method of RDF for this purpose so Ganesha shall be limited to creating XML/RDF. It will have the ability to create stand-alone RDF files as well as embedding RDF into existing XML documents.

Once the project has been completed I intend to release the source code under an open-source license so that Ganesha may be further developed and improved by any interested participants.

Methods

The first step will be to complete my background reading and to set up a project website so that any interested observers may watch the project's progress. I will be using XHTML for this, so that I can test Ganesha's ability to augment the XHTML pages with RDF meta data when I reach the testing phase.

Then I will begin with the design of the program - whether or not to follow a formal software development model, the features the program will have, the layout of user interface, choosing a suitable API for working with the XML files and structuring the programs code into classes, methods etc..

After that the actual coding may begin and upon its completion there will be a testing phase to eliminate any bugs in the software. I intend to use the Java 2 programming language to write Ganesha because of its platform independance and availability on the DCS machines.

Once the software is complete the presentation shall be prepared and then final report will be written and submitted.

Timetable

[Gannt chart of the project timetable]; Weeks 3 - 4: Make Project Website, Weeks 3 - 6: Complete background reading, Weeks 5 - 8: Plan feratures and GUI layout, Week 9 - End of Christmas holidays: Choose development method & design software, Christmas holidays - Week 15: Coding, Week 14 - 18: Testing and bug fixing, Weel 18 - 20: Preparation of presentation, Easter holidays - Week 22: Writing of project report

(may be subject to minor changes)

Resources

I will be writing the program on the DCS machines and on my own computer at home (which is running Linux). Any work I do at home will be backed up regularly onto the DCS computers.

The presentation of Ganesha will be done on the DCS machines using OpenOffice so no non-standard software or hardware is required.