DAS Writeback: A collaborative annotation system for proteins
Thesis
We designed and developed a Collaborative Annotation System for Proteins called DAS Writeback, which extends the Distributed Annotation System (DAS) to provide the functionalities of adding, editing and deleting annotations. A great deal of e ort has gone into gathering information about proteins over the last few years. By June 2009, UniProtKB/Swiss-Prot, a curated database, contained over four hundred thousand sequence entries and UniProtKB/TrEMBL, a database with automated annotation, contained over eight million sequence entries. Every protein is annotated with relevant information, which needs to be e ciently captured and made available to other research groups. These include annotations about the structure, the function or the biochemical residues. Several research groups have taken on the task of making this information accessible to the community, however, information ow in the opposite direction has not been extensively explored. Users are currently passive actors that behave as consumers of one or several sources of protein annotations and they have no immediate way to provide feedback to the source if, for example, a mistake is detected or they want to add information. Any change has to be done by the owner of the database. The current lack of being able to feed information back to a database is tackled in this project. The solution consists of an extension of the DAS protocol that de nes the communication rules between the client and the writeback server following the Uniform Interface of the RESTful architecture. A protocol extension was proposed to the DAS community and implementations of both server and client were created in order to have a fully functional system. For the development of the server, writing functionalities were added to MyDAS, which is a widely used DAS server. The writeback client is an extended version of the web-based protein client Dasty2. The involvement of the DAS community and other potential users was a fundamental component of this project. The architecture was designed with the insight of the DAS specialized forum, a prototype was then created and subsequently presented in the DAS workshop 2009. The feedback from the forum and workshop was used to rede ne the architecture and implement the system. A usability experiment was performed using potential users of the system emulating a real annotation task. It demonstrated that DAS writeback is e ective, usable and will provide the appropriate environment for the creation and evolution of a protein annotation community. Although the scope of this research is limited to protein annotations, the speci cation was de ned in a general way. It can, therefore, be used for other types of information supported by DAS, implying that the server is versatile enough to be used in other scenarios without major modi cations.