|








|
Motivation
The rapidly increasing number of sequenced genomes contain a wealth of information that can be exploited to understand the biological
functions of the gene products. Annotation derived from protein structure is especially important for the following reasons:
- Protein function is a property of its 3D structure, dependent on placement and interaction of functional residues. Identification of these residues reveals not just the general function but also may be informative about the mechanism through which it is performed.
- Evolutionary relationships that are hidden at the sequence level may be revealed at the structural level, thus permitting functional assignments from more distant relatives.
- From a wider perspective, these annotations lead to advances in fundamental biological understanding, which in turn provide a route for the discovery and refinement of novel strategies for pharmaceutical intervention, i.e. drug discovery.
Bioinformatics research has benefited from the availability of low-cost but powerful desktop computers, which have been used as building
blocks to create large computational farms. However, the physical demands for space, power and cooling place a natural limit on the size of farms that any one institution can
support. In addition private resources are inefficient and unreliable: they must be specified to cope with peak demand and so will idle for significant periods of time, but also
present a single-point of failure that can severely disrupt all services. The obvious solution is to share the available resources across different sites. Until now, the lack of
proper management and security tools has hindered the development of resource sharing. Accordingly, one objective of the e-protein project is to develop GRID based facilities to
pool the computing resources between the different centres involved in the project.
To provide a comprehensive resource for structure-based annotation within the UK , it is necessary to pool the resources and expertise from
different academic groups. The deliverable from e-Protein for the biological community is structure-based proteome annotations, accessible via a single interface combining
information stored at the different sites. Specifically, a major objective of e-protein is to develop a version of the DAS (Distributed Annotation System) that will link protein
annotation databases at the e-Protein sites, and allow annotation to be easily combined and compared with that from other sources.
To summarise: the key motivations of e-protein are both to provide a valuable bioinformatics resource and also to provide a timely, novel and practical demonstration of the use of
e-science technology within bioinformatics.
|