Delivering accurate structural bioinformatics to the yeast community with the HHyeast data

Welcome to the HHyeast website.

This website offers the results of remote homology searches of the entire genome of the model budding yeast Saccharomyces cerevisiae. The searches have been carried out using the HHsearch package developed by Johannes Soeding and colleagues [1], a tool available online as "HHpred" through the server.

The results show visualisations of the strongest homologies for 100% of 6,713 verified yeast open reading frames (ORFs) in three databases:

PDB (solved structures)
Pfam (curated protein domain families from the European Bioinformatics Institute)
the yeast proteome itself (self-hits are excluded from the visualisations)

For each ORF, a summary will be shown for each database that has at least one significant hit. Links at the bottom of that page will allow you to change parameters (e.g. threshold pSS to zero!) in one database, so you can investigate hits and adjust parameters and then use those parameters to produce the summary views for all three databases. Note, only hits matching 30 columns or more are reported.

Gaps between identified domains

HHyeast has discovered some additional hits in the gaps between the domains displayed in this website [2]. Although these are not included in the visualisations, the data can be obtained from the HHyeast repository, together with all other data for our yeast genome searches. Please email Tim Levine (tim levine @ucl.ac.uk) for data requests.

Downloaded data format

As well as being able to save images of the domain displays through the controls in each window, individual ORF files from which the data has been extracted for the visualisation can be downloaded. This can be done when selecting the ORF in this page. These files indicate which residues and what (predicted) secondary structural elements have been aligned. They also incude hits shorter than 30 matched columns.

The single downloaded data file containing an ORF's search results for all three databases is a text file with suffix ".hhr". It can be opened by any text editor (e.g. Mac OS X "TextEdit".

Job submission

In the box at the top of this page, type a gene identifier, either systematic name or standard name if available. The server will offer valid options. Alternately, you can simply add names as suffices to the URL. When the "Display plot" button is pressed, typically three panels will be displayed, one for each of the databases (PDB, Pfam, Yeast). However, a panel will not be displayed if no hits reach the default matching probability threshold for display. Each panel can then be examined in detail allowing re-setting of display thresholds for the other two panels as well.

REFERENCES

[1] "Protein homology detection by HMM-HMM comparison", Söding J., Bioinformatics, 2005, updated most recently in "A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core.", Zimmermann et al., J Mol Biol., 2018.

[2] "HHyeast reveals hundreds of new domains in the yeast proteome". Christidi et al., manuscript in preparation