HAMAP logo

HAMAP annotation rules: User Manual for the Web View


The HAMAP annotation rules are written in the UniRule format , which is used by the UniProt Knowledgebase (UniProtKB) automated annotation projects to annotate protein records in the UniProtKB format. The rules can be displayed in a user-friendly Web View which consists of the following three main sections and associated sub-sections.


General rule information

Accession
This section indicates the accession number of the rule, in the form MF_xxxxx

Dates
This section is composed of two lines. The first line indicates the rule creation date; the second corresponds to the last rule revision date.

Name
This section provides the name of the protein.

Scope
This section indicates the taxonomic range covered by the rule.

Template(s)
This section lists the UniProtKB accession numbers of the entries from which the rule's annotation was inferred. The template entries are usually characterized. "Template: None" indicates that there are no characterization papers on any of the proteins that belong to that family. This is the case for UPFs (Uncharacterized Protein Family), for example.

Triggered by
This section indicates the profile identifier(s) used to trigger the application of the rule. The trigger can be either:
Propagated annotation

Identifier, protein and gene names
This section contains:
Comments
This section contains all applicable comment lines of a UniProtKB entry (see: the General annotation (Comments) section of the UniProtKB User Manual ).

Keywords
This section contains all applicable keywords of a UniProtKB entry (see: the Keywords section of the UniProtKB User Manual ).

Gene Ontology
This section contains all applicable GO terms and the corresponding cross-references to the Gene Ontology database .

Cross-references
This section indicates cross-references to domain databases within a UniProtKB entry; currently PROSITE, Pfam, PRINTS, TIGRFAMs and PIRSF (see: the Cross-references section of the UniProtKB User Manual ).
The format is:
Database Name identifier1; identifier2; number of expected hits;

        Pfam     PF02033; RBFA; 1;

        TIGRFAMs TIGR00082; rbfA; 1;

        PROSITE  PS01319; RBFA; 1;



Computed features
This section indicates which other rule(s) must be applied to completely annotate the protein.
Two main cases can be distinguished:
Features
This section contains:
  1. Template feature line(s), which defines the template for all the subsequent Feature lines.
    The format is:
    From: template name
    where template name is the identifier (ID and AC) of a sequence in the seed alignment.
    (e.g. From: ACP_ECOLI (P02901))
  2. Applicable feature lines that may be applied to UniProtKB entries (e.g. ACT_SITE, METAL, see the Sequence annotation (Features) section of the UniProtKB User Manual ).

Conditions may be used in feature lines. They usually correspond to pattern constraints, or to the presence of a specific amino acid.
e.g.

Key             From            To              Description             Condition

DISULFID          60            80                                       C-x*-C

Optional label can be used to indicate the presence of a feature which is not mandatory in the matched sequences.
e.g.

Key                    From             To              Description             Condition

BINDING (Optional)      153            153              ATP                       [RQ]

Multiple FT lines that should be applied either all together or not at all are grouped within an "FTGroup", to force the common presence of all sites.
e.g.

Key         From    To     Description                              Condition        FTGroup

ACT_SITE      42    42     Charge relay system                          H               1

ACT_SITE      91    91     Charge relay system                          D               1

ACT_SITE     186   186     Charge relay system                          S               1

This group can then be referenced by case statements in any other annotation section to be propagated.
For instance:

case  <FTGroup:1>

  COFACTOR:

  Name=Zn(2+); Xref=ChEBI:CHEBI:29105;

  Note=Binds 1 zinc ion per subunit.;

end case


Additional information