
Substrate and Binding Annotation of Carbohydrate-Active Enzymes : interest and lessons for Post-genomics
Pedro M. Coutinho, CNRS Aix-Marseille University, France
Hosts: Angelina Palma and Ana Luísa Carvalho, UCIBIO, NOVA
Abstract:
The Carbohydrate-Active Enzymes Database (CAZy; www.cazy.org) provides access to nearly 500 families of enzymes and of their ancillary carbohydrate-binding modules involved in the degradation, biosynthesis and modification of complex carbohydrates. Thanks to dedicated curation efforts, this classification links protein sequences and structures with enzyme’s functional and bibliographic data, and provides associated taxonomic and genomic descriptors [1,2,3]. In the post-genomic era, high-throughput enzyme characterization is expanding and systematizing a wide variety of activity assays. However, the often-limited scope of the biochemical characterizations (both in the diversity of the substrates or in the nature of the kinetic or activity assays) may impact the functional annotation of CAZymes. In order to remain factual, information from the peer-reviewed literature on the “significant” substrates has been associated with individual enzymes. Depending on the depth of characterization, the substrate data either complements the EC annotation by providing details on the evidence and on enzyme specificity, or constitutes a functional annotation in the absence of more in-depth characterizations. Direct or indirectly this data can also help in the identification of concerted enzyme actions for the degradation of complex saccharides. Carbohydrates are known for their enormous diversity [4,5] and extensive efforts to describe them are underway [6]. Although the diversity of complex glycan may have its limits [7], much has yet to be unveiled. For our ongoing curation efforts, over 1,300 substrates have been grouped into broad categories. This categorization is important to reveal aspects of enzyme specificity, as substrate comparisons are usually relevant only within “equivalent” category levels.
Furthermore, carbohydrate-active enzymes often rely on the action of carbohydrate-binding modules (CBMs) to better target their catalytic action. Initially described in substrate specific binding families, these had to evolve to apply the same principles used for enzyme family classification. Contrarily to enzymes that rely on the EC classification to describe their catalytic properties, the “characterized CBMs” had no reference to annotate to describe their binding properties. A brief presentation will be made of the binding classification used internally in CAZy that will describe the different binding levels allowing a more complete curation of enzyme properties.
Current functional knowledge and known enzyme functional diversity are direct consequences of the biological or biotechnological context of their characterization. Awareness of the potential limitations of past approaches is necessary to ensure that future enzyme and CBM discovery and characterization efforts involve meaningful substrate collections for a greater impact.
References
[1] B.L. Cantarel, P.M. Coutinho et al. Nucleic Acids Res. 2009, 37, D233-238
[2] V. Lombard, H. Golaconda Ramulu, et al. Nucleic Acids Res. 2014 42, D490–D495
[3] E. Drula, M.-L. Garron, et al. Nucleic Acids Res. 2022, 50, D571-D577
[4] R.A. Laine, Glycobiology 1994, 4, 759767.
[5] Varki, A. et al. (eds). Essentials of Glycobiology, 4th Edition, Cold Spring Harbor Laboratory Press, New York, 2022
[6] P.V. Toukach, K.S. Egorova Nucleic Acids Res. 2016, 44, D1229-D1236.
[7] P. Lapebie, Lombard V, et. al. Nat Commun. 2019, 10, 2043.