Fabrega, S, P Durand, Mornon, J P and Lehn, P (2002), "[The active site of human glucocerebrosidase: structural predictions and experimental validations]", J Soc Biol, 196, 2: 151-60.
Abstract: Gaucher disease is a lysosomal storage disorder caused by a deficiency in glucocerebrosidase
which cleaves the beta-glucosidic linkage of glucosylceramide, a
normal intermediate in glycolipid metabolism. Glucocerebrosidase
belongs to the clan GH-A of glycoside hydrolases, a large group of
enzymes which function with retention of the anomeric configuration
at the hydrolysis site. Accurate three-dimensional (3D) structure
data for glucocerebrosidase should help to better understand the
molecular bases of Gaucher disease. As such 3D structure data were not
available, we used the two-dimensional hydrophobic cluster analysis
(HCA) method to make structure predictions for the catalytic domains
of clan GH-A glycoside hydrolases. We found that all the enzymes
of clan GH-A may share a similar catalytic domain consisting of an
(alpha/beta)8 barrel with the critical acid/base and nucleophile
residues located at the C-terminal ends of strands beta 4 and beta 7,
respectively. In the case of glucocerebrosidase, Glu 235 was predicted
to be the putative acid/base catalyst whereas the nucleophile was
located at Glu 340. Next, in order to obtain experimental evidence
supporting these HCA-based predictions, we used retroviral vectors
to express, in murine null cells, E235A and E340A mutant proteins,
in which alanine residues unable to participate in the enzymatic
reaction replace the presumed critical glutamic acid residues. Both
mutants were found to be catalytically inactive although they were
correctly folded/processed and sorted to the lysosome. Thus, Glu 235
and Glu 340 do indeed play key roles in the active site of human
glucocerebrosidase as predicted by the HCA analysis. In a broader
perspective, our work points out that bioinformatics approaches may be
highly useful for generating structure-function predictions based on
sequence-structure interrelationships, especially in the context of a rapid
increase in protein sequence information through genome sequencing.