Table of Contents
P450 structure
There is currently no crystal structure of an arthropod P450, but the overall structure of P450s is quite conserved, so this page provides a general background.
Overview
The overall P450 fold as described by Munro et al.(2013) is shown here:
A clear illustration of P450 structure as it is related to the protein sequence was provided by Di Nardo and Gilardi (2020) and is shown below:
Below is a different view showing the structure of human CYP3A4 (pdb 1TQN, Yano et al., 2004) with helices labeled.
Conserved motifs
The sequence identity of distantly-related P450 proteins can be as low as that predicted from the random assortment of two sets of 500 or so amino acids. This is because there are very few absolutely conserved amino acids, in fact, in arthropods none is absolutely conserved.
The highly conserved motifs are the WxxxR motif, the GxE/DTT/S motif, the ExLR motif, the PxxFxPE/DRF motif and the signature PFxxGxRxCxG/A motif around the heme coordinating Cys. They are shown below (Fig.4 from Feyereisen, 2012).
The description of the structure essentially follows the nomenclature of the P450cam protein (CYP101), the camphor hydroxylase of Pseudomonas putida. This was the first crystal structure of a P450 protein (Poulos et al., 1985, 1987)(Figure from Hamdane et al., 2008)
The first motif WxxxR is located in the C-helix, and the Arg is thought to form a charge pair with the propionate of the heme. This motif is easily discernable in multiple alignments.
The second conserved motif GxE/DTT/S surrounds a conserved Thr in the middle of the long helix I that runs on top of the plane of the heme, over pyrrole ring B.
The third conserved motif ExLR is located in helix K. It is thought to stabilize the overall structure through a set of salt bridge interactions (E-R-R) with the fourth conserved motif PxxFxPE/DRF (often PERF, but R is sometimes replaced by H or N) that is located after the K' helix in the “meander” facing the ExLR motif.
The fifth conserved motif PFxxGxRxCxG/A precedes helix L and carries the cysteine (thiolate) ligand to the heme iron on the opposite side of helix I. The cysteine ligand is responsible for the typical 450 nm (hence P450) absorption of the Fe(II)-CO complex of P450. This heme binding loop is the most conserved portion of the protein, often considered as “signature” for P450 proteins. Only in rare cases are P450 sequences lacking the “invariant” Cys.
The term P420 is given to P450 proteins in which the heme is not liganded and the Cys reduced, and the enzyme thus inactivated (not always irreversibly !).
Deviations from the consensus sequences of these five motifs deserve special attention. For instance, the conserved CYP301A1 of several species have an unusual Y instead of F in the canonical PFxxGxRxCxG/A motif before the Cys axial ligand to the heme.
Several insect P450 sequences, notably CYP307 and CYP321 and some CYP9 sequences lack the conserved threonine in helix I. Examples of (non arthropod) P450 that lack this threonine are CYP74 (allene oxide synthase) and CYP5 (thromboxane synthase) that do not depend on molecular oxygen for activity and CYP107A1 and CYP158A1 that do, but substitute a hydroxyl group of the substrate for that of the missing threonine.
Membrane anchoring
P450 proteins are characterized by their N-terminal sequence. Those targeted to the endoplasmic reticulum have a stretch of about 20 hydrophobic amino acids preceeding one or two charged residues that serve as halt-transfer signal and followed by a short motif with several prolines and glycines.
This “PGPP” motif serves as a hinge that slaps the globular domain of the protein onto the surface of the membrane while the N-terminus is anchored through it. The presence of the PGPP hinge is necessary for proper heme incorporation and assembly of functional P450 in the cell (Yamazaki et al., 1993, Chen et al., 1998).
A hydrophobic region between helices F and G is thought to penetrate the lipid bilayer, thus increasing the contact of the P450 with the hydrophobic environment from which many substrates can enter the active site (Williams et al., 2000).
The structure of membrane bound yeast CYP51 (see figure below from Pochapsky, 2014) shows how the transmembrane helix (TMH) anchors the globular structure of the P450 with the tips of helices F and G interacting with the lipid bilayer. Here an N-terminal amphipathic (AH) helix resides on the inner side of the ER membrane.
Substrate Recognition Sites (SRS)
Interspersed throughout the globular domain of the P450 proteins are six regions with a low degree of sequence similarity, covering about 16% of the total length of the protein. Initially recognized in CYP2 proteins by Gotoh (1992), these are called SRS for Substrate Recognition Sites and this designation has been generically extended to other P450s. However, they do not have precise sizes and boundaries across CYP families. A later analysis of CYP2 sequences in vertebrates did not support Gotoh's earlier conclusions (Kirischian et al., 2011).
For P450s with highly conserved function, the SRS would be expected to have a high degree of sequence similarity. A comparative study of the sterol 14-demethylase CYP51 (Podust et al., 2001) indicates that while SRS-4,5 and 6 would contribute to the substrate binding site, SRS-2 and 3 “likely do not exist” and SRS-1 in the B-C loop would only contact with the sterol under certain conformations.
Despite these limitations, the concept of SRS is still widely used, as a general guide to regions more likely to interact with substrates than others, as Figure 1B (above) indeed suggests. Zawaira et al., (2011) took advantage of about 50 crystal structures of mammalian P450s, versus just the bacterial P450cam structure available to Gotoh in 1992. They revisited the SRS concept by combining what they called “X-ray structures SRS maps” and substrate “docking SRS maps” for a number of P450s from four families. From this much larger sample than Gotoh's original study, Zawaira et al. (2011) modified and expanded the SRS zones to 33% of the mammalian P450 sequences. It would seem therefore that, by covering a third of the primary structure of P450s, such a new SRS definition can only be less relevant than Gotoh's landmark SRSs. Instead, given the increasingly accurate and family-specific structural information that is now available, defining regions of the P450 structure that interact with substrates (and inhibitors) can be done more directly than by relying on an SRS template. However, until now mammalian structures from the CYP2, 3 or 4 clans remain the best, yet only distant templates for arthropod P450s.