Table of Contents
The CYP4G subfamily
Overview
Insect P450 enzymes of the CYP4G subfamily catalyze the synthesis of cuticular hydrocarbons that serve multiple functions from desiccation resistance to chemical communication. These functions are essential for survival.
Hydrocarbon biosynthesis from very long chain fatty acyl thioesters. CoA: Coenzyme A.The reaction of CYP4G is an oxidative decarbonylation, with production of CO2. It is not yet clear if the fatty aldehyde substrate is a product of a FAR enzyme, or if an alcohol product of a FAR enzyme is the substrate of CYP4G which first makes the aldehyde as intermediate in the oxidative decarbonylation.
The genomes of most species of Neoptera carry at least two CYP4G genes that are paralogs of the two Drosophila CYP4G genes (CYP4G1 and CYP4G15). The duplication of the original CYP4G is basal to Neoptera and no CYP4G is found in Paleoptera, or beyond the class Insecta. See Below: evolution.
The sequences of CYP4G and particularly their active site have been highly conserved over 400 MY, but all CYP4G sequences are characterized by a + 44 residue insertion between the G and H helices, which protrudes from the globular structure of the enzyme distally from the membrane anchor.
To determine if a sequence is a CYP4G1-type or a CYP4G15-type, just BLAST it against the database.
Unique insertion in the CYP4G subfamily. The CYP4G sequences (D. melanogaster CYP4G1 and a consensus from 44 insect CYP4G sequences) are aligned with four CYP sequences of known crystal structure and the portion of the alignment from helix F to helix I is shown with helices marked in color. The CYP4G-specific insertion is marked in blue.(Fig. S4 from Qiu et al.2012)
Function
The function of the CYP4G genes was first elucidated by Qiu et al.(2012) who showed that survivors of CYP4G1 RNAi are deficient in cuticular hydrocarbons (CHC), highly sensitive to desiccation stress, and impaired in their pheromone mediated courtship behavior. Furthermore, the recombinant Musca domestica CYP4G2-P450 reductase fusion protein was shown to catalyze the last step in CHC biosynthesis, the oxidative decarbonylation of long chain fatty aldehydes Qiu et al.(2012) .
The role of CYP4G enzymes in the synthesis of alkanes and alkenes which serve as waterproofing agents on the insect epicuticle and in many pheromonal functions as well (Howard and Blomquist, 2005; Ferveur, 2005) has been confirmed in subsequent studies.
CYP4G activity and phenotypes of CYP4G suppression by RNAi or CRISPR/Cas9 are summarized in this table. Thus, direct and indirect evidence from phylogenetically distant species strongly suggest that CYP4G enzymes share a common biochemical function as oxidative decarbonylases essential in hydrocarbon biosynthesis, whether these have a structural function (as CHC) or a signalling one (as pheromones).
Indirect function in insecticide resistance
Constitutive overexpression of CYP4G genes in insecticide-resistant strains also provides indirect evidence for a role of CYP4G genes in resistance. Initial observations were only correlative, i.e. high CYP4G expression and resistance (Pittendrigh et al., 1997; Pridgeon et al., 2003; Muller et al., 2008; Jones et al., 2013). RNAi of CYP4G19 in Blattella germanica and of CYP4G14 in Tribolium castaneum increases toxicity of pyrethroids (Guo et al., 2010; Chen et al., 2019; Kalsi and Palli, 2017). It appears that the control of CHC production by CYP4G enzymes affects insecticide penetration, and hence contributes to resistance (Balabanidou et al., 2016; 2018; Wang et al., 2019b). In the laboratory-selected resistant strain of Drosophila melanogaster called 91-R, CYP4G1 is one of several constitutively overexpressed genes, leading to an increase in CHC content. RNAi experiments suggest that this contributes to DDT resistance in this strain (Kim et al., 2018). The CYP4G enzymes are therefore not only essential for the synthesis of CHC and their multiple structural and communication roles, but they can also play a role in a poorly understood toxicokinetic aspect of insecticide resistance, i.e. resistance to penetration through the cuticle.
Structure
Comparison of the alignments of 148 CYP4G1 and 210 CYP4G15 sequences do not reveal consistent and obvious differences. Residues that are conserved are mostly conserved in both types of sequences. 67 and 68 residues are 100% conserved in CYP4G1 and CYP4G15 sequences, respectively. Of those, 51 are 100% conserved in all CYP4G sequences analyzed. Over a hundred residues are conservative substitutions. Although the insertion between helices G and H in CYP4G15 is longer than in CYP4G1, this is particular for Drosophila, and the mean length of the insertion is the same (44 and 43 amino acids respectively) in the two types of CYP4G sequences. The amino acid composition of the insertion shows an enrichment in acidic residues (Asp 15.6%, Glu 9.4%, average from 204 sequences). This is about double the natural abundance of acidic residues in proteins.
Comparison of the structure of rabbit CYP4B1 and the model of Anopheles CYP4G16. The CYP4G16 model obtained by I-TASSER on the right panel and CYP4B1 with bound heme and octane substrate on the left panel (PDB 5t6q). Helices are in red and sheets in green. The view is through the I helix, with the N-terminal transmembrane helix and the tips of the F and G helices at the bottom. The CYP4G insertion is clearly visible on top of the model, after the G helix. The approximate position of the membrane surface is shown as stippled line. (Fig. 8 from Feyereisen, 2020)
The figure shows a comparison of the CYP4B1 structure with the I-Tasser model for CYP4G16. The insertion was protruding from the globular structure of the P450, on the cytoplasmic side distal from the membrane surface in which the N-terminal is anchored and the loop between helices F and G is dipping. The variable sequences of the CYP4G insertions resulted in varying predictions of secondary structure, from random coil over the entire length, to a short helix between helices G and H, to a lengthened helix G and random coil. It appears then that the acidic nature of the insertion is the only common feature. The insertion does not interfere with interaction site of the P450 with NADPH cytochrome P450 reductase (CPR), the obligatory electron donor that is highly enriched in oenocytes along with CYP4G enzymes (Lycett et al., 2006; Qiu et al., 2012). This interaction site of the FMN domain of CPR with P450s is thought to be located near residues of the N-terminus of the I helix and of the B’ helix (Estrada et al., 2015), i.e. in front of the view shown in the figure. However, the CPR is a larger multi-domain enzyme which undergoes conformational changes during catalysis (Laursen et al., 2011), and it is possible that the CYP4G insertion could have ionic interactions that stabilize the P450-CPR complex. Alternatively, the insertion might serve to tether yet unidentified proteins to form a metabolon with CYP4G and CPR. Candidates would be fatty acyl-CoA reductases (FAR), the enzymes providing CYP4G with their substrates.
The consensus from 358 CYP4G sequences (Feyereisen, 2020) shows that the regions close to the active site of the CYP4G enzymes have been highly conserved for 400 MY. Furthermore, of the 17 amino acid residues lining the active site of CYP4B1 (Hsu et al., 2017) eight are identical and four are conserved substitutions. This suggests that the ancestor CYP4C may have been a fatty acid omega hydroxylase and that there are structural constraints to maintain the regioselectivity towards the omega position. The broader phylogeny of CYP4 genes (Kirischian and Wilson, 2012) indicates that insect CYP4 sequences are monophyletic with the vertebrate CYP4V subfamily, known to encode omega hydroxylases.
Evolution and distribution
Although it is generally considered that genes with highly conserved sequence and function are evolutionarily “stable”, the evidence from the CYP4G subfamily shows that since their initial duplication over 400 MYA, these genes have experienced many gene births and deaths (Feyereisen, 2020).
Phylogeny of neopteran orders and fate of the CYP4G1 and 4G15 paralogs. The insect phylogeny is adapted from Misof et al. (2014). An ancestral CYP4G resulted from the neofunctionalization of a duplicated CYP4C gene. This ancestral CYP4G then duplicated resulting in a family of CYP4G1 and CYP4G15 paralogs found in 24 extant insect orders. D indicates gene loss within the lineage, not excluding independent losses. B indicates gene duplication and BB indicates more than a single duplication within the lineage. Lineages as stippled lines indicate gene loss in the entire lineage.(Fig. 6 from Feyereisen, 2020)
The Figure above shows that the CYP4G1 homolog has been lost several times, and is missing in five orders of insects. These losses are both ancient, as in all Hemiptera and Thysanoptera, and more recent as in honey bees. Serial duplications leading to CYP4G gene clusters have also been observed, as in house flies and in fireflies. The detailed evolutionary history of CYP4G genes does not support the “stability” of these essential genes, but rather a “revolving door” pattern where their essential function is maintained despite an apparently random birth and death process. The dual function of cuticular hydrocarbons, in desiccation resistance achieved mainly by the quantity of hydrocarbons produced and in chemical communication, achieved by the blend of hydrocarbons produced, may explain the apparently paradoxical evolution of CYP4G genes.
FASTA files are available for 207 CYP4G1 type sequences and 262 4G15 type sequences.