These two CYP2 clan P450s involved in ecdysteroid metabolism are closely related in sequence and result from a duplication event, forming a strongly supported monophyletic clade. They are also mostly present in close synteny, with the two genes head to head in Drosophila and the honey bee (Niwa et al., 2004; Claudianos et al., 2006) as well as in Daphnia pulex (Rewitz and Gilbert 2008). This arrangement is also found in Coleoptera, Hemiptera, Isoptera, Calopteryx splendens, and the collembolan Holacanthella duospinosa but the CYP306 and CYP18 genes are head to tail in another collembolan Orchesella cincta and in the amphipods Hyalella azteca and Parhyale hawaiensis. They are also head to tail in the millipede Trogoniulus corallinus, but tail to tail in another millipede, Helicorthomorpha holstii. In Lepidoptera, CYP18 is duplicated, and the two genes have a different tissue expression pattern (Li et al., 2014). The synteny of CYP306 (25-hydroxylase, Niwa et al., 2004; phm in Drosophila, Warren et al., 2004) and CYP18 (26-hydroxylase/oxidase, Guittard et al., 2011) maintained over 400 MY is remarkable, given that the function of the two genes is thought to be opposite (biosynthesis vs. inactivation - at least in Drosophila). CYP18A1 has been lost in Anopheles gambiae (Feyereisen, 2006), but this loss is restricted to the A. gambiae complex, as it is found in the closest species, An. christyi and beyond (Neafsey et al., 2015). Similarly, we could not find CYP18 in the genome of Blattella germanica, nor in the TSA of any other related species in the “Blattellinae” (sensu Evangelista et al., 2019), although it is readily found in Blaberidae and other Blattodea.
The Manduca sexta midgut P450 C26-hydroxylase activity is found in both microsomal and mitochondrial compartments (Williams et al., 2000). The lepidopteran midgut C26-hydroxylase would presumably be encoded by CYP18B1 as in B. mori (Li et al., 2014). Indeed Manduca sexta CYP18B1 has a predicted mitochondrial location by DeepLoc1.0. While this is unusual for a CYP2 clan P450, it would explain the biochemical data (Williams et al., 2000) yet would suggest, in addition, the existence of a microsomal P450 with C26 hydroxylase activity, or a cryptic microsomal targeting sequence in CYP18B1.
In Tetranychus urticae, both CYP18 and CYP306 are absent, and indeed ponasterone A (25-deoxy-20-hydroxyecdysone) has been identified in this mite (Grbic et al., 2011). These two genes are also missing in Varroa destructor, Metaseiulus occidentalis, and in Neoseiulus cucumeris. In the latter case the two genes were misidentified (Fig. S3 in Zhang et al., 2019), a problem that can be avoided by studying all P450s, not just a few. CYP18 is present in Dermanyssus pteronyssinus, Sarcoptes scabiei, Psoroptes ovis, Aculops lycopersicii (as probable pseudogene), Ixodes scapularis, the common house spider and the wolf spider Pardosa pseudoannulata. We found two copies in scorpions and horseshoe crabs.
Thus, no chelicerate has a CYP306 gene, and the 18/306 duplication may have occurred at the root of the Mandibulata (Myriapoda + Pancrustacea).
In millipede genomes (Trigoniulus corallinus and Helicorthomorpha holstii) the CYP18 and CYP306 pair of genes has generated close paralogs, with three transcripts also seen in Chamberlinius hualienensis. In the salmon louse (copepod) we confirm there is no CYP306 (Humble et al., 2019), so the origin of the C25 hydroxyl group of E and 20E found in this species (Sandlund et al., 2018) is unclear. The two other copepod species that we studied do carry a CYP306 gene. Similarly, we did not find CYP306 in TSA or genome of the collembolan Folsomia candida, but there is a CYP306 gene in other collembolans, Sinella curviseta, Holacanthella duospinosa and Orchesella cincta.
CYP307 are enigmatic P450s of the CYP2 clan. Namiki et al., (2005) first showed that the Bombyx mori gene and its Drosophila melanogaster homolog (aka spook) were involved in ecdysone biosynthesis. Drosophila melanogaster has two CYP307 genes, CYP307A1 (spook) and CYP307A2. The latter was found as putative pseudogene in the initial genome release (Tijet et al., 2001), but later obtained from a difficult to sequence heterochromatic region (3R-47.1). The function of the CYP307 enzymes is still unknown 15 years after being characterized as a “key area in studies of ecdysteroid biosynthesis” (Namiki et al., 2005), but it is thought that CYP307 are involved in the “black box” steps downstream of 7-dehydrocholesterol. CYP307 genes are known to be “unstable” with multiple instances of birth and death (Sztal et al., 2007; Rewitz and Gilbert, 2008; Sezutsu et al., 2013). Dermauw et al.(2020) expanded upon these earlier results, and confirmed the presence of two orthologous groups in Pancrustacea, the CYP307A and CYP307B genes. These cover Hexapods, Branchiopoda (Daphnia pulex) and Copepoda within “Multicrustacea” (sensu Schwentner et al., 2017). Depending on the lineage, these two groups are either present together or singly, with no phylogenetic pattern.
Rewitz and Gilbert (2008) noticed the tail to tail location of CYP307A2 and the neverland (nvd) gene in Daphnia pulex, Anopheles gambiae and Drosophila willistoni. The nvd gene encodes a cholesterol 7-dehydrogenase, i.e. an enzyme upstream from CYP307 in ecdysteroid biosynthesis (Yoshiyama et al., 2011). Dermauw et al.(2020) found that in Drosophila melanogaster the two genes are only 0.1cM apart. The synteny is also maintained in Aedes aegypti, at least four lepidopteran species, as well as Nasonia vitripennis, Cephus cinctus, Neodiprion lecontei, Athalia rosae (tail to tail), Myzus persicae, Frankliniella occidentalis, Pediculus humanus and Zootermopsis nevadensis. In Aphis gossypii there are four genes, one CYP307A (XP_027846257) located tail to tail with an nvd pseudogene (XP_027846250) and nvd (XP_027846249) and two more in tandem array (XP_027848023 and XP_027848010), as well as one CYP307B gene (XP_027854219). CYP307A2 and neverland therefore form a functional cluster.
More distant from the CYP307A and 307B sequences are clades of CYP307 in Malacostraca, Cirripedia as well as Diplura, whose relationships to the CYP307A/B clades are unclear. There are two more clades, one (100/100/99) includes Strigamia CYP307 and most Chelicerata, and another (100/100/100) is specific for Acariformes. In none of these genomes did we detect synteny of CYP307 with nvd. While the TSA of the millipede Chamberlinius hualienensis revealed no CYP307, its presence in millipedes can be shown in the genomes of H. holstii and Trigoniulus corallinus. In the bark scorpion, CYP307M was recently duplicated, with the two genes differing by just four nucleotides, but with different neighboring genes, while the horseshoe crab has three CYP307 genes .
All CYP307 sequences share unusual structural features. The C-helix motif conserved GxxWxEQRR of the CYP2 clan has instead a AxCDWSxxQxxRR motif. The I helix of CYP307 lacks the conserved Thr, and has a LEDxxGGHSAvvN consensus where the CYP2 clan has LxDLFxAGx(E/D)TTS. The conserved ExxR and PERF are present and the Cys pocket motif has the consensus FxPFxxGxRxCxG. We believe that these structural features will prove determinant in explaining the complex reaction(s) catalyzed by CYP307 enzymes. Until now, no clear evidence has been presented for any intermediate between 7-dehydrocholesterol and a putative “∆4-diketol” precursor which would then be reduced to the most likely product of the “black box”, the “5ß-diketol” and, depending on the species, further to the “ketodiol” (2,22,25-trideoxyecdysone)(Lafont et al., 2012). Furthermore, there is no evidence that the introduction of the 14 alpha-hydroxyl group is independent of the formation of the conjugated 7-ene-6-one moiety. That the complex and poorly understood reactions of this famed Black Box should be catalyzed by the product of an “unstable” gene as CYP307 is somewhat paradoxical. The high degree of CYP307 “instability” contrasts with the stability and high conservation of the other P450 genes of ecdysteroid biosynthesis. It is possible that different CYP307 have different substrates but a similar product, so that comparative biochemistry may resolve the paradox, and/or that CYP307 duplications allow different timing and sites of expression as in Drosophila melanogaster and Nilaparvata lugens (Ono et al., 2006; Zhou et al., 2020).
Shi et al., 2022 reported a modest O-demethylation activity of Helicoverpa armigera CYP307A2 towards 7-benzyloxymethoxy resorufin (BOMR).
Next to CYP15A/C in the phylogeny is the CYP305 clade. Synteny relationships indicate that CYP15 and CYP305 are neighbouring genes and that this synteny is recognizable from termites to mosquitoes. Phylogeny of the CYP15/CYP305 locus (Fig. S5 from Dermauw et al., 2020)
The function of the CYP305 genes is still unknown, but they are found in most Neoptera as a single gene.
In gregarious locusts, one of the paralogs, CYP305M2, controls biosynthesis of the defense compound phenylacetonitrile (Wei et al., 2019), suggesting that it plays a regulatory role in phase determination.
CYP303A1 is a strongly supported clade (100/100/100) with generally a single gene for each species, but it is duplicated in the Argentine ant, Linepithema humile, and in the carpenter ant Camponotus floridanus where the two genes are in a tandem array. It is also duplicated in the damselfly Calopteryx splendens. CYP303 was not found in genomes or TSA beyond winged insects. The CYP303A1 are highly conserved with a 498 + 4 amino acids length, yet the hymenopteran orthologs are much longer (Apis mellifera 562 aa, Nasonia vitripennis 587 aa). The difference is a single long insertion, confirmed by TSA of a variety of species. This insertion is predicted to be located between helices D and E, thus on the outside of the globular P450 structure and away from the ER membrane surface. With possibly more exceptions as noted above, CYP303A1 is mostly a single copy gene, “stable” in insects, yet there is a bloom of CYP303 genes in fireflies (Fallon et al., 2018). The genome of Photinus pyralis carries 11 genes and two pseudogenes that are all paralogs of CYP303A1. This appears to be related to the biosynthesis of defensive compounds (lucibufagins), because related fireflies that do not make lucibufagins have only a single CYP303 gene (Fallon et al., 2018). Making these polyhydroxylated sterols from cholesterol may require 6-7 P450 reactions. While the enzymatic details of lucibufagin biosynthesis are unknown, the firefly case suggests that the original CYP303A1 is already able to metabolize a sterol or terpenoid structure. Its endogenous substrate is still unknown, but the conserved regulatory function of the gene in Drosophila and locusts also point to a signal molecule, possibly a hormone (Wu et al., 2019; 2020b).
Shi et al., 2022 reported modest O-demethylation activity of Helicoverpa armigera CYP303A1 towards 7-benzyloxymethoxy resorufin (BOMR), and a significant metabolism of 2-tridecanone (although the metabolic product of the reaction was not identified).