Table of Contents
CYP genes, P450 enzymes
CYP genes encoding P450 enzymes are consistently found among the largest gene families in plants, animals and fungi. These enzymes are at the interface of environmental responses, metabolism and endocrine regulation by catalyzing the transformation of a myriad of exogenous and endogenous substrates by hydroxylation, epoxidation, dealkylations and a great variety of other reactions. This dokuwiki has a focus on arthropod P450s.
The name cytochrome P450 dates back to the original paper of Omura and Sato (1962) characterizing the rabbit liver pigment with a prominent peak of its Fe(II)-CO complex at 450 nm as a hemoprotein (Omura and Sato, 1964a,b). This pigment had first been observed in rat and pig liver microsomes by Klingenberg (1958) and Garfinkel (1958). For a historical context see Omura, 2011. Elements of a chronology of arthropod P450 research have been collected here.
It turned out that cytochromes P450 are not cytochromes in the strict biochemical sense, but heme-thiolate proteins (Mansuy, 1998). The term cytochrome has remained, although it is now easier to just refer to “P450 enzymes”. Many papers refer to cytochrome P450 monooxygenases, even though the reaction P450 enzymes catalyze is not always a monooxygenation. Also, P450s should not be confused with flavin monooxygenases (FMOs).
With gene cloning and later genome sequencing, the number of sequences increased rapidly and the CYP nomenclature was adopted in the late 1980s for genes encoding P450 enzymes. Soon, CYP and P450 became almost interchangeable, and many papers now refer to “CYPs” rather than “P450s”. It doesn't really matter, as long as the designation is precise. Here, we prefer to use P450 in general, and to use CYP when referring to a specific gene/protein.
P450 nomenclature
A CYP prefix, followed by an arabic numeral designates the family (all members nominally share >40% sequence identity), a capital letter designates the subfamily (all members nominally >55% identical) and an arabic numeral designates the individual gene (all italics) or transcript and protein (no italics).
(A termite P450 claimed the welcoming designation of CYP4U2 AF046011).
CYP names are given by the “P450 nomenclature committee”, which currently means Dr. David R. Nelson (University of Tennessee) ( email) to whom requests for official CYP names should be directed. As of April 17, 2023, David Nelson has named 22,877 insect P450s in 1043 CYP families.
Publishing about a P450 without official CYP name, or worse, with an approximate, wrong or invented name only causes confusion in the literature. It leads to errors in the interpretation of results, these errors are then compounded in later publications, and are a waste of research time.
WARNING: NCBI is an excellent resource, but a P450 (CYP) result in a BLAST search at NCBI should only be considered a starting point (for sequence) and indication of relatedness to known P450s:
(1) CYP name: A CYP name as found at NCBI is in most cases NOT an official CYP name. In fact, when NCBI names P450s as “cytochrome P450 nNm-like” or “probable cytochrome P450 nNm”, this shows that the CYPnNm name should not be used in publications, as this leads to confusion. The correct name should be found by blast on this site or by asking Dr. Nelson for a correct name ( email).
“The beginning of wisdom is to call things by their right name” (名正才能言順, Confucius, The Analects, Zi Lu 3).
(2) Sequence: Sequences as found at NCBI are often correct, but in too many cases they are not. Problems with sequences at NCBI are discussed here.
P450 sequences
Collections of manually annotated arthropod P450 sequences with their official CYP names are provided here.
There are currently very few confirmed examples of alternatively spliced arthropod P450 transcripts, although cases are well described for human P450s. Similarly, gene conversion events have not been studied systematically in arthropods, despite the many gene clusters that would be the “breeding grounds” for such events. Notable examples are listed here. Copy number variation (CNV) is increasingly appreciated as an important phenomenon at the intraspecific level (e.g. Lucas et al., 2019), which is mirrored at the level of interspecific comparisons by gene duplications. Currently there is no standard way of naming copy number variants of P450 genes.
CYPome size, or number of CYP genes in a genome, is a number commonly referred to in the literature, but it is not an absolute or fixed number. It is affected by the number of pseudogenes, the number of alternatively spliced genes, and by intraspecific copy number variation. Moreover, it is also dependent on the quality of the genome assembly (with the not infrequent inclusion of two alleles of the same gene), and the quality of the annotation.
Higher order nomenclature: CYP clans
CYP clans constitute a higher order of nomenclature, regrouping CYP families. Until recently insect CYPomes were thought to be comprised of sequences from just four clans, the CYP2, CYP3, CYP4 and mitochondrial clans (Feyereisen 2006, 2012). Recent work established the presence of P450s from additional clans, the CYP16 and CYP20 clans, and pointed to the intriguing presence of possible CYP19 clan sequences, as well as a CYP53 clan sequence in the fungus gnat Bradysia (acquired from fungi by horizontal gene transfer).
More about CYP clans
Pseudogenes and gene fragments
Pseudogenes are noted by the suffix P. This suffix is added to the closest paralog that is an active gene, e.g. CYP9E2 and CYP9E2P1 in Blattella germanica (Wen et al., 2001). However this is not always done, as the closest paralog is sometimes not easily recognized. In that case, the pseudogene has its own gene root number.
Pseudogenes can differ by a single nucleotide (substitution or indel) leading to a premature stop or a frameshift (“young” pseudogenes). They can also be so degraded as to be hardly recognized as P450 genes (“old” pseudogenes).
Sequencing of different populations or strains of the same species can reveal that a pseudogene in one population is an active gene in another, and vice-versa (e.g. CNV study in Tetranychus urticae in prep.).
A nomenclature for loose exons (solo exons, detritus exons), or internal duplicated or partial exons has been proposed (Nelson et al., 2004) but these rules are too cumbersome and are not in common use.
Alleles
Alleles of a gene are named as subscripts v1, v2 (e.g. CYP6B1v2, Cohen et al., 1992). In a practical shortcut, CYP6CM1vQ and vB designate the alleles of this gene found in the Q and B biotypes, (now MED and MEAM) respectively, of Bemisia tabaci (Karunker et al., 2008). In this case, the alleles differ in their capacity to metabolize neonicotinoids.
v1, v2 subscripts are not very common, and while they are found in the literature, they should be treated with caution. They may be associated with genes cloned before genome projects, so they may represent different genes that are very close in sequence, rather than (as intended) alleles of a single gene. Copy number variation (CNV) also results in (often duplicated) genes that are close in sequence.
Allelic variants of human P450s can be responsible for interindividual variation in drug metabolism and there is a dedicated website to document them and support research in pharmacogenetics.
P450 common names
In following the tradition that predates the CYP nomenclature, P450 enzymes can be named with a small suffix, such as P450cam, the camphor hydroxylase of Pseudomonas putida later named CYP101; P450BM3 the fatty acid hydroxylase of Bacillus megaterium (CYP102); or P450scc, the cholesterol side-chain cleavage enzyme of vertebrates (CYP11A1). In insects, few P450 enzymes have been named in this way. P450hyd Reed et al., 1994 is the P450 forming hydrocarbons, later identified as CYP4G2 in the house fly (Qiu et al., 2012). P450Lpr is the predominant P450 in the pyrethroid-resistant strain Learn-Pyr of the house fly, later identified as CYP6D1 (Tomita and Scott, 1995). In the Drosophila gene nomenclature only the initial letter is capitalized, hence CYP6A1 in the house fly and Cyp6a2 in Drosophila melanogaster.
Several of the so-called “Halloween genes”, originally from the Drosophila literature, are particular cases where the use of common (actually rather uncommon) names obscures the identity of the genes as P450s in the molting hormone biosynthetic pathway.