Alternative splicing of arthropod P450s
Manual curation of P450 sequences has probably eliminated most obvious errors in gene structure, but only deep transcriptomes and high coverage genomes will help further improve the quality of annotations, as well as shed more light on the question of alternative splicing of P450 transcripts.
Alternative splicing of human P450s has been well documented (Annalora et al.,2017).
The nomenclature rules for alternatively spliced P450s follows Nelson et al. (2004), where the locus is named and the alternatively spliced exons named with the _v#a and _v#b suffixes.
Thus in Drosophila melanogaster, alternative splice forms of CYP4D1 are known (Tijet et al., 2001; Chung et al., 2009; Good et al., 2014). They differ by their first exon and should be named CYP4D1_v1a and CYP4D1_v1b
A remarkable example is found in Spodoptera species. In S. litura, CYP6AE50 has five alternative splicing forms, with a common second (last) exon and different first exons. These five isoforms were named before it became apparent that these were five forms of the same gene, so they don't follow the nomenclature rules, and for convenience these names are kept. The five isoforms are called CYP6AE138 (most distal first exon), CYP6AE139, CYP6AE70, CYP6AE47 and CYP6AE50 (most proximal first exon). All are found as transcripts. Similarly, in S. exigua there are three isoforms, and four in S. littoralis.
The figure shows the second common exon in dark blue, and the first exons in four Spodoptera species. Color code indicates orthology relationships. The exons marked by * have (until now) not been identified as transcripts. Note that two strains of S. frugiperda differ in their number of first exons.
In other examples, there are alternatively spliced forms in Bicyclus anynana (CYP4CG34 and CYP6AN59), Ectropis grisescens (CYP6AN36), Pieris rapae (CYP4L40) and Nezara viridula (3 genes).
In Bradysia coprophila, the deep TSA coverage has confirmed two cases of alternative splicing in CYP4 family genes: CYP4ZL1_v1a2a has distal exons 1a and 2a, while CYP4ZL1_v1b2b uses 2 proximal exons 1b and 2b.
CYP4ZL3_v1a2a has distal exons 1a and 2b, and CYP4ZL3_v1b1'b2b has proximal exons 1b-1'b (corresponding to exon 1a split by an intron) and 2b. In both cases the isoforms use common last exons.
In the absence of transcript evidence, it is difficult to distinguish partial genes or loose exons from alternatively spliced transcripts.
Partial genes or loose exons are a common feature of CYPomes. An interesting case is the CYP3367D3 structure found in the Centruroides sculpturatus CYPome. BLASTn searches exon by exon revealed an overlapping sets of 24 exons, but a single transcript of 11 exons, with an intricate genomic structure.
Here, only alternative splicing variants that differ in their open reading frame (orf) are considered. Variants that affect only the 5' UTR are known, and in some cases are responsible for variations in expression patterns. Examples include the CYP4CE1 gene in Nilaparvata lugens which has two alternative splice forms of the first exon (Liu et al., 2021), and the CYP6ER1 gene in that species (Liang et al., 2018).