User Tools

Site Tools


problems_at_ncbi

P450 genes /sequences at NCBI

These are of two kinds:

  • User submitted sequences that may or may not be correct.
  • Gene models predicted by automatic computational analysis. These RefSeq models (XP_nnnn) are extremely useful as a starting point, but they can be incorrect (sequence), and their CYP name, especially for lesser-known, non-model species, is usually wrong.

Genes annotated as P450s at NCBI often show different isoforms with different accession numbers. This is almost always incorrect, with just one of the “isoforms” corresponding to the correct sequence, while the others are either missing one or more exons, or joining two (or three) adjacent genes into one. A few documented cases of alternative splicing gives rise to bona fide P450 isoforms. Manual annotation of P450 genes from many arthropod species showed a 27% error rate in RefSeq P450 sequences (Dermauw et al., 2020).

CYP9P2 of the honeybee is an example of an intronless gene, cut by RefSeq (XP_006562368.1) with the N-terminal side spliced with another gene, while the C-terminal side is not called. Yet there is full transcript support (HP578228) for the manually curated sequence. Also, the RefSeq name is incorrect because CYP9E2 is a gene from Blattella germanica.

Another example, not uncommon in NCBI, is the fusion of two adjacent P450 genes, as CYP9S1 and 9R1 in the honeybee (XP_026300962.1), that can be cut into its two genes, both fully supported by transcripts.

Searches at NCBI by “P450” or “CYP” does not give all the sequences, but there is a need to blast search as well: Some P450 are not recognized as such but can be found as “low quality protein” or “uncharacterized” or they have a common name that is not a CYP name, e.g. a Drosophila gene name (“shade”) for a non-Drosophila gene, or a common name for a related vertebrate P450 such as “thromboxane synthase” or “vitamin D 25-hydroxylase-like”.

The screen shots below illustrate the problems:

problems_at_ncbi.txt · Last modified: 2024/05/27 12:05 by renefeyereisen