Chromosome 4 open reading frame 50

From Wikipedia, the free encyclopedia

Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene.[1] The protein localizes in the nucleus.[1] C4orf50 has orthologs in vertebrates but not invertebrates[2]

Gene[edit]

Chromosome 4 with gene C4orf50 marked

The C4orf50 gene is on chromosome 4 at position 4p16.2 and is located on the minus strand.[1][3] The gene's longest isoform consists of 11 exons, a coding sequence of 6370 nucleotides, and an upstream in-frame stop codon.[4] Other genes in the gene neighborhood include: CRMP1 and JAKMIP1[1]

Predicted Tertiary Structure of C4orf50 from Phyre 2

Protein[edit]

C4orf50 is 1508 amino acids long and has a calculated molecular weight of 30 kDa.[1] The isoelectric point is at approximately a pH of 5.6.[5] In addition, the protein has higher than normal amounts of glutamic acid and arginine, and lower than normal amounts of phenylalanine and tyrosine.[6]

Tertiary structure[edit]

i-TASSER and Phyre 2 predict C4orf50 to have a tertiary structure rich in alpha helices concentrated near the N-terminus and C-terminus.[7][8]

Predicted Tertiary Structure of C4orf50 from i-TASSER

Gene level regulation[edit]

Expression[edit]

C4orf50 RNA is expressed lowly and ubiquitously in most tissue types. C4orf50 is expressed at a much higher level in the brain, testis, adrenal, and prostate.[3] C4orf50 was expressed in specific parts of the brain including the hippocampus and striatum.[3] Other tissues with moderate expression included the frontal lobe, parietal lobe, and amygdala.[3] In all available RNA-sequencing data shows C4orf50 is found in the brain.

Protein level regulation[edit]

Modification[edit]

It is predicted that C4orf50 has 21 phosphorylation sites, one sulfonation site, one N-glycosylation site, and several O-glycosylation sites.[9]

Immunohistochemical staining of testis tissue of C4orf50 from Thermofisher

Subcellular localization[edit]

The primary subcellular location is the nucleus.[1] Immunofluorescent staining of C4orf50 antibodies show that C4orf50 is present in the nucleus, but the reason remains unknown.[10] C4orf50 is less abundant than most proteins in humans[10]

Evolution[edit]

Orthologs

Corrected Sequence Divergence vs Estimated Date of Divergence. Blue indicates C4orf50. Red indicates Cytochrome C. Yellow indicates Fibrinogen alpha.

C4orf50 in Homo sapiens is poorly conserved. It is found in vertebrates but not invertebrates and has many orthologs including mammals, reptiles, birds, amphibians, and fish.[11] Table 1 below shows orthologs of C4orf50 in mammals, reptiles, birds, amphibians, and fish. C4orf50 is evolving considerably quickly compared to reference sequences Cytochrome C and Fibrinogen alpha. This is shown to the right when comparing the divergence rates of C4orf50, Cytochrome C, and Fibrinogen Alpha.

Genus and Species Common Name Taxonomic Group Median Date of Divergence (MYA*) Accession # Sequence Length (aa) Sequence Identity to Human Protein (%) Sequence Similarity to Human Protein (%)
Homo sapiens Human Primate 0 XP_047271622 1508 100 100
Tupaia chinensis Chinese Tree Shrew Tupaiidae 85 XP_027622007 1448 93 53.2
Mus musculus House Mouse Rodentia 87 XP_006504299 1238 90 41.9
Talpa occidentalis Iberian Mole Talpidae 94 XP_037386436 1364 79 44.3
Mauremys mutica Yellow Pond Turtle Testudines 319 XP_044874448 1954 62 30.5
Alligator mississippiensis American Alligator Crocodilia 319 XP_019333198 1893 37 28.3
Apteryx rowi Okarito Kiwi Apterygiformes 319 XP_025910622 1459 8 47.2
Aquila chrysaetos chrysaetos Golden Eagle Accipitriformes 319 XP_040979081 1611 10 38.3
Gallus gallus Chicken Galliformes 319 XP_046772670 1627 7 44.6
Anser cygnoides Swan Goose Anseriformes 319 XP_047902118 1596 18 31.7
Falco cherrug Saker Falcon Falconiformes 319 XP_027669980 1518 8 50.4
Strigops Kakapo Psittaciformes 319 XP_030347251 1497 8 50.4
Geotrypetes seraphini Gaboon Caecillian Dermophiidae 353 XP_033815404 1897 11 37.8
Halichoerus grypus Grey Seal Phocidae 94 XP_035960566 1536 85 51
Amblyraja radiata Thorny Skate Rajiformes 464 XP_032876992 2434 74 50.8

*MYA = Million Years Ago

References[edit]

  1. ^ a b c d e f "C4orf50 Gene - GeneCards | CD050 Protein | CD050 Antibody". www.genecards.org. Retrieved 2022-07-29.
  2. ^ "Home - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2022-07-29.
  3. ^ a b c d "C4orf50 chromosome 4 open reading frame 50 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2022-07-29.
  4. ^ "PREDICTED: Homo sapiens chromosome 4 open reading frame 50 (C4orf50), transcript variant X2, mRNA". 2022-04-05. {{cite journal}}: Cite journal requires |journal= (help)
  5. ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2022-07-29.
  6. ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2022-07-29.
  7. ^ www.sbg.bio.ic.ac.uk http://www.sbg.bio.ic.ac.uk/~phyre2/html/. Retrieved 2022-07-29. {{cite web}}: Missing or empty |title= (help)
  8. ^ "I-TASSER results". seq2fun.dcmb.med.umich.edu. Retrieved 2022-07-29.[permanent dead link]
  9. ^ "Services". www.healthtech.dtu.dk. Retrieved 2022-07-29.
  10. ^ a b "C4orf50 Antibody (PA5-63550)". www.thermofisher.com. Retrieved 2022-07-29.
  11. ^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2022-07-29.