RetrogeneDB ID:

retro_hsap_3220

Retrocopy
location
Organism:Human (Homo sapiens)
Coordinates:5:96704479..96705548(+)
Located in intron of:ENSG00000251513
Retrocopy
information
Ensembl ID:ENSG00000251389
Aliases:None
Status:KNOWN_PSEUDOGENE
Parental gene
information
Parental gene summary:
Parental gene symbol:YTHDF1
Ensembl ID:ENSG00000149658
Aliases:YTHDF1, C20orf21
Description:YTH domain family, member 1 [Source:HGNC Symbol;Acc:15867]


Retrocopy-Parental alignment summary:






>retro_hsap_3220
AGCATGGACACTTAGAGAACAAAAGGACAAGATAATAAAGTACACAATGATTCTTTGCATCAGAAGGATGTGGTTCATGA
CAATGACTTTCAGCCCTACCTTTCTGGAGAGTCAAATTAGAGTAACAATTACCCTTCAGTGAGTGACCTCTACCTGTCCG
GCTATTACCCACCATCCACTGGAGTTCCTTAATCCCTCAATGAGGCTCTGTGGTCCACTGCAAGGGACCCTTTGATTCTA
CACCTCATCACCTATGAACAGCTCAGTAACAGAGATCATAATTTTATGCACAATTCTATTTTGGGGTGGCCTTGGGGTCC
GGAGAACAAAATCTATCAGCACAGGTTTAATTTTTACCCCCAAACCCTGTATTCTTGGCATGGAGGACAAGTGGGTCTCA
AGGGTAGCAGATTCAGAGCTCTACATATGGGAGCAGCTACACCTACCCCCAAGCTCCCTGGGCCACGCAGTGGTTGATGG
GCAGATGGGCTTTCACAGCGACACCCTTAGCAAGGCCCCTGGGGTGAACAGCCTGAAGCAGGGCATAGTCAGCCTGAAAA
TTGGGGATGTCATCTCCTCTGCAGTCAAGACAGTGGGCTCTGTCATCAGCAGTGTGGCACTGATTGGCATACTTTCTGGC
AATGATGGGACATATGTGAACATGGCAGTTTCAAAGCTGACCTCATGGGCCGCCATTGCCAGCAAGCCTGCAAAACCACG
GCCTAAAATGGAAACAAGGAGTGGGCCTGTCATAGGGGGGTGCATTGCCCCCTCCATCTATAAAGCATAACATGGGTATT
GACACCCAGGATAACAAGGGGCCTGTGCTGAAGACCCCGGCCCCCCAGGAGACATGATCCTCACAGGCTGTCCTACAACC
CCAGAAGGTGCTTCAGCTTCTCCCAGTGAAGCTTCCTGCTTTGGCTCAACCACAGTATCAGAGCCCTCAGCAGCCATTCC
AGACCTGCTGGATCATTCCTCACGTGGAAATATGGCATTTGGGCAGAGCAGAGGGGCTGACAGTGGTAGTAACTCTCCTG
GAAATACCAAGACTAATTCTGCCCCAAGT

ORF - retro_hsap_3220 Open Reading Frame is not conserved.
Retrocopy - Parental Gene Alignment summary:
Percent Identity: 72.58 %
Parental protein coverage: 63.86 %
Number of stop codons detected: 5
Number of frameshifts detected 4


Retrocopy - Parental Gene Alignment:

ParentalSVDTQRTKGQDNKVQNGSLHQKDTVHDNDFEPYLTGQSNQSNSYPSMSDPYLSSYYPPSIGFPYSLNEAP
S.DT.RTKGQDNKV.N.SLHQKD.VHDNDF.PYL.G.SN.SN.YPS.SD.YLS.YYPPS.G.P.SLNEA.
RetrocopySMDT*RTKGQDNKVHNDSLHQKDVVHDNDFQPYLSGESN*SNNYPSVSDLYLSGYYPPSTGVP*SLNEAL
ParentalWSTAGDPPIPYLTTYGQLSNGDHHFMHDAVFGQPGGLGNNIYQHRFNFFPE-NPAFSAWGTSGSQGQQTQ
WSTA.DP.I..L.TY.QLSN.DH.FMH....G.P.G..N.IYQHRFNF.P..NP.F.AW.TSGSQG.Q.Q
RetrocopyWSTARDPLILHLITYEQLSNRDHNFMHNSILGWPWGPENKIYQHRFNFYPQ<NPVFLAWRTSGSQG*QIQ
ParentalSSAYGSSYTY-PPSSLGGTVVDGQPGFHSDTLSKAPGMNSLEQGMVGLKIGDVSSSAVKTVGSVVSSVAL
SS.YGSSYTY.PPSSLG..VVDGQ.GFHSDTLSKAPG.NSL.QG.V.LKIGDV.SSAVKTVGSV.SSVAL
RetrocopySSTYGSSYTY<PPSSLGHAVVDGQMGFHSDTLSKAPGVNSLKQGIVSLKIGDVISSAVKTVGSVISSVAL
ParentalTGVLSGNGGTNVNMPVSKPTSWAAIASKPAKPQPKMKTKSGPVM-GGGLPPPPIKHNMDIGTWDNKGPVP
.G.LSGN.GT.VNM.VSK.TSWAAIASKPAKP.PKM.T.SGPV..GG.LPPP.IKHNM.I.T.DNKGPV.
RetrocopyIGILSGNDGTYVNMAVSKLTSWAAIASKPAKPRPKMETRSGPVI>GGALPPPSIKHNMGIDTQDNKGPVL
ParentalKAPVPQQAPSPQAAPQPQQVAQPLPAQPPALAQPQYQSPQQPPQTRWVAPR-NRNAAFGQSGGAGSDSNS
K.P.PQ...S.QA..QPQ.V.Q.LP...PALAQPQYQSPQQP.QT.W..P....N.AFGQS.GA.S.SNS
RetrocopyKTPAPQET*SSQAVLQPQKVLQLLPVKLPALAQPQYQSPQQPFQTCWIIPH<RGNMAFGQSRGADSGSNS
ParentalPGNVQPNSAPS
PGN...NSAPS
RetrocopyPGNTKTNSAPS

Legend:
*Stop codon
>Forward frameshift by one nucleotide
<Reverse frameshift by one nucleotide






(Hint: click retrocopy or parental gene accession number on the plot's legend, to show / hide expression level values)

Expression validation based on RNA-Seq data:
Library Retrocopy expression Parental gene expression
bodymap2_adipose 0 .02 RPM 61 .96 RPM
bodymap2_adrenal 0 .12 RPM 80 .90 RPM
bodymap2_brain 0 .05 RPM 87 .34 RPM
bodymap2_breast 0 .00 RPM 57 .28 RPM
bodymap2_colon 0 .00 RPM 37 .45 RPM
bodymap2_heart 0 .00 RPM 30 .24 RPM
bodymap2_kidney 0 .00 RPM 54 .45 RPM
bodymap2_liver 0 .00 RPM 36 .23 RPM
bodymap2_lung 0 .00 RPM 56 .65 RPM
bodymap2_lymph_node 0 .00 RPM 64 .56 RPM
bodymap2_ovary 0 .04 RPM 72 .92 RPM
bodymap2_prostate 0 .00 RPM 42 .71 RPM
bodymap2_skeletal_muscle 0 .00 RPM 66 .15 RPM
bodymap2_testis 0 .00 RPM 77 .86 RPM
bodymap2_thyroid 0 .09 RPM 80 .39 RPM
bodymap2_white_blood_cells 0 .00 RPM 65 .60 RPM
RNA Polymerase II actvity near the 5' end of retro_hsap_3220 was not detected
No EST(s) were mapped for retro_hsap_3220 retrocopy.
No TSS is located nearby retro_hsap_3220 retrocopy 5' end.
retro_hsap_3220 was not experimentally validated.

Retrocopy orthology:
Retrocopy retro_hsap_3220 has 2 orthologous retrocopies within eutheria group .

Species RetrogeneDB ID
Pan troglodytes retro_ptro_2170
Gorilla gorilla retro_ggor_2186

Parental genes homology:
Parental genes homology involve 4 parental genes, and 4 retrocopies.

Species Parental gene accession Retrocopies number
Homo sapiens ENSG00000149658 1 retrocopy
retro_hsap_3220 ,
Gorilla gorilla ENSGGOG000000083401 retrocopy
Nomascus leucogenys ENSNLEG000000099751 retrocopy
Pan troglodytes ENSPTRG000000137301 retrocopy

Expression level across human populations :
image/svg+xml GBR_HG00142 GBR_HG00099 GBR_HG00114 GBR_HG00143 GBR_HG00131 GBR_HG00137 GBR_HG00133 GBR_HG00119 GBR_HG00111 GBR_HG00134 FIN_HG00378 FIN_HG00338 FIN_HG00349 FIN_HG00375 FIN_HG00315 FIN_HG00277 FIN_HG00328 FIN_HG00321 FIN_HG00377 FIN_HG00183 TSI_NA20756 TSI_NA20538 TSI_NA20798 TSI_NA20532 TSI_NA20765 TSI_NA20518 TSI_NA20513 TSI_NA20512 TSI_NA20771 TSI_NA20786 YRI_NA19114 YRI_NA19099 YRI_NA18870 YRI_NA18907 YRI_NA19223 YRI_NA19214 YRI_NA18916 YRI_NA19093 YRI_NA19118 YRI_NA19213 Toscaniin Italia: Finnish inFinland: British in England and Scotland: Utah Residents (CEPH) with Northernand Western European Ancestry: Yoruba in Ibadan, Nigeria: CEU_NA12760 CEU_NA12827 CEU_NA12872 CEU_NA12751 CEU_NA12873 CEU_NA12400 CEU_NA11930 CEU_NA12004 CEU_NA11831 CEU_NA11843 No expression ( = 0 RPM ) > 0 RPM = 0.14 RPM Legend:


Library Retrogene expression
CEU_NA11831 0 .04 RPM
CEU_NA11843 0 .03 RPM
CEU_NA11930 0 .03 RPM
CEU_NA12004 0 .00 RPM
CEU_NA12400 0 .04 RPM
CEU_NA12751 0 .10 RPM
CEU_NA12760 0 .00 RPM
CEU_NA12827 0 .14 RPM
CEU_NA12872 0 .03 RPM
CEU_NA12873 0 .00 RPM
FIN_HG00183 0 .05 RPM
FIN_HG00277 0 .00 RPM
FIN_HG00315 0 .00 RPM
FIN_HG00321 0 .00 RPM
FIN_HG00328 0 .02 RPM
FIN_HG00338 0 .04 RPM
FIN_HG00349 0 .06 RPM
FIN_HG00375 0 .00 RPM
FIN_HG00377 0 .03 RPM
FIN_HG00378 0 .00 RPM
GBR_HG00099 0 .00 RPM
GBR_HG00111 0 .02 RPM
GBR_HG00114 0 .00 RPM
GBR_HG00119 0 .02 RPM
GBR_HG00131 0 .06 RPM
GBR_HG00133 0 .00 RPM
GBR_HG00134 0 .07 RPM
GBR_HG00137 0 .00 RPM
GBR_HG00142 0 .03 RPM
GBR_HG00143 0 .00 RPM
TSI_NA20512 0 .00 RPM
TSI_NA20513 0 .05 RPM
TSI_NA20518 0 .00 RPM
TSI_NA20532 0 .03 RPM
TSI_NA20538 0 .00 RPM
TSI_NA20756 0 .03 RPM
TSI_NA20765 0 .00 RPM
TSI_NA20771 0 .03 RPM
TSI_NA20786 0 .00 RPM
TSI_NA20798 0 .00 RPM
YRI_NA18870 0 .10 RPM
YRI_NA18907 0 .00 RPM
YRI_NA18916 0 .00 RPM
YRI_NA19093 0 .00 RPM
YRI_NA19099 0 .00 RPM
YRI_NA19114 0 .00 RPM
YRI_NA19118 0 .00 RPM
YRI_NA19213 0 .05 RPM
YRI_NA19214 0 .00 RPM
YRI_NA19223 0 .04 RPM


Indel association:

The presence of retro_hsap_3220 across human populations is associated with 1 indel. The percentage values indicate the frequencies of retro_hsap_3220 presence in various populations. Based on Kabza et al. 2015 (PubMed).


# Indel coordinates AFR, African AMR, Ad Mixed American EUR, European EAS, East Asian
ASW YRI LWK MXL PUR CLM CEU IBS GBR FIN TSI JPT CHB CHS
1. 5:96704395..96705200 100 100 98.97 100 100 100 100 100 100 100 100 100 100 100


Indel #1, located at the genomic coordinates 5:96704395..96705200.

image/svg+xml Mexican Ancestryfrom Los Angeles USA MXL 100 % Puerto Ricansfrom Puerto Rico PUR 100 % Colombians fromMedellin, Colombia CLM 100 % Americans of AfricanAncestry in SW USA ASW 100 % Yoruba in Ibadan,Nigeria YRI 100 % Luhya in Webuye,Kenya LWK 98.97 % Utah Residents (CEPH)with Northern andWestern European Ancestry CEU 100 % Iberian Populationin Spain IBS 100 % British in Englandand Scotland GBR 100 % Finnish in Finland FIN 100 % Toscani in Italia TSI 100 % Han Chinese in Bejing,China CHB 100 % Japanese in Tokyo,Japan 100 % Southern Han Chinese CHS 100 % JPT EUROPE AMERICAS AFRICA EAST ASIA





Copyright © RetrogeneDB 2014-2017