CpG Islands Analysis

CpG islands (CGIs) are genomic areas with a high number of CpG dinucleotide repeats (CpG sites). CpG sites are DNA regions wherein the linear sequence of bases along its 5' -> 3', a cytosine nucleotide is followed by a guanine nucleotide. Sequence ranges with at least 200 bp, an Obs/Exp value more than 0.6 and a GC content greater than 50% are referred to as CpG islands.he estimated number of CpG dimers in the window is calculated by multiplying the number of'C's in the window by the number of 'G's in the window divided by the window length. A python based script was used to finds CpG island sites in the given nucleotide by using a nucleotide window of at least 200 bp that moves at 1 bp intervals throughout the sequence.

We found 4714 CpG island locations in 1230 complete nucleotide sequences of APV9 (as shown in the table below).

Further, we analyzed these 4714 CpG island locations and found that out of these 4714 locations, only 663 CpG island locations were unique, the largest length of CpG island was found to be of 378 bp, average GC content percentage was found to be 50% and the maximum Obs/Exp ratio was found to be 0.90.

Accesion Number
Location
Length
G
C
G+C
GC Content
OBS(CpG)
EXP(CPG)
OBS/EXP