User Tools

Site Tools


vcf_to_circos

CircosVCF10X To help 10X Genomics Long Ranger wgs VCF files be used in Circos plots

Package: CircosVCF10x https://github.com/sxdxs/CircosVCF10X
Type: Package
Title: Convert 10X Genomics VCF to common Circos formats
Version: 0.1.0
Author: Tim Chappell
Maintainer: The package maintainer timchappell@uga.edu
Description: For use in RCircos or other circos plotting programs. Translocations() prompts for .csv, outputs linked segments from VCF format. Quality defaults to remove low scoring breakend points.

  Other functions will be coming in version 0.1.1. Outputs .csv.

License: open to add & expand Encoding: UTF-8 LazyData: true

Translocations <- function() {
  filename <-readline(prompt="Enter document name (saved VCF output from 10X LONGRANGER as CSV):")
 
  mydata <- read.csv(filename, header = TRUE, skip = 53)
 
  View(mydata)
 
  endcomp <- data.frame("eight" = (as.integer(substr(mydata$INFO, 5, 8))),"nine" = (as.integer(substr(mydata$INFO, 5, 9))),"ten" = (as.integer(substr(mydata$INFO, 5, 10))), "eleven" = (as.integer(substr(mydata$INFO, 5, 11))), "twelve"= (as.integer(substr(mydata$INFO, 5, 12))), "digit40" = (as.integer(substr(mydata$INFO,37,40))),"digit41" = (as.integer(substr(mydata$INFO,37,41))), "digit42" = (as.integer(substr(mydata$INFO,37,42))), "digit43" = (as.integer(substr(mydata$INFO,37,43))), "digit44" = (as.integer(substr(mydata$INFO,37,44))))
 
  endcomp[is.na(endcomp)] <- 0
 
  endmax <- pmax(endcomp$eight,endcomp$nine,endcomp$ten,endcomp$eleven,endcomp$twelve,endcomp$digit40, endcomp$digit41, endcomp$digit42, endcomp$digit43, endcomp$digit44)
 
  mydata$temp<- endmax
 
  mydata <- mydata[!grepl('DUP', mydata$ALT),]
  mydata <- mydata[!grepl('INV', mydata$ALT),]
  mydata <- mydata[!grepl('UNK', mydata$ALT),]
  mydata <- mydata[!grepl('DEL', mydata$ALT),]
  mydata <- mydata[!grepl('INV', mydata$INFO),]
  mydata <- mydata[!grepl('DEL', mydata$INFO),]
  mydata <- mydata[!grepl('DUP', mydata$INFO),]
 
  mydata <- mydata[!grepl('LOWQ', mydata$FILTER),]
 
  mydata <- mydata[order(mydata$ID),]
 
  callOdd <- mydata[ c(TRUE,FALSE),]
 
  callEven <- mydata[ !c(TRUE,FALSE),]
 
  circosData <- data.frame("Chromosome" = callOdd$X.CHROM, "chromStart" = callOdd$temp, "chromEnd" = callOdd$POS, "Chromosome.1" = callEven$X.CHROM, "chromStart.1" = callEven$temp, "chromEnd.1" = callEven$POS)
 
 
  circosData$Chromosome <- sub("^", "chr", circosData$Chromosome)
  circosData$Chromosome.1 <- sub("^", "chr", circosData$Chromosome.1)
 
  write.csv(circosData, file = "TranslocationsOutput.csv", row.names = FALSE)
 
 
  View(circosData)
}

Example head(circosData) output:

Chromosome chromStart chromEnd Chromosome.1 chromStart.1 chromEnd.1
chr10 17031753 17196578 chr13 8383617 8430614
chr10 14372049 14397444 chr10 16515031 16561290
chr10 49169974 49315301 chr20 36292648 36295759
chr5 36003040 36254618 chr8 9999477 10006092
chr12 36186160 36286837 chr13 39578635 39765430

This code can be modified for extracting/reformatting for your type of translocation/filter settings and fed into a shiny app published in a paper here: https://www.ncbi.nlm.nih.gov/pubmed/29186362

The shiny app link: http://shinycircos.ncpgr.cn/

Example output

vcf_to_circos.txt · Last modified: 2019/05/09 15:41 by sxd