top of page

Retrieve an Entire Genome & Retrieval of SARS-CoV-2 Viral Genome

How to Retrieve an Entire Genome & Retri

Transcription

Introduction: 

In Bioinformatics, a genome browser is a graphical interface for display of information from a biological database for genomic data. Genome Browser can also be used to retrieve biological data. We will retrieve SARS-CoV-2 Viral Genome.  

Steps: 

  • Go to the ‘Genome Browser’ which you can access here

  • Search for ‘SARS-CoV-2' in the search bar. 

  • The information page of the SARS-CoV-2 will be displayed. 

  • Scroll down this page, you will see there are many options available to download an entire sequence and annotation data. 

  • From those options, first click on the ‘Data use Condition and Restriction’, which would open up the terms and conditions’ page, and give it a read. 

       

   Retrieve through an Editor 

  • Now, go back and select the ‘Using FTP’ and it will open up the directory which contains folders (bigzips, chromosomes, database) 

  • If you go to ‘BigZips’, where you can access individual chromosomes such as MaskedChromosoms. 

  • There’s a file named ‘ReadMe’, which is a text file, containing the information of the directory you are in. 

  • Go back to the ‘Parent Directory’

  • Click on the ‘Chromosomes’ folder, which contains the ‘.gz’ file and is named by the NCBI’s genome accession number (it holds the entire Genome of SARS-CoV-2). 

  • Click on this ‘.gy’ file and download it. 

  • Drag the downloaded file to any editor e.g. Visual Studio Code. 

  • You can visualize the entire genome sequence on the editor and further use it for your analysis. 

   Retrieve through BASH       

  • Copy the link of the ‘.gz’ file. 

  • Go to the BASH if you have a Linux computer. 

  • Firstly, to get the file, type ‘wget’ followed by the link of the file. 

  • It will retrieve the file. 

  • Now, to unzip the file, type ‘gunzip NC*’ 

  • And in the next line, ‘Is’

  • It will find the file and display the name of the file. 

  • Now, type ‘cat NC* | head – n 50’, it will display the first 50 lines of the SARS-CoV-2 Genome Sequence. 

   Note: Make sure you have WinRAR installed in your computer, so the file can be unzipped. 

 

Summary: 

In this video, we got to see how to retrieve an entire genome by using SARS-CoV-2 as an example. We got to see how to retrieve the genome through two different Operating System; through Windows and through Linux.

Download Transcription

bottom of page