Retrieve an Entire Genome & Retrieval of SARS-CoV-2 Viral Genome
Transcription
Introduction:
In Bioinformatics, a genome browser is a graphical interface for display of information from a biological database for genomic data. Genome Browser can also be used to retrieve biological data. We will retrieve SARS-CoV-2 Viral Genome.
Steps:
-
Go to the ‘Genome Browser’ which you can access here.
-
Search for ‘SARS-CoV-2' in the search bar.
-
The information page of the SARS-CoV-2 will be displayed.
-
Scroll down this page, you will see there are many options available to download an entire sequence and annotation data.
-
From those options, first click on the ‘Data use Condition and Restriction’, which would open up the terms and conditions’ page, and give it a read.
Retrieve through an Editor
-
Now, go back and select the ‘Using FTP’ and it will open up the directory which contains folders (bigzips, chromosomes, database)
-
If you go to ‘BigZips’, where you can access individual chromosomes such as MaskedChromosoms.
-
There’s a file named ‘ReadMe’, which is a text file, containing the information of the directory you are in.
-
Go back to the ‘Parent Directory’.
-
Click on the ‘Chromosomes’ folder, which contains the ‘.gz’ file and is named by the NCBI’s genome accession number (it holds the entire Genome of SARS-CoV-2).
-
Click on this ‘.gy’ file and download it.
-
Drag the downloaded file to any editor e.g. Visual Studio Code.
-
You can visualize the entire genome sequence on the editor and further use it for your analysis.
Retrieve through BASH
-
Copy the link of the ‘.gz’ file.
-
Go to the BASH if you have a Linux computer.
-
Firstly, to get the file, type ‘wget’ followed by the link of the file.
-
It will retrieve the file.
-
Now, to unzip the file, type ‘gunzip NC*’
-
And in the next line, ‘Is’.
-
It will find the file and display the name of the file.
-
Now, type ‘cat NC* | head – n 50’, it will display the first 50 lines of the SARS-CoV-2 Genome Sequence.
Note: Make sure you have WinRAR installed in your computer, so the file can be unzipped.
Summary:
In this video, we got to see how to retrieve an entire genome by using SARS-CoV-2 as an example. We got to see how to retrieve the genome through two different Operating System; through Windows and through Linux.