Make Anchor Files
Two approaches are commonly used for detecting anchors. The first approach detects homologous gene pairs between pairwise genomes, and considers the homologous gene pairs as anchors. The second approach detects short genomic sequences conserved among evolutionary related genomes, and considers the homologous sequences as anchors.
The OSfinder program requires a file that includes the genomic locations of the anchors to be input. You can choose the way to create the anchor files from the following list.
- Detect homologous gene pairs in terms of BLAST hit, and generate the anchor files with Perl scripts in the OSfinder distribution. For details, click here.
- Detect homologous sequences by utilizing the Murasaki software, and generate the anchor files with Perl scripts in the OSfinder distribution. For details, click here.
- Detect anchors based on your own strategy, and create the anchor files that have appropriate format. For details, click here.
- Download the anchor files from our Web site. For details, click here.
Detect Homologous Gene Pairs
The OSfinder distribution includes helpful Perl scripts for detecting homologous gene pairs by utilizing the BLASTP program, and for generating the input files of OSfinder. For detailed descriptions, please see the following pages.
From GenBank Data
-- This page explains the way to automatically download
genome sequence files from the NCBI GenBank database,
and to parse the genome sequence files in the GBK format.
From Ensembl Data
-- This page explains the way to automatically download
protein sequence files from the Ensembl genome browser,
and to parse the protein sequence files downloaded.
Execute BLASTP
-- This page explains the way to execute
the all-against-all comparison of the protein sequences
encoded in two genomes,
to parse the files output by the BLASTP program,
and to generate the input files of OSfinder.
Detect Homologous Sequences
Murasaki detects short homologous sequences conserved among multiple genomes. The output of Murasaki includes ".anchors" file which includes the genomic locations of the anchors. The Murasaki software includes a Perl script that converts the ".anchors" file into the input file of OSfinder. This approach can be applied to the detection of the anchors among multiple genomes as well as the detection of the anchors between pairwise genomes. You can download the latest version of the Murasaki software from here.
Detect Anchors Based On Your Own Strategy
If you want to detect anchors based on your own strategy, the anchor files made by you must have the appropriate format in order to be accepted by the OSfinder program. The OSfinder program can accept all anchor files based on the format described in the page "File Format".
Download Mammalian Data
The computational cost for calculating anchors between mammalian genomes may inhibit your further analyses. Thus we pre-computed anchor files between mammalian genomes, and those files can be downloaded from the page "Mammalian Data".