Anchor File Format (Pairwise)
The anchor files that can be accepted by the OSfinder program are composed by the lines, each of which corresponds to an anchor and has the following format.
chrom_1 start_1 end_1 sign_1 chrom_2 start_2 end_2 sign_2 [chrom_i start_i end_i sign_i] ...
- "chrom_1", "chrom_2", and "chrom_i" represent the chromosome IDs for the first, second, and i-th genome, respectively. Note that the chromosome IDs must be integer. Thus, the chromosome names such as "X", "Y", or "NC_007436" must be replaced by integers.
- "start_1" and "end_1" represent the start and end locations of the anchor on the first genome, respectively. "start_2", "end_2", "start_i", and "end_i" represent similarly.
- "sign_1" represents the anchor is located on which strand of the first genome's chromosome, namely the forward strand ("+") or the reverse strand ("-"). "sign_2" and "sign_i" represent similarly. Thus, "sign_1", "sign_2", and "sign_i" must be either "+" or "-".
For example, an anchor file for pairwise genome comparison consists of the following lines.
1 8040363 8040427 + 1 168782735 168782799 + 1 8847983 8848048 + 1 43731182 43731247 - 1 8849080 8849144 + 1 43730882 43730946 - 1 8876198 8876273 + 1 108267190 108267265 + ...
An anchor file for the comparison of three genomes consists of the following lines.
20 55496602 55496695 + 10 31263560 31263653 + 1 16798400 16798493 + 20 55496602 55496695 + 10 31263560 31263653 + 1 192688819 192688912 + 20 55496602 55496695 + 10 31263560 31263653 + 1 228589428 228589521 + 20 55496602 55496694 + 10 31263560 31263652 + 1 166624290 166624382 - ...