Tandem Repeats Finder Program written by:

                 Gary Benson
      Program in Bioinformatics
          Boston University

Version 4.09

Sequence: scaffold_23 ID=scaffold_23-JGI_221_v2.0

Parameters: 2 7 7 80 10 50 1000

Pmatch=0.80,Pindel=0.10
tuple sizes 0,4,5,7
tuple distances 0, 29, 159, 1000

Length: 70483
ACGTcount: A:0.33, C:0.16, G:0.18, T:0.32

Warning! 609 characters in sequence are not A, C, G, or T


Found at i:7809 original size:21 final size:19

Alignment explanation

Indices: 7775--7813 Score: 60 Period size: 21 Copynumber: 1.9 Consensus size: 19 7765 ATAATCCTCC 7775 GGAAAATGGCTTAGATAGA 1 GGAAAATGGCTTAGATAGA 7794 GGAAACATGGTCTTAGATAG 1 GGAAA-ATGG-CTTAGATAG 7814 GGAGGCAAAT Statistics Matches: 18, Mismatches: 0, Indels: 2 0.90 0.00 0.10 Matches are distributed among these distances: 19 5 0.28 20 4 0.22 21 9 0.50 ACGTcount: A:0.38, C:0.08, G:0.31, T:0.23 Consensus pattern (19 bp): GGAAAATGGCTTAGATAGA Found at i:11871 original size:20 final size:21 Alignment explanation

Indices: 11848--11890 Score: 61 Period size: 21 Copynumber: 2.1 Consensus size: 21 11838 TTCAAGATTT 11848 AGGTTCAGGG-TTTGAGTTCA 1 AGGTTCAGGGTTTTGAGTTCA * * 11868 AGGTTCGGGGTTTTGGGTTCA 1 AGGTTCAGGGTTTTGAGTTCA 11889 AG 1 AG 11891 ATAAAATAAT Statistics Matches: 20, Mismatches: 2, Indels: 1 0.87 0.09 0.04 Matches are distributed among these distances: 20 9 0.45 21 11 0.55 ACGTcount: A:0.16, C:0.09, G:0.40, T:0.35 Consensus pattern (21 bp): AGGTTCAGGGTTTTGAGTTCA Found at i:18647 original size:37 final size:36 Alignment explanation

Indices: 18606--18714 Score: 146 Period size: 37 Copynumber: 2.9 Consensus size: 36 18596 ATTAATCATC * 18606 AATAGAATGAGAAAACAAGCAAGAAAGGTATGTTCAA 1 AATAGAATGAGAAAACAA-CAAGAAAGGTATATTCAA * 18643 AATAGAATGAGAAAACTAACAAGAAAGGTATATTCAT 1 AATAGAATGAGAAAAC-AACAAGAAAGGTATATTCAA * * * 18680 AATCGAATGAGAATACAAACAAGAAAGCTATATTC 1 AATAGAATGAGAAAAC-AACAAGAAAGGTATATTC 18715 CTATGAATTA Statistics Matches: 65, Mismatches: 6, Indels: 2 0.89 0.08 0.03 Matches are distributed among these distances: 37 63 0.97 38 2 0.03 ACGTcount: A:0.53, C:0.10, G:0.17, T:0.19 Consensus pattern (36 bp): AATAGAATGAGAAAACAACAAGAAAGGTATATTCAA Found at i:18851 original size:25 final size:25 Alignment explanation

Indices: 18797--18852 Score: 78 Period size: 25 Copynumber: 2.2 Consensus size: 25 18787 TTCACTTATA * 18797 AATTTGTATATATAATTTCATTTGT 1 AATTTATATATATAATTTCATTTGT * 18822 TATTTATATATATAATTTCATCTT-T 1 AATTTATATATATAATTTCAT-TTGT 18847 AATTTA 1 AATTTA 18853 AAAGAATTTT Statistics Matches: 27, Mismatches: 3, Indels: 2 0.84 0.09 0.06 Matches are distributed among these distances: 25 25 0.93 26 2 0.07 ACGTcount: A:0.34, C:0.05, G:0.04, T:0.57 Consensus pattern (25 bp): AATTTATATATATAATTTCATTTGT Found at i:27766 original size:23 final size:23 Alignment explanation

Indices: 27740--27834 Score: 127 Period size: 23 Copynumber: 4.1 Consensus size: 23 27730 AAGTTAGAAG 27740 AGTGAAGAGACTTGAATATGAAA 1 AGTGAAGAGACTTGAATATGAAA * * * 27763 AGTGAAAAAAACTTGAATATGAAT 1 AGTG-AAGAGACTTGAATATGAAA * 27787 AGTGAAGAGACTTGAAGATGAAA 1 AGTGAAGAGACTTGAATATGAAA * * 27810 AGTTAAGAGGCTTGAATATGAAA 1 AGTGAAGAGACTTGAATATGAAA 27833 AG 1 AG 27835 ATACACACAT Statistics Matches: 61, Mismatches: 10, Indels: 2 0.84 0.14 0.03 Matches are distributed among these distances: 23 41 0.67 24 20 0.33 ACGTcount: A:0.48, C:0.04, G:0.25, T:0.22 Consensus pattern (23 bp): AGTGAAGAGACTTGAATATGAAA Found at i:32340 original size:23 final size:23 Alignment explanation

Indices: 32314--32408 Score: 127 Period size: 23 Copynumber: 4.1 Consensus size: 23 32304 AAGTTAGAAG 32314 AGTGAAGAGACTTGAATATGAAA 1 AGTGAAGAGACTTGAATATGAAA * * * 32337 AGTGAAAAAAACTTGAATATGAAT 1 AGTG-AAGAGACTTGAATATGAAA * 32361 AGTGAAGAGACTTGAAGATGAAA 1 AGTGAAGAGACTTGAATATGAAA * * 32384 AGTTAAGAGGCTTGAATATGAAA 1 AGTGAAGAGACTTGAATATGAAA 32407 AG 1 AG 32409 ATACACACAT Statistics Matches: 61, Mismatches: 10, Indels: 2 0.84 0.14 0.03 Matches are distributed among these distances: 23 41 0.67 24 20 0.33 ACGTcount: A:0.48, C:0.04, G:0.25, T:0.22 Consensus pattern (23 bp): AGTGAAGAGACTTGAATATGAAA Found at i:37021 original size:190 final size:190 Alignment explanation

Indices: 36692--37065 Score: 581 Period size: 190 Copynumber: 2.0 Consensus size: 190 36682 AGCCACAGTA * 36692 TTGGAGTATGATAGAAGGAACATCTTTGACTATGTAGAAATCTTCTCCTCATTTCTCCAATCTTT 1 TTGGAATATGATAGAAGGAACATCTTTGACTATGTAGAAATCTTCTCCTCATTTCTCCAATCTTT * * * * 36757 CTTTGAATGCCTTTGGCATTGATTCAATCTTCAGTTCTTTGAGGGTTGTAATGGAGCTCAATCCC 66 CTTTGAATGCATCTGGCATTGATTCAATCTTCAGTTCTTTGAGGGTTGTAATGAACCTCAATCCC * * * 36822 TCTGGAAGCATCTTCAAGCTTGGGCAATGCC-TGATATCCAATCGCTGAAGAGAACATGCG 131 TCTGGAAGCATCTTCAAGCCTAGGCAA-CCCTTGATATCCAATCGCTGAAGAGAACATGCG * * * 36882 TTGGAATATGATAGAAGGAACATGTTTGACTTTGTAGAAATCTTCTCCTC-CTTCTTCCAATCTT 1 TTGGAATATGATAGAAGGAACATCTTTGACTATGTAGAAATCTTCTCCTCATTTC-TCCAATCTT * * 36946 TTTTTGAATTCATCTGGCATTGATTCAATCTTCAGTTCTTTGAGGGTTGTAATGAACCTCAATCC 65 TCTTTGAATGCATCTGGCATTGATTCAATCTTCAGTTCTTTGAGGGTTGTAATGAACCTCAATCC * * 37011 CTCTGGAAGCATCTTCAATCCTAGGCAACCCTTGATCTCCAATCGCTGAAGAGAA 130 CTCTGGAAGCATCTTCAAGCCTAGGCAACCCTTGATATCCAATCGCTGAAGAGAA 37066 GGCATGGCTC Statistics Matches: 167, Mismatches: 15, Indels: 4 0.90 0.08 0.02 Matches are distributed among these distances: 189 5 0.03 190 162 0.97 ACGTcount: A:0.26, C:0.21, G:0.19, T:0.34 Consensus pattern (190 bp): TTGGAATATGATAGAAGGAACATCTTTGACTATGTAGAAATCTTCTCCTCATTTCTCCAATCTTT CTTTGAATGCATCTGGCATTGATTCAATCTTCAGTTCTTTGAGGGTTGTAATGAACCTCAATCCC TCTGGAAGCATCTTCAAGCCTAGGCAACCCTTGATATCCAATCGCTGAAGAGAACATGCG Found at i:38489 original size:21 final size:19 Alignment explanation

Indices: 38455--38495 Score: 64 Period size: 21 Copynumber: 2.1 Consensus size: 19 38445 ATAATCCTCT 38455 GGAAAATGGCTTAGATAGA 1 GGAAAATGGCTTAGATAGA 38474 GGAAACATGGTCTTAGATAGA 1 GGAAA-ATGG-CTTAGATAGA 38495 G 1 G 38496 AGGCAAATCA Statistics Matches: 20, Mismatches: 0, Indels: 2 0.91 0.00 0.09 Matches are distributed among these distances: 19 5 0.25 20 4 0.20 21 11 0.55 ACGTcount: A:0.39, C:0.07, G:0.32, T:0.22 Consensus pattern (19 bp): GGAAAATGGCTTAGATAGA Found at i:48893 original size:37 final size:37 Alignment explanation

Indices: 48801--48953 Score: 198 Period size: 37 Copynumber: 4.1 Consensus size: 37 48791 GCAACATTAA * 48801 TCATCAATAGAATAAGAATACAAACAAGAAAGGTATATAT 1 TCAT-AATAGAATGAGAATACAAACAAGAAAGG--TATAT * * * 48841 TCATAATAGAATGAGAAAACAAGCAAGAAAGGTATGT 1 TCATAATAGAATGAGAATACAAACAAGAAAGGTATAT * * 48878 TCAAAATAGAATGAGAATACTAACAAGAAAGGTATAT 1 TCATAATAGAATGAGAATACAAACAAGAAAGGTATAT * * * 48915 TCATAATCGAATGAGAATAGAAACAAGAAAGCTATAT 1 TCATAATAGAATGAGAATACAAACAAGAAAGGTATAT 48952 TC 1 TC 48954 CTATGAATTA Statistics Matches: 99, Mismatches: 14, Indels: 3 0.85 0.12 0.03 Matches are distributed among these distances: 37 70 0.71 39 25 0.25 40 4 0.04 ACGTcount: A:0.52, C:0.10, G:0.16, T:0.22 Consensus pattern (37 bp): TCATAATAGAATGAGAATACAAACAAGAAAGGTATAT Found at i:49057 original size:25 final size:25 Alignment explanation

Indices: 49027--49080 Score: 81 Period size: 25 Copynumber: 2.1 Consensus size: 25 49017 TATAAATTTG * 49027 TATATATAATTAATTTCATTTTTAATT 1 TATATAT-A-TAATTTCATCTTTAATT 49054 TATATATATAATTTCATCTTTAATT 1 TATATATATAATTTCATCTTTAATT 49079 TA 1 TA 49081 AAAGAATTTT Statistics Matches: 26, Mismatches: 1, Indels: 2 0.90 0.03 0.07 Matches are distributed among these distances: 25 18 0.69 26 1 0.04 27 7 0.27 ACGTcount: A:0.37, C:0.06, G:0.00, T:0.57 Consensus pattern (25 bp): TATATATATAATTTCATCTTTAATT Found at i:50044 original size:2 final size:2 Alignment explanation

Indices: 50039--50072 Score: 68 Period size: 2 Copynumber: 17.0 Consensus size: 2 50029 GTCTCTCTTA 50039 AT AT AT AT AT AT AT AT AT AT AT AT AT AT AT AT AT 1 AT AT AT AT AT AT AT AT AT AT AT AT AT AT AT AT AT 50073 TTAGTAGGTT Statistics Matches: 32, Mismatches: 0, Indels: 0 1.00 0.00 0.00 Matches are distributed among these distances: 2 32 1.00 ACGTcount: A:0.50, C:0.00, G:0.00, T:0.50 Consensus pattern (2 bp): AT Found at i:50853 original size:156 final size:154 Alignment explanation

Indices: 50291--50832 Score: 611 Period size: 156 Copynumber: 3.5 Consensus size: 154 50281 CAGCAACGGT * * * * 50291 AAAAAAGGA-ATAAAATGTAAGTCTTTGTTTATGGTTTCGGCTTGCTTATGTAGCAAGAAAAGTG 1 AAAAAAGGAGA-AAAAAGTATGCCTTTGTTTATGGTTTCGGCTTGCTTGTGTAGCAAGAAAAGTG * * 50355 AATATAAACACATATAT-T-TAGT-AT-GGTTTCTTGAACTTGAATTGAACGAAGAATTTAAAAA 65 AATATAAACACATATATATATA-TAATAGGTTTCTTGAAATTGAATTGAATGAAGAATTT-AAAA * 50416 ACAACAAAGAAACAAGAAAGCCATTT- 128 ACAACAAAGAAACAAGAAAACCATTTA * * * * ** 50442 AAAAAAGGAG-AAAATGTATGCCTTTATTTATGGTTTCAGCTTGCTTGTGTAGCAAAAAAAAACG 1 AAAAAAGGAGAAAAAAGTATGCCTTTGTTTATGGTTTCGGCTTGCTTGTGTAGC-AAGAAAAGTG * * 50506 AATATAAATATATATATATATATAATATGGTTTCTT-AAACTTGAATTGAATGAAGAATTTTAAA 65 AATATAAACACATATATATATATAATA-GGTTTCTTGAAA-TTGAATTGAATGAAGAA-TTTAAA * * 50570 AACAATAAAGAATCAAGAAAACCA-TTA 127 AACAACAAAGAAACAAGAAAACCATTTA * * ** * * * * 50597 AAAAAAAGAAAAAAAAACATGCTTTTGTTTATGGTTTTGGCTTGCTTGTGTAGTAAGAAAAGTAA 1 AAAAAAGGAGAAAAAAGTATGCCTTTGTTTATGGTTTCGGCTTGCTTGTGTAGCAAGAAAAGTGA * * 50662 ATATAAACATATATATATATTTAAATATGGTTTCTTGAAATTGAATTGAATGAAGAATTTAAAAA 66 ATATAAACACATATATATATAT-AATA-GGTTTCTTGAAATTGAATTGAATGAAGAATTTAAAAA * 50727 GCGGA-AAAGAAACAAGAAAACCATTTA 129 -C-AACAAAGAAACAAGAAAACCATTTA * * * * 50754 AAAAAAGGCGAAAAAAGTATTCCTTTGTTTATGGTTTCGGATTGCTTGTGGAGCAA-AAAAGTGA 1 AAAAAAGGAGAAAAAAGTATGCCTTTGTTTATGGTTTCGGCTTGCTTGTGTAGCAAGAAAAGTGA 50818 ATATAAACAGCATAT 66 ATATAAACA-CATAT 50833 GTGATTCCTC Statistics Matches: 328, Mismatches: 46, Indels: 28 0.82 0.11 0.07 Matches are distributed among these distances: 150 38 0.12 151 31 0.09 152 2 0.01 153 4 0.01 154 4 0.01 155 91 0.28 156 102 0.31 157 56 0.17 ACGTcount: A:0.44, C:0.08, G:0.16, T:0.31 Consensus pattern (154 bp): AAAAAAGGAGAAAAAAGTATGCCTTTGTTTATGGTTTCGGCTTGCTTGTGTAGCAAGAAAAGTGA ATATAAACACATATATATATATAATAGGTTTCTTGAAATTGAATTGAATGAAGAATTTAAAAACA ACAAAGAAACAAGAAAACCATTTA Found at i:62567 original size:21 final size:21 Alignment explanation

Indices: 62524--62569 Score: 58 Period size: 21 Copynumber: 2.2 Consensus size: 21 62514 TAATTTTTTT * * 62524 ACCCTGAATCCAGAATCTTAA 1 ACCCTGAATCCAGAAACTAAA 62545 ACCCTGAATCCA-AAACATAAA 1 ACCCTGAATCCAGAAAC-TAAA 62566 ACCC 1 ACCC 62570 CAAACTCGCA Statistics Matches: 22, Mismatches: 2, Indels: 2 0.85 0.08 0.08 Matches are distributed among these distances: 20 3 0.14 21 19 0.86 ACGTcount: A:0.43, C:0.33, G:0.07, T:0.17 Consensus pattern (21 bp): ACCCTGAATCCAGAAACTAAA Found at i:62671 original size:20 final size:19 Alignment explanation

Indices: 62633--62671 Score: 51 Period size: 20 Copynumber: 2.0 Consensus size: 19 62623 ACCCTTGAAC * 62633 TTTAAACTTGAACCTCGAA 1 TTTAAACTTAAACCTCGAA * 62652 TTTAAACTTTAAATCTCGAA 1 TTTAAAC-TTAAACCTCGAA 62672 ACCAAACCTT Statistics Matches: 17, Mismatches: 2, Indels: 1 0.85 0.10 0.05 Matches are distributed among these distances: 19 7 0.41 20 10 0.59 ACGTcount: A:0.38, C:0.18, G:0.08, T:0.36 Consensus pattern (19 bp): TTTAAACTTAAACCTCGAA Found at i:62932 original size:17 final size:18 Alignment explanation

Indices: 62910--62943 Score: 52 Period size: 18 Copynumber: 1.9 Consensus size: 18 62900 TTGTTTGTAA 62910 TTTAAA-AAATTAATTAT 1 TTTAAATAAATTAATTAT * 62927 TTTAAATAATTTAATTA 1 TTTAAATAAATTAATTA 62944 AAGTTAAATT Statistics Matches: 15, Mismatches: 1, Indels: 1 0.88 0.06 0.06 Matches are distributed among these distances: 17 6 0.40 18 9 0.60 ACGTcount: A:0.50, C:0.00, G:0.00, T:0.50 Consensus pattern (18 bp): TTTAAATAAATTAATTAT Found at i:65467 original size:21 final size:21 Alignment explanation

Indices: 65443--65495 Score: 97 Period size: 21 Copynumber: 2.5 Consensus size: 21 65433 CAGTAAATTA 65443 CGTCCACGTCAGCACTATTTG 1 CGTCCACGTCAGCACTATTTG 65464 CGTCCACGTCAGCACTATTTG 1 CGTCCACGTCAGCACTATTTG * 65485 TGTCCACGTCA 1 CGTCCACGTCA 65496 ACATAGCGCA Statistics Matches: 31, Mismatches: 1, Indels: 0 0.97 0.03 0.00 Matches are distributed among these distances: 21 31 1.00 ACGTcount: A:0.19, C:0.34, G:0.19, T:0.28 Consensus pattern (21 bp): CGTCCACGTCAGCACTATTTG Found at i:66041 original size:58 final size:58 Alignment explanation

Indices: 65951--66063 Score: 199 Period size: 58 Copynumber: 1.9 Consensus size: 58 65941 GTTCCTTGAG * 65951 GGAATTCTATTGAATCTTTTATCGTCGCAAGTTAAGCTGACTTAGGCGTTTTTGGAGT 1 GGAATTCTATTGAATCCTTTATCGTCGCAAGTTAAGCTGACTTAGGCGTTTTTGGAGT * * 66009 GGAATTCTGTTGAATCCTTTATCGTCGCAAGTTAGGCTGACTTAGGCGTTTTTGG 1 GGAATTCTATTGAATCCTTTATCGTCGCAAGTTAAGCTGACTTAGGCGTTTTTGG 66064 GAGCAAAGAA Statistics Matches: 52, Mismatches: 3, Indels: 0 0.95 0.05 0.00 Matches are distributed among these distances: 58 52 1.00 ACGTcount: A:0.20, C:0.15, G:0.26, T:0.39 Consensus pattern (58 bp): GGAATTCTATTGAATCCTTTATCGTCGCAAGTTAAGCTGACTTAGGCGTTTTTGGAGT Done.