Tandem Repeats Finder Program written by:

                 Gary Benson
      Program in Bioinformatics
          Boston University

Version 4.09

Sequence: AWWV01008450.1 Corchorus capsularis cultivar CVL-1 contig08471, whole genome shotgun sequence

Parameters: 2 7 7 80 10 50 1000

Pmatch=0.80,Pindel=0.10
tuple sizes 0,4,5,7
tuple distances 0, 29, 159, 1000

Length: 43480
ACGTcount: A:0.34, C:0.17, G:0.17, T:0.32

Warning! 1 characters in sequence are not A, C, G, or T


Found at i:54 original size:19 final size:20

Alignment explanation

Indices: 14--55 Score: 52 Period size: 20 Copynumber: 2.1 Consensus size: 20 4 AAAGAAGGTC * 14 AAAAAATGGATGTAGAGATA 1 AAAAAATGGATGTAAAGATA 34 AAAAAATGGTAT-TAAAG-TA 1 AAAAAATGG-ATGTAAAGATA 53 AAA 1 AAA 56 GAGTAAAGAA Statistics Matches: 20, Mismatches: 1, Indels: 3 0.83 0.04 0.12 Matches are distributed among these distances: 19 5 0.25 20 13 0.65 21 2 0.10 ACGTcount: A:0.60, C:0.00, G:0.19, T:0.21 Consensus pattern (20 bp): AAAAAATGGATGTAAAGATA Found at i:185 original size:15 final size:15 Alignment explanation

Indices: 165--195 Score: 53 Period size: 15 Copynumber: 2.1 Consensus size: 15 155 AAAGAGTAAG * 165 AAAAATGGTAAAAGT 1 AAAAATGATAAAAGT 180 AAAAATGATAAAAGT 1 AAAAATGATAAAAGT 195 A 1 A 196 GCAAAAGTAA Statistics Matches: 15, Mismatches: 1, Indels: 0 0.94 0.06 0.00 Matches are distributed among these distances: 15 15 1.00 ACGTcount: A:0.65, C:0.00, G:0.16, T:0.19 Consensus pattern (15 bp): AAAAATGATAAAAGT Found at i:822 original size:19 final size:18 Alignment explanation

Indices: 789--824 Score: 54 Period size: 19 Copynumber: 1.9 Consensus size: 18 779 TTGAAATAAT 789 TCTTCAATGATCTTCAAA 1 TCTTCAATGATCTTCAAA * 807 TCTTCAAATTATCTTCAA 1 TCTTC-AATGATCTTCAA 825 GAAATCTTCA Statistics Matches: 16, Mismatches: 1, Indels: 1 0.89 0.06 0.06 Matches are distributed among these distances: 18 5 0.31 19 11 0.69 ACGTcount: A:0.33, C:0.22, G:0.03, T:0.42 Consensus pattern (18 bp): TCTTCAATGATCTTCAAA Found at i:2454 original size:21 final size:20 Alignment explanation

Indices: 2414--2456 Score: 52 Period size: 21 Copynumber: 2.1 Consensus size: 20 2404 ACAATTAATC * 2414 TAAATTAAAGCAAACTAAAT 1 TAAATTAAAGCAAACGAAAT 2434 TAAATCTAAAGCTAAA-GAAAT 1 TAAAT-TAAAGC-AAACGAAAT 2455 TA 1 TA 2457 CTTAAGAAAA Statistics Matches: 20, Mismatches: 1, Indels: 3 0.83 0.04 0.12 Matches are distributed among these distances: 20 5 0.25 21 12 0.60 22 3 0.15 ACGTcount: A:0.58, C:0.09, G:0.07, T:0.26 Consensus pattern (20 bp): TAAATTAAAGCAAACGAAAT Found at i:5109 original size:20 final size:20 Alignment explanation

Indices: 5071--5110 Score: 55 Period size: 20 Copynumber: 2.0 Consensus size: 20 5061 ATCACACATA 5071 AAAATACCAAAAAGCATAGG 1 AAAATACCAAAAAGCATAGG * 5091 AAAATGACCATAAAG-ATAGG 1 AAAAT-ACCAAAAAGCATAGG 5111 GTTAATTTTG Statistics Matches: 18, Mismatches: 1, Indels: 2 0.86 0.05 0.10 Matches are distributed among these distances: 20 10 0.56 21 8 0.44 ACGTcount: A:0.57, C:0.12, G:0.17, T:0.12 Consensus pattern (20 bp): AAAATACCAAAAAGCATAGG Found at i:5412 original size:17 final size:16 Alignment explanation

Indices: 5390--5434 Score: 54 Period size: 16 Copynumber: 2.8 Consensus size: 16 5380 ACTGTGTTAG * 5390 ATTTTAAAACAAAACTA 1 ATTTT-AAACAAAACAA * 5407 ATTTTAGACAAAACAA 1 ATTTTAAACAAAACAA * 5423 ATTTTGAACAAA 1 ATTTTAAACAAA 5435 TCTAGGTTTA Statistics Matches: 24, Mismatches: 4, Indels: 1 0.83 0.14 0.03 Matches are distributed among these distances: 16 19 0.79 17 5 0.21 ACGTcount: A:0.56, C:0.11, G:0.04, T:0.29 Consensus pattern (16 bp): ATTTTAAACAAAACAA Found at i:9080 original size:5 final size:5 Alignment explanation

Indices: 9070--9102 Score: 57 Period size: 5 Copynumber: 6.6 Consensus size: 5 9060 AATAGGCAAT * 9070 CAAAG CAAAG CAAAG CAAAG CAAAG CAATG CAA 1 CAAAG CAAAG CAAAG CAAAG CAAAG CAAAG CAA 9103 GAAAAATAAA Statistics Matches: 27, Mismatches: 1, Indels: 0 0.96 0.04 0.00 Matches are distributed among these distances: 5 27 1.00 ACGTcount: A:0.58, C:0.21, G:0.18, T:0.03 Consensus pattern (5 bp): CAAAG Found at i:9655 original size:12 final size:11 Alignment explanation

Indices: 9632--9662 Score: 53 Period size: 11 Copynumber: 2.7 Consensus size: 11 9622 TCAGAAAATT 9632 AAAAAATTAAA 1 AAAAAATTAAA 9643 AAAAAATTAAA 1 AAAAAATTAAA 9654 AAATAAATT 1 AAA-AAATT 9663 TATGAAAAAA Statistics Matches: 19, Mismatches: 0, Indels: 1 0.95 0.00 0.05 Matches are distributed among these distances: 11 14 0.74 12 5 0.26 ACGTcount: A:0.77, C:0.00, G:0.00, T:0.23 Consensus pattern (11 bp): AAAAAATTAAA Found at i:9796 original size:14 final size:14 Alignment explanation

Indices: 9777--9805 Score: 58 Period size: 14 Copynumber: 2.1 Consensus size: 14 9767 CCACTTGTAA 9777 TCATCAAATTGATG 1 TCATCAAATTGATG 9791 TCATCAAATTGATG 1 TCATCAAATTGATG 9805 T 1 T 9806 AATCTTTACT Statistics Matches: 15, Mismatches: 0, Indels: 0 1.00 0.00 0.00 Matches are distributed among these distances: 14 15 1.00 ACGTcount: A:0.34, C:0.14, G:0.14, T:0.38 Consensus pattern (14 bp): TCATCAAATTGATG Found at i:19990 original size:13 final size:13 Alignment explanation

Indices: 19972--19997 Score: 52 Period size: 13 Copynumber: 2.0 Consensus size: 13 19962 CAACGATACC 19972 TCGATATATCCAT 1 TCGATATATCCAT 19985 TCGATATATCCAT 1 TCGATATATCCAT 19998 GGACACATGT Statistics Matches: 13, Mismatches: 0, Indels: 0 1.00 0.00 0.00 Matches are distributed among these distances: 13 13 1.00 ACGTcount: A:0.31, C:0.23, G:0.08, T:0.38 Consensus pattern (13 bp): TCGATATATCCAT Found at i:20285 original size:46 final size:44 Alignment explanation

Indices: 20211--20302 Score: 130 Period size: 46 Copynumber: 2.0 Consensus size: 44 20201 ATCCATATTA * 20211 AATTAAATATTTTTTTTCATTTTCACATCTAGGAATAAAAATAT 1 AATTAAATATTCTTTTTCATTTTCACATCTAGGAATAAAAATAT * * * 20255 AATTAAATACGTTCTTTTTCATTTTTACATCTATGATTAAAAATAT 1 AATTAAATA--TTCTTTTTCATTTTCACATCTAGGAATAAAAATAT 20301 AA 1 AA 20303 GCGACATTTT Statistics Matches: 42, Mismatches: 4, Indels: 2 0.88 0.08 0.04 Matches are distributed among these distances: 44 9 0.21 46 33 0.79 ACGTcount: A:0.40, C:0.10, G:0.04, T:0.46 Consensus pattern (44 bp): AATTAAATATTCTTTTTCATTTTCACATCTAGGAATAAAAATAT Found at i:21535 original size:5 final size:5 Alignment explanation

Indices: 21525--21580 Score: 60 Period size: 5 Copynumber: 10.4 Consensus size: 5 21515 TATATGTAGT 21525 ATATA ATATA ATATA ATATAAA ATATA ATAT- ATATA ATAATA GTATATA 1 ATATA ATATA ATATA ATAT--A ATATA ATATA ATATA AT-ATA --ATATA 21574 ATATA AT 1 ATATA AT 21581 GTAGTATTTG Statistics Matches: 45, Mismatches: 0, Indels: 12 0.79 0.00 0.21 Matches are distributed among these distances: 4 4 0.09 5 28 0.62 6 3 0.07 7 8 0.18 8 2 0.04 ACGTcount: A:0.59, C:0.00, G:0.02, T:0.39 Consensus pattern (5 bp): ATATA Found at i:21553 original size:22 final size:23 Alignment explanation

Indices: 21525--21580 Score: 80 Period size: 22 Copynumber: 2.5 Consensus size: 23 21515 TATATGTAGT 21525 ATATAATATAATATAAT-ATAAA 1 ATATAATATAATATAATAATAAA ** 21547 ATATAATAT-ATATAATAATAGT 1 ATATAATATAATATAATAATAAA 21569 ATATAATATAAT 1 ATATAATATAAT 21581 GTAGTATTTG Statistics Matches: 30, Mismatches: 2, Indels: 3 0.86 0.06 0.09 Matches are distributed among these distances: 21 7 0.23 22 21 0.70 23 2 0.07 ACGTcount: A:0.59, C:0.00, G:0.02, T:0.39 Consensus pattern (23 bp): ATATAATATAATATAATAATAAA Found at i:21585 original size:17 final size:16 Alignment explanation

Indices: 21521--21587 Score: 63 Period size: 17 Copynumber: 4.4 Consensus size: 16 21511 TGAATATATG 21521 TAGTATATAATATAAT 1 TAGTATATAATATAAT * * 21537 ATAATATAAAATATAA- 1 -TAGTATATAATATAAT 21553 TA-TATAT-A-ATAA- 1 TAGTATATAATATAAT 21565 TAGTATATAATATAAT 1 TAGTATATAATATAAT 21581 GTAGTAT 1 -TAGTAT 21588 TTGGTATAAA Statistics Matches: 42, Mismatches: 3, Indels: 10 0.76 0.05 0.18 Matches are distributed among these distances: 12 6 0.14 13 6 0.14 14 5 0.12 15 6 0.14 17 19 0.45 ACGTcount: A:0.54, C:0.00, G:0.06, T:0.40 Consensus pattern (16 bp): TAGTATATAATATAAT Found at i:30929 original size:6 final size:6 Alignment explanation

Indices: 30918--31023 Score: 54 Period size: 6 Copynumber: 16.2 Consensus size: 6 30908 TATCGAAAAT ** * * * 30918 GAACCC GAACCC G-ACCC GGGCCC AAACCC GAACCC G-ATCC GAGCCC 1 GAACCC GAACCC GAACCC GAACCC GAACCC GAACCC GAACCC GAACCC 30964 GAAAATACCC GAACCC GAAATACCC GAACCC GAAAATACCC GAACCC GAACCC 1 G---A-ACCC GAACCC G--A-ACCC GAACCC G---A-ACCC GAACCC GAACCC 31017 GAACCC G 1 GAACCC G 31024 TCCAATTGCC Statistics Matches: 78, Mismatches: 9, Indels: 26 0.69 0.08 0.23 Matches are distributed among these distances: 5 9 0.12 6 49 0.63 7 3 0.04 8 1 0.01 9 7 0.09 10 9 0.12 ACGTcount: A:0.34, C:0.44, G:0.18, T:0.04 Consensus pattern (6 bp): GAACCC Found at i:30949 original size:23 final size:24 Alignment explanation

Indices: 30919--30981 Score: 74 Period size: 23 Copynumber: 2.5 Consensus size: 24 30909 ATCGAAAATG * 30919 AACCCGAACCCGACCCGGGCCC-A 1 AACCCGAACCCGACCCGAGCCCAA * 30942 AACCCGAACCCGATCCGAGCCCGAAA 1 AACCCGAACCCGACCCGAGCCC--AA 30968 ATACCCGAACCCGA 1 A-ACCCGAACCCGA 30982 AATACCCGAA Statistics Matches: 34, Mismatches: 2, Indels: 4 0.85 0.05 0.10 Matches are distributed among these distances: 23 20 0.59 26 2 0.06 27 12 0.35 ACGTcount: A:0.32, C:0.46, G:0.19, T:0.03 Consensus pattern (24 bp): AACCCGAACCCGACCCGAGCCCAA Found at i:30986 original size:15 final size:16 Alignment explanation

Indices: 30956--31013 Score: 100 Period size: 16 Copynumber: 3.7 Consensus size: 16 30946 CGAACCCGAT * 30956 CCGAGCCCGAAAATAC 1 CCGAACCCGAAAATAC 30972 CCGAACCCG-AAATAC 1 CCGAACCCGAAAATAC 30987 CCGAACCCGAAAATAC 1 CCGAACCCGAAAATAC 31003 CCGAACCCGAA 1 CCGAACCCGAA 31014 CCCGAACCCG Statistics Matches: 40, Mismatches: 1, Indels: 2 0.93 0.02 0.05 Matches are distributed among these distances: 15 15 0.38 16 25 0.62 ACGTcount: A:0.40, C:0.40, G:0.16, T:0.05 Consensus pattern (16 bp): CCGAACCCGAAAATAC Found at i:32109 original size:18 final size:19 Alignment explanation

Indices: 32077--32116 Score: 64 Period size: 18 Copynumber: 2.1 Consensus size: 19 32067 GAGTGTCTAA 32077 TTAAAAAAATTTCAATTAAT 1 TTAAAAAAA-TTCAATTAAT 32097 TTAAAAAAA-TCAATTAAT 1 TTAAAAAAATTCAATTAAT 32115 TT 1 TT 32117 GAAATTTGAT Statistics Matches: 20, Mismatches: 0, Indels: 2 0.91 0.00 0.09 Matches are distributed among these distances: 18 11 0.55 20 9 0.45 ACGTcount: A:0.55, C:0.05, G:0.00, T:0.40 Consensus pattern (19 bp): TTAAAAAAATTCAATTAAT Found at i:41363 original size:16 final size:16 Alignment explanation

Indices: 41344--41453 Score: 84 Period size: 16 Copynumber: 6.9 Consensus size: 16 41334 GACCTGACAA * 41344 ACCCGTGACCCGAATG 1 ACCCGAGACCCGAATG * 41360 ACCCGACACCC-AGATG 1 ACCCGAGACCCGA-ATG 41376 ACCCGAGACCCGAATG 1 ACCCGAGACCCGAATG * * * 41392 ACCTGTA-ACTC-AGATA 1 ACCCG-AGACCCGA-ATG * 41408 ACCCGAAACCCGAATG 1 ACCCGAGACCCGAATG * 41424 ACCCAAGACCC-ATATG 1 ACCCGAGACCCGA-ATG * 41440 ACCCGAAACCCGAA 1 ACCCGAGACCCGAA 41454 AAACCCAAGA Statistics Matches: 73, Mismatches: 13, Indels: 16 0.72 0.13 0.16 Matches are distributed among these distances: 15 4 0.05 16 65 0.89 17 4 0.05 ACGTcount: A:0.35, C:0.37, G:0.18, T:0.10 Consensus pattern (16 bp): ACCCGAGACCCGAATG Found at i:41380 original size:32 final size:31 Alignment explanation

Indices: 41351--41453 Score: 125 Period size: 32 Copynumber: 3.2 Consensus size: 31 41341 CAAACCCGTG * 41351 ACCCGAATGACCCGACACCCAGATGACCCGAG 1 ACCCGAATGACCCGA-ACCCAGATGACCCGAA * * * 41383 ACCCGAATGACCTGTAACTCAGATAACCCGAA 1 ACCCGAATGACCCG-AACCCAGATGACCCGAA * * 41415 ACCCGAATGACCCAAGACCCATATGACCCGAA 1 ACCCGAATGACCCGA-ACCCAGATGACCCGAA 41447 ACCCGAA 1 ACCCGAA 41454 AAACCCAAGA Statistics Matches: 60, Mismatches: 9, Indels: 4 0.82 0.12 0.05 Matches are distributed among these distances: 31 1 0.02 32 58 0.97 33 1 0.02 ACGTcount: A:0.36, C:0.37, G:0.17, T:0.10 Consensus pattern (31 bp): ACCCGAATGACCCGAACCCAGATGACCCGAA Found at i:42782 original size:42 final size:42 Alignment explanation

Indices: 42723--42803 Score: 153 Period size: 42 Copynumber: 1.9 Consensus size: 42 42713 GTTGAGACAG 42723 ACCCCACCTGATAATTAATTATGTATTTAATATTCAAAACTT 1 ACCCCACCTGATAATTAATTATGTATTTAATATTCAAAACTT * 42765 ACCCCACCTGATAATTAATTTTGTATTTAATATTCAAAA 1 ACCCCACCTGATAATTAATTATGTATTTAATATTCAAAA 42804 TTAATATCAA Statistics Matches: 38, Mismatches: 1, Indels: 0 0.97 0.03 0.00 Matches are distributed among these distances: 42 38 1.00 ACGTcount: A:0.38, C:0.19, G:0.05, T:0.38 Consensus pattern (42 bp): ACCCCACCTGATAATTAATTATGTATTTAATATTCAAAACTT Found at i:42994 original size:16 final size:15 Alignment explanation

Indices: 42975--43144 Score: 123 Period size: 16 Copynumber: 10.9 Consensus size: 15 42965 TTCAATGCTG 42975 ACCCAATTGACCCGAA 1 ACCCAA-TGACCCGAA * 42991 ACCCGAGTGACCCG-A 1 ACCC-AATGACCCGAA * 43006 AGCCAA-GACCC-AA 1 ACCCAATGACCCGAA *** * 43019 ACCCACCCAACCCGAG 1 ACCCA-ATGACCCGAA * 43035 ACCCGAATGACCCGGA 1 ACCC-AATGACCCGAA * 43051 ACCCGAATGACCCGAG 1 ACCC-AATGACCCGAA 43067 ACCCGAATGACCCGAA 1 ACCC-AATGACCCGAA * * 43083 ACCCGTATGACCCGAG 1 ACCC-AATGACCCGAA * 43099 ACCCGAATAACCC-AA 1 ACCC-AATGACCCGAA 43114 ACCCAGATGACCCGAA 1 ACCCA-ATGACCCGAA 43130 ACCCGAATGACCCGA 1 ACCC-AATGACCCGA 43145 GAAAACTACC Statistics Matches: 124, Mismatches: 21, Indels: 18 0.76 0.13 0.11 Matches are distributed among these distances: 13 10 0.08 14 2 0.02 15 19 0.15 16 90 0.73 17 3 0.02 ACGTcount: A:0.35, C:0.40, G:0.19, T:0.06 Consensus pattern (15 bp): ACCCAATGACCCGAA Found at i:43064 original size:32 final size:32 Alignment explanation

Indices: 42974--43144 Score: 178 Period size: 32 Copynumber: 5.5 Consensus size: 32 42964 ATTCAATGCT * 42974 GACCC-AATTGACCCGAAACCCGAGTGACCCGA 1 GACCCGAA-TGACCCGAAACCCGAATGACCCGA * *** 43006 -AGCC-AA-GACCC-AAACCC-ACCCAACCCGA 1 GACCCGAATGACCCGAAACCCGA-ATGACCCGA * 43034 GACCCGAATGACCCGGAACCCGAATGACCCGA 1 GACCCGAATGACCCGAAACCCGAATGACCCGA * 43066 GACCCGAATGACCCGAAACCCGTATGACCCGA 1 GACCCGAATGACCCGAAACCCGAATGACCCGA * 43098 GACCCGAATAACCC-AAACCC-AGATGACCCGA 1 GACCCGAATGACCCGAAACCCGA-ATGACCCGA * 43129 AACCCGAATGACCCGA 1 GACCCGAATGACCCGA 43145 GAAAACTACC Statistics Matches: 116, Mismatches: 15, Indels: 16 0.79 0.10 0.11 Matches are distributed among these distances: 27 1 0.01 28 12 0.10 29 8 0.07 30 2 0.02 31 37 0.32 32 55 0.47 33 1 0.01 ACGTcount: A:0.35, C:0.40, G:0.19, T:0.06 Consensus pattern (32 bp): GACCCGAATGACCCGAAACCCGAATGACCCGA Found at i:43140 original size:47 final size:47 Alignment explanation

Indices: 42982--43146 Score: 178 Period size: 48 Copynumber: 3.6 Consensus size: 47 42972 CTGACCCAAT * * ** 42982 TGACCCGAAACCCGAGTGACCCGA-AGCC-AA-GACCCAAACCCACC 1 TGACCCGAAACCCGAATGACCCGAGACCCGAATGACCCAAACCCAGA ** * * 43026 CAACCCGAGACCCGAATGACCCG-GAACCCGAATGACCCGAGACCC-GAA 1 TGACCCGAAACCCGAATGACCCGAG-ACCCGAATGACCC-AAACCCAG-A * * 43074 TGACCCGAAACCCGTATGACCCGAGACCCGAATAACCCAAACCCAGA 1 TGACCCGAAACCCGAATGACCCGAGACCCGAATGACCCAAACCCAGA 43121 TGACCCGAAACCCGAATGACCCGAGA 1 TGACCCGAAACCCGAATGACCCGAGA 43147 AAACTACCTG Statistics Matches: 98, Mismatches: 15, Indels: 13 0.78 0.12 0.10 Matches are distributed among these distances: 44 19 0.19 45 3 0.03 46 2 0.02 47 36 0.37 48 37 0.38 49 1 0.01 ACGTcount: A:0.35, C:0.39, G:0.20, T:0.06 Consensus pattern (47 bp): TGACCCGAAACCCGAATGACCCGAGACCCGAATGACCCAAACCCAGA Done.