Compiled by Sherwood Casjens, Daniel Haft, Jeremy Peterson, Brian Stevenson and Claire Fraser
Last modified on Feb. 3, 2000.
This document is also available as Macintosh Microsoft WORD 5.1 and OFFICE 98 (WORD98) document from Sherwood Casjens.
Please send corrections, additions, comments, etc. to sherwood.casjens@hci.utah.edu.
Table of Contents
|
The Pseudo-, Questionable, and Short Genes on the B31 Plasmids |
|
|
Ambiguous Nucleotides in the B. burgdorferi B31 Genome Sequence |
|
PURPOSE
This document contains a number of tables which cross-annotate the current
knowledge of the B. burgdorferi B31 genome in various ways. We hope that
this cross-referencing will allow readers to browse through the information
profitably, and that it will allow them to become familiar with what is not
known as well as what is known about this genome. Major conclusions from this
analysis are published in Fraser et al. (1997) and Casjens et al.
(2000)
ORGANIZATION
In each section of this document plasmids are listed with circular plasmids ascending in number (approximate size) followed by linear plasmids ascending in number as follows:
cp9, cp26, cp32-1, cp32-3, cp32-4, cp32-6, cp32-7, cp32-8, cp32-9,
lp5, lp17, lp21, lp25, lp28-1, lp28-2, lp28-3, lp28-4, lp36, lp38, lp54,
lp56
OPEN READING FRAMES and PREDICTED GENES
Throughout this document we use the words "gene"
and "protein" advisedly to mean putative gene and putative
protein that has been predicted from the nucleotide sequence. Since little
molecular biology has been done with these organisms, nearly all of the "genes"
in this document are currently only identified as open reading frames.
GENBANK ACCESSION NUMBERS and GENE NAME PREFIXES
The B. burgdorferi B31 chromosome and plasmid sequences are available at the TIGR Borrelia web site or from GENBANK. The accession numbers from GENBANK and gene name prefixes are as follows (as reported in Fraser et al. (1997) and Casjens et al. (2000):
|
Replicon |
Accession # |
gene name prefix |
|
Chromosome |
AE000788 |
BB0 (BBzero) |
|
cp9 |
AE000791 |
BBC |
|
cp26 |
AE000792 |
BBB |
|
cp32-1 |
AE001575 |
BBP |
|
cp32-3 |
AE001576 |
BBS |
|
cp32-4 |
AE001577 |
BBR |
|
cp32-6 |
AE001578 |
BBM |
|
cp32-7 |
AE001579 |
BBO (BB"oh") |
|
cp32-8 |
AE001580 |
BBL |
|
cp32-9 |
AE001581 |
BBN |
|
lp5 |
AE001583 |
BBT |
|
lp17 |
AE000793 |
BBD |
|
lp21 |
AE001582 |
BBU |
|
lp25 |
AE000785 |
BBE |
|
lp28-1 |
AE000794 |
BBF |
|
lp28-2 |
AE000786 |
BBG |
|
lp28-3 |
AE000784 |
BBH |
|
lp28-4 |
AE000789 |
BBI |
|
lp36 |
AE000788 |
BBK |
|
lp38 |
AE000787 |
BBJ |
|
lp54 |
AE000790 |
BBA |
|
lp56 |
AE001584 |
BBQ |
B31 Plasmid Open Reading Frame Summary
Sherwood Casjens - 1999
ALL B31 PLASMIDS
898 total gene-like entities. Among these gene-like entities are the following:
836 genes (which are not "questionable") + pseudogenes
167 pseudogenes (+ about 10 others that have marginal similarity to "intact" genes)
62 "questionable" genes (29 in-frame fragments of larger pseudogenes; 33 ²300 bp genes inside a larger pseudogene in another frame and short genes that were not called in paralogous sequence elsewhere on the plasmids).
669 "intact" genes (which are not "questionable")
39 convincing similarity hits to genes of known function outside of Borrelia among plasmid genes
16 convincing similarity hits to genes of unknown function outside of Borrelia among plasmid genes
535 "intact" genes >300 bp (which are not "questionable")
134 "intact" genes ²300 bp (which are not "questionable")
472 genes (which are not "questionable") have a paralog (it may not be intact)
197 genes (which are not "questionable") have no paralog (63 of these are >300 bp and 134 are ²300 bp)
98 plasmid gene-like entities that encode potential lipoproteins
90 intact plasmid genes that encode potential lipoproteins
7 gene-like entities that we defined as pseudogenes have translation start codons that could possibly lead to expression of lipoproteins that are truncated relative to their paralogs
32 intact plasmid genes that are below but close to our lipidation cutoff
162 paralogous gene families, 107 of which have plasmid-borne members
9 paralogous gene families encode only predicted lipoproteins
17 paralogous gene families are heterogeneous in that at least 1 potential LP and at least one non-LP is found in the family
THE "LOW PSEUDOGENE" or "WELL BEHAVED" B31 PLASMIDS
These plasmids are: cp9, cp26, all seven of the cp32s, lp28-2, lp54 and the cp32-like portion of lp56
498 gene-like entities on the "well behaved" plasmids on which apparent protein-encoding genes occupy >70% of the DNA.
9 "questionable" genes (all are ²300 bp genes inside a larger pseudogene in another frame or short genes that were not called in paralogous sequence elsewhere on the plasmids).
489 genes (which are not "questionable") + pseudogenes
22 pseudogenes
467 genes (which are not "questionable")
420 genes >300 bp (which are not "questionable")
47 genes ²300 bp (which are not "questionable")
54 genes that encode potential lipoproteins
12 genes that are below but close to our lipidation cutoff
23 convincing matches to genes of known function outside of Borrelia among plasmid genes (which are not "questionable")
13 convincing matches to genes of unknown function outside of Borrelia among plasmid genes (which are not "questionable")
THE "HIGH PSEUDOGENE" or "NOT YET AMMELIORATED" B31 PLASMIDS
These plasmids are: lp5, lp17, lp21, lp25, lp28-1, lp28-3, lp28-4, lp36, lp38, lp56 and the non-cp32-like portion of lp56
400 gene-like entities on the "bad" plasmids on which apparent protein-encoding genes occupy <75% of the DNA.
53 "questionable" genes (29 in-frame fragments of larger pseudogenes; 24 ²300 bp genes inside a larger pseudogene in another frame and short genes that were not called in paralogous sequence elsewhere on the plasmids).
347 genes (which are not "questionable") + pseudogenes
145 pseudogenes
202 genes (which are not "questionable")
115 genes >300 bp (which are not "questionable")
87 genes ²300 bp (which are not "questionable")
37 genes that encode potential lipoproteins
5 genes that are below but close to our lipidation cutoff
16 convincing matches to genes of known function outside of Borrelia among plasmid genes (which are not "questionable")
3 convincing matches to genes of unknown function outside of Borrelia among plasmid genes (which are not "questionable")
Annotated B. burgdorferi B31 Plasmid Gene List
Compiled by Sherwood Casjens, Dan Haft and Jeremy Peterson - April 1999
Definitions for Gene List
Note that these definitions are NOT necessarily absolutely identical to those used in the other published gene lists and maps for B. burgdorferi or on the TIGR WEB site. In particular we have an expanded definition of "pseudogene" that includes truncated members of paralogous gene families.
Putative genes and gene names column lists all the putative "gene-like entities" - genes and pseudogenes - currently recognized in the twenty-one B. burgdorferi B31 plasmids. We tentatively interpret those genes not indicated to be pseudogenes to be intact and potentially functional, but since the functionality of most Borrelia genes is unknown this may not be true. The gene and plasmid names used here are those used in Fraser et al. (1997) and Casjens et al. (2000). Of course any given putative pseudo-, questionable, short, fragmented or frameshifted genes could in principle have an important function, but it seems likely that a substantial fraction of them are not functional.
Daggers mark computer-recognized ORFs that are an in-frame and part of a larger pseudogene entity. To avoid counting the entity twice, these were ignored when compiling gene and pseudogene numbers in Casjens et al. (2000).
Coordinates - these columns list the positions of the 5 and 3 ends of the gene or pseudogene on the sequence of the relevant plasmid.
Database hit outside Borrelia indicates all similarities to non-Borrelia sequences in the extant database as of January 1999. The criteria for inclusion in the list are those of the TIGR protocol, which uses BLAST (Altschul et al., 1997), and alignments can be found on the TIGR Borrelia WEB page. A search using EMOTIF (Nevill-Manning et al., 1998) did not find any additional convincing B31 plasmid gene similarities to previously known genes.
Common name column gives gene names previously used in the literature. If it was previously named in a strain other than B31, the Borrelia strain is given in parentheses. In addition, we and others have suggested more specific, clarifying common names for genes currently under study in the following paralogous families: mlp [family 113], bdr [80], rev [63] and erp [162/163/164] genes.
Paralog family column indicates the family of paralogous genes (homologs within B. burgdorferi B31) to which individual genes belong. A complete list of genes and pseudogenes in each of these paralogous gene families can be found in PART II of this document.
Comments Column
N-terminal lipidation consensus refers to genes whose products are most likely to be lipoproteins.
Near-consensus N-terminal lipidation signal refers to genes whose products may be lipoproteins, but whose N-terminal amino acid sequences did not quite meet the arbitrary cutoff that we set for criteria for inclusion in the "probable lipoproteins" category.
See PART III of this document for a discussion of the strategies used to identifiy possible lipoprotein encoding genes.
Authentic frameshift genes contain one or a few simple frameshifts relative to their paralogs. It is unlikely that these are actually expressed by programmed frameshifting mechanisms, since they usually do not contain the expected translationally "slippery" sequences. The TIGR computer uses this term for damaged genes (hence it currently replaces "pseudogene" in some parts of the TIGR Borrelia web page). These considered to be pseudogenes in this analysis (Casjens et al., 2000).
Authentic point mutation gene has an in-frame stop codon relative to its paralogs. These are considered to be pseudogenes in this analysis.
Gene fragments or truncated genes are substantially shorter than other members of their paralogous families. Some of these could be expressed and have a function, although they are included in the pseudogene category for ease of discussion in this analysis and to point out that they are truncated.
Pseudogenes are regions of DNA that are similar in sequence to a paralogous Borrelia gene or to a gene from another organism, but which are obviously truncated and/or do not have full open reading frames relative to those homologs. These mostly appear to be mutationally damaged genes - they include "authentic frameshift", "authentic point mutation", fused and truncated genes. These pseudogenes often contain multiple frameshifts, deletions, insertions and inversions (see Casjens et al., 2000).
Exceptions to this definition of a pseudogene are the 15 silent vlsE cassettes on lp28-1; these are not damaged are apparently "designed" to be a reservoir of antigenic variation for the vlsE protein. They are pseudogenes in that they are incomplete relative to the expressed vlsE gene and are probably not expressed themselves.
Of course the gene fragments whose reading frames are intact, that we include in this category for ease of discussion, could in fact be expressed and if so could perform a function. Nonetheless such fragments are very unusual in prokaryotes, and given the other evidence for many rearrangements in the B31 plasmids (Casjens et al., 2000) it seems likely that many, if not all of such fragments, may no longer have a biological function.
See PART IV of this document for a complete list of pseudogenes and the reasons why each is so classified.
"Questionable genes" were called by TIGRs standard gene recognition protocol, but there is reason to suspect they may be spurious calls. For example, "computer-called genes" that are inside another gene or pseudogene and small genes that were not called in paralogous sequence elsewhere in the Borrelia sequence. Those marked with daggers () are inside of larger pseudogenes, but which were nonetheless called as genes by the TIGR protocol.
See PART IV of this document for a complete list of questionable genes and the reasons why each is so classified.
Short genes are <300 bp in length but ARE NOT in the "questionable" or "pseudogene" categories. The Borrelia plasmids have an inordinately large fraction of called genes that are <300 bp in length. These are often not tightly packed and fall into regions that contain no larger genes. Of course any given putative short gene could in principle be functional, but it seems likely that a substantial fraction of them are not functional
See PART IV of this document for a complete list of short plasmid "genes".
Putative functions were deduced in most cases from homologies to genes of known function.
WE EMPHASIZE ONE MORE TIME! Any given putative pseudo-, questionable, short, fragmented or frameshifted gene (as we have defined them) could in principle be functional. But it seems likely that a substantial fraction of them are not functional. We use the above pseudogene definitions only as terms to describe relevant features of the B31 plasmid genes, not to imply functionality in any specific cases.
A Complete B. burgdorferi B31 Plasmid Gene List
|
Putative Gene |
5end |
3end |
Database hit outside Borrelia {organism of best database hit} |
Common Name |
Paralog Family |
Comments/References |
|
cp9 |
A homolog of cp9, called cp8.3 from B. garinii strain Ip21 was completely sequenced by (Dunn et al., 1994) |
|||||
|
BBC01 |
163 |
1269 |
57 |
|||
|
BBC02 |
1282 |
1836 |
50 |
|||
|
BBC03 |
1892 |
2449 |
49 |
|||
|
BBC04 |
2700 |
2593 |
short gene |
|||
|
BBC05 |
2804 |
3709 |
161 |
|||
|
BBC06 |
4377 |
3856 |
eppA |
95 |
exported protein (Champion et al., 1994) |
|
|
BBC07 |
4788 |
4507 |
short gene |
|||
|
BBC08 |
5534 |
5977 |
55 |
|||
|
(BBC09) |
Does not exist; erroneously present in original gene list and map in figure 2 of Fraser et al. (1997) |
|||||
|
BBC10 |
6808 |
6284 |
63 |
N-terminal lipidation consensus |
||
|
BBC11 |
6974 |
7768 |
96 |
|||
|
BBC12 |
9203 |
7914 |
165 |
|||
|
cp26 |
Homolog of cp26 present in essentially all isolates (e.g., Tilly et al., 1997) |
|||||
|
BBB01 |
16 |
321 |
conserved hypothetical protein {Escherichia coli} |
weak similarity to acylphosphatase |
||
|
BBB02 |
751 |
311 |
||||
|
BBB03 |
2186 |
840 |
weak (Y-BLAST) similarity to phage N15 gene 29 |
The protein encoded by this gene has weak similarity to the putative "protelomerase" encoded by gene 29 of phage N15 ( Ravin et al., in preparation). Circumstantial evidence suggests this N15 protein is responsible for hairpoin end formation in the N15 prophage plasmid. |
||
|
BBB04 |
3807 |
2479 |
PTS system, cellobiose-specific IIC component (celB) {Bacillus stearothermophilus} |
possible chitobiose transporter (Fraser et al., 1997) |
||
|
BBB05 |
4084 |
4428 |
PTS system, cellobiose-specific IIA component (celC) {Bacillus subtilis} |
possible chitobiose transporter (Fraser et al., 1997) |
||
|
BBB06 |
4440 |
4754 |
PTS system, cellobiose-specific IIB component (celA) {Bacillus subtilis} |
possible chitobiose transporter (Fraser et al., 1997) |
||
|
BBB07 |
4769 |
5863 |
||||
|
BBB08 |
6517 |
5891 |
N-terminal lipidation consensus |
|||
|
BBB09 |
6677 |
7711 |
N-terminal lipidation consensus |
|||
|
BBB10 |
7836 |
8762 |
62 |
|||
|
BBB11 |
8781 |
9296 |
50 |
|||
|
BBB12 |
9275 |
10033 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBB13 |
10104 |
10649 |
49 |
|||
|
BBB14 |
11417 |
10923 |
N-terminal lipidation consensus |
|||
|
BBB15 |
11636 |
11737 |
short gene |
|||
|
BBB16 |
12014 |
13603 |
oligopeptide ABC transporter, periplasmic oligopeptide-binding protein {Escherichia coli} |
oppAIV |
37 |
N-terminal lipidation consensus, not surface exposed, and not essential in culture (Bono et al., 1998) |
|
BBB17 |
15107 |
13896 |
IMP dehydrogenase {Haemophilus influenzae} |
guaA |
IMP dehydrogenase (Margolis et al., 1994b; Zhou et al., 1997) |
|
|
BBB18 |
16718 |
15135 |
GMP synthase {Haemophilus influenzae} |
guaB |
putative GMP synthase Margolis et al., 1994b) erroneous duplication in cp26 between BBB18 and BBB19 corrected in current gene list; affected originally released gene coordinates to right of BB18 |
|
|
BBB19 |
16903 |
17532 |
ospC |
surface localized (Wilske et al., 1993), N-terminal lipidation consensus (Fuchs et al., 1992; Jauris-Heipke et al., 1993; Jauris-Heipke et al., 1995; Marconi et al., 1993c; Margolis et al., 1994a; Margolis et al., 1994b; Masuzawa et al., 1997; Stevenson and Barthold, 1994; Stevenson et al., 1994; Tilly et al., 1997; Wang et al., 1999; Wilske et al., 1996a; Wilske et al., 1996b); transcription start site (Marconi et al., 1993b); temperature regulation (Schwan et al., 1995; Stevenson et al., 1995) |
||
|
BBB20 |
17733 |
17626 |
short gene |
|||
|
BBB21 |
17750 |
17842 |
short gene |
|||
|
BBB22 |
19321 |
17969 |
conserved hypothetical protein MJ0326 {Methanococcus jannaschii} |
94 |
12 putative membrane spanning regions; homologs in E. coli |
|
|
BBB23 |
20822 |
19434 |
conserved hypothetical protein MJ0326 {Methanococcus jannaschii} |
94 |
12 putative membrane spanning regions; homologs in E. coli |
|
|
BBB24 |
21364 |
20861 |
near-consensus N-terminal lipidation signal |
|||
|
BBB25 |
21851 |
21342 |
N-terminal lipidation consensus |
|||
|
BBB26 |
21898 |
22590 |
||||
|
BBB27 |
23154 |
22606 |
N-terminal lipidation consensus |
|||
|
BBB28 |
23255 |
24496 |
||||
|
BBB29 |
24825 |
26450 |
PTS system, maltose and glucose-specific IIABC component (malX) {Escherichia coli} |
16 |
putative sugar transport |
|
|
cp32-1 |
||||||
|
BBP01 |
66 |
1286 |
146 |
|||
|
BBP02 |
1306 |
1995 |
147 |
|||
|
BBP03 |
2011 |
2565 |
148 |
|||
|
BBP04 |
2575 |
3336 |
148 |
|||
|
BBP05 |
3369 |
3938 |
148 |
|||
|
BBP06 |
3948 |
4919 |
149 |
(Casjens et al., 1997) |
||
|
BBP07 |
4936 |
5394 |
150 |
|||
|
BBP08 |
5379 |
5777 |
107 |
|||
|
BBP09 |
5768 |
6154 |
108 |
|||
|
BBP10 |
6154 |
6717 |
151 |
|||
|
BBP11 |
6701 |
7810 |
152 |
|||
|
BBP12 |
7828 |
8253 |
153 |
|||
|
BBP13 |
8272 |
8724 |
154 |
|||
|
BBP14 |
8724 |
8957 |
155 |
short gene |
||
|
BBP15 |
8968 |
10239 |
156 |
|||
|
BBP16 |
10265 |
10945 |
157 |
|||
|
BBP17 |
10952 |
11899 |
159 |
|||
|
BBP18 |
11920 |
12462 |
160 |
|||
|
BBP19 |
12495 |
12824 |
139 |
|||
|
BBP20 |
12824 |
13696 |
140 |
|||
|
BBP21 |
13709 |
14311 |
141 |
|||
|
BBP22 |
14324 |
15136 |
142 |
|||
|
BBP23 |
15215 |
15415 |
orfA-1; blyA-1 |
109 |
putative hemolysin; short gene; sequenced for homologous plasmids in strain 297 by Porcella et al. (1996) |
|
|
BBP24 |
15422 |
15766 |
orfB; blyB-1 |
111 |
putative hemolysin; sequenced for homologous plasmids in strain 297 by Porcella et al. (1996) |
|
|
BBP25 |
15759 |
16091 |
orfC |
112 |
(Gilmore and Mbow, 1998); sequenced in homologous plasmids of strain 297 by Porcella et al. (1996) |
|
|
BBP26 |
16081 |
16437 |
orfD |
143 |
(Gilmore and Mbow, 1998); sequenced in homologous plasmids of strain 297 by Porcella et al. (1996); near-consensus N-terminal lipidation signal but strain 297 homolog was not ipidated in E. coli. |
|
|
BBP27 |
17060 |
16581 |
rev-1 |
63 |
N-terminal lipidation consensus (Gilmore and Mbow, 1998); sequenced in homologous plasmids of strain 297 by Porcella et al.(1996) |
|
|
BBP28 |
17232 |
17675 |
mlpA |
113 |
N-terminal lipidation consensus (Gilmore and Mbow, 1998); sequenced in several homologous plasmids of strain 297 by Porcella et al. (1996); lipidated in E. coli (Porcella et al., 1996); paralog lipidated in B. afzelii Theisen (1996) |
|
|
BBP29 |
18728 |
17718 |
orf4-1 |
161 |
(Gilmore and Mbow, 1998) |
|
|
BBP30 |
19114 |
20211 |
orf1-1 |
57 |
(Zuckert and Meyer, 1996) |
|
|
BBP31 |
20224 |
20787 |
orf2-1 |
50 |
(Zuckert and Meyer, 1996) |
|
|
BBP32 |
20766 |
21503 |
plasmid partition protein {Bacillus subtilis} |
orfC-1 |
32 |
putative plasmid partition function (Zuckert and Meyer, 1996) |
|
BBP33 |
21510 |
22115 |
orf3-1 |
49 |
(Zuckert and Meyer, 1996) |
|
|
BBP34 |
22131 |
22760 |
bdrA |
80 |
contains 4.7 repeats of a 54 bp sequence; all "bdr" genes contain direct, tandem repeats (Casjens et al., 1999; Zuckert and Meyer, 1996) |
|
|
BBP35 |
23231 |
24553 |
orf8/7-1 |
165 |
(Casjens et al., 1997; Zuckert and Meyer, 1996) |
|
|
BBP36 |
24609 |
25031 |
orf10-1 |
144 |
(Casjens et al., 1997) |
|
|
BBP37 |
25816 |
25043 |
orf6-1 |
96 |
(Casjens et al., 1997) |
|
|
BBP38 |
26235 |
26765 |
erpA |
162 |
surface exposed (Lam et al., 1994); N-terminal lipidation consensus (Stevenson et al., 1996); lipidated in E. coli (Akins et al., 1995b; Wallich et al., 1995); erp-like genes have been sequenced from several other strains (Akins et al., 1999; Lam et al., 1994; Marconi et al., 1996b; Stevenson et al., 1997; Suk et al., 1995) |
|
|
BBP39 |
26796 |
27929 |
erpB |
163 |
N-terminal lipidation consensus (Stevenson et al., 1996) |
|
|
BBP40 |
28074 |
28652 |
114 |
|||
|
BBP41 |
28835 |
29398 |
115 |
|||
|
BBP42 |
29398 |
30747 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
(Amouriaux et al., 1993; Casjens et al., 1997); phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
cp32-3 |
||||||
|
BBS01 |
66 |
1286 |
146 |
|||
|
BBS02 |
1306 |
1995 |
147 |
|||
|
BBS03 |
2011 |
2565 |
148 |
|||
|
BBS04 |
2575 |
3336 |
148 |
|||
|
BBS05 |
3369 |
3938 |
148 |
|||
|
BBS06 |
3963 |
4919 |
149 |
(Casjens et al., 1997) |
||
|
BBS07 |
4936 |
5394 |
150 |
|||
|
BBS08 |
5379 |
5777 |
107 |
|||
|
BBS09 |
5768 |
6154 |
108 |
|||
|
BBS10 |
6154 |
6717 |
151 |
|||
|
BBS11 |
6701 |
7810 |
152 |
|||
|
BBS12 |
7828 |
8253 |
153 |
|||
|
BBS13 |
8272 |
8724 |
154 |
|||
|
BBS14 |
8724 |
8957 |
155 |
short gene |
||
|
BBS15 |
8968 |
10239 |
156 |
|||
|
BBS16 |
10265 |
10945 |
157 |
|||
|
BBS17 |
10952 |
11899 |
159 |
|||
|
BBS18 |
11920 |
12462 |
160 |
|||
|
BBS19 |
12495 |
12824 |
139 |
|||
|
BBS20 |
12824 |
13696 |
140 |
|||
|
BBS21 |
13709 |
14311 |
141 |
|||
|
BBS22 |
14324 |
15133 |
142 |
|||
|
BBS23 |
15212 |
15412 |
blyA-3 |
109 |
putative hemolysin; short gene |
|
|
BBS24 |
15419 |
15763 |
blyB-3 |
111 |
putative hemolysin; |
|
|
BBS25 |
15756 |
16088 |
112 |
|||
|
BBS26 |
16078 |
16434 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBS27 |
16586 |
16900 |
||||
|
BBS28 |
16915 |
17046 |
short gene |
|||
|
BBS29 |
17068 |
17694 |
bdrF |
80 |
contains 3.6 repeats of a 33 bp sequence |
|
|
BBS30 |
17803 |
18246 |
mlpC |
113 |
N-terminal lipidation consensus |
|
|
BBS31 |
19159 |
18290 |
orf4-3 |
161 |
(Zuckert and Meyer, 1996) |
|
|
BBS32 |
19198 |
19392 |
conserved hypothetical protein {Chlorella vulgaris}(similarity poor) |
questionable gene; gene not called in paralogous sequence on other cp32s |
||
|
BBS33 |
19605 |
20702 |
orf1-3 |
57 |
(Zuckert and Meyer, 1996) |
|
|
BBS34 |
20715 |
21278 |
50 |
|||
|
BBS35 |
21257 |
21994 |
plasmid partition protein {Bacillus subtilis} |
orfC-3 |
32 |
putative plasmid partition function; (Stevenson et al., 1998b) |
|
BBS36 |
22038 |
22577 |
orf3-3 |
49 |
(Stevenson et al., 1998b) |
|
|
BBS37 |
22593 |
23180 |
bdrE |
80 |
contains 4.1 repeats of a 54 bp sequence |
|
|
BBS38 |
23649 |
25013 |
orf8/7-3 |
165 |
(Casjens et al., 1997) |
|
|
BBS39 |
25069 |
25491 |
orf10-3 |
144 |
(Casjens et al., 1997) |
|
|
BBS40 |
26276 |
25503 |
orf6-3 |
96 |
(Casjens et al., 1997) |
|
|
BBS41 |
26708 |
27295 |
erpG; pG |
164 |
N-terminal lipidation consensus; (Stevenson et al., 1996; Wallich et al., 1995) |
|
|
BBS42 |
27410 |
27916 |
bapA |
95 |
(Stevenson et al., 1996; Wallich et al., 1995) |
|
|
BBS43 |
28067 |
28246 |
short gene |
|||
|
BBS44 |
28236 |
28871 |
115 |
|||
|
BBS45 |
28871 |
30220 |
conserved hypothetical protein Orf26 of phage f01205 {Streptococcus thermophilus} |
145 |
(Amouriaux et al., 1993; Casjens et al., 1997); phage f01205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
cp32-4 |
||||||
|
BBR01 |
66 |
1286 |
146 |
|||
|
BBR02 |
1306 |
1998 |
147 |
pseudogene; authentic frameshift |
||
|
BBR03 |
2013 |
2573 |
148 |
|||
|
BBR04 |
2580 |
3344 |
148 |
|||
|
BBR05 |
3340 |
3948 |
orfI |
148 |
(Casjens et al., 1997) |
|
|
BBR06 |
3958 |
4929 |
orfII |
149 |
(Casjens et al., 1997) |
|
|
BBR07 |
4952 |
5404 |
orfIII |
150 |
(Casjens et al., 1997) |
|
|
BBR08 |
5389 |
5787 |
orfIV |
107 |
(Casjens et al., 1997) |
|
|
BBR09 |
5778 |
6164 |
orfV |
108 |
(Casjens et al., 1997) |
|
|
BBR10 |
6164 |
6727 |
151 |
|||
|
BBR11 |
6711 |
7820 |
152 |
|||
|
BBR12 |
7838 |
8263 |
153 |
|||
|
BBR13 |
8282 |
8734 |
154 |
|||
|
BBR14 |
8734 |
8967 |
155 |
short gene |
||
|
BBR15 |
8978 |
10270 |
156 |
|||
|
BBR16 |
10296 |
10889 |
157 |
|||
|
BBR17 |
10896 |
11843 |
159 |
|||
|
BBR18 |
11864 |
12415 |
160 |
|||
|
BBR19 |
12448 |
12777 |
139 |
|||
|
BBR20 |
12777 |
13649 |
140 |
|||
|
BBR21 |
13662 |
14264 |
141 |
|||
|
BBR22 |
14277 |
15089 |
142 |
|||
|
BBR23 |
15167 |
15367 |
blyA-4 |
109 |
putative hemolysin; short gene |
|
|
BBR24 |
15374 |
15718 |
blyB-4 |
111 |
putative hemolysin |
|
|
BBR25 |
15711 |
16043 |
112 |
|||
|
BBR26 |
16033 |
16389 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBR27 |
16467 |
16994 |
bdrH |
80 |
sequenced in homologous plasmids of strain 297 by Porcella et al. (1996) and in B. afzelii by Theisen (1996) |
|
|
BBR28 |
17103 |
17522 |
mlpD |
113 |
N-terminal lipidation consensus |
|
|
BBR29 |
18664 |
17576 |
161 |
|||
|
BBR30 |
18829 |
18737 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBR31 |
18960 |
20054 |
57 |
|||
|
BBR32 |
20067 |
20630 |
50 |
|||
|
BBR33 |
20609 |
21361 |
plasmid partition protein {Bacillus subtilis} |
orfC-4 |
32 |
putative plasmid partition function (Stevenson et al., 1998b) |
|
BBR34 |
21415 |
21957 |
orf3-4 |
49 |
(Stevenson et al., 1998b) |
|
|
BBR35 |
21974 |
22249 |
bdrG |
80 |
authentic point mutation; has an in-frame stop codon |
|
|
BBR36 |
22831 |
24153 |
165 |
|||
|
BBR37 |
24210 |
24632 |
orf10-4 |
144 |
(Casjens et al., 1997) |
|
|
BBR38 |
25435 |
24644 |
orf6-4 |
96 |
(Casjens et al., 1997); sequence from strain N40 - assession # AF011453 |
|
|
BBR39 |
25636 |
25538 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBR40 |
25865 |
25966 |
erpH |
162 |
pseudogene; severely truncated relative to other erps; N-terminal lipidation consensus (Stevenson et al., 1996) |
|
|
BBR41 |
26077 |
26817 |
161/ 162 |
pseudogene; this is a "fusion" gene - a family [161] gene is fused to an [162] erp gene |
||
|
BBR42 |
26853 |
27524 |
erpY |
164 |
N-terminal lipidation consensus |
|
|
BBR43 |
27634 |
28200 |
114 |
|||
|
BBR44 |
28384 |
28947 |
115 |
|||
|
BBR45 |
28947 |
30296 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
homolog of phage Streptococcus thermophilus fO1205 gene orf26 that is likely to be phage structural protein; (Amouriaux et al., 1993; Casjens et al., 1997) |
|
|
cp32-6 |
||||||
|
BBM01 |
66 |
1286 |
146 |
|||
|
BBM02 |
1306 |
1995 |
147 |
|||
|
BBM03 |
2010 |
2570 |
148 |
|||
|
BBM04 |
2577 |
3341 |
148 |
|||
|
BBM05 |
3337 |
3945 |
148 |
|||
|
BBM06 |
3955 |
4926 |
149 |
|||
|
BBM07 |
4949 |
5401 |
150 |
|||
|
BBM08 |
5386 |
5784 |
107 |
|||
|
BBM09 |
5775 |
6161 |
108 |
|||
|
BBM10 |
6161 |
6727 |
151 |
|||
|
BBM11 |
6711 |
7820 |
152 |
|||
|
BBM12 |
7838 |
8263 |
153 |
|||
|
BBM13 |
8282 |
8734 |
154 |
|||
|
BBM14 |
8734 |
8967 |
155 |
short gene |
||
|
BBM15 |
8978 |
10249 |
156 |
|||
|
BBM16 |
10275 |
10955 |
157 |
|||
|
BBM17 |
10962 |
11909 |
159 |
|||
|
BBM18 |
11930 |
12481 |
160 |
|||
|
BBM19 |
12514 |
12843 |
139 |
|||
|
BBM20 |
12843 |
13715 |
140 |
|||
|
BBM21 |
13728 |
14330 |
141 |
|||
|
BBM22 |
14343 |
15152 |
142 |
|||
|
BBM23 |
15231 |
15431 |
blyA-6 |
109 |
putative hemolysin; short gene |
|
|
BBM24 |
15438 |
15782 |
blyB-6 |
111 |
putative hemolysin |
|
|
BBM25 |
15775 |
16107 |
112 |
|||
|
BBM26 |
16097 |
16453 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBM27 |
17075 |
16596 |
rev-6 |
63 |
N-terminal lipidation consensus |
|
|
BBM28 |
17247 |
17693 |
mlpF |
113 |
N-terminal lipidation consensus |
|
|
BBM29 |
18680 |
17736 |
161 |
|||
|
BBM30 |
19069 |
20166 |
57 |
|||
|
BBM31 |
20179 |
20742 |
50 |
|||
|
BBM32 |
20721 |
21467 |
plasmid partition protein {Bacillus subtilis} |
orfC-6 |
32 |
putative plasmid partition (Stevenson et al., 1998b) |
|
BBM33 |
21520 |
22095 |
orf3-6 |
49 |
(Stevenson et al., 1998b) |
|
|
BBM34 |
22102 |
22767 |
bdrK |
80 |
||
|
BBM35 |
23241 |
24563 |
165 |
|||
|
BBM36 |
24619 |
25041 |
144 |
|||
|
BBM37 |
25820 |
25053 |
96 |
only [96] member with signal sequence |
||
|
BBM38 |
26245 |
27012 |
erpK |
164 |
N-terminal lipidation consensus; (Casjens et al., 1997) |
|
|
BBM39 |
27745 |
27080 |
||||
|
BBM40 |
27731 |
27850 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBM41 |
27923 |
28486 |
115 |
|||
|
BBM42 |
28486 |
29835 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein; (Amouriaux et al., 1993; Casjens et al., 1997) |
|
|
cp32-7 |
||||||
|
BBO01 |
65 |
1285 |
146 |
|||
|
BBO02 |
1305 |
1994 |
147 |
|||
|
BBO03 |
2010 |
2564 |
148 |
|||
|
BBO04 |
2574 |
3335 |
148 |
|||
|
BBO05 |
3368 |
3937 |
148 |
|||
|
BBO06 |
3962 |
4918 |
149 |
|||
|
BBO07 |
4935 |
5393 |
150 |
|||
|
BBO08 |
5378 |
5776 |
107 |
|||
|
BBO09 |
5767 |
6153 |
108 |
|||
|
BBO10 |
6153 |
6719 |
151 |
|||
|
BBO11 |
6703 |
7812 |
152 |
|||
|
BBO12 |
7830 |
8255 |
153 |
|||
|
BBO13 |
8274 |
8726 |
154 |
|||
|
BBO14 |
8726 |
8959 |
155 |
short gene |
||
|
BBO15 |
8970 |
10301 |
156 |
|||
|
BBO16 |
10317 |
10955 |
157 |
|||
|
BBO17 |
10962 |
11900 |
159 |
|||
|
BBO18 |
11904 |
12470 |
160 |
|||
|
BBO19 |
12503 |
12832 |
139 |
|||
|
BBO20 |
12832 |
13707 |
140 |
|||
|
BBO21 |
13716 |
14318 |
141 |
|||
|
BBO22 |
14331 |
15143 |
142 |
|||
|
BBO23 |
15222 |
15422 |
blyA-7 |
109 |
putative hemolysin; short gene |
|
|
BBO24 |
15429 |
15782 |
blyB-7 |
111 |
putative hemolysin |
|
|
BBO25 |
15766 |
16098 |
112 |
|||
|
BBO26 |
16088 |
16444 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBO27 |
16522 |
17136 |
bdrN |
80 |
||
|
BBO28 |
17245 |
17664 |
mlpG |
113 |
N-terminal lipidation consensus |
|
|
BBO29 |
18770 |
17715 |
161 |
|||
|
BBO30 |
19117 |
20211 |
57 |
|||
|
BBO31 |
20224 |
20787 |
50 |
|||
|
BBO32 |
20766 |
21512 |
plasmid partition protein {Bacillus subtilis} |
orfC-7 |
32 |
putative plasmid partition function (Stevenson et al., 1998b) |
|
BBO33 |
21522 |
22073 |
orf3-7 |
49 |
(Stevenson et al., 1998b) |
|
|
BBO34 |
22088 |
22657 |
bdrM |
80 |
||
|
BBO35 |
22755 |
22630 |
questionable gene; gene not called in paralogous sequence in other cp32s |
|||
|
BBO36 |
23093 |
24457 |
165 |
|||
|
BBO37 |
24513 |
24935 |
144 |
|||
|
BBO38 |
25720 |
24947 |
96 |
|||
|
BBO39 |
26152 |
26838 |
erpL |
164 |
N-terminal lipidation consensus (Casjens et al., 1997) |
|
|
BBO40 |
26893 |
27981 |
erpM |
163 |
N-terminal lipidation consensus (Casjens et al., 1997) |
|
|
BBO41 |
28117 |
28007 |
116 |
questionable gene; gene not called in paralogous sequence in other cp32s |
||
|
BBO42 |
28134 |
28700 |
114 |
|||
|
BBO43 |
28885 |
29448 |
115 |
|||
|
BBO44 |
29448 |
30797 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
(Amouriaux et al., 1993; Casjens et al., 1997); phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
cp32-8 |
||||||
|
BBL01 |
66 |
1286 |
146 |
|||
|
BBL02 |
1306 |
1995 |
147 |
|||
|
BBL03 |
2011 |
2565 |
148 |
|||
|
BBL04 |
2575 |
3336 |
148 |
|||
|
BBL05 |
3369 |
3938 |
148 |
|||
|
BBL06 |
3948 |
4919 |
149 |
|||
|
BBL07 |
4936 |
5394 |
150 |
|||
|
BBL08 |
5379 |
5777 |
107 |
|||
|
BBL09 |
5768 |
6154 |
108 |
|||
|
BBL10 |
6154 |
6717 |
151 |
|||
|
BBL11 |
6701 |
7810 |
152 |
|||
|
BBL12 |
7828 |
8253 |
153 |
|||
|
BBL13 |
8272 |
8724 |
154 |
|||
|
BBL14 |
8724 |
8957 |
155 |
short gene |
||
|
BBL15 |
8968 |
10239 |
156 |
|||
|
BBL16 |
10265 |
10945 |
157 |
|||
|
BBL17 |
10952 |
11899 |
159 |
|||
|
BBL18 |
11920 |
12462 |
160 |
|||
|
BBL19 |
12495 |
12824 |
139 |
|||
|
BBL20 |
12824 |
13696 |
140 |
|||
|
BBL21 |
13709 |
14311 |
141 |
|||
|
BBL22 |
14324 |
15136 |
142 |
|||
|
BBL23 |
15215 |
15415 |
blyA-8 |
109 |
putative hemolysin; short gene |
|
|
BBL24 |
15422 |
15766 |
blyB-8 |
111 |
putative hemolysin |
|
|
BBL25 |
15759 |
16091 |
112 |
|||
|
BBL26 |
16081 |
16437 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBL27 |
16515 |
17096 |
bdrO |
80 |
||
|
BBL28 |
17205 |
17648 |
mlpH |
113 |
N-terminal lipidation consensus |
|
|
BBL29 |
18761 |
17691 |
161 |
|||
|
BBL30 |
19091 |
20185 |
57 |
|||
|
BBL31 |
20198 |
20761 |
50 |
|||
|
BBL32 |
20740 |
21477 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBL33 |
21467 |
21556 |
short gene |
|||
|
BBL34 |
21540 |
22097 |
49 |
|||
|
BBL35 |
22113 |
22688 |
bdrP |
80 |
||
|
BBL36 |
23306 |
24688 |
165 |
|||
|
BBL37 |
24744 |
25166 |
144 |
|||
|
BBL38 |
25951 |
25178 |
96 |
|||
|
BBL39 |
26370 |
26900 |
erpN |
162 |
N-terminal lipidation consensus |
|
|
BBL40 |
26931 |
28064 |
erpO |
163 |
N-terminal lipidation consensus |
|
|
BBL41 |
28209 |
28787 |
114 |
|||
|
BBL42 |
28970 |
29533 |
115 |
|||
|
BBL43 |
29533 |
30882 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
cp32-9 |
||||||
|
BBN01 |
66 |
1289 |
146 |
|||
|
BBN02 |
1309 |
1998 |
147 |
|||
|
BBN03 |
2013 |
2576 |
148 |
|||
|
BBN04 |
2583 |
3347 |
148 |
|||
|
BBN05 |
3343 |
3950 |
148 |
pseudogene; authentic frameshift |
||
|
BBN06 |
3960 |
4935 |
149 |
pseudogene; authentic frameshift |
||
|
BBN07 |
4958 |
5410 |
150 |
|||
|
BBN08 |
5395 |
5793 |
107 |
|||
|
BBN09 |
5784 |
6170 |
108 |
|||
|
BBN10 |
6170 |
6733 |
151 |
|||
|
BBN11 |
6717 |
7802 |
152 |
|||
|
BBN12 |
7845 |
8270 |
153 |
|||
|
BBN13 |
8289 |
8742 |
154 |
pseudogene; authentic frameshift |
||
|
BBN14 |
8742 |
8975 |
155 |
short gene |
||
|
BBN15 |
8986 |
10257 |
156 |
|||
|
BBN16 |
10283 |
11034 |
157 |
pseudogene; authentic frameshift |
||
|
BBN17 |
11041 |
11988 |
159 |
|||
|
BBN18 |
12009 |
12560 |
160 |
pseudogene; authentic frameshift |
||
|
BBN19 |
12593 |
12922 |
139 |
|||
|
BBN20 |
12922 |
13794 |
140 |
|||
|
BBN21 |
13807 |
14410 |
141 |
pseudogene; authentic frameshift |
||
|
BBN22 |
14423 |
15230 |
orfX-9 |
142 |
pseudogene; authentic frameshift (Guina and Oliver, 1997) |
|
|
BBN23 |
15312 |
15512 |
blyA-9 |
109 |
pore-forming hemolysin; short gene (Guina and Oliver, 1997) |
|
|
BBN24 |
15519 |
15863 |
blyB-9 |
111 |
hemolysin accessory protein (Guina and Oliver, 1997) |
|
|
BBN25 |
15856 |
16188 |
orfC-9 |
112 |
(Guina and Oliver, 1997) |
|
|
BBN26 |
16178 |
16534 |
orfD-9 |
143 |
(Guina and Oliver, 1997); near-consensus N-terminal lipidation signal |
|
|
BBN27 |
16612 |
17193 |
bdrR |
80 |
(Guina and Oliver, 1997) |
|
|
BBN28 |
17302 |
17727 |
mlpI |
113 |
N-terminal lipidation consensus |
|
|
BBN29 |
18784 |
17776 |
161 |
pseudogene; authentic frameshift |
||
|
BBN30 |
19164 |
20261 |
57 |
|||
|
BBN31 |
20275 |
20838 |
50 |
|||
|
BBN32 |
20817 |
21569 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBN33 |
21614 |
22171 |
49 |
|||
|
BBN34 |
22184 |
22720 |
bdrQ |
80 |
||
|
BBN35 |
23194 |
24516 |
165 |
|||
|
BBN36 |
24572 |
24994 |
144 |
|||
|
BBN37 |
25779 |
25006 |
96 |
pseudogene; authentic frameshift |
||
|
BBN38 |
26198 |
26767 |
erpP |
162 |
N-terminal lipidation consensus |
|
|
BBN39 |
26798 |
27826 |
erpQ |
163 |
N-terminal lipidation consensus |
|
|
BBN40 |
27991 |
27884 |
116 |
questionable gene; gene not called in paralogous sequence on other cp32s |
||
|
BBN41 |
27984 |
28541 |
114 |
|||
|
BBN42 |
28736 |
29299 |
115 |
|||
|
BBN43 |
29299 |
30648 |
conserved hypothetical protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
lp5 |
||||||
|
BBT01 |
195 |
635 |
57 |
pseudogene |
||
|
BBT02 |
744 |
1094 |
57 |
pseudogene |
||
|
BBT03 |
1208 |
1573 |
84 |
(pseudogene; also [57] related?) |
||
|
BBT04 |
2148 |
3251 |
57 |
|||
|
BBT05 |
3200 |
3350 |
57 |
pseudogene |
||
|
BBT06 |
3340 |
4329 |
conserved hypothetical protein ywlC {Bacillus subtilis} |
137 |
family of genes that includes yeast SUA gene |
|
|
BBT07 |
4388 |
4816 |
52 |
pseudogene |
||
|
lp17 |
lp17 from B31 was independently sequenced by Barbour et al (1996) Hinnebusch et al. (1990) determined the sequences of the two telomeres - those sequences contain 29 bp and 78 bp that are not present in the TIGR sequence |
|||||
|
BBD001 |
214 |
405 |
orfA |
166 |
short gene; N-terminal lipidation consensus |
|
|
BBD01 |
332 |
802 |
orfB |
76 |
||
|
BBD02 |
873 |
1019 |
57/77 |
pseudogene |
||
|
BBD03 |
1117 |
1309 |
57 |
pseudogene |
||
|
BBD04 |
1412 |
1765 |
orfC |
57 |
pseudogene; different translation start called by Barbour et al. (1996) |
|
|
BBD05 |
2389 |
2541 |
orfD |
84 |
(pseudogene [57]?) |
|
|
BBD05.1 |
3018 |
3604 |
57 |
pseudogene |
||
|
BBD06 |
3143 |
3604 |
orfE |
57 |
in-frame and inside pseudogene BBD05.1; questionable gene |
|
|
BBD07 |
4373 |
4260 |
82 |
short gene |
||
|
BBD08 |
4707 |
4802 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBD09 |
5058 |
5738 |
orfF |
|||
|
BBD10 |
6454 |
5879 |
orfG |
N-terminal lipidation consensus |
||
|
BBD11 |
6681 |
7631 |
orfH |
sequence difference caused Barbour et al. (1996) orfH to end at ~7556 |
||
|
BBD12 |
7752 |
7624 |
short gene |
|||
|
BBD13 |
7787 |
8110 |
orfI |
|||
|
BBD14 |
8269 |
9378 |
orfJ |
62 |
||
|
BBD15 |
10015 |
9596 |
orfK |
85 |
different start called by Barbour et al. (1996); N-terminal lipidation consensus |
|
|
BBD15.01 |
10000 |
9940 |
175 |
pseudogene |
||
|
BBD15.1 |
10152 |
10326 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBD16 |
10520 |
10428 |
short gene |
|||
|
BBD17 |
10591 |
10683 |
short gene |
|||
|
BBD18 |
11648 |
10989 |
orfL |
|||
|
BBD19 |
12057 |
12167 |
short gene |
|||
|
BBD20 |
12250 |
12975 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
21 bp repeat |
13154 |
13329 |
8.3 tandem, direct repeats of TAATTAATATGTGATATAAAA; not in a gene |
|||
|
BBD21 |
13341 |
14078 |
plasmid partition protein {Bacillus subtilis} |
orfM |
32 |
putative plasmid partition function |
|
BBD22 |
14072 |
14338 |
orfN |
short gene |
||
|
BBD23 |
14781 |
15725 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBD24 |
16121 |
15894 |
orfO |
N-terminal lipidation consensus; short gene |
||
|
BBD25 |
16212 |
16367 |
short gene |
|||
|
lp21 |
||||||
|
BBU01 |
184 |
615 |
57 |
pseudogene |
||
|
BBU02 |
746 |
1111 |
84 |
(pseudogene [57]?) |
||
|
BBU03 |
1357 |
1241 |
172 |
short |
||
|
BBU04 |
1486 |
2625 |
57 |
|||
|
BBU05 |
2868 |
3653 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
63 bp repeat (not a gene) |
3618 |
14636 |
tandem array of about 176 repeats of 63 bp sequence; not in ORF, has stop codons in all 6 frames (Casjens et al., 2000) |
|||
|
BBU06 |
14633 |
15232 |
49 |
|||
|
BBU07 |
15349 |
15810 |
57 |
pseudogene |
||
|
BBU08 |
15791 |
16081 |
137 |
pseudogene; short |
||
|
BBU09 |
16548 |
16231 |
55 |
pseudogene? |
||
|
BBU10 |
16603 |
16797 |
57 |
pseudogene; short |
||
|
BBU11 |
16886 |
17875 |
protein Y02L_MYCTU {Mycobacterium tuberculosis} |
137 |
family of genes that includes B. subtilis ywlC and yeast SUA genes |
|
|
BBU12 |
17918 |
18362 |
52 |
pseudogene; authentic frameshift |
||
|
lp25 |
||||||
|
BBE01 |
255 |
157 |
short gene |
|||
|
BBE02 |
4156 |
326 |
1 |
|||
|
BBE03 |
4613 |
4422 |
98 |
short gene |
||
|
BBE04 |
4719 |
4856 |
54 |
pseudogene; near-consensus N-terminal lipidation signal |
||
|
BBE04.1 |
5377 |
5734 |
44 |
pseudogene |
||
|
BBE05 |
5377 |
5526 |
44 |
inside and in-frame with BBE04 |
||
|
BBE06 |
5757 |
5903 |
N-terminal lipidation consensus; short gene |
|||
|
BBE07 |
6401 |
6185 |
psf-I protein {Escherichia coli} |
26 |
pseudogene |
|
|
BBE08 |
6701 |
6558 |
N-terminal lipidation consensus; short gene |
|||
|
BBE09 |
6898 |
7758 |
44 |
N-terminal lipidation consensus |
||
|
BBE10 |
7972 |
7877 |
short gene |
|||
|
BBE11 |
8446 |
8315 |
short gene |
|||
|
BBE12 |
8646 |
8524 |
short gene |
|||
|
BBE13 |
8863 |
8955 |
short gene |
|||
|
BBE14 |
9163 |
9375 |
short gene |
|||
|
BBE15 |
9490 |
9356 |
short gene |
|||
|
BBE16 |
10187 |
9570 |
99 |
near-consensus N-terminal lipidation signal (but short signal sequence?) |
||
|
BBE17 |
10709 |
10203 |
||||
|
BBE18 |
12079 |
11501 |
49 |
|||
|
BBE19 |
12854 |
12099 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBE20 |
13393 |
12833 |
50 |
|||
|
BBE21 |
14530 |
13406 |
57 |
|||
|
BBE21.1 |
14767 |
14893 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBE22 |
15578 |
15045 |
pyrazinamidase/nicotinamidase (pncA) {Mycobacterium tuberculosis} |
putative pyrazinamidase/nicotinamidase (pncA) |
||
|
BBE23 |
15973 |
16155 |
short gene |
|||
|
BBE23.1 |
16459 |
16540 |
57 |
pseudogene |
||
|
BBE23.2 |
16540 |
16721 |
plasmid partition protein {Bacillus subtilis} |
32 |
pseudogene |
|
|
BBE24 |
16740 |
17300 |
49 |
pseudogene |
||
|
BBE24.1 |
17902 |
18303 |
49 |
pseudogene |
||
|
BBE25 |
18606 |
18505 |
short gene |
|||
|
BBE26 |
18586 |
18711 |
short gene; near-consensus N-terminal lipidation signal |
|||
|
BBE27 |
19055 |
19195 |
short gene |
|||
|
BBE28 |
19489 |
19340 |
near N-terminal lipidation consensus; short gene |
|||
|
BBE29 |
19697 |
20883 |
adenine specific DNA methyltransferase {Helicobacter pylori} |
167 |
pseudogene; adenine specific DNA methyltransferase |
|
|
BBE29.1 |
21110 |
21476 |
102 |
pseudogene |
||
|
BBE30 |
21558 |
21701 |
49 |
pseudogene |
||
|
BBE31 |
22677 |
21949 |
60 |
N-terminal lipidation consensus |
||
|
BBE32 |
23723 |
23418 |
57 |
pseudogene |
||
|
BBE33 |
24100 |
23850 |
169 |
pseudogene; authentic frameshift |
||
|
lp28-1 |
Zhang et al. (1997) determined the sequence of the right telomere of lp28-1. |
|||||
|
BBF001 |
1 |
163 |
88 |
pseudogene; near-consensus N-terminal lipidation signal |
||
|
BBF001.1 |
200 |
380 |
80 |
pseudogene |
||
|
BBF01 |
467 |
1462 |
erpT (N40) |
163 |
small patch of similarity to erp genes of family 163; near-consensus N-terminal lipidation signal and affinity to lipoprotein families [60] and [163]; (Fikrig et al., 1999) |
|
|
BBF02 |
1720 |
2073 |
orf105 {Plasmodium falciparum} fairly poor match |
88 |
pseudogene |
|
|
BBF03 |
2619 |
2101 |
bdrS |
80 |
pseudogene - N-terminal truncation; contains about 3 repeats each of 2 different 33 bp sequences |
|
|
BBF04 |
2658 |
2804 |
57/77 |
pseudogene in [57] |
||
|
BBF05 |
2777 |
3073 |
57 |
pseudogene in [57] |
||
|
BBF06 |
3201 |
3377 |
57 |
pseudogene in [57] (actually a "fusion" gene) |
||
|
BBF07 |
3529 |
3413 |
100 |
short gene |
||
|
BBF08 |
3849 |
3685 |
72 |
pseudogene; paralog of fragment of BBK43 |
||
|
BBF09 |
4027 |
4179 |
71 |
pseudogene; paralog of C-term of BBK42 |
||
|
BBF10 |
4488 |
4982 |
70 |
|||
|
BBF11 |
5435 |
5539 |
questionable gene; backwards inside BBF11.1 |
|||
|
BBF11.1 |
5620 |
5412 |
32 |
pseudogene |
||
|
BBF12 |
6540 |
5956 |
49 |
pseudogene (?) in [49] - patchy similarity to other [49] genes |
||
|
BBF13 |
7381 |
6635 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBF14 |
7911 |
7357 |
50 |
|||
|
BBF14.1 |
8197 |
8367 |
65 |
pseudogene |
||
|
BBF16 |
8389 |
8571 |
64 |
pseudogene; paralog of N-term of K34 |
||
|
BBF17 |
8772 |
9026 |
68 |
pseudogene; paralog of N-term of K35 |
||
|
BBF18 |
9561 |
10049 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBF19 |
10559 |
10036 |
transposase-like protein {Anabena} |
82 |
questionable gene; BBF18 and F19 are almost certainly inverted parts of one complex pseudogene |
|
|
BBF19.1 |
10916 |
11200 |
175 |
pseudogene |
||
|
BBF20 |
10991 |
10701 |
85 |
N-terminal lipidation consensus; pseudogene |
||
|
BBF21 |
11550 |
11449 |
66 |
short gene |
||
|
BBF22 |
12018 |
11794 |
44 |
pseudogene |
||
|
BBF23 |
12992 |
12444 |
49 |
|||
|
BBF24 |
13793 |
13032 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBF25 |
14329 |
13772 |
50 |
|||
|
BBF26 |
15451 |
14354 |
57 |
|||
|
BBF26.1 |
15663 |
16209 |
101 |
badly deleted pseudogene |
||
|
BBF27 |
15925 |
15758 |
101 |
questionable gene; in-frame and inside BBF26.1 |
||
|
BBF28 |
16129 |
16001 |
questionable gene; out-of-frame (?) and inside BBF26.1 |
|||
|
BBF29 |
16825 |
16457 |
49 |
pseudogene |
||
|
BBF30 |
17415 |
17170 |
short gene |
|||
|
BBF31 |
17805 |
17394 |
50 |
pseudogene |
||
|
BBF31.1 |
17920 |
18050 |
57 |
pseudogene |
||
|
BBF32 |
26698 |
18430 |
170 |
15 tandem pseudogenes; N-terminal lipidation consensus; unexpressed reservoir of diversity for vlsE expression site; no frame disruptions (Zhang et al., 1997; Zhang and Norris, 1998a; Zhang and Norris, 1998b) |
||
|
vlsE |
27097 |
28170 |
vlsE |
170 |
surface exposed; N-terminal lipidation consensus; this is the vlsE expression site; it is beyond the end of the TIGR lp28-1 sequence; there is a 100 bp gap (apparently unclonable sequence) between the TIGR and vlsE sequences (Zhang et al., 1997; Zhang and Norris, 1998a; Zhang and Norris, 1998b) |
|
|
lp28-2 |
||||||
|
BBG01 |
116 |
1006 |
12 |
N-terminal lipidation consensus |
||
|
BBG02 |
1047 |
1925 |
HP1353 gene {Helicobacter pylori} |
102 |
N-terminal lipidation consensus; rather good similarity to HP1353 across the C-terminal 3/4 of the gene - HP1353 has an "FLSTC" sequence that is about 30 aas from the N-terminus, not a good lipidation consensus and not a particularly good signal sequence; HP1352 & HP1354 are putative adenine methylases! |
|
|
BBG03 |
2104 |
2492 |
48 |
pseudogene; authentic frameshift |
||
|
BBG04 |
2857 |
2753 |
short gene |
|||
|
BBG05 |
4056 |
2894 |
transposase-like protein {Anabena} |
82 |
pseudogene; one authentic frameshift; BBG05 is the most intact member of this family which is homologous throughout its length to a putative transposase gene family originally found in Anabena, Saccharopolyspora, Salmonella and thermophilic bacterium PS3 (Bancroft and Wolk, 1989; Donadio and Staver, 1993; Gulig et al., 1992; Krause et al., 1991; Murai et al., 1995); BBG05 was first characterized by Barbour and Carter (1997) |
|
|
BBG06 |
4208 |
5365 |
57 |
|||
|
BBG07 |
5378 |
5947 |
50 |
|||
|
BBG08 |
5911 |
6675 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBG09 |
6737 |
7285 |
49 |
|||
|
BBG10 |
10779 |
7486 |
101 |
weak similarity to phage TM4 tail tape measure protein in a psi-BLAST search |
||
|
BBG11 |
11015 |
10779 |
short gene |
|||
|
BBG12 |
11491 |
11066 |
||||
|
BBG13 |
12355 |
11504 |
||||
|
BBG14 |
12752 |
12312 |
||||
|
BBG15 |
12681 |
13166 |
||||
|
BBG16 |
13495 |
13160 |
||||
|
BBG17 |
14341 |
13511 |
||||
|
BBG18 |
14885 |
14379 |
||||
|
BBG19 |
15431 |
14889 |
117 |
|||
|
BBG20 |
16482 |
15460 |
103 |
|||
|
BBG21 |
17684 |
16497 |
||||
|
BBG22 |
18827 |
18033 |
86 |
|||
|
BBG23 |
19619 |
18840 |
86 |
|||
|
BBG24 |
22310 |
19623 |
104 |
|||
|
BBG25 |
22659 |
22276 |
143 |
N-terminal lipidation consensus |
||
|
BBG26 |
23033 |
22662 |
||||
|
BBG27 |
23725 |
23036 |
||||
|
BBG28 |
24108 |
23725 |
||||
|
BBG29 |
24489 |
25952 |
62 |
|||
|
BBG30 |
25962 |
26387 |
||||
|
BBG31 |
26567 |
27082 |
50 |
pseudogene; N-terminus missing relative to paralogs |
||
|
BBG32 |
27113 |
27937 |
replicative DNA helicase, putative {Bacillus subtilis} |
46 |
putative DNA helicase |
|
|
BBG33 |
28031 |
28828 |
bdrT |
80 |
contains 3 repeats of 87 bp sequence and 4 repeats of a 33 bp sequence |
|
|
BBG34 |
29618 |
28857 |
88 |
|||
|
lp28-3 |
||||||
|
BBH01 |
273 |
464 |
166 |
N-terminal lipidation consensus; short gene |
||
|
BBH02 |
391 |
855 |
76 |
|||
|
BBH03 |
926 |
1072 |
57/77 |
pseudogene |
||
|
BBH04 |
1045 |
1365 |
57 |
pseudogene |
||
|
BBH05 |
1498 |
1677 |
57 |
pseudogene |
||
|
BBH06 |
2970 |
2263 |
near-consensus N-terminal lipidation signal |
|||
|
BBH07 |
3514 |
3086 |
50 |
pseudogene |
||
|
BBH08 |
3730 |
3593 |
short gene |
|||
|
BBH09 |
7728 |
3895 |
1 |
|||
|
BBH09.1 |
8091 |
7810 |
95 |
pseudogene |
||
|
BBH10 |
8203 |
8003 |
questionable gene (overlaps BBH09.1) |
|||
|
BBH10.1 |
8240 |
8310 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBH11 |
8796 |
8704 |
questionable gene; backwards inside BBH11.1 |
|||
|
BBH11.1 |
8320 |
8850 |
1 |
pseudogene (in part) |
||
|
BBH12 |
9455 |
9589 |
short gene |
|||
|
BBH13 |
10516 |
9851 |
bdrU |
80 |
contains 5.6 repeats of a 54 bp sequence |
|
|
BBH14 |
10934 |
10821 |
short gene |
|||
|
BBH15 |
11005 |
10913 |
short gene |
|||
|
BBH16 |
11068 |
11187 |
short gene |
|||
|
BBH17 |
11837 |
12025 |
short gene |
|||
|
BBH18 |
12105 |
13217 |
69 |
N-terminal lipidation consensus |
||
|
BBH18.1 |
13571 |
13693 |
65 |
pseudogene; N-terminus inverted |
||
|
BBH19 |
13590 |
13709 |
questionable gene; overlaps BB18.1 partly in-frame |
|||
|
BBH20 |
14596 |
13840 |
171 |
pseudogene |
||
|
BBH20.1 |
14750 |
15300 |
104 |
pseudogene |
||
|
BBH21 |
-- |
-- |
No longer considered to be a realistic potential gene. |
|||
|
BBH22 |
14870 |
14766 |
questionable gene; inside BBH20.1 backwards |
|||
|
BBH23 |
15051 |
15158 |
104 |
questionable gene; inside of and in-frame with pseudogene BBH20.1 |
||
|
BBH24 |
15136 |
15342 |
104 |
questionable gene; mostly inside, in-fame with pseudogene BBH20.1 |
||
|
BBH24.1 |
15354 |
15750 |
86 |
pseudogene |
||
|
BBH25 |
15810 |
15568 |
questionable gene; inside BBH24.1 backwards |
|||
|
BBH26 |
16519 |
17412 |
62 |
|||
|
BBH27 |
17408 |
17971 |
50 |
|||
|
BBH28 |
17947 |
18699 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBH29 |
18798 |
19424 |
49 |
|||
|
BBH30 |
20871 |
20415 |
96 |
pseudogene |
||
|
BBH31 |
20997 |
21104 |
short gene |
|||
|
BBH32 |
21470 |
22216 |
60 |
N-terminal lipidation consensus |
||
|
BBH33 |
22678 |
22950 |
61 |
pseudogene |
||
|
BBH34 |
23383 |
23192 |
62 |
pseudogene |
||
|
BBH35 |
23447 |
23560 |
short gene |
|||
|
BBH36 |
24180 |
24031 |
44 |
questionable gene; in-frame, inside of BBH36.1 |
||
|
BBH36.1 |
24223 |
24041 |
44 |
pseudogene |
||
|
BBH36.2 |
24751 |
25112 |
102 |
pseudogene |
||
|
BBH37 |
26371 |
25436 |
12 |
N-terminal lipidation consensus |
||
|
BBH38 |
26614 |
26498 |
short gene |
|||
|
BBH39 |
26754 |
26855 |
short gene |
|||
|
BBH40 |
27445 |
26981 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBH41 |
28197 |
27628 |
48 |
|||
|
lp28-4 |
||||||
|
BBI01 |
174 |
605 |
57 |
pseudogene |
||
|
BBI02 |
736 |
1101 |
84 |
(pseudogene [57]?) |
||
|
BBI02.1 |
1416 |
1346 |
172 |
pseudogene |
||
|
BBI02.2 |
1617 |
1862 |
57 |
pseudogene |
||
|
BBI03 |
1972 |
1850 |
short gene |
|||
|
BBI04 |
2219 |
2127 |
short gene |
|||
|
BBI05 |
2310 |
2191 |
short gene |
|||
|
BBI06 |
2536 |
3348 |
pfs-I protein {Escherichia coli} |
26 |
putative 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase |
|
|
BBI07 |
3441 |
3346 |
short gene |
|||
|
BBI08 |
3576 |
3752 |
questionable gene; overlaps BBI08.1 |
|||
|
BBI08.1 |
3674 |
4314 |
59 |
pseudogene |
||
|
BBI09 |
3745 |
3879 |
59 |
questionable gene; in-frame inside of pseudogene BBI08.1 |
||
|
BBI10 |
3911 |
4312 |
59 |
questionable gene; in-frame inside of pseudogene BBI08.1 |
||
|
BBI11 |
4721 |
4626 |
short gene |
|||
|
BBI12 |
5128 |
5343 |
short gene |
|||
|
BBI13 |
5609 |
5704 |
IS9016 (V-4) orf1 {Haemophilus influenzae} (very small similarity) |
short gene |
||
|
BBI14 |
6159 |
6269 |
60 |
N-terminal lipidation consensus; pseudogene |
||
|
BBI15 |
6603 |
6830 |
60 |
pseudogene |
||
|
BBI16 |
7183 |
8535 |
60 |
N-terminal lipidation consensus; contains 22 internal tandem repeats of 27 bp sequence that is not in other members of family [60] |
||
|
BBI17 |
8967 |
8824 |
short gene |
|||
|
BBI18 |
10647 |
10498 |
short gene |
|||
|
BBI19 |
10749 |
11924 |
57 |
|||
|
BBI20 |
11924 |
12475 |
50 |
|||
|
BBI21 |
12454 |
13203 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBI22 |
13265 |
13834 |
49 |
|||
|
BBI23 |
13989 |
14090 |
short gene |
|||
|
BBI24 |
14334 |
14438 |
short gene |
|||
|
BBI25 |
15211 |
15339 |
short gene |
|||
|
BBI26 |
15352 |
16527 |
multidrug-efflux transporter tetA(B) {Helicobacter pylori} |
105 |
putative multidrug-efflux transporter |
|
|
BBI27 |
17272 |
17096 |
60 |
pseudogene |
||
|
BBI28 |
17874 |
17305 |
60 |
N-terminal lipidation consensus |
||
|
BBI29 |
19183 |
18521 |
60 |
N-terminal lipidation consensus |
||
|
BBI30 |
19403 |
19507 |
168 |
|||
|
BBI31 |
20127 |
19618 |
48 |
N-terminal lipidation consensus |
||
|
BBI31.1 |
20240 |
20340 |
98 |
pseudogene |
||
|
BBI32 |
20479 |
20273 |
N-terminal lipidation consensus; questionable gene; backwards inside BBI31.1 |
|||
|
BBI33 |
20482 |
20589 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBI34 |
21562 |
20774 |
60 |
N-terminal lipidation consensus |
||
|
BBI35 |
21992 |
22090 |
93 |
questionable gene; paralog not called in paralogous sequence |
||
|
BBI36 |
22931 |
22098 |
54 |
N-terminal lipidation consensus |
||
|
BBI37 |
23056 |
23154 |
93 |
questionable gene; paralog not called in paralogous sequence |
||
|
BBI38 |
23995 |
23162 |
54 |
N-terminal lipidation consensus |
||
|
BBI39 |
25089 |
24226 |
54 |
N-terminal lipidation consensus |
||
|
BBI40 |
25320 |
25802 |
49 |
pseudogene |
||
|
BBI41 |
26036 |
25797 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBI42 |
26360 |
26911 |
52 |
pseudogene; N-terminal lipidation consensus |
||
|
BBI43 |
27069 |
26884 |
55 |
pseudogene?; short gene |
||
|
lp36 |
||||||
|
BBK001 |
86 |
14 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBK01 |
188 |
1078 |
12 |
N-terminal lipidation consensus |
||
|
BBK02 |
2478 |
1213 |
1 |
questionable gene; in-frame inside of BBK02.1 |
||
|
BBK02.1 |
3770 |
1213 |
1 |
pseudogene |
||
|
BBK03 |
3222 |
2821 |
1 |
questionable gene; in-frame inside of BBK02.1 |
||
|
BBK04 |
3595 |
3419 |
1 |
questionable gene; in-frame inside of BBK02.1; near N-terminal lipidation consensus |
||
|
BBK05 |
5096 |
4905 |
short gene |
|||
|
BBK06 |
5126 |
5233 |
short gene |
|||
|
BBK07 |
6040 |
5291 |
59 |
N-terminal lipidation consensus |
||
|
BBK08 |
6281 |
6180 |
short gene |
|||
|
BBK09 |
6366 |
6647 |
short gene |
|||
|
BBK10 |
6983 |
6807 |
1 |
pseudogene |
||
|
BBK11 |
6956 |
7060 |
short gene |
|||
|
BBK12 |
7335 |
8030 |
59 |
N-terminal lipidation consensus |
||
|
BBK13 |
8880 |
8167 |
protein slr1258 Synechocystis PCC6803} |
40 |
||
|
BBK14 |
8921 |
9013 |
short gene |
|||
|
BBK15 |
9373 |
9996 |
60 |
|
||
|
BBK16 |
10223 |
10101 |
plasmid partition protein {Bacillus subtilis} |
32 |
pseudogene |
|
|
BBK17 |
10301 |
11944 |
adenine deaminase {Bacillus subtilis} |
61 |
putative adenine deaminase |
|
|
BBK18 |
12143 |
12054 |
hypothetical protein {Cyanidium caldarium} (very small similarity) |
short gene |
||
|
BBK19 |
12602 |
13234 |
N-terminal lipidation consensus |
|||
|
BBK20 |
13212 |
13301 |
short gene |
|||
|
BBK21 |
14326 |
13580 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBK22 |
14841 |
14305 |
50 |
|||
|
BBK23 |
15760 |
14837 |
62 |
|||
|
BBK24 |
16275 |
16883 |
49 |
|||
|
BBK24.1 |
17380 |
17580 |
questionable gene; backwards inside BBK25 |
|||
|
BBK25 |
17565 |
16969 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBK25.1 |
17880 |
20033 |
1 |
pseudogene |
||
|
BBK26 |
18346 |
18462 |
1 |
questionable gene; in-frame inside of BBK25.1 |
||
|
BBK27 |
19094 |
18807 |
questionable gene; backwards inside BBK25.1 |
|||
|
BBK28 |
19232 |
19348 |
1 |
questionable gene; in-frame inside of BBK25.1 |
||
|
BBK29 |
19807 |
19718 |
1 |
questionable gene; in-frame inside of BBK25.1 |
||
|
BBK30 |
19935 |
20033 |
1 |
questionable gene; in-frame inside of BBK25.1 |
||
|
BBK31 |
20026 |
20166 |
short gene |
|||
|
BBK32 |
20389 |
21450 |
P47 (P35 in N40) |
fibronectin-binding protein; surface localized (Probert and Johnson, 1998); upregulated in stationary phase in N40 (Fikrig et al., 1997; Indest et al., 1997); near-consensus N-terminal lipidation signal |
||
|
BBK33 |
21720 |
21890 |
65 |
pseudogene |
||
|
BBK34 |
21912 |
22130 |
64 |
short gene |
||
|
BBK35 |
22294 |
22464 |
68 |
short gene |
||
|
BBK36 |
22545 |
22646 |
66 |
short gene |
||
|
BBK37 |
23953 |
22913 |
75 & 175 |
pseudogene |
||
|
BBK38 |
24146 |
23922 |
short gene; near-consensus N-terminal lipidation signal |
|||
|
BBK39 |
24293 |
24667 |
59 |
pseudogene |
||
|
BBK40 |
25103 |
25654 |
bdrX |
58/80 |
has family 80-like repeats |
|
|
BBK41 |
26379 |
25816 |
70 |
|||
|
BBK42 |
26803 |
26588 |
71 |
short gene |
||
|
BBK42.1 |
26916 |
27078 |
72 |
short gene |
||
|
BBK43 |
26937 |
27041 |
72 |
questionable gene; largely overlaps BBK42.1 in-frame |
||
|
BBK44 |
27234 |
27350 |
100 |
short gene |
||
|
BBK45 |
28315 |
27386 |
75 |
near-consensus N-terminal lipidation signal |
||
|
BBK46 |
29212 |
28394 |
75 |
pseudogene; authentic frameshifts |
||
|
BBK47 |
30463 |
29480 |
69 |
N-terminal lipidation consensus |
||
|
BBK48 |
31585 |
30722 |
75 |
N-terminal lipidation consensus |
||
|
BBK49 |
32822 |
31830 |
69 |
N-terminal lipidation consensus |
||
|
BBK50 |
34079 |
33084 |
P37 (N40) |
75 |
N-terminal lipidation consensus; (Fikrig et al., 1997) |
|
|
BBK51 |
34232 |
34327 |
short gene; near-consensus N-terminal lipidation signal |
|||
|
BBK52 |
35443 |
34598 |
P23 (297) |
44 |
N-terminal lipidation consensus; previously only sequenced in strain 297 (Akins et al., 1994) |
|
|
BBK52.1 |
35722 |
35811 |
174 |
authentic frameshift |
||
|
BBK53 |
35868 |
36419 |
52 |
N-terminal lipidation consensus |
||
|
BBK54 |
36577 |
36392 |
55 |
pseudogene? |
||
|
lp38 |
||||||
|
BBJ001 |
482 |
1208 |
60 |
N-terminal lipidation consensus; pseudogene |
||
|
BBJ01 |
482 |
664 |
60 |
questionable gene; in-frame and inside of BBJ001 pseudogene; N-terminal lipidation consensus |
||
|
BBJ02 |
927 |
1208 |
60 |
questionable gene; in-frame and inside of BBJ001 pseudogene |
||
|
BBJ02.1 |
1475 |
2367 |
48 |
pseudogene |
||
|
BBJ03 |
1593 |
1742 |
48 |
questionable gene; in-frame and inside of BJ02.1 pseudogene |
||
|
BBJ04 |
2381 |
2271 |
N-terminal lipidation consensus; questionable gene; backwards inside BBJ03.2 |
|||
|
BBJ05 |
3828 |
2768 |
transposase-like protein {Anabena} |
82 |
pseudogene |
|
|
BBJ06 |
3486 |
3629 |
questionable gene; backwards inside BBJ05; near-consensus N-terminal lipidation signal |
|||
|
BBJ07 |
4307 |
4167 |
98 |
questionable gene; in-frame and inside of BBJ07.1 |
||
|
BBJ07.1 |
4409 |
4167 |
98 |
pseudogene |
||
|
BBJ08 |
4576 |
5493 |
12 |
near-consensus N-terminal lipidation signal |
||
|
17 bp repeat |
5938 |
6063 |
7.6 repeats of the 17 bp sequence AATTGATATTAAAATAT; not in a gene |
|||
|
BBJ09 |
6089 |
6859 |
ospD |
outer surface protein; N-terminal lipidation consensus; rather extensive short direct repeat upstream of gene (Marconi et al., 1994; Norris et al., 1992) |
||
|
BBJ10 |
7270 |
7473 |
bdrY |
58 |
pseudogene |
|
|
BBJ11 |
7965 |
7783 |
short gene |
|||
|
(BBJ11.1) |
8070 |
8260 |
171 |
possible pseudogene, but too weak a match to be in the TIGR master gene list |
||
|
BBJ12 |
8725 |
8636 |
86 |
questionable gene; inside BBJ12.1 |
||
|
BBJ12.1 |
8782 |
8593 |
86 |
pseudogene |
||
|
BBJ13 |
9155 |
8880 |
69 |
pseudogene |
||
|
BBJ14 |
9125 |
9283 |
protein urf (51aa) {Thermoproteus tenax virus} (poor match) |
questionable gene; overlaps BBJ13 backwards |
||
|
BBJ15 |
10168 |
10043 |
questionable gene; backwards inside BBJ15.1) |
|||
|
BBJ15.1 |
9450 |
10150 |
multidrug-efflux transporter tetA(B) {Helicobacter pylori} |
105 |
pseudogene |
|
|
7 bp repeat |
10287 |
10373 |
12.3 repeats of the 7 bp sequence TAATAGT; not in a gene |
|||
|
BBJ16 |
11105 |
10521 |
49 |
|||
|
BBJ17 |
11889 |
11155 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBJ18 |
12452 |
11865 |
50 |
|||
|
BBJ19 |
13440 |
12448 |
62 |
|||
|
BBJ20 |
13936 |
13775 |
167 |
pseudogene |
||
|
BBJ21 |
15514 |
15657 |
questionable gene; backwards inside BBJ21.1 |
|||
|
BBJ21.1 |
15976 |
15484 |
138 |
pseudogene |
||
|
BBJ22 |
16003 |
15905 |
questionable gene; overlaps BBJ21.1 |
|||
|
BBJ23 |
16505 |
17326 |
106 |
near-consensus N-terminal lipidation signal |
||
|
BBJ24 |
17388 |
18167 |
106 |
|||
|
BBJ25 |
18202 |
19251 |
||||
|
BBJ26 |
19313 |
20005 |
ABC transporter, ATP-binding protein {Methanococcus jannaschii} |
4 |
putative ABC transporter, ATP- binding subunit |
|
|
BBJ27 |
19995 |
21227 |
||||
|
BBJ28 |
21220 |
21945 |
||||
|
BBJ29 |
21971 |
23005 |
90 |
|||
|
BBJ30 |
23018 |
23119 |
91 |
short gene |
||
|
BBJ31 |
23366 |
24085 |
59 |
|||
|
BBJ32 |
24517 |
24389 |
173 |
short gene |
||
|
BBJ33 |
24681 |
24791 |
short gene |
|||
|
BBJ34 |
26407 |
25340 |
92 |
N-terminal lipidation consensus |
||
|
BBJ35 |
26858 |
26959 |
short gene |
|||
|
BBJ36 |
28053 |
26998 |
92 |
N-terminal lipidation consensus |
||
|
BBJ37 |
28442 |
28281 |
short gene |
|||
|
BBJ38 |
28703 |
28608 |
short gene |
|||
|
BBJ39 |
29401 |
29267 |
short gene |
|||
|
BBJ39.1 |
29800 |
29600 |
54 |
pseudogene |
||
|
BBJ40 |
29827 |
29919 |
questionable gene; ; no gene called in paralogous sequence |
|||
|
BBJ41 |
30771 |
29908 |
54 |
N-terminal lipidation consensus |
||
|
BBJ42 |
31133 |
30945 |
(54?) |
pseudogene if weak similarity to family 54 is real; not in TIGR master paralog list |
||
|
BBJ43 |
31220 |
32137 |
90 |
|||
|
BBJ44 |
32150 |
32251 |
91 |
short gene |
||
|
BBJ45 |
32498 |
33217 |
59 |
|||
|
BBJ45.1 |
33652 |
33522 |
173 |
pseudogene in [J32] |
||
|
BBJ46 |
34668 |
34375 |
short gene |
|||
|
BBJ47 |
35591 |
34908 |
99 |
near-consensus N-terminal lipidation signal |
||
|
BBJ48 |
36272 |
35637 |
||||
|
BBJ49 |
36455 |
36315 |
92 |
pseudogene |
||
|
BBJ50 |
36502 |
37203 |
BbK2.5-6 |
BBJ50 has "authentic frameshifts" relative to outer membrane protein gene BbK2.5-6 which was sequenced from strain 297 (accession #L31615) |
||
|
BBJ51 |
37917 |
37418 |
171 |
pseudogene |
||
|
lp54 |
lp54-like plasmids have been found in almost all Bb (sensu lato) isolates analyzed (e.g., Casjens et al., 1995; Marconi et al., 1996a; Mathiesen et al., 1997; Samuels et al., 1993) |
|||||
|
BBA01 |
588 |
1070 |
p11/S3 (N40) |
48 |
S3 does not correspond perfectly to BBA01 (Feng et al., 1996) |
|
|
BBA02 |
1238 |
1122 |
short gene; Feng et al. (1996); recognized a different small ORF called S4 (or p5) in this region in strain N40 |
|||
|
BBA03 |
1397 |
1903 |
BbK2.14 (297) |
near-consensus N-terminal lipidation signal; not labeled with palmitate (Akins et al., 1995a) |
||
|
BBA04 |
2829 |
1984 |
S2 (N40) |
44 |
N-terminal lipidation consensus |
|
|
BBA05 |
4192 |
2942 |
S1 (N40) |
N-terminal lipidation consensus (Feng et al., 1995) |
||
|
BBA06 |
4342 |
4226 |
short gene |
|||
|
BBA07 |
5091 |
4606 |
chpAI protein {Escherichia coli} (chpAI similar only to central region of BBA07; best guess is that this is false hit) |
N-terminal lipidation consensus; patchy homolog chpAI does not have a lipidation consensus. |
||
|
BBA08 |
5250 |
5582 |
139 |
has cp32 paralog |
||
|
BBA09 |
5582 |
6451 |
140 |
has cp32 paralog |
||
|
BBA10 |
6457 |
7080 |
141 |
has cp32 paralog |
||
|
BBA11 |
7145 |
8176 |
142 |
has cp32 paralog |
||
|
BBA12 |
8202 |
8378 |
short gene |
|||
|
BBA13 |
8378 |
8800 |
||||
|
BBA14 |
8793 |
9155 |
143 |
N-terminal lipidation consensus; has cp32 paralogs but they do not have lipidation consensus |
||
|
BBA15 |
9393 |
10211 |
ospA |
53 |
outer surface protein (Barbour et al., 1983); N-terminal lipidation consensus and lipidated in Bb; (Bergstrom et al., 1989; Brandt et al., 1990); transcription start site mapped (Jonsson et al., 1992); sequenced in numerous other strains (Bunikis et al., 1996; Caporale and Kocher, 1994; Jonsson et al., 1992; Marconi et al., 1993a; Rosa et al., 1992; Wallich et al., 1992; Wallich et al., 1989; Wang et al., 1997a; Wang et al., 1997b; Wang et al., 1997c; Will et al., 1995; Wilske et al., 1996a; Wilske et al., 1996b; Wilske et al., 1992; Zumstein et al., 1992); atomic resolution structure (Li et al., 1997); in vitro mutagenesis (McGrath et al., 1995) |
|
|
BBA16 |
10224 |
11111 |
ospB |
53 |
outer surface protein (Barbour et al., 1984); N-terminal lipidation consensus (Bergstrom et al., 1989) |
|
|
BBA17 |
11390 |
11301 |
short gene |
|||
|
BBA18 |
11687 |
12880 |
57 |
has cp32 paralog |
||
|
BBA19 |
12931 |
13512 |
50 |
has cp32 paralog |
||
|
BBA20 |
13491 |
14240 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function; has cp32 paralog |
|
|
BBA21 |
14274 |
14816 |
49 |
has cp32 paralog |
||
|
BBA22 |
15084 |
15239 |
short gene |
|||
|
BBA23 |
15294 |
15734 |
144 |
has cp32 paralog |
||
|
BBA24 |
16512 |
15940 |
dbpB (297) |
74 |
N-terminal lipidation consensus; binds decorin (Guo et al., 1998) |
|
|
BBA25 |
17195 |
16635 |
dbpA (297) |
74 |
N-terminal lipidation consensus, surface exposed on outer membrane, binds decorin (Guo et al., 1998; Hagman et al., 1998; Hanson et al., 1998) |
|
|
BBA26 |
17386 |
17514 |
short gene |
|||
|
BBA27 |
17563 |
17679 |
short gene |
|||
|
BBA28 |
17906 |
17757 |
short gene |
|||
|
BBA29 |
18019 |
17897 |
short gene |
|||
|
BBA30 |
18064 |
18654 |
||||
|
BBA31 |
18661 |
20010 |
protein Orf26 of phage fO1205 {Streptococcus thermophilus} |
145 |
homolog of phage Streptococcus thermophilus fO1205 gene orf26 that is likely to be a terminase subunit |
|
|
11 bp repeat |
20138 |
20216 |
7.1 repeats of TAAATCAATAT; not in a gene |
|||
|
BBA32 |
20654 |
20845 |
short gene; near-consensus N-terminal lipidation signal |
|||
|
BBA33 |
21023 |
21559 |
N-terminal lipidation consensus |
|||
|
BBA34 |
23209 |
21623 |
oligopeptide ABC transporter, periplasmic oligopeptide-binding protein {Escherichia coli} |
oppAV |
37 |
N-terminal lipidation consensus (Bono et al., 1998) |
|
BBA35 |
23391 |
23284 |
short gene |
|||
|
BBA36 |
23710 |
24348 |
N-terminal lipidation consensus; has weak similarity to family 113 |
|||
|
BBA37 |
24434 |
25036 |
||||
|
BBA38 |
25389 |
26627 |
146 |
has cp32 paralog |
||
|
BBA39 |
26642 |
27226 |
147 |
has cp32 paralog |
||
|
BBA40 |
27326 |
27931 |
148 |
has cp32 paralog |
||
|
BBA41 |
27950 |
28864 |
149 |
has cp32 paralog |
||
|
BBA42 |
28871 |
29320 |
150 |
has cp32 paralog |
||
|
BBA43 |
29320 |
29691 |
107 |
has cp32 paralog |
||
|
BBA44 |
29688 |
30005 |
||||
|
BBA45 |
30030 |
30566 |
151 |
has cp32 paralog |
||
|
BBA46 |
30646 |
31713 |
152 |
has cp32 paralog |
||
|
BBA47 |
31716 |
32135 |
153 |
has cp32 paralog |
||
|
BBA48 |
32197 |
32682 |
154 |
has cp32 paralog |
||
|
BBA49 |
32678 |
32893 |
155 |
short gene; has cp32 paralog |
||
|
BBA50 |
32905 |
34299 |
||||
|
BBA51 |
34321 |
34881 |
157 |
has cp32 paralog |
||
|
BBA52 |
34924 |
35766 |
BK2.1 (297) |
(Akins et al., 1993) |
||
|
BBA53 |
35890 |
36162 |
158 |
short gene |
||
|
BBA54 |
36192 |
36467 |
158 |
short gene; near-consensus N-terminal lipidation signal |
||
|
BBA55 |
36558 |
37484 |
159 |
has cp32 paralog |
||
|
BBA56 |
37477 |
38043 |
160 |
has cp32 paralog |
||
|
BBA57 |
39341 |
38100 |
N-terminal lipidation consensus |
|||
|
BBA58 |
39566 |
39766 |
short gene |
|||
|
BBA59 |
40048 |
39812 |
12 kd lipoprotein |
N-terminal lipidation consensus; short gene but its real and expressed; (McGrath et al., 1997) |
||
|
BBA60 |
40981 |
40151 |
P27 (B29) |
N-terminal lipidation consensus, lipidated in Bb, surface exposed (Reindl et al., 1993) |
||
|
BBA61 |
41955 |
41335 |
D6 (B. garinii VS102) |
(Balmelli et al., 1996) |
||
|
BBA62 |
42203 |
42406 |
7.5 kd; 6.6 kd (297) |
N-terminal lipidation consensus; lipidated in Bb, possible surface exposure, short gene (Katona et al., 1992; Lahdenne et al., 1997); transcription start (Indest et al., 1997) |
||
|
BBA63 |
42576 |
42454 |
short gene |
|||
|
BBA64 |
43483 |
42563 |
P35 antigen |
54 |
N-terminal lipidation consensus; (Gilmore et al., 1997); cell density-dependent expression and transcription start (Indest et al., 1997) |
|
|
BBA65 |
44469 |
43624 |
54 |
N-terminal lipidation consensus |
||
|
BBA66 |
45883 |
44651 |
54 |
N-terminal lipidation consensus |
||
|
BBA67 |
46021 |
46197 |
short gene |
|||
|
BBA68 |
47164 |
46412 |
54 |
N-terminal lipidation consensus |
||
|
BBA69 |
48203 |
47415 |
54 |
N-terminal lipidation consensus |
||
|
BBA70 |
49158 |
48523 |
54 |
pseudogene |
||
|
BBA71 |
49796 |
49386 |
54 |
pseudogene |
||
|
BBA72 |
50031 |
49792 |
short gene; near-consensus N-terminal lipidation signal |
|||
|
BBA73 |
51112 |
50225 |
54 |
N-terminal lipidation consensus |
||
|
BBA74 |
51642 |
52412 |
oms28 |
171 |
outer membrane protein with porin activity (Skare et al., 1996) |
|
|
BBA75 |
52591 |
52496 |
short gene |
|||
|
BBA76 |
52642 |
53436 |
thymidylate synthase-complementing protein (thy1) {Dictyostelium discoideum} |
65 |
one patch of similarity to thy1 |
|
|
lp56 |
Hinnebusch et al. (1990) determined the sequence of what turned out to be the right telomere of lp56 - this sequence (called TL49 in that paper) adds 25 bp to the right end of the TIGR sequence |
|||||
|
BBQ01 |
279 |
545 |
55 |
pseudogene? |
||
|
BBQ02 |
710 |
799 |
174 |
questionable gene; paralog not called in similar sequence elsewhere |
||
|
BBQ03 |
856 |
1404 |
52 |
N-terminal lipidation consensus |
||
|
BBQ04 |
1404 |
2265 |
44 |
pseudogene; authentic frameshift; N-terminal lipidation consensus |
||
|
BBQ05 |
2744 |
3493 |
60 |
N-terminal lipidation consensus |
||
|
BBQ06 |
3623 |
4105 |
48 |
|||
|
BBQ07 |
4986 |
4252 |
49 |
|||
|
BBQ08 |
5830 |
5072 |
plasmid partition protein {Bacillus subtilis} |
32 |
putative plasmid partition function |
|
|
BBQ09 |
6339 |
5806 |
50 |
|||
|
BBQ10 |
6674 |
6339 |
62 |
pseudogene; fusion protein resulting from integration of a cp32-like plasmid into the linear lp56 precursor |
||
|
BBQ11 |
6585 |
6800 |
148 |
pseudogene; result of integration of a cp32-like plasmid into the linear lp56 precursor |
||
|
BBQ12 |
6800 |
7408 |
148 |
|||
|
BBQ13 |
7418 |
8389 |
149 |
|||
|
BBQ14 |
8406 |
8864 |
150 |
|||
|
BBQ15 |
8849 |
9247 |
107 |
|||
|
BBQ16 |
9238 |
9624 |
108 |
pseudogene; authentic frameshift |
||
|
BBQ17 |
9624 |
10187 |
151 |
|||
|
BBQ18 |
10171 |
11280 |
152 |
|||
|
BBQ19 |
11276 |
11722 |
153 |
|||
|
BBQ20 |
11741 |
12193 |
154 |
|||
|
BBQ21 |
12193 |
12426 |
155 |
short gene |
||
|
BBQ22 |
12437 |
13729 |
156 |
|||
|
BBQ23 |
13755 |
14435 |
157 |
|||
|
BBQ24 |
14442 |
15389 |
159 |
|||
|
BBQ25 |
15410 |
15961 |
160 |
|||
|
BBQ26 |
15994 |
16323 |
139 |
|||
|
BBQ27 |
16323 |
17195 |
140 |
|||
|
BBQ28 |
17208 |
17810 |
141 |
|||
|
BBQ29 |
17823 |
18635 |
142 |
|||
|
BBQ30 |
18713 |
18913 |
blyA-56 |
109 |
putative hemolysin; short gene |
|
|
BBQ31 |
18920 |
19264 |
blyB-56 |
111 |
putative hemolysin |
|
|
BBQ32 |
19257 |
19589 |
112 |
|||
|
BBQ33 |
19579 |
19935 |
143 |
near-consensus N-terminal lipidation signal |
||
|
BBQ34 |
20022 |
20735 |
bdrW |
80 |
contains ~7 repeats of a 33 bp sequence |
|
|
BBQ35 |
20844 |
21452 |
mlpJ |
113 |
N-terminal lipidation consensus |
|
|
BBQ36 |
21424 |
21513 |
questionable gene; gene not called in paralogous sequence on other cp32s |
|||
|
BBQ37 |
22479 |
21535 |
orf4-56 |
161 |
(Zuckert and Meyer, 1996) |
|
|
BBQ38 |
22869 |
23963 |
orf1-56 |
57 |
(Zuckert and Meyer, 1996) |
|
|
BBQ39 |
23976 |
24539 |
orf2-56 |
50 |
(Zuckert and Meyer, 1996) |
|
|
BBQ40 |
24518 |
25270 |
plasmid partition protein {Bacillus subtilis} |
orfC-56 |
32 |
putative plasmid partition function (Zuckert and Meyer, 1996) |
|
BBQ41 |
25317 |
25874 |
orf3-56 |
49 |
(Zuckert and Meyer, 1996) |
|
|
BBQ42 |
25890 |
26423 |
bdrV |
80 |
(Zuckert and Meyer, 1996) |
|
|
BBQ43 |
26879 |
28243 |
orf8/7-56 |
165 |
(Zuckert and Meyer, 1996) |
|
|
BBQ44 |
28299 |
28721 |
144 |
|||
|
BBQ45 |
29506 |
28733 |
96 |
|||
|
BBQ46 |
29559 |
29651 |
questionable gene; gene not called in paralogous sequence on other cp32s; near consensus lipidation sequence |
|||
|
BBQ47 |
29895 |
30971 |
erpX |
163 |
N-terminal lipidation consensus (Stevenson et al., 1998a) |
|
|
BBQ48 |
31117 |
31707 |
114 |
|||
|
BBQ49 |
31892 |
32455 |
115 |
|||
|
BBQ50 |
32455 |
33804 |
hypothetical protein Orf26 of phage fO1205 gene {Streptococcus thermophilus} |
145 |
phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
|
BBQ51 |
33873 |
35095 |
146 |
pseudogene; authentic frameshift |
||
|
BBQ52 |
35115 |
35804 |
147 |
|||
|
BBQ53 |
35819 |
36382 |
148 |
|||
|
BBQ54 |
36388 |
37132 |
148 |
pseudogene; result of integration of a cp32-like plasmid into the linear lp56 precursor |
||
|
BBQ55 |
37533 |
36933 |
62 |
pseudogene; result of integration of a cp32-like plasmid into the linear lp56 precursor |
||
|
BBQ56 |
37809 |
37672 |
near-consensus N-terminal lipidation signal; short gene |
|||
|
BBQ57 |
38223 |
37819 |
101 |
questionable gene; in-frame and inside of BBQ60 |
||
|
BBQ58 |
38514 |
38419 |
101 |
questionable gene; in-frame and inside of BBQ60 |
||
|
BBQ59 |
39077 |
38568 |
101 |
questionable gene; in-frame and inside of BBQ60 |
||
|
BBQ60 |
39360 |
37817 |
101 |
pseudogene |
||
|
BBQ61 |
39482 |
39360 |
117 |
questionable gene; in-frame and inside of BBQ63 |
||
|
BBQ62 |
39531 |
39902 |
questionable gene; backwards in BBQ63 |
|||
|
BBQ63 |
39934 |
39400 |
117 |
pseudogene |
||
|
BBQ64 |
40186 |
39962 |
103 |
questionable gene; in-frame and inside of BBQ65 |
||
|
BBQ65 |
40218 |
39961 |
103 |
pseudogene |
||
|
BBQ66 |
40317 |
40409 |
short gene |
|||
|
BBQ67 |
43732 |
40439 |
C-terminal portion adenine specific DNA methyltransferase {Helicobacter pylori} N-terminal portion hits gene HP1353 {Helicobacter pylori} |
102/ 167 |
probable pseudogene or fusion gene; N-terminal portion is good match to adenine specific DNA methyltransferase; C-term is good match to BBG02 which matches H. pylori HP1353 and HP1352 & HP1354 are putative adenine methylases! |
|
|
BBQ68 |
44232 |
43918 |
138 |
questionable gene; in-frame and part of BBQ69 |
||
|
BBQ69 |
44581 |
43769 |
138 |
pseudogene |
||
|
BBQ70 |
44612 |
44511 |
questionable gene; in-frame and overlaps Q69 (in part) |
|||
|
BBQ71 |
44582 |
45263 |
multidrug-efflux transporter tetA(B) {Helicobacter pylori} |
105 |
pseudogene |
|
|
BBQ72 |
45314 |
45427 |
short gene |
|||
|
BBQ73 |
45630 |
45530 |
60 |
pseudogene |
||
|
BBQ74 |
46462 |
45804 |
60 |
pseudogene |
||
|
BBQ75 |
46671 |
46781 |
168 |
pseudogene |
||
|
BBQ76 |
46982 |
47077 |
short gene |
|||
|
BBQ77 |
47163 |
47279 |
transposase-like protein {Anabena} |
82 |
pseudogene contains a fragment of [82] in middle; probably actually part of BBQ80 pseudogene |
|
|
BBQ78 |
47295 |
47393 |
questionable gene; out of frame inside BBQ81 |
|||
|
BBQ79 |
47273 |
47569 |
transposase-like protein {Anabena} |
82 |
pseudogene (see BBQ77 comment) |
|
|
BBQ80 |
48626 |
47787 |
60 |
pseudogene |
||
|
BBQ81 |
49246 |
49047 |
48 |
pseudogene |
||
|
BBQ82 |
49538 |
49347 |
76 |
pseudogene |
||
|
BBQ83 |
49868 |
49755 |
short gene |
|||
|
BBQ84 |
50550 |
50398 |
84 |
short gene (pseudogene [57]) |
||
|
BBQ84.1 |
50900 |
50700 |
57 |
pseudogene |
||
|
BBQ85 |
51528 |
51175 |
57 |
pseudogene |
||
|
BBQ86 |
51823 |
51722 |
57 |
pseudogene |
||
|
BBQ87 |
52067 |
51921 |
57/77 |
pseudogene |
||
|
BBQ88 |
52608 |
52138 |
76 |
|||
|
BBQ89 |
52726 |
52535 |
166 |
short gene; N-terminal lipidation consensus |
||
|
Right 7.2 kbp of B31 chrm |
Note that "genes" BB0850 and BB851 in this region have been removed from the published (Fraser et al., 1997) gene list due to improvements in the TIGR gene-calling protocol. |
The rightmost 7.2 kbp of the B31 chromosome contains largely plasmid-like sequences, many of which are pseudogenes. This is not true of the remainder or "constant portion" of the chromosome, including sequences near the left chromosomal end. |
||||
|
BB0843.1 |
903255 |
903415 |
32 |
pseudogene |
||
|
BB0844 |
904900 |
903932 |
12 |
|||
|
BB0845 |
905120 |
905224 |
questionable gene; inside BB0845.1 and backwards |
|||
|
BB0845.1 |
905255 |
905025 |
76 |
pseudogene |
||
|
BB0845.11 |
905395 |
905295 |
166 |
possible pseudogene; this is a poor homolog and is not included in the TIGR analysis. |
||
|
BB0845.2 |
905475 |
905775 |
105 |
pseudogene |
||
|
BB0846 |
905865 |
905755 |
questionable gene; overlaps BB 845.2 backwards |
|||
|
BB0847 |
905839 |
905943 |
short gene |
|||
|
BB0848 |
905928 |
906029 |
short gene |
|||
|
BB0848.1 |
906075 |
906275 |
82 |
pseudogene |
||
|
BB0849 |
906162 |
906260 |
questionable gene; inside BB 848.1 |
|||
|
BB0849.1 |
906725 |
906275 |
57 |
pseudogene |
||
|
BB0849.2 |
907225 |
908225 |
1 |
pseudogene |
||
|
BB0850 |
|
|
No longer considered to be a realistic potential gene. |
|||
|
BB0851 |
|
|
No longer considered to be a realistic potential gene. |
|||
|
BB0852 |
908407 |
909588 |
138 |
|||
|
BB0853 |
910175 |
909845 |
57 |
pseudogene |
||
|
BB0853.1 |
910555 |
910375 |
57 |
pseudogene |
Paralogous Gene Families in B. burgdorferi B31
Compiled by Daniel Haft, Owen White and Sherwood Casjens - April 1999
Procedure for generation of the paralogous gene families.
1. Any pair of B31 proteins whose comparison scored better than 0.02 probability by FASTA3 was clustered. Any additional protein similar to (with better than 0.02 probability) any member of an existing cluster joined that cluster. Therefore each protein is a member of AT MOST a single cluster (no cluster is linked to any other cluster by a score better than 0.02). However clusters may fail to be closed under transitivity, that is, if protein A scores >0.02 with B, and B scores >0.02 with C, A does not necessarily score >0.02 with C.
2. This preliminary clustering was followed by manual curation of amino acid sequence alignments. Final clusters were generated by parsing the approved alignments. Curation was as follows: For each initial cluster, multiple sequence alignments were generated by CLUSTALW and by a TIGR program called MSA (G. Sutton, unpublished; not to be confused with another MSA program) that runs on the MASPAR. The better of the two alignments was selected by inspection of both. Sometimes manual editing was performed to improve the alignment further.
3. Adjustments to the paralogous families were achieved by splitting clusters, not joining them or adding proteins to clusters (with a single exception, in which one additional GTP-binding protein was added to the cluster of all other GTP-binding proteins). The standard for splitting versus not splitting was as follows: Alignments were viewed in BELVU, which could be used to generate a UPGMA difference tree (which is NOT a phylogenetic tree). In some cases, a domain could be recognized that was responsible for the initial clustering. If large portions of the resulting alignment contained protein sequence that clearly was not homologous, although aligned, these were split. This happened most often for plasmid proteins that shared similar amino-terminal domains (signal and lipidation sequences), but that otherwise were easily resolvable into several different classes. Two notable cases, genes BBR41 and BBQ76, appear to be gene fusions that join large, easily recognizable parts of two genes that fall into two different paralogous families; thus, even though, for example BBR41 bridges two families by virtue of its two different domains, these two families were not joined since they have no similarity to one another.
Why are some numbers not used in the current set of paralogous family names?
Why are some published paralogous family names no longer used?
The methods TIGR uses to classify paralogous gene family members has changed somewhat since Fraser et al. (1997) was published. Where possible, the gene family names in that paper are used here. In cases where two or more families have fused since that publication, we chose one of the previous family names and have not re-used the others. Hence, not all numbers are currently used in the list of family names; 161 of the numbers between 1 and 175 are currently used as family names (the numbers 5, 7, 17, 24, 27, 28, 51, 67, 73, 79, 81,83, 87, 169, etc., are not used).
Mini-summary of Some Paralogous Relationships
There are 161 paralogous families of B. burgdorferi B31 genes, 107 of which have plasmid borne members. The family sizes vary from 2 members to 41 members. Family 57 has 41 members of which only 16 appear to be full-length, intact genes. Some families have noticeable subgroups that are not delineated here - for, example, family 148 (26 members of which 24 are intact) has three clear subgroups each of which has 8 members (these subgroups coincide with three continuous related genes on each of the cp32s and on lp56).
Some of the largest paralogous families are as follows:
|
family |
total genes |
pseudogenes |
apparently intact genes |
|
32 |
29 |
4 |
25 |
|
49 |
26 |
6 |
20 |
|
50 |
23 |
3 |
20 |
|
54 |
14 |
4 |
10 |
|
57 |
41 |
25 |
16 |
|
60 |
15 |
6 |
9 |
|
62 |
12 |
3 |
6 |
|
80 |
18 |
1 |
17 |
|
82 |
17 |
17 |
0 |
It is curious to note that five of the above families (32, 49, 50 and 57) are members of the so-called "partition gene cluster" (Zuckert & Meyer, 1996; Casjens et al., 1999), intact variants of which are found on all or most of the B31 plasmids.
Most plasmid genes are members of paralogous families. Only 63 of the 535 plasmid non-pseudogenes >300 bp that have no paralogs, and 93 of the 134 ²300 bp non-pseudo genes have no paralogs.
The 63 >300 bp plasmid genes with no paralogs are the following:
BBB01, BBB02, BBB03, BBB04, BBB05, BBB06, BBB07, BBB08, BBB09, BBB14, BBB17, BBB18, BBB19, BBB24, BBB25, BBB26, BBB27, BBB28, BBS27, BBM39, BBD09, BBD10, BBD11, BBD13, BBD18, BBE17, BBE22, BBG12, BBG13, BBG14, BBG15, BBG16, BBG17, BBG18, BBG21, BBG26, BBG27, BBG28, BBG30, BBH06, BBK19, BBK32, BBJ09, BBJ25, BBJ27, BBJ28, BBJ48, BBA03, BBA05, BBA07, BBA13, BBA30, BBA33, BBA36, BBA37, BBA44, BBA50, BBA52, BBA57, BBA59, BBA60, BBA61, BBA62.
Paralogous Gene Families in B. burgdorferi B31
Definitions used in the following table:
* - Indicates pseudogenes as defined in PARTs I and IV.
LP - Indicates that the gene contains a "perfect" lipoprotein consensus (see PART III).
LP? - Indicates that the gene contains an "imperfect" but near-consensus lipidation sequence (see PART III). Cross referencing of this feature into gene families without plasmid members was not performed.
- Indicates families with only chromosomal member genes.
- Daggers indicate genes inside of a larger pseudogene (i.e., are part of a larger entity that is also in the gene list).
(...) - Gene names in parentheses are not in this location in the TIGR paralog list - see comments column in those cases.
|
Paralogous Family Name |
Member Genes and Pseudogenes* |
Comments |
|
1 family |
BB0849.2* BBE02 BBH09 BBH11.1* BBK02.1* BBK02* BBK03* BBK04* BBK10* BBK25.1* BBK26* BBK28* BBK29* BBK30* |
7 members |
|
2 family |
BB0246 BB0255 BB0262 BB0761 |
4 members |
|
3 family |
BB0611 BB0757 |
2 members |
|
4 family |
BB0080 BB0146 BB0218 BB0318 BB0334 BB0335 BB0466 BB0573 BB0642 BB0677 BB0742 BB0754 BBJ26 |
13 members |
|
6 family |
BB0020 BB0727 |
2 members |
|
8 family |
BB0302 BB0719 |
2 members |
|
9 family |
BB0264 BB0518 BB0715 |
3 members |
|
10 family |
BB0076 BB0270 BB0694 |
3 members |
|
11 family |
BB0088 BB0540 BB0691 |
3 members |
|
12 family |
BB0844LP BBG01LP BBH37LP BBJ08LP? BBK01*LP |
5 members; All are fairly near a telomere, transcribed towards center of plasmid. |
|
13 family |
BB0578 BB0596 BB0597 BB0680 BB0681 |
5 members |
|
14 family |
BB0419 BB0420 BB0551 BB0570 BB0672 BB0763 |
6 members |
|
15 family |
BB0517 BB0655 |
2 members |
|
16 family |
BB0116 BB0645 BBB29 |
3 members |
|
18 family |
BB0344 BB0607 |
2 members |
|
19 family |
BB0408 BB0629 |
2 members |
|
20 family |
BB0581 BB0623 |
2 members |
|
21 family |
BB0002 BB0620 |
2 members |
|
22 family |
BB0253 BB0613 |
2 members |
|
23 family |
BB0369 BB0834 |
2 members |
|
25 family |
BB0137 BB0593 |
2 members |
|
26 family |
BB0375 BB0588 BBE07* BBI06 |
4 members |
|
29 family |
BB0451 BB0452 |
2 members |
|
30 family |
BB0036 BB0436 |
2 members |
|
31 family |
BB0035 BBB0435 |
2 members |
|
32 family |
BB0269 BB0361 BB0431 BB0726 BB0843.1* BBA20 BBB12 BBD21 BBE19 BBE23.2* BBF11.1* BBF13 BBF24 BBG08 BBH28 BBI21 BBJ17 BBK16* BBK21 BBL32 BBM32 BBN32 BBO32 BBP32 BBQ08 BBQ40 BBR33 BBS35 BBU05 |
29 members; Previously called ORF-C (Zuckert and Meyer, 1996) Homology to parA genes in other bacterial systems suggests these genes function in plasmid partitioning. BBD21 is fairly distant member |
|
33 family |
BB0040 BB0312 BB0414 BB0565 BB0670 |
5 members |
|
34 family |
BB0738 BB0833 |
2 members |
|
35 family |
BB0405 BB0406 BB0562 BB0563 BB0564 |
5 members |
|
36 family |
BB0382 BB0383 BB0384 BB0385 |
4 members |
|
37 family |
BB0328LP BB0329LP? BB0330LP? BBA34LP BBB16LP |
5 members; oppA homologous genes (oligopeptide ABC transporter) |
|
38 family |
BB0221 BB0290 |
2 members |
|
39 family |
BB0093 BB0094 BB0288 |
3 members |
|
40 family |
BB0223 BB0224 BBK13 |
3 members |
|
41 family |
BB0145 BB0216 BB0217 BB0332 BB0333 BB0640 BB0641 BB0746 BB0747 |
9 members |
|
42 family |
BB0059 BB0202 |
2 members |
|
43 family |
BB0074 BB0196 |
2 members |
|
44 family |
BB0158LP
BB0159
BBA04LP BBE04.1*LP
BBE05*
BBE09LP
BBF22*
BBH36*
BBH36.1*
BBK52LP BBQ04*LP |
9 members; BBA04 protein is "S2 antigen" |
|
45 family |
BB0018 BB0815 |
2 members |
|
46 family |
BB0111 BBG32 |
2 members; These proteins are homologs to helicases; lipidation seems unlikely |
|
47 family |
BB0050 BB0051 |
2 members |
|
48 family |
BB0034 BBA01 BBG03* BBH41 BBI31 BBJ02.1* BBJ03* BBQ06 BBQ81* |
9 members; BBG03* has patchy similarity to family and is slightly truncated at N-terminus |
|
49 family |
BBA21 BBB13 BBC03 BBE18 BBE24* BBE24.1* BBE30* BBF12(*?) BBF23 BBF29* BBG09 BBH29 BBI22 BBI40* BBJ16 BBK24 BBL34 BBM33 BBN33 BBO33 BBP33 BBQ07 BBQ41 BBR34 BBS36 BBU06 |
26 members Previously called ORF-3 (Dunn et al., 1994; Zuckert and Meyer, 1996); BBF12 is very patchy paralog (fusion pseudogene?); BBE24.1 is a fragment of BBF12 that is not very related to the rest of the family; BBF29 is patchy paralog (fusion pseudogene?) |
|
50 family |
BBA19 BBB11 BBC02 BBE20 BBF14 BBF25 BBF31* BBG07 BBG31* BBH07* BBH27 BBI20 BBJ18 BBK22 BBL31 BBM31 BBN31 BBO31 BBP31 BBQ09 BBQ39 BBR32 BBS34 |
23 members Previously called ORF-2 (Dunn et al., 1994; Zuckert and Meyer, 1996). |
|
51 |
no longer exists - merged into family 62 |
|
|
52 family |
BBI42LP
BBJ50*
BBK53LP BBQ03LP
BBT07*
BBU12*
|
6 members; BBJ50* is fairly distant member of family |
|
53 family |
BBA15LP BBA16LP |
2 members; ospA & ospB genes |
|
54 family |
BBA64LP BBA65LP BBA66LP BBA68LP BBA69LP
BBA70*
BBA71*
BBA73LP BBE04*LP? BBI36LP BBI38LP BBI39LP
BBJ39.1*
BBJ41LP
(BBJ42*?)
|
14 members Only BBA64, BBA65, BBA66 and BBA73 have a "standard" lipidation consensus; BBA68, BBA69, BBI36, BBI38, BBI39 and BBJ41 have sequences that fit a slightly relaxed consensus; Includes "old family 87"; BBJ42* has very weak homology to family 54; it is not included in the TIGR computers paralog list. |
|
55 family |
BBC08 BBI43(*) BBK54(*) BBQ01(*) BBU09(*) |
5 members; A confusing family - it demonstrates the problems involved in attempting to make gene/pseudogene decisions on novel DNA sequence for which there is no information beyond sequence: relative sizes are BBC08 > BBU09>BBK54=BBI43=BBQ01; I called the shortest 4 pseudogenes, but obviously no one really knows; it is possible that BBC08 is unusually large for an as yet unknown reason. |
|
56 family |
BB0473 BB0583 BB0584 |
3 members |
|
57 family |
BB0849.1* BB0853* BB0853.1* BA18 BBC01 (BBD02*) BBD03* BBD04* BBD05.1* BBD06* BBE21 BBE23.1* BBE32* BBE33* (BBF04*) BBF05* BBF06* BBF26 BBF31.1* BBG06 (BBH03*) BBH04* BBH05* BBI01* BBI02.2* BBI19 BBL30 BBM30 BBN30 BBO30 BBP30 BBQ38 BBQ84.1* BBQ85* BBQ86* (BBQ87*) BBR31 BBS33 BBT01* BBT02* BBT04 BBT05* BBU01* BBU04 BBU07* BBU10* |
45 members - family 57 is distant relative of family 62 Previously called ORF-1 (Dunn et al., 1994; Zuckert and Meyer, 1996). BBD02, BBF04, BBH03 & BBQ87 (family 77) appear to be fusions of part of a family 57 gene and something else; BBD05, BBI02, BBQ84, BBT03, and BBU02 (family 84) also have weak similarities to family 57. Many of the family 57 members are in highly recombined telomeric regions are perhaps not functional (?) (Casjens et al., 2000). |
|
58 family |
BBJ10* BBK40 |
2 members BBK40 contains repeats that are somewhat similar to BBG33 in family 80 |
|
59 family |
BBI08.1* BBI09* BBI10* BBJ31 BBJ45LP? BBK07LP BBK12LP
BBK39*
|
6 members only BBK07 and BBK39 have good lipidation consensus |
|
60 family |
BBE31LP BBH32LP BBI14*LP
BBI15*
BBI16LP
BBI27*
BBI28LP BBI29LP BBI34LP BBJ001*LP BBJ01LP
BBJ02
BBK15
BBQ05LP
BBQ73*
BBQ74*
BBQ80*
|
15 members F01 is fairly poor paralog K15 missing LP consenus - is it a pseudogene? |
|
61 family |
BBH33* BBK17 |
2 members |
|
62 family |
BBB10 BBD14 BBG29 BBH26 BBH34* BBJ19 BBK23 BBQ10* BBQ55* |
9 members - family 62 is related to family 57 Published families 51 and 62 now merged |
|
63 family |
BBC10LP BBM27LP BBP27LP |
3 members BBP27 is rev gene of (Gilmore et al., 1997) |
|
64 family |
BBF16* BBK34 |
2 members |
|
65 family |
BBA76 BBF14.1* BBH18.1* BBK33* |
4 members |
|
66 family |
BBF21 BBK36 |
2 members |
|
68 family |
BBF17* BBK35 |
2 members |
|
69 family |
BBH18LP
BBJ13*
BBK47LP BBK49LP |
4 members published families 69 & 81 merged |
|
70 family |
BBF10 BBK41 |
2 members |
|
71 family |
BBF09* BBK42 |
2 members |
|
72 family |
BBF08* BBK42.1* BBK43* |
2 members complex situation regarding pseudogenes |
|
74 family |
BBA24LP BBA25LP |
2 members dbpA & dbpB (decorin binding proteins) (Guo et al., 1998) |
|
75 family |
BBK37* BBK45LP?
BBK46*
BBK48LP BBK50LP |
5 members BBK45 does not have the lipidation consensus as the gene was originally listed, but it has a near-consensus lipidation signal at alternate translation start (see Part III). BBK37 is truncated and fused to other sequences C-terminal to its family 75 similarity; thus it is also placed in family 175 as well by virtue of similarity to that family at its C-terminus; it looks as though it could be expressed. |
|
76 family |
BB0845.1* BBD01 BBH02 BBQ82* BBQ88* |
5 members |
|
77 family |
BBD02 BBF04 BBH03 BBQ87 |
4 members These appear to be fusions between family 57 and something else (that is common to family 77 genes); see also comments under family 57 |
|
78 family |
BB0283 BB0293 BB0774 BB0775 |
4 members |
|
80 family |
BBF001.1* BBF03* BBG33 BBH13 (BBK40) BBL27 BBL35 BBM34 BBN27 BBN34 BBO27 BBO34 BBP34 BBQ34 BBQ42 BBR27 BBR35 BBS29 BBS37 |
19 members Two subfamilies previously typified by ORF-E (Zuckert and Meyer, 1996; Zuckert et al., 1999) and rep (Porcella et al., 1996). Now these are called the bdr genes, for Borrelia direct repeat containing genes (W. Zuckert et al. 1999). Most cp32s carry two homologs in this family called one at or near gene position 28 (rep) and the other at or near gene position 34 (ORF-E). BBK40 (family 58) contains repeats that are somewhat similar to BBG33, but is otherwise not very similar to this family. |
|
82 family |
BB0848.1* BBD08* BBD15.1* BBD20* BBD23* BBE21.1* BBF18* BBF19* BBG05* BBH10.1* BBH40* BBI33* BBI41* BBJ05* BBK001* BBK25* BBQ77* BBQ79* |
17 members BBG05 is "best" family member; it has one frameshift relative to homologous putative transposases in some other bacteria. It does not appear that BBG05 could be expressed by programmed translational frameshifting, and so it appears that strain B31 no longer has an intact version of this gene. |
|
84 family |
BBD05* BBI02(*) BBQ84* BBT03(*) BBU02(*) |
5 members; This group may in fact be part of family 57 since BBI02, BBU02 & BBT03 have weak similarity to the N-terminal region of [57] members; if so then all the genes in family 84 would be pseudogenes. These are all in the highly recombined telomeric regions which may carry no "real" genes (Casjens et al., 2000). |
|
85 family |
BBD15LP BBF20*LP |
2 members |
|
86 family |
BBG22 BBG23 BBH24.1* BBJ12* BBJ12.1* |
4 members |
|
87 family |
none |
published family 87 (Fraser et al., 1997) merged into family 54 |
|
88 family |
BBF001* BBF02 BBG34 |
3 members |
|
89 family |
BB0712 BB0771 |
2 members |
|
90 family |
BBJ29 BBJ43 |
2 members |
|
91 family |
BBJ30 BBJ44 |
2 members |
|
92 family |
BBJ34LP BBJ36LP
BBJ49*
|
3 members |
|
93 family |
BBI35 BBI37 |
2 members |
|
94 family |
BBB22 BBB23 |
2 members |
|
95 family |
BBC06 BBH09.1* BBS42 |
3 members; BBC06 and BBS42 previously called eppA and bapA, respectively (Champion et al., 1994; Wallich et al., 1995) |
|
96 family |
BBC11 BBH30* BBL38 BBM37 BBN37* BBO38 BBP37 BBQ45 BBR38 BBS40 |
10 members; Previously called ORF-6 (Dunn et al., 1994; Zuckert and Meyer, 1996). |
|
97 family |
BB0068 BB0421 |
2 members |
|
98 family |
BBE03 BBI31.1* BBJ07* BBJ07.1* |
3 members |
|
99 family |
BBE16LP? BBJ47LP? |
2 members |
|
100 family |
BBF07 BBK44 |
2 members |
|
101 family |
BBF26.1* BBF27* BBG10 BBQ57 BBQ58 BBQ59 BBQ60* |
3 members |
|
102 family |
BBE29.1* BBG02LP
BBH36.2*
(BBQ67*)
|
4 members The C-terminal portion of BBQ67 is similar to BBG02; the N-terminal part is similar to adenine methylases (see family 167) |
|
103 family |
BBG20 BBQ64 BBQ65* |
2 members |
|
104 family |
BBG24 BBH20.1* BBH23* BBH24* |
2 members |
|
105 family |
BB0845.2* BBI26 BBJ15.1* BBQ71* |
4 members |
|
106 family |
BBJ23 BBJ24 |
2 members |
|
107 family |
BBA43 BBL08 BBM08 BBN08 BBO08 BBP08 BBQ15 BBR08 BBS08 |
9 members BBA43 is the most distant paralog of this family |
|
108 family |
BBL09 BBM09 BBN09 BBO09 BBP09 BBQ16* BBR09 BBS09 |
8 members |
|
109 family |
BBL23 BBM23 BBN23 BBO23 BBP23 BBQ30 BBR23 BBS23 |
8 members |
|
110 family |
BB0079 BB0081 |
2 members |
|
111 family |
BBL24 BBM24 BBN24 BBO24 BBP24 BBQ31 BBR24 BBS24 |
8 members |
|
112 family |
BBL25 BBM25 BBN25 BBO25 BBP25 BBQ32 BBR25 BBS25 |
8 members |
|
113 family |
BBL28LP BBM28LP BBN28LP BBO28LP BBP28LP BBQ35LP BBR28LP BBS30LP |
8 members cp32 mlp genes |
|
114 family |
BBL41 BBN41 BBO42 BBP40 BBQ48 BBR43 |
7 members |
|
115 family |
BBL42 BBM41 BBN42 BBO43 BBP41 BBQ49 BBR44 BBS44 |
8 members |
|
116 family |
BBN40 BBO41 |
2 members |
|
117 family |
BBG19 BBQ61 BBQ63* |
2 members |
|
118 family |
BB0098 BB0797 |
2 members |
|
119 family |
BB0136 BB0718 |
2 members |
|
120 family |
BB0147 BB0182 |
2 members |
|
121 family |
BB0172 BB0173 |
2 members |
|
122 family |
BB0179 BB0508 BB0643 |
3 members |
|
123 family |
BB0058 BB0195 |
2 members |
|
124 family |
BB0225 BB0737 |
2 members |
|
125 family |
BB0231 BB0245 BB0538 |
3 members |
|
126 family |
BB0251 BB0587 |
2 members |
|
127 family |
BB0295 BB0612 |
2 members |
|
128 family |
BB0304 BB0817 |
2 members |
|
129 family |
BB0316 BB0317 |
2 members |
|
130 family |
BB0678 BB0679 |
2 members |
|
131 family |
BB0366 BB0627 |
2 members |
|
132 family |
BB0415 BB0568 |
2 members |
|
133 family |
BB0471 BB0505 |
2 members |
|
134 family |
BB0567 BB0669 |
2 members |
|
135 family |
BB0638 BB0637 |
2 members |
|
136 family |
BB0652 BB0653 |
2 members |
|
137 family |
BB0734 BBT06 BBU08* BBU11 |
4 members |
|
138 family |
BB0852 BBJ21.1* BBQ68 BBQ69* |
3 members |
|
139 family |
BBA08 BBL19 BBM19 BBN19 BBO19 BBP19 BBQ26 BBR19 BBS19 |
9 members |
|
140 family |
BBA09 BBL20 BBM20 BBN20* BBO20 BBP20 BBQ27 BBR20 BBS20 |
9 members |
|
141 family |
BBA10 BBL21 BBM21 BBN21* BBO21 BBP21 BBQ28 BBR21 BBS21 |
9 members |
|
142 family |
BBA11 BBL22 BBM22 BBN22 BBO22 BBP22 BBQ29 BBR22 BBS22 |
9 members |
|
143 family |
BBA14LP BBG25LP BBL26LP? BBM26LP? BBN26LP? BBO26LP? BBP26LP? BBQ33LP? BBR26LP? BBS26LP? |
10 members only BBA14 and BBG25 have lipidation consensus; others are one off from the current consensus (see section III below) - they are the most distant members of the family; Porcella et al. (1996) showed that a homolog (from strain 297) of the cp32 members of this family is not lipidated in E. coli. |
|
144 family |
BBA23 BBL37 BBM36 BBN36 BBO37 BBP36 BBQ44 BBR37 BBS39 |
9 members Previously called ORF-10 (Dunn et al., 1994; Zuckert and Meyer, 1996). BBG27 is a weak paralog of family 144 |
|
145 family |
BBA31 BBL43 BBM42 BBN43 BBO44 BBP42 BBQ50 BBR45 BBS45 |
9 members phage fO1205 Orf26 homology; Orf26 is a possible phage structural protein |
|
146 family |
BBA38 BBL01 BBM01 BBN01 BBO01 BBP01 BBQ51* BBR01 BBS01 |
9 members |
|
147 family |
BBA39 BBL02 BBM02 BBN02 BBO02 BBP02 BBQ52 BBR02 BBS02 |
9 members |
|
148 family |
BBA40 BBL03 BBL04 BBL05 BBM03 BBM04 BBM05 BBN03 BBN04 BBN05* BBO03 BBO04 BBO05 BBP03 BBP04 BBP05 BBQ11* BBQ12 BBQ53 BBQ54* BBR03 BBR04 BBR05 BBS03 BBS04 BBS05 |
26 members There are three subfamilies in 148; each subfamily has one member from each cp32; the three subfamily genes lie in a contiguous cluster on each cp32. |
|
149 family |
BBA41 BBL06 BBM06 BBN06* BBO06 BBP06 BBQ13 BBR06 BBS06 |
9 members |
|
150 family |
BBA42 BBL07 BBM07 BBN07 BBO07 BBP07 BBQ14 BBR07 BBS07 |
9 members |
|
151 family |
BBA45 BBL10 BBM10 BBN10 BBO10 BBP10 BBQ17 BBR10 BBS10 |
9 members |
|
152 family |
BBA46 BBL11 BBM11 BBN11 BBO11 BBP11 BBQ18 BBR11 BBS11 |
9 members |
|
153 family |
BBA47 BBL12 BBM12 BBN12 BBO12 BBP12 BBQ19 BBR12 BBS12 |
9 members |
|
154 family |
BBA48 BBL13 BBM13 BBN13* BBO13 BBP13 BBQ20 BBR13 BBS13 |
9 members |
|
155 family |
BBA49 BBL14 BBM14 BBN14 BBO14 BBP14 BBQ21 BBR14 BBS14 |
9 members |
|
156 family |
BBL15 BBM15 BBN15 BBO15 BBP15 BBQ22 BBR15 BBS15 |
9 members |
|
157 family |
BBA51 BBL16 BBN16* BBM16 BBO16 BBP16 BBQ23 BBR16 BBS16 |
9 members BBN16 authentic frameshift |
|
158 family |
BBA53 BBA54 |
2 members |
|
159 family |
BBA55 BBL17 BBM17 BBN17 BBO17 BBP17 BBQ24 BBR17 BBS17 |
9 members |
|
160 family |
BBA56 BBL18 BBM18 BBN18* BBO18 BBP18 BBQ25 BBR18 BBS18 |
9 members |
|
161 family |
BBC05 BBL29 BBM29 BBN29* BBO29 BBP29 BBQ37 BBR29 BBR41* BBS31 |
10 members Previously called ORF-4 (Dunn et al., 1994; Zuckert and Meyer, 1996). BBR41* is a fusion of family 161 and 162 genes. |
|
162 family |
BBL39LP BBN38LP BBP38LP BBR40*LP
BBR41*
|
5 members erp genes; BBR41* is an apparent fusion of family 161 and 162 genes. |
|
163 family |
BBF01LP? BBL40LP BBN39LP BBO40LP BBP39LP BBQ47LP |
6 members erp genes; BBF01 has small patch of similarity to this gene family and an N-terminal sequence that is one off from our "stringent consensus" lipidation sequence. |
|
164 family |
BBM38LP BBO39LP BBR42LP BBS41LP |
4 members erp genes |
|
165 family |
BBC12 BBL36 BBM35 BBN35 BBO36 BBP35 BBQ43 BBR36 BBS38 |
9 members Previously called ORF-8/7 (Dunn et al., 1994; Zuckert and Meyer, 1996). |
|
166 family |
(BB0845.11*) BBD001LP BBH01LP BBQ89LP |
4 members; Could well all be pseudogenes since they are in highly recombined telomeric regions (Casjens et al., 2000). BB0845.11 is a possible pseudogene; this is a poor homolog and is not included in the TIGR master gene list |
|
167 family |
BBE29* BBQ67(*) BBJ20* |
3 members; The N-terminal portion of BBQ67 is paralogous to BBE29 and its C-terminal portion is similar to BBG02 (family 102); The N-terminal region of BBQ67 is a good match to the full length of an adenine methylase and could be functional in spite of this apparent fusion to BBG02-like sequences? BBJ20 is similar to the N-terminal portion of BBQ67 but not BBE29. |
|
168 family |
BBI30 BBQ75* |
2 members |
|
170 family |
BBF32* vlsELP
BBJ51*
|
3 members; BBF32 contains 15 tandem, direct repeats of vlsE-like sequences (in effect it is 15 pseudogenes, even though they are in-frame with one another); The vlsE gene is the only known B31 gene that is absent from TIGR sequence; this is due to its terminal location on lp28-1 and the presence of an unclonable sequence near it - see Zhang et al. (1997) and Casjens et al., 1999). |
|
171 family |
BBA74 BBH20* (BBJ11.1*) |
3 members - A74 is osm28 of Skare et al. (1996) BBJ11.1 similarity to BBA74 is short and not strong - it is not in the TIGR computers master gene list |
|
172 family |
BBI02.1* BBU03 |
2 members |
|
173 family |
BBJ32 BBJ45.1* |
2 members |
|
174 family |
BBK52.1* BBQ02 |
2 members |
|
175 family |
BBD15.01* BBF19.1* BBK37* |
3 members BBK37 is a fusion gene that also contains sequences similar to family 75 members; although BBK37 is truncated relative to intact family 75 members it appears that it might be expressed. |
Borrelia burgdorferi B31 Plasmid Lipoprotein Genes
Compiled by Sherwood Casjens, Dan Haft & Claire Fraser - April, 1999
Find below in this section
(i) Background on lipidation consensus sequence
(ii) Numbers of possible lipoprotein genes on B31 plasmids
(iii) Cross-referenced table of the possible lipoprotein genes on B31 plasmids
(iv) Summary of lipoprotein analysis
(v) List of sequences of N-terminal 60 amino acids of the consensus and near-consensus lipoprotein genes on B31 plasmids
Most authors use an "N-terminal lipidation consensus sequence" in which the proteins must
(i) have a Cys between positions 10 and 30
(ii) have a credible hydrophobic N-terminal signal sequence
(iii) have a positively charged amino acid very near the N-terminus
(iv) contain the following consensus relative to the above Cys (defined as position +1)
[L,A,V,I,F,T,M] [L,A,V,I,F,S] X [G,A,S,N] C
-4 -3 -2 -1 +1 position relative to C
We refer to this as the "stringent consensus" in the following discussion (for Gram positive bacterial consensus, see Sutcliffe and Russell, 1995).
B31 proteins that contain this consensus that may be most likely to be lipoproteins in reality, but lipoprotein gene identification in this way is not completely certain, especially in Borrelia, and so the following analysis should be understood to be a current best guess, and NOT as the final word in the identification of Borrelia burgdorferi B31 lipoprotein encoding genes.
Interestingly, the sequence "LTXIC" is present in conjunction with a decent signal sequence in A68, A69, I36, I38, I39 and J41. These genes are all in paralogous family [54], which contains a number of other consensus lipoprotein genes; could this mean that this "non-consensus" sequence is actually lipidated? Similarly the erpL gene (BBO39) has an M at position -3 but is likely to be a lipoprotein, since it is a member of a family of genes in which all other members have the above "stringent" lipidation consensus. Perhaps a better(?) "relaxed consensus" for potential Borrelia lipoproteins can be guessed at from "near-consensus" genes which are known surface proteins or which fall into paralogous families with "consensus gene" members. Most proteins of this type have conservative differences from the above consensus - in particular, S, G or M in position -4, T or M in position -3, and I, T or L in position -1; these could be included in a "relaxed consensus". In part because of these uncertainties, an analysis (described below) that was more complex than a simple consensus search was performed on all of the putative B31 encoded proteins in order to identify the potential lipoproteins.
Prediction of Lipoproteins
Putative B31 proteins were included in the predicted lipoprotein list by the following procedure:
1. A preliminary B31 potential lipoprotein list was generated by rules derived from other species (positive charge near N-terminus followed by a hydrophobic stretch of amino acids that is in turn followed immediately by a lipidation consensus).
2. This list was then manually curated using the Borrelia lipoprotein literature and by inspection of the N-terminal sequences. At this point a few genes were added and or removed. Nearly all of the remaining proteins met the "stringent" consensus (above) and the remaining few met a somewhat more relaxed consensus with only conservative amino acid additions.
3. This list was then used to build a multiple sequence alignment of the putative lipoprotein N-terminal regions that included the presumptive lipoprotein signal and some additional sequence. The alignment was edited, poorly aligned sequences removed, and the alignment was trimmed at the N-terminus to include Met-1 of most but not all sequences, and at the C-terminus to include the predicted modified Cys residue and 4 residues beyond it (the analysis showed that these last 4 residues had little effect on the final results).
4. From this alignment, a Hidden Markov Model (HMM) for the N-terminal region of lipoproteins was constructed using the HMMER 1.8.4 package.
5. The HMM analysis provided a set of all B31 proteins with potential lipidation sequences, listed according to descending HMM score. This list was used in a final manual curation of the potential lipoprotein list as follows:
(i) Any high-scoring genes (HMM scores >25 with a lipidation consensus at position ²40) that were previously missed were added to the list.
(ii) In some cases new start codon assignment allowed high scoring genes to be added to the list.
(iii) Genes that are members of "lipoprotein families" were added if their HMM score was >19.5. The HMM scores dropped off rather quickly; only 16 plasmid proteins not in the final potential lipoprotein list have the sequence N-terminal positive charge and hydrophobic signal sequence features required and HMM scores above 11.
6. The resulting "Potential B31 Lipoprotein List" includes 141 genes given below.
The HMM analysis also identified a number of additional proteins that have N-terminal regions that are similar to those predicted to be lipoproteins, but which do not quite meet the above criteria. We conclude that there may be as many as 20-50 lipoproteins in Borrelia beyond those we specifically predicted.
Note: Bona fide type I signal sequences on non-lipidated proteins share some of the above characteristics, including basic residues near the N-terminus followed by a hydrophobic region. Lipoproteins differ in having the modified Cys itself and the lipidation consensus region. The HMM model was not specifically designed to discriminate between type I and lipidated signal sequences, so many of the weakly predicted lipoproteins are likely to be membrane proteins even if they are not in fact lipidated.
Summary of Lipoproteins Encoded by the B31 Genome
Predicted B31 Lipoproteins
136 putative B31 genes encode proteins that are the most likely to be lipoproteins.
38 of these putative lipoproteins are chromosomally encoded (all by "intact" genes).
98 of these are plasmid encoded.
90 putative lipoproteins are plasmid encoded by "intact" genes.
7 putative plasmid encoded lipoproteins (BBF20, BBF32, BBI14, BBJ001[BBJ01], BBK01, BBQ04, BBR40) come from ORFs currently classified as pseudogenes that may have intact translation starts with the lipidation consensus sequence.
1 putative plasmid lipoprotein (BBI32) is translated from "questionable" genes (see discussion of these above).
9 paralogous gene families (families 36, 53, 63, 74, 85, 113, 164, 166, 170) encode only predicted lipoproteins, and several others (e.g., family 12) contain genes that encode only with a near-consensus lipidation sequence.
17 paralogous gene families (families 12, 21, 37, 40, 44, 52, 54, 59, 60, 69, 75, 92, 102, 136, 143, 162, 163) are heterogeneous in that at least 1 potential LP and at least one non-LP is found in the family. A substantial fraction of the proteins from these families that are not predicted to be lipidated by the above protocol have "near-consensus" sequences, suggesting that they might in fact also be lipoproteins.
The 38 chromosomal genes listed as potentially encoding lipoproteins are:
BB0028, BB0038, BB0071, BB0141, BB0144, BB0155, BB0213, BB0215, BB0224, BB0227, BB0298, BB0321, BB0324, BB0352, BB0328, BB0365, BB0382, BB0383, BB0384, BB0385, BB0398, BB0458, BB0464, BB0475, BB0536, BB0542, BB0553, BB0620, BB0628, BB0652, BB0664, BB0689, BB0758, BB0806, BB0823, BB0832, BB0840, BB0844
The 98 plasmid genes and pseudogenes that are potential lipoproteins are as follows:
(* indicates ORFs currently classified as pseudogenes; indicates "questionable" gene calls; # pseudogenes not listed as LPs on the TIGR WEB site for technical reasons):
BBA04, BBA05, BBA07, BBA14, BBA15, BBA16, BBA24, BBA25, BBA33, BBA34, BBA36, BBA57, BBA59, BBA60, BBA62, BBA64, BBA65, BBA66, BBA68, BBA69, BBA73, BBB08, BBB09, BBB14, BBB16, BBB19, BBB25, BBB27, BBC10, BBD001, BBD10, BBD15, BBE06, BBE08, BBE09, BBE31, BBF20*, BBF32*, vlsE#, BBG01, BBG02, BBG25, BBH01, BBH18, BBH32, BBH37, BBI14*, BBI16, BBI28, BBI29, BBI32, BBI34, BBI36, BBI38, BBI39, BBI42, BBJ001#*(BBJ01*), BBJ09, BBJ34, BBJ36, BBJ41, BBK01*, BBK07, BBK12, BBK19, BBK47, BBK48, BBK49, BBK50, BBK52, BBK53, BBL28, BBL39, BBL40, BBM27, BBM28, BBM38, BBN28, BBN38, BBN39, BBO28, BBO39, BBO40, BBP27, BBP28, BBP38, BBP39, BBQ03, BBQ04*, BBQ05, BBQ35, BBQ47, BBQ89, BBR28, BBR40*, BBR42, BBS30, BBS41
The following is a list of the scores of the potential lippoprotein gene products obtained from the HMM analysis (see above; Dan Haft, unpublished). These scores were taken into account but were not the sole criteria for inclusion of genes as potential lipoprotein encoding genes.
|
GENE/aa |
HMM SCORE |
PREDICTED N-TERMINAL AA SEQUENCE |
|
BBR40/1-24 |
36.67 |
MNKKMKNLIICAVFVLIISCKNNT |
|
BBS41/1-24 |
36.54 |
MNKKMKNLIICAVFVLIISCKIDA |
|
BBR42/1-24 |
36.17 |
MNKKIKMFIICAIFMLISSCKNDV |
|
BBK52/1-24 |
35.38 |
MKKNIYILNIFLYIPLFYSCFLTP |
|
BBQ47/1-24 |
35.27 |
MNKKMKIFIICAVFVLISSCKIDA |
|
BBM28/1-22 |
34.68 |
MKIINIL--FCLFLLLLNSCNSND |
|
BBP28/1-22 |
34.68 |
MKIINIL--FCLFLLLLNSCNSND |
|
BBQ35/1-22 |
33.69 |
MKIINIL--FCISLLLLNSCNSND |
|
BBC10/1-26 |
33.66 |
MQKINIAKLIFILIFSLFVISCELFI |
|
BB0664/12-35 |
33.62 |
MNINTLFYGMIIIIFALISCNHKN |
|
BBN28/1-22 |
33.44 |
MKIINIL--FCLFLLMLNSCNSND |
|
BBO39/1-24 |
33.34 |
MNKKMKMFIICAVFALMISCKNYA |
|
BBB08/1-22 |
32.71 |
MKKKFNF--IFPFIIFLFSCNISV |
|
BBJ09/1-24 |
32.58 |
MKKLIKILLLSLFLLLSISCVHDK |
|
BBA14/1-24 |
32.51 |
MQIKNFPFLFLLNSLIIFSCSTIA |
|
BBL39/5-28 |
32.12 |
MNKKMKMFIICAVFILIGACKIHT |
|
BBP38/1-24 |
32.12 |
MNKKMKMFIICAVFILIGACKIHT |
|
BB0398/1-24 |
31.93 |
MNLLVKIAKFILILFLFTSCNQKQ |
|
BBO40/1-22 |
31.65 |
MNKKILI--IFAVFALIISCKNYA |
|
BBM27/1-24 |
31.33 |
MRNKNIFKLFFASMLFVMACKAYV |
|
BBL28/1-22 |
30.97 |
MKIINIL--FCLFLLMLNGCNSND |
|
BBO28/1-22 |
30.97 |
MKIINIL--FCLFLLMLNGCNSND |
|
BBR28/1-22 |
30.97 |
MKIINIL--FCLFLLMLNGCNSND |
|
BBS30/1-22 |
30.97 |
MKIINIL--FCLFLLMLNGCNSND |
|
BBK48/1-24 |
30.91 |
MNLINKLFILTILFSSVISCKLYK |
|
BBB16/1-26 |
30.43 |
MKILIKKLKVVLFLNLILLISCVNES |
|
BBB19/1-23 |
30.38 |
MKKNTLSAI-LMTLFLFISCNNSG |
|
BBE09/1-24 |
30.32 |
MQKDIYISNIFLYIPLFYSCFLTP |
|
BBM38/5-26 |
30.32 |
MNKKMFI--ICAIFALIVSCKNYA |
|
BBP27/1-24 |
30.23 |
MRNKNIFKLFFAAMLFVMACKAYV |
|
BBI14/1-22 |
29.88 |
MKNKIIL--CMCVFSLLNSCNFDN |
|
BBA65/1-24 |
29.85 |
MNKIKLSILITLGITTFFSCDLNN |
|
BBK19/1-21 |
29.73 |
MKKYIIN---LSLCLLLLSCNLFS |
|
BBD15/1-26 |
29.66 |
MNRKFVISLLFIILTFLLILGCDLSI |
|
BBJ36/1-24 |
29.59 |
MKRKSNICISLLVTILFVSCKFFG |
|
BBE31/1-22 |
29.58 |
MKYHIIV--SIFIFLFLNACNPDS |
|
BBA07/1-24 |
29.55 |
MCGRRMKNILLFVILLFFSCKEFN |
|
BBG25/1-27 |
29.44 |
MKNLKTKINFLGIFWLLLLFLSCESIP |
|
BBN38/1-24 |
29.36 |
MNKKMKMFIVCAVFILIGACKIHT |
|
BBD001/1-19 |
29.08 |
MNKL-----LIFIILLVFSCNLSN |
|
BBH01/1-19 |
29.08 |
MNKL-----LIFIILLVFSCNLSN |
|
BBQ89/1-19 |
29.08 |
MNKL-----LIFIILLVFSCNLSN |
|
BB0155/1-23 |
28.94 |
MKKHYKALI-LSLLFAIISCNTKT |
|
BBA69/6-29 |
28.90 |
LNIIKINIITMILTLICISCAPFN |
|
BBK50/1-24 |
28.89 |
MNLIIKVMLISSLFSSFISCKLYE |
|
BBJ01/1-22 |
28.70 |
MKYHIIV--SIFVFLFLNACNPDS |
|
BBA73/1-25 |
28.62 |
MKRNKIWKTLKLFQITLLFSCSFYS |
|
BBL40/1-22 |
28.55 |
MNKKTII--ICAVFALILSCKNYA |
|
BBP39/1-22 |
28.55 |
MNKKTII--ICAVFALILSCKNYA |
|
BBA59/1-21 |
28.15 |
MVKKII---FISFSIFIVSCSAIG |
|
BBD10/1-23 |
28.14 |
MNSKFILK-YFILAFFLVSCQTYQ |
|
BB0844/1-23 |
28.03 |
MKKKNLSI-YMIMLISLLSCNTSD |
|
BB0365/1-26 |
27.84 |
MYKNGFFKNYLSLFLIFLVIACTSKD |
|
BB0840/1-25 |
27.76 |
MKNINRLILLILTTHTLLFSCALIA |
|
BBI29/1-22 |
27.73 |
MKNNIIL--CMCVFLLLNSCTANH |
|
BB0806/1-26 |
27.55 |
MKFVLNNLFKGCLICFFLFFSCLTTD |
|
BBN39/1-22 |
27.47 |
MNKKTLI--ICAVFALIISCKNFA |
|
BB0328/1-22 |
27.39 |
MKYIKIA--LMLIIFSLIACISNA |
|
BBA05/1-22 |
27.28 |
MNKIGIA--FIISFLLFVNCKGKS |
|
BB0385/1-21 |
26.57 |
MLKKVY---YFLIFLFIVACSSSD |
|
BBH32/1-22 |
26.56 |
MKYNTII--SIFVCLFLTACNPDF |
|
BBE08/1-21 |
26.49 |
MQKNVY---CFIIFVLISSCNNYA |
|
BB0141/2-30 |
26.48 |
MNLIFNINLYLKKYFLVLFLVLVACVGDN |
|
BBH37/1-22 |
26.46 |
MKKKMFL--YTLLTIGLMSCNLNS |
|
BBF20/1-26 |
26.35 |
MNKKFSISLLSTILAFLLVLGCDLSS |
|
BB0352/1-26 |
26.27 |
MNNFMRIKNLILIAILLISPSCSTNK |
|
BB0536/1-27 |
26.26 |
MNYQRIKNYCKFTSVFLFFLFSCVSNE |
|
BB0383/1-22 |
25.89 |
MNKILLL--ILLESIVFLSCSGKG |
|
BB0324/1-18 |
25.77 |
MKK------LILLNLIFISCYTIN |
|
BB0652/1-22 |
25.61 |
MKKGSKL--ILILLVTFFACLLIF |
|
BBA57/1-27 |
25.60 |
MNGKLRKALKIAIFTTLLLVISCNANM |
|
BBA60/1-21 |
25.22 |
MSKKVI---LILLEILILSCDLSI |
|
BB0215/1-24 |
25.14 |
MMKKVIILIFMLSTSLLYNCKNQD |
|
BBA33/1-21 |
24.90 |
MKRYIY---VYIISVAVISCYLND |
|
BBA68/6-29 |
24.71 |
LNIIKINIIAMILTLICTSCAPFS |
|
BBK12/1-20 |
24.64 |
MSKLI----LAISILLIISCKWYV |
|
BBK47/6-29 |
24.44 |
IKIFIIPNLVFSSLFLFESCSGFL |
|
BBK49/6-29 |
24.44 |
IKIFIIPNLVFSSLFLFESCSGFL |
|
BBJ34/1-23 |
24.32 |
MIKGNTFI-LILVTTMFVSCKFYG |
|
BB0628/1-21 |
24.21 |
MRKCFV---SLSLLLIFFACSSNV |
|
BBE06/1-21 |
23.97 |
MRFLNI---IKSLEFFVMSCNDIF |
|
BB0028/6-29 |
23.83 |
ENYFKKRLILNLLIFLLLACSSES |
|
BBA64/6-32 |
23.81 |
LKNNKLIAIFLLHVLTVLILISCSLEV |
|
BBB09/1-20 |
23.76 |
MKYLKN----ISLFLLILGCKSIP |
|
BBB14/2-26 |
23.7 |
ILYQNQLKFLKLLVFFLLISCTSLN |
|
BB0224/3-26 |
23.4 |
GRKDLFFLILFLSFSIIISCRVKG |
|
BBK07/1-20 |
23.4 |
MSKLI----LAISILLIISCKWHV |
|
BBQ05/1-20 |
23.35 |
MKYYI----CVCVFLLLNACNSDF |
|
BBA66/1-23 |
23.24 |
MKIKPLIQ-LKLLGLFLFSCTIDA |
|
BBI28/1-21 |
23.22 |
MKCHIIA---TIFVFLFLACSTDF |
|
BBI16/1-21 |
23.19 |
MKYHII---TTIFVFLFLACRPDF |
|
BB0689/1-20 |
22.97 |
MKKLI----IIFTLFLSQACNLST |
|
BBA25/1-25 |
22.81 |
MKIGKLNSIVIALFFKLLVACSIGL |
|
BB0542/12-35 |
22.79 |
DLSKFFMYKLLFIIVFVLSCSSIF |
|
BBB27/1-20 |
22.66 |
MKKFL----ISVYFLLFYGCSTIS |
|
BB0823/1-20 |
22.36 |
MNTKT----LYLISLILLACNKNN |
|
BBI34/1-22 |
22.36 |
MKHYIIV--HIFVFLFLNACYPVA |
|
BBB25/1-21 |
22.29 |
MKYCFS---LILMVFICSSCKILN |
|
BB0298/1-20 |
21.70 |
MKILW----LIILVNLFLSCGNES |
|
BBA62/1-22 |
21.51 |
MTKLMYA--IFLSAILFVACETTR |
|
BBA15/1-21 |
21.43 |
MKKYLLG---IGLILALIACKQNV |
|
BB0144/1-20 |
21.38 |
MYKLF----LFFIIFMFLSCDEKK |
|
BB0384/1-21 |
21.35 |
MFKRFI---FITLSLLVFACFKSN |
|
BB0758/10-36 |
21.32 |
LKLLRQSINLKSLFPLSVLFFSCNVVD |
|
BB0038/18-41 |
21.17 |
ELIPFYKFLFLFFFFTLLACSKVS |
|
BB0620/1-20 |
21.03 |
MKRNFY----LIVLFIANNCFSID |
|
BB0321/5-29 |
20.81 |
MVYSLKIQIEEEINIFVFISCLKLL |
|
BB0382/1-19 |
20.76 |
MRIV-----IFIFGILLTSCFSRN |
|
BBA36/1-21 |
20.70 |
MMQRIS---ILLMLLAVFSCKQFG |
|
BBJ41/1-29 |
20.45 |
MKNLKLNIIKLNVITAILTSICISCAPFG |
|
BB0464/1-19 |
20.42 |
MKILR-----LCLLFLFFACTFDY |
|
BB0475/7-31 |
20.35 |
FKLKLLPILVISGILIVFMSCMKTS |
|
BBG02/1-27 |
20.16 |
MEINLQSKLNNKNNNKLIFFISCSLVL |
|
BB0553/1-21 |
20.07 |
MNKTKNR---SLTYFIILSCISLF |
|
BBI39/1-29 |
19.84 |
MKNFKLNIIKLNVITAILTSICISCAPFG |
|
BB0213/1-22 |
19.76 |
MQSGLKI--KLILFFCCFACSCDI |
|
BBI36/1-29 |
19.62 |
MKNFKLNTIKLNVITAILTLICISCAPFG |
|
BBI38/1-29 |
19.62 |
MKNFKLNTIKLNVITAILTLICISCAPFG |
|
BBG01/1-22 |
19.51 |
MRKSLFL--YTLLMGGLMSCNLDS |
|
BB0227/1-27 |
19.50 |
MLNPRTIKTTFMLISTLMIFNGCTKKL |
|
BBA24/5-29 |
19.22 |
NNKTFNNLLKLTILVNLLISCGLTG |
|
BB0071/1-24 |
19.19 |
MVRFLGFLYLITTIPLIKSCDAAQ |
|
BBA34/1-23 |
19.11 |
MIIKKRGL-LILGIATVISCSAMS |
|
BBA04/1-19 |
19.06 |
MKRV-----IVSFVVLILGCNLDD |
|
BBK01/1-22 |
18.94 |
MRKSLFL--YALLMGGLMSCNLDS |
|
BBH18/6-29 |
17.94 |
KRVGNKIFYISVVLILIVGCDWGT |
|
BBI32/3-29 |
17.61 |
LNFRVIFLISHTQYMYSPILKSCEFIN |
|
BB0832/1-22 |
17.38 |
MLKTLTK--IITISCLIVGCASLP |
|
BB0458/16-39 |
17.09 |
KRSKMRLILMLLLXFLCFSTLLSQ |
|
BBQ03/1-22 |
16.09 |
MRILVGV--FIIAALALLGCYLPD |
|
BBK53/1-22 |
14.77 |
MRILVGV--CIIAALALLGCYLPD |
|
BBI42/1-21 |
14.70 |
MRILVG---VCIIALALLGCYLPD |
|
BBA16/1-20 |
13.07 |
MRLLIG----FALALALIGCAQKG |
Below this line are listed the HMM scores of "near cutoff" genes that did not
make it into the list of potential lipoprotein genes. Some of them could in fact
be lipoproteins. See description of lipoprotein prediction criteria above.
|
BBK45/26-48 |
24.00 |
MNLMIKVLI..........FSLFLSFISCKLYE |
|
BB0460/1-21 |
18.27 |
MKTRII............IFLSILSILSCSKSV |
|
BBK32/1-24 |
18.13 |
MKKVKSKYL.........ALGLLFGFISCDLFI |
|
BBQ46/1-17 |
17.71 |
MIKNVIYILp kt...kkFSQIVWLLIACKIWI |
|
BB0323/1-25 |
17.40 |
MNIKNKLISl........LIVVAISFIACKTPP |
|
BB0171/1-25 |
17.30 |
MGRNFLAILy........FCFLFLGFLSCSNVK |
|
BB0329/1-23 |
16.13 |
MKLQRSLF..........LIIFFLTFLCCNNKE |
|
BB0259/16-48 |
15.96 |
MKKINMFNRsscvlqnflFLFLFLSLVSCFAKK |
|
BB0330/7-32 |
15.60 |
KKIGKKIKIv.......tLLMLAVSLIACNNNS |
|
BBB24/18-41 |
15.59 |
LKIFKKYLL.........LFYLLFLTLSCSTIY |
|
BBA32/1-20 |
15.14 |
MRITG.............LLFFLICLLSCSSFN |
|
BB0639/1-20 |
14.81 |
MKKIF.............ILIVILTTFACTNKD |
|
BBA03/1-21 |
13.90 |
MKKTII............VFIILAFMLNCKNKS |
|
BBH06/1-23 |
12.76 |
MKKSFLSI..........YMLISISLLSCDVSR |
|
BBS26/1-22 |
12.19 |
MKMLKRL...........HCLLIVLLLCCTTIA |
|
BBP26/1-22 |
12.19 |
MKMLKRL...........HCLLIVLLLCCTTIA |
|
BBO26/1-22 |
12.19 |
MKMLKRL...........HCLLIVLLLCCTTIA |
|
BBL26/1-22 |
12.19 |
MKMLKRL...........HCLLIVLLLCCTTIA |
|
BBR26/1-22 |
10.70 |
MKMLKRL...........HCLLIALLLCCTTIA |
|
BBQ33/1-22 |
10.70 |
MKMLKRL...........HCLLIALLLCCTTIA |
|
BBN26/1-22 |
10.70 |
MKMLKRL...........HCLLIALLLCCTTIA |
|
BBM26/1-22 |
10.70 |
MKMLKRL...........HCLLIALLLCCTTIA |
|
BB0560/1-23 |
10.68 |
MILIFYFK..........QIALFIIFRLCYIIK |
|
BB0729/7-30 |
10.60 |
LYTLINIII.........MLILISIVYLCKRKN |
|
BB0305/1-28 |
10.31 |
MNSISKIEFkv.....ycILVLILTVILCFNIY |
|
BB0332/1-27 |
10.22 |
MLKFTLKKIlg......iIPTLLVIIFLCFFVM |
|
BB0204/1-26 |
16.01 |
MKFIINLLLs.......tIKIITFTVIVCLTIL |
|
BB0752/1-27 |
12.73 |
MKNKENEVLnl......tLNLTIIFLIFCNISI |
|
BBF01/1-23 |
12.48 |
MKLLKIFM..........CAFLLLNLVNCKFDS |
|
BB0315/1-22 |
11.64 |
MRKITIM...........ILFYGLIINVCPTTT |
|
BB0353/1-21 |
11.45 |
MKREIYA............FLSNFIIFMCFFLG |
|
BBJ47/5-28 |
11.16 |
SNCIKYIIL.........TMLIGLLIFCCATFV |
Of this latter group (near, but below cutoff) the following are members of families with putative lipoprotein members:
BB0328, BB0329, BB0330, BBF01, BBK45, BBL26, BBM26, BBN26, BBO26, BBO39, BBP26, BBQ33, BBR26 & BBS26.
BBK32 has been shown to be an outer surface protein.
B31 Plasmid-encoded Putative B. burgdorferi Lipoproteins
Footnotes to following table
1. Previously described gene or gene family name.
2. BB gene names are the open reading frames identified in the nucleotide sequence of the genome of isolate B31.
* Putative pseudogenes may not always be included in this table, but when they are they are indicated by an asterisk.
# Have a "near-cutoff" lipidation signal, but was not included in our list of potential lipoproteins.
member of paralogous family but has NO lipidation signal.
3. vlsE, the rightmost gene on lp28-1 is not in the reported genomic sequence, apparently due to an unclonable sequence nearby (Zhang et al., 1997).
4. Reference to the complete B31 genome sequence (Casjens et al., 1999; Fraser et al., 1997) are not given in the table. Sequences from Borrelia isolates other than B31 are indicated in French brackets {....}.
|
Common gene name1 |
BB gene name2 |
Paralog Family |
Location |
Comments |
References4 |
|
7.5 kd lipoprotein |
BBA62 |
|
lp54 |
transcription start (Indest et al., 1997) also called 6.6 kbp lipoprotein |
possible surface exposure (Katona et al., 1992; Lahdenne et al., 1997) {297} |
|
12 Kda lipoprotein |
BBA59 |
|
lp54 |
short gene but it is expressed; (McGrath et al., 1997) |
|
|
mlpA mlpC mlpD mlpF mlpG mlpH mlpI mlpJ |
BBP28 BBS30 BBR28 BBM28 BBO28 BBL28 BBN28 BBQ35 |
113 |
cp32-1 cp32-3 cp32-4 cp32-6 cp32-7 cp32-8 cp32-9 lp56 |
also called "LP" and "nlpH" |
(Gilmore and Mbow, 1998; Porcella et al., 1996) {297}; lipidated in E. coli (Porcella et al., 1996); lipidated in B. afzelii DK1 (Theisen, 1996) {297, DK1} |
|
dbpA dbpB |
BBA25 BBA26 |
74 |
lp54 |
binds decorin |
surface exposed on outer membrane; binds decorin (Guo et al., 1998; Hagman et al., 1998; Hanson et al., 1998){N40} |
|
erpA & B erpG erpH & Y
erpK
erpL & M
erpN & O
erpP & Q
erpX |
BBP38, P39 BBS41 BBR40*, R42 BBM38 BBO39#, O40
BBL39, L40
BBN38, N39
BBQ47 |
162, 163 and 164 |
cp32-1 cp32-3 cp32-4 cp32-6 cp32-7 cp32-8 cp32-9 lp56 |
these genes fall into at least 3 paralogous classes [162,163,164] (Akins et al., 1999; Casjens et al., 1999; Stevenson et al., 1998b) also called "p21" and "ospE & ospF" |
surface exposed (Lam et al., 1994){N40}; lipidated in E. coli (Akins et al., 1995b; Wallich et al., 1995); erp-like genes have been sequenced from several strains (Akins et al., 1999; Lam et al., 1994; Marconi et al., 1996b; Stevenson et al., 1997; Stevenson et al., 1996; Suk et al., 1995){297, ZS7, N40} |
|
fibronectin binding protein |
BBK32# |
|
lp36 |
fibronectin binding; called P35 in strain N40; upregulated in stationary phase in N40 (Fikrig et al., 1997; Indest et al., 1997) |
one off from "stringent lipidation consensus" (G in position 1); surface localized (Probert and Johnson, 1998) |
|
oppAIV
oppAV |
BBA34 BBB16 |
37 |
cp26 lp54 |
ABC oligopeptide transporter |
B16 not surface exposed; not essential in culture (Bono et al., 1998) |
|
ospA ospB |
BBA15 BBA16 |
53 |
lp54 lp54 |
transcription start site (Jonsson et al., 1992); OspA atomic resolution structure (Li et al., 1997); in vitro mutagenesis (McGrath et al., 1995) |
outer surface protein (Barbour et al., 1983); lipidated in Bb; (Bergstrom et al., 1989; Brandt et al., 1990); ; sequenced in numerous other strains (Bunikis et al., 1996; Caporale and Kocher, 1994; Jonsson et al., 1992; Marconi et al., 1993a; Rosa et al., 1992; Wallich et al., 1992; Wallich et al., 1989; Wang et al., 1997a; Wang et al., 1997b; Wang et al., 1997c; Will et al., 1995; Wilske et al., 1996a; Wilske et al., 1996b; Wilske et al., 1992; Zumstein et al., 1992) |
|
ospC |
BBB19 |
|
cp26 |
temperature regulated (Schwan et al., 1995; Stevenson et al., 1995) transcription start site (Marconi et al., 1993b) |
surface localized (Wilske et al., 1993); sequenced in {many strains} (e.g., Fuchs et al., 1992; Jauris-Heipke et al., 1993; Jauris-Heipke et al., 1995; Marconi et al., 1993c; Margolis et al., 1994a; Margolis et al., 1994b; Masuzawa et al., 1997; Stevenson and Barthold, 1994; Stevenson et al., 1994; Tilly et al., 1997; Wang et al., 1999; Wilske et al., 1996a; Wilske et al., 1996b) |
|
ospD |
BBJ09 |
|
lp38 |
short direct repeat upstream of gene |
outer surface protein (Marconi et al., 1994; Norris et al., 1992){many strains} |
|
P27 |
BBA60 |
|
lp54 |
(Reindl et al., 1993) {B29} |
|
|
P35 antigen |
BBA64 BBA65 BBA66 BBA68 BBA69 BBA70* BBA71*
BBA73
BBI36
BBI38
BBI39
BBJ41 |
54 |
lp54 lp54 lp54 lp54 lp54 lp54 lp54 lp28-4 lp28-4 lp28-4 lp38 |
BBA64 cell density-dependent expression and transcription start (Indest et al., 1997)
|
(Gilmore et al., 1997)
|
|
rev |
BBP27 BBC10 BBM27 |
63 |
cp32-1 cp9 cp32-4 |
(Gilmore and Mbow, 1998; Porcella et al., 1996) |
|
|
S1 |
BBA05 |
|
lp54 |
(Feng et al., 1995){N40} |
|
|
S2 |
BBA04 BBE09 BBK04* BBK52 BBQ04* |
44 |
lp54 lp25 lp36 lp36 lp56 |
(Feng et al., 1995){N40} |
|
|
vlsE7 |
vlsE BBF32* BBJ52* |
170 |
lp28-1 lp28-1 lp36 |
F32 is 15 tandem unexpressed versions of part of the vlsE gene |
cassette surface antigenicity variation mechanism (Zhang et al., 1997; Zhang and Norris, 1998a; Zhang and Norris, 1998b) |
|
Below are plasmid open reading frames with LP "stringent" consensus that have not been previously given common names |
|||||
|
BBA07 |
|
lp54 |
|||
|
BBA14 BBG25 |
143 |
lp54 lp28-2 |
other cp32 members of family do not have lipidation consensus |
||
|
BBA33 |
|
lp54 |
|||
|
BBA36 |
|
lp54 |
|||
|
BBA57 |
|
lp54 |
|||
|
BBB08 |
|
cp26 |
|||
|
BBB09 |
|
cp26 |
|||
|
BBB14 |
|
cp26 |
|||
|
BBB25 |
|
cp26 |
|||
|
BBB27 |
|
cp26 |
|||
|
BBE31 BBH32 BBI14* BBI16 BBI28 BBI29 BBI34 BBJ001* BBK15# BBQ05 |
60 |
lp25 lp28-2 lp28-4 lp28-4 lp28-4 lp28-4 lp28-4 lp38 lp36 lp56 |
|||
|
BBD001 BBH01 BBQ89 |
166 |
lp17 lp28-3 lp56 |
²300 bp ²300 bp ²300 bp |
||
|
BBD10 |
|
lp17 |
|||
|
BBD15 BBF20* |
85 |
lp17 lp28-1 |
|||
|
BBE06 |
|
lp25 |
²300 bp |
||
|
BBE08 |
|
lp25 |
²300 bp |
||
|
BBG01 BBH37 BBK01* BBJ08 BB0844 |
12 |
lp28-2 lp28-3 lp36 lp38 chrm |
|||
|
BBG02 |
102 |
lp28-2 |
|||
|
BBI32 |
|
lp28-4 |
questionable gene |
||
|
BBJ34 BBJ36 |
92 |
lp38 lp38 |
|||
|
BBK07 BBK12 |
59 |
lp36 lp36 |
|||
|
BBK19 |
|
lp36 |
|||
|
BBK45# BBK46* BBK48 BBK50 |
75 |
lp36 lp36 lp36 lp36 |
|||
|
BBK47 BBK49 BBH18 |
69 |
lp36 lp36 lp28-3 |
|||
|
BBK53 BBI42 BBQ03 |
52 |
lp36 lp28-4 lp56 |
|||
N-terminal Sequences of Potential B31 Plasmid-Encoded Lipoproteins
The table below shows the first 60 amino acids of all the B31 plasmid genes that contain a putative N-terminal lipidation consensus. It also includes those B31 plasmid genes that are not included in our "potential lipoprotein list" but which have an HMM score (above) of >11 as well as some added by manual inspection. If fewer than 60 amino acids are shown in the table below, the entire predicted amino acid sequence of that genes translation is shown.
Notes on the following table:
* Asterisks mark the putative proteins which contain the following "stringent" consensus
[L,A,V,I,F,T,M] [L,A,V,I,F,S] X [G,A,S,N] C
-4 -3 -2 -1 +1 position relative to C
Non-stringent consensus proteins that we consider to be potential lipoproteins (see rules above)
# Indicates a below cut-off gene that might be worthwhile considering when dealing with the potential lipoprotein genes in this genome, since it is particularly close the above "stringent" consensus.
Underlined amino acids are non-stringent consensus amino acids within the consensus region.
Alternate, better(?) translation starts are double underlined in the following table.
Cs are red, positively charged K, R and Hs are blue and negatively charged D and Es are green.
cp9
BBC10 *
MQKINIAKLIFILIFSLFVISCELFIIKRRATITETTTIEKKRINWLIMSVSGLNDEADE
cp26
BBB08 *
MKKKFNFIFPFIIFLFSCNISVSSIFIRPLDEVIKSEIALYESLGDGKFKTGIHAKNYFD
BBB09 *
MKYLKNISLFLLILGCKSIPNGNFNLHDTNHKLGKLKFQEDSIISRNYDNKISIVGVYNP
BBB14 *
MILYQNQLKFLKLLVFFLLISCTSLNVEHDQFGKTFRIYQSLNKNAELKGIFNYKTGITK
BBB16 * (oppAIV)
MKILIKKLKVVLFLNLILLISCVNESNRNKLVFKLNIGSEPATLDAQLINDTVGSGIVSQ
BBB19 * (ospC) (surface exposed, Wilske et al., 1993)
MKKNTLSAILMTLFLFISCNNSGKDGNTSANSADESVKGPNLTEISKKITDSNAVLLAVK
BBB24 #
MHLKTKFYKKTYILWTFLKIFKKYLLLFYLLFLTLSCSTIYFDGIPELKKDSKYIKLIQE
BBB25
MKYCFSLILMVFICSSCKILNIAEDLEKNFEKIERADYFLYFYPDSQIYIKKDKSSNKFS
BBB27 *
MKKFLISVYFLLFYGCSTISLVKIPEKDKINLTVLSSLMNYPDLKISNFKIKDYEHLHYS
cp32-1
BBP26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIVLLLCCTTIANLPEEPKPPIIPTLKSLAKYETQLSEYVMYLVTFLAKT
BBP27 * (rev-1)
MRNKNIFKLFFAAMLFVMACKAYVEEKKEIDSLMEDVLALVNDSSGGKFKDYKDKINELK
BBP28 * (mlpA) (lipidated in E. coli, Porcella et al., 1996), (lipidated in Bb, Theisen, 1996)
MKIINILFCLFLLLLNSCNSNDNDTLKNNAQQTKSRGKRDLTQKEATPEKPKSKEELLRE
BBP38 * (erpA) (Erp homologs surface exposed in other Bb strains and lipidated in E. coli, Akins et al., 1995b; Lam et al., 1994; Wallich et al., 1995)
MEKFMNKKMKMFIICAVFILIGACKIHTSYDEQSNGEVKVKKIEFSEFTVKIKNKNNSNN
BBP39 * (erpB2)
MNKKTIIICAVFALILSCKNYAIKDLEQNAKGKIKGFIDKALDPAKDKITSSSSKVDELA
cp32-3
BBS26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIVLLLCCTTIANLPEEPKPPIIQTLKSLAKYETQLSEYVMYLVTFLAKT
BBS30 * (mlpC)
MKIINILFCLFLLMLNGCNSNDNDTLKNNAQQTKRRGKRDLTQKETTQEKPKSKEELLRE
BBS41 * (erpG)
MNKKMKNLIICAVFVLIISCKIDASSEDLKQNVKEKVEGFLDKELMQGDDPNNSLFNPPP
cp32-4
BBR26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIALLLCCTTIANLPEEPKPPIIPTLKSLAKYETQLSEHVMYLVTFLAKT
BBR28 * (mlpD)
MKIINILFCLFLLMLNGCNSNDTNNSQTKSRQKRDLTQKEATQEKPKSKEELLREKLNDN
BBR31 #
MKDFLITTKNPTCHNKHQHKLIYLTSTVDFLNKKDKKYTQQNILYYYNKNLKRNGLAPTT
BBR39 #
MYLSCVPPLKIASSVYPTCSTAQILHAMRTHKI
BBR40 * (erpH) truncated pseudogene
MNKKMKNLIICAVFVLIISCKNNTLSLYDEQSIG
BBR42 * (erpY)
MNKKIKMFIICAIFMLISSCKNDVTSKDLEGAVKDLESSEQNVKKTEQEIKKQVEGFLEI
cp32-6
BBM26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIALLLCCTTIANLPEEPKPPIIQTLKSLAKYETQLSEYVMYLVTFLAKT
BBM27 * (rev-6)
MRNKNIFKLFFASMLFVMACKAYVEEKKEIDSLMEDVLALVNDSSGGKFKDYKDKINELK
BBM28 * (mlpF)
MKIINILFCLFLLLLNSCNSNDNDTLKNNAQQTKSRGKRDLTQKEATPEKPKSKEELLRE
BBM38 * (erpK)
MEQLMNKKMFIICAIFALIVSCKNYASGEDVKKSLEQDLKGKVKGFLDTKKEEFFGDFKK
cp32-7
BBO26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIVLLLCCTTIANLPEEPKPPIIPTLKSLAKYETQLSEYVMYLVTFLAKT
BBO28 * (mlpG)
MKIINILFCLFLLMLNGCNSNDTNTKQTKSRQKRDLTQKEATQEKPKSKSKEDLLREKLS
BBO39 (erpL)
MNKKMKMFIICAVFALMISCKNYASGENLKNSEQNLESSEQNVKKTEQEIKKQVEGFLEI
BBO40 * (erpM)
MNKKILIIFAVFALIISCKNYATGKDIKQNAKGKIKGFLDKVLDPAKDKITSSSSKVDEL
cp32-8
BBL26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIVLLLCCTTIANLPEEPKPPIIPTLKSLAKYETQLSEYVMYLVTFLAKT
BBL28 * (mlpH)
MKIINILFCLFLLMLNGCNSNDNDTLKNNAQQTKSRRKRDLTQKEVTQEKPKSKEELLRE
BBL39 * (erpN)
MEKFMNKKMKMFIICAVFILIGACKIHTSYDEQSNGEVKVKKIEFSEFTVKIKNKNNSNN
BBL40 * (erpO)
MNKKTIIICAVFALILSCKNYAIKDLEQNAKGKIKGFIDKALDPAKDKITSSSSKVDELA
cp32-9
BBN26 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIALLLCCTTIANLPEEPNPPIIPTLKSLAKYETQLSEYVIYLVTFLAKT
BBN28 * (mlpI)
MKIINILFCLFLLMLNSCNSNDTNTSQTKSRQKRDLTQKEATQEKPKSKEDLLREKLSED
BBN38 * (erpP)
MEKFMNKKMKMFIVCAVFILIGACKIHTSYDEQSSGEINHTLYDEQSNGELKLKKIEFSK
BBN39 * (erpQ)
MNKKTLIICAVFALIISCKNFATGKDIKQNSEGKIKGFVNKILDPVKDKIASSGTKVDEV
lp5
BBT04 # (not a good looking signal sequence?)
MNSKTTNKTNRNCYNKVQHKLIVLISTICYLNKTHKKYTQKTILYYFNENLRKNGQTIST
lp17
BBD001 *
MNKLLIFIILLVFSCNLSNSDQNNPLNMSNKEKISEYQINESSNKYSIFKRNSSVKRYTFB
BBD10 *
MNSKFILKYFILAFFLVSCQTYQIAYDRFSQVLDSQYDIGVNYSRDGIFKSVISIKYDKL
BBD15
MKNKLSVYTTIMLNFKFLKCVYLCFMVFVRLILIIKFRGKKFMNRKFVISLLFIILTFLLB
lp21
none
lp25
BBE04 #
MKNPKSNKSKLNIITAILASIYISCAPIGKVNTKPNSDTNPENNQN
BBE06 * (E in signal seq)
MRFLNIIKSLEFFVMSCNDIFTKKGTLSNLKLSAVERCILDDMEIVIMN
BBE08 *
MQKNVYCFIIFVLISSCNNYANDKGLKRVKEYLEKEAKVFLCLSNFVL
BBE09 * (D in signal seq)
MQKDIYISNIFLYIPLFYSCFLTPPKSLKINSIKTEVFDFKIIEEGDITKYNKNPIKESN
BBE16 #
MGKILFFGLLLICIFLGFFFYKQKENNVIYNKIVEKFDDNVFVDETYTYLFKDSNLKELV
BBE25 #
MQTIKIQDIPTLFNKVGIIFCNINFESIIKINIY
BBE26 # (but no N-terminal + charge)
MYFYCLHLIVFIVICFGDFGICALGGVVFLGFIFLLYSVQCN
BBE28 #
MIFKEIKMMPQKLLIIKNCYSCQKLLKKNSKICCVVYRTRNKYPKTLITS
BBE31 *
MKYHIIVSIFIFLFLNACNPDSNTNQNNSKKGLLKIEKIPNKQIKNKLLDDLKNLIETAN
lp28-1
BBF01 #
MKLLKIFMCAFLLLNLVNCKFDSLNLSTKSVDDKNNSIAKLLQHLSKSEDQANKTSTSED
BBF11 # (but signal sequence too short?)
MVFYNRNIIFFSLCLVIPLIILIKILKLSIDHISD
BBF20 * (pseudogene)
MNKKFSISLLSTILAFLLVLGCDLSSNNAENKMDDIFNLEKKYMDNSNYKCLSKNEAIVK
BBF32 * (pseudogene cassette)
MFKTIIKQKNMKKISSAILLTTFFVFINCKSQVADKASVTGIAKGIKEIVEAAGGSEKLK
vlsE * (surface exposed, Zhang et al., 1997)
MNMKKISSASLLTTFFVFINCKSQVADKDDPTNKFYQSVIQLGNGFLDVFTSFGGLVAEA
lp28-2
BBG01
MRKSLFLYTLLMGGLMSCNLDSKLSSNKEQKNNNNVKEVSNSVQEDGLNDLYSNQEKQKS
BBG02 *
MEINLQSKLNNKNNNKLIFFISCSLVLVSTRPFDNRFTYYSKNRGVIIRPGYKIMKHILE
BBG06 #
MKKVFTFLKKLCIIYNINPIRSSTMINNSKKPNCHNKLQQKLIVLLSTLAYVNSKYNKYT
BBG07 #
MQNMAKSIQLVKPIVRCSNKKDLFIKIEKDNDKTIYHTKIMMDIYKFGLNKKKNKYRISL
BBG25 *
MKNLKTKINFLGIFWLLLLFLSCESIPSLPQKPTLTNKEDIENLMLDEAELFRYSTALNV
BBG32 # (bad signal sequence)
MKVASLIRSTCENENLILRSGFRDLDAIIQGFRESNFVVIGARPSVGKTAFALNIAHNIC
lp28-3
BBH01 *
MNKLLIFIILLVFSCNLSNSDQNNPLNMSNKEKISEYQINESSNKYSIFKRNSSVKRYTF
BBH06 #
MKKSFLSIYMLISISLLSCDVSRLNQRNINELKIFVEKAKYYSIKLDAIYNECTGAYNDI
BBH08 #
MVGVFIKAKTLEIKSFTSLHKRSSGCHIGSILLTYIAKEDDLAINM
BBH18 *
MKMKEKRVGNKIFYISVVLILIVGCDWGTIKDKSTEISKLLRTDKDKTKNQDRIELGEDN
BBH32 *
MKYNTIISIFVCLFLTACNPDFNTNKKRTLSKGIISNQDADSDKIIKNKLLDDLINLIEK
BBH37
MIKGKESIFMKKKMFLYTLLTIGLMSCNLNSKLSGNKEEQKNNNDIKEALNGVQENAINN
lp28-4
BBI14 * (pseudogene)
MKNKIILCMCVFSLLNSCNFDNDAEAATKKHADKIKN
BBI16 *
MKYHIITTIFVFLFLACRPDFNIDQKDIKYPPTEKSRPKTESSKQKESKPKTEEELKKKQ
BBI25 #
MLLILFFTLTMNMKKFFILNKEIGIGNCNLLFYLYFLKNINKI
BBI28 *MKCHIIATIFVFLFLACSTDFNTDQKGIKYPPTEKSKPKTEDSKQKELKPKTEKELKKKQ
BBI29 *
MKNNIILCMCVFLLLNSCTANHEAEAKIKKHVDKTKNEYINEIKNLIATTKEIIEKRKLL
BBI32 *
MSLNFRVIFLISHTQYMYSPILKSCEFINNLKTVSSRLIKNILFIWRGINENFIFGIEVI
BBI34 *
MKHYIIVHIFVFLFLNACYPVASNKIELKPKTETSLNQEEVPNQEANYKEEKEAKEEGIN
BBI36
MKNFKLNTIKLNVITAILTLICISCAPFGNVNPNKLKNPITSKNLKKTKRSNHSRNLKKT
BBI38
MKNFKLNTIKLNVITAILTLICISCAPFGNVNPNKLKNPITSKNLKKPKRSNHSRNLKKT
BBI39
MKNFKLNIIKLNVITAILTSICISCAPFGNVNPNEPKNPTTSKSLKKTKRSNNSRNLKNT
BBI42 *
MRILVGVCIIALALLGCYLPDNQEQAVQTFFENSESSDMGSDEIVTEGIFSSLKLYASEH
lp36
BBK01 (pseudogene)
MRKSLFLYALLMGGLMSCNLDSKLSSNKEQKNNNNVKEVSDSVQEDGLNDLYNNQEKQKS
BBK04 # (pseudogene; consensus at 2nd C - poor signal sequence)
MTEFMVVSSIEERLKTKSPLDIKIIDNSCGSGNFLISCLDYLTEKVWYELDKFEDVKKN
BBK07 *
MSKLILAISILLIISCKWHVDNPIDEATAESKSALTSVDQVLDEISEATGLSSEKITKLT
BBK12 *
MSKLILAISILLIISCKWYVDNTIDEATVESKSALTSIDQVLDEISEATGLSSEKITKLT
BBK19 *
MKKYIINLSLCLLLLSCNLFSKDSRSRQKYNFKVPAKSVSNPINKENIDTEKGTNTTLCI
BBK30 #
MVALFKFAIFQLSNTKTCTSSFKAKFMIQGNDN
BBK32 # (Fibronectin-binding protein, surface localized, Probert and Johnson, 1998)
MKKVKSKYLALGLLFGFISCDLFIRYEMKEESPGLFDKGNSILETSEESIKKPMNKKGKG
BBK38 # (but no N-terminal positive charge)
MLLPPFVMLSSTLTLVSLATSFTSCAIFSSPSLPNALLSYLDYLPTPFSLTTARSKAGSA
BBK45 #
MITNNKCNIMILYYNNTLFLHKVSTMNLMIKVLIFSLFLSFISCKLYEAVDKSLIKDNKR
BBK47 *
MRLCLIKIFIIPNLVFSSLFLFESCSGFLSKKSIEQFALALKDHQENKNTTNTSVDKNSK
BBK48
MNLINKLFILTILFSSVISCKLYKKITYNADQVIDKLKSNNGSFNTLKSNDDSKRSGRKP
BBK49
MRLCLIKIFIIPNLVFSSLFLFESCSGFLSKKSIEQFALALKDHQENKNTTNTSVDKNSK
BBK50
MNLIIKVMLISSLFSSFISCKLYEKLTNKSQQALAKAFVYDKDIADNKSTNSTSKLDNSS
BBK51 #
MNLRINKFILILNSILELCIAESISKIFILEK
BBK52 *
MKKNIYILNIFLYIPLFYSCFLTPPKSSKINSIKTEVLDFKIIEEGNIIKYDKKPIEERN
BBK53 *
MRILVGVCIIAALALLGCYLPDNQEQAVQTFFENSESSDMGSDEIVTEGIFSSLKLYASE
lp38
BBJ01 (BBJ001) * (N terminal end of pseudogene BBJ001)
MKYHIIVSIFVFLFLNACNPDSNTNQNNSKKELKTGRIPNKQIKNALLDDLKNLIETASA
BBJ04 # (bad signal sequence)
MLLLYKKTFSMIGFCFRALSENEKHFLMLLFSAGTIF
BBJ06 #
MIISSKSFIIFLLSYLENYCVLFPFLIYLKNYYMQSLSQAGHYKELNY
BBJ08 # (no N-terminal positive charge)
MFLYTLLTIGLISCNLDSKLPNKEQKNNNDIKETLGSSVQENALNNLYGNQEEKKDFKNF
BBJ09 * (ospD) (lipidated and surface exposed, Norris et al., 1992)
MKKLIKILLLSLFLLLSISCVHDKQELSSKSNLNNQKGYLDNEGANSNYESKKQSILSEL
BBJ23 #
MNGVIMREISCCFLLLTFSVVCVYSFDVSSRKFYGILEGYYSGKIEELSKKNDEDVYIYR
BBJ25 #
MKVFMKKFLFLILPCFGVFANELNDELGDFVIRGVDFEFRLDYLSVPNNFENNFDFILNI
BBJ28 #
MMLKIFVIVFNFCVLNLLNAGDGKSLIKEFENLYYPQLKNGIYAFKMNFKINVKNNLEES
BBJ34 *
MIKGNTFILILVTTMFVSCKFYGSDDTNKKNTSLNGDTREIDNIGSVILEQDGNKKGDTT
BBJ36 *
MKRKSNICISLLVTILFVSCKFFGNKSASKEKEETSFSDTASKISKSGTAASSDKQEKNT
BBJ41
MKNLKLNIIKLNVITAILTSICISCAPFGNVNPNEPKNPTTSKSLKKTKRSNNSRNLKNT
BBJ46 #
MDKFLTSNHPPIIIFTIGALCATVLICLIIIFIIHGIINPILIKKFKSINNSLQKITKEF
BBJ47 #
MRNISNCIKYIILTMLIGLLIFCCATFVWLIGIFYSNNFKEERNYSISPIDSVIMRKCYF
lp54
BBA03 #
MKKTIIVFIILAFMLNCKNKSNDAEPNNDLDEKSQAKSNLVDEDRIEFSKATPLEKLVSR
BBA04 * (S2)
MKRVIVSFVVLILGCNLDDNSKMERKGSNKLIRESGSDRRGQENRALGAMNFGLFSGDSG
BBA05 * (S1) (not labeled with palmitate, Feng et al., 1995)
MNKIGIAFIISFLLFVNCKGKSLEEDLKSTTSNNKQNLISNEKKSLNSKNNRLKDSRLSN
BBA07 *
MCGRRMKNILLFVILLFFSCKEFNYSDLRRRPSKVLNASNGASNKELKISFVDSLNDDQK
BBA14 *
MQIKNFPFLFLLNSLIIFSCSTIASLPEEPSSPQESTLKALSLYEAHLSSYIMYLQTFLV
BBA15 * (ospA) (lipidated and surface exposed, Barbour et al., 1983; Brandt et al., 1990)
MKKYLLGIGLILALIACKQNVSSLDEKNSVSVDLPGEMKVLVSKEKNKDGKYDLIATVDK
BBA16 * (ospB) (lipidated and surface exposed, Barbour et al., 1983; Brandt et al., 1990)
MRLLIGFALALALIGCAQKGAESIGSQKENDLNLEDSSKKSHQNAKQDLPAVTEDSVSLF
BBA24 * (dbpB) (surface exposed, protein lipidated in E. coli, Hagman et al., 1998; Hanson et al., 1998)
MIKCNNKTFNNLLKLTILVNLLISCGLTGATKIRLERSAKDITDEIDAIKKDAALKGVNF
BBA25 * (dbpA)
MKIGKLNSIVIALFFKLLVACSIGLVERTNAALESSSKDLKNKILKIKKEATGKGVLFEA
BBA26 # (but no N-terminal positive charge)
MNLFYANIAVTLFGITSLLCNLACDFKTCFQFAKIFQIGYFAN
BBA32 #
MRITGLLFFLICLLSCSSFNKSSNKSLLAKNKQKASDYNREYYQKNREKLKLRARERYRR
BBA33 *
MKRYIYVYIISVAVISCYLNDFSGMKENNCNKYDLSFFELSLAERENAILKIQRKFKSLT
BBA34 * (oppA IV) (not surface exposed/probably periplasmic, Bono et al., 1998)
MIIKKRGLLILGIATVISCSAMSKPKDDIVFGVGIGNEPTSLDPQFCSDRLGNLIINELF
BBA36 *
MMQRISILLMLLAVFSCKQFGDVKSLTEIDSGNGIPLVVSDVVKDLIPKEISLTPEEAEK
BBA54 #
MVAFVRKVCTIFILFFCLSFNLHSYAVENGVVIKVKIFNFKLNQQRSFEELERDLRLFIQ
BBA57 *
MNGKLRKALKIAIFTTLLLVISCNANMDTNDKNKALNEYKLKNISEVIKNSLQLESDPKL
BBA59 * (12 kd)
MVKKIIFISFSIFIVSCSAIGRGILIDSILNNVHKELEQEKKDEKKKNPQSKASIEENAD
BBA60 * (P27) (lipidated and surface exposed in strain B29, Reindl et al., 1993)
MSKKVILILLEILILSCDLSINKEQKTKEKTSEKQESEKQNIEKQEPEKQKQNAAKIIPT
BBA62 * (7.5 kd) (lipidated, in outer membrane, possible surface exposure, Katona et al., 1992; Lahdenne et al., 1997)
MTKLMYAIFLSAILFVACETTRISDEMENTSDEDSKVTAPMTDKDMMKSMPDKNTKSMKQ
BBA64 *
MKDNILKNNKLIAIFLLHVLTVLILISCSLEVKDSNESKKHKKEKRKGKVENLLVAINNL
BBA65 *
MNKIKLSILITLGITTFFSCDLNNKDNKDKVASFTETKYNELSPQKGTKTQDQRSTKNLK
BBA66 *
MKIKPLIQLKLLGLFLFSCTIDANLNEDYKNKVKGILNKAADDQETTSADTNSNAAKNIP
BBA68
MKKAKLNIIKINIIAMILTLICTSCAPFSKIDPKANANTKPKKITNPGENTQNFEDKSGD
BBA69
MKKAKLNIIKINIITMILTLICISCAPFNKINPKANENTKLKKNTRLKKPANPGENIQNF
BBA72 #
MHKESVLTKNKLNIIATILTLIGTSCAVNPIGPKVKSRTDIKESNQKSGNPESLNQKYQE
BBA73 *
MKRNKIWKTLKLFQITLLFSCSFYSKSNNTEAISELQSSPIKLGKIKVLQKTEKIVSTQN
lp56
BBQ03 *
MRILVGVFIIAALALLGCYLPDNQEQAVQTFFENSESIDMGSDEIVTEGIFSSLKLYASE
BBQ04 * (pseudogene)
MKKNICILNIFLYIPLFYSCFLTTPKSSKINSIKTEVLDFKIIEEGNIIKYDKKPIEESN
BBQ05 *
MKYYICVCVFLLLNACNSDFSTNQEDIKYPSDKEKSKSNMEASSKEEDPNKKIKNTLLND
BBQ33 # (one member of this family [from strain 297] was not lipidated in E. coli, Porcella et al., 1996)
MKMLKRLHCLLIALLLCCTTIANLPEEPKPPIIPTLKSLAKYETQLSEYVMYLVTFLAKT
BBQ35 * (mlpJ)
MKIINILFCISLLLLNSCNSNDNDTLKNNAQQTKSRKKRDLSQEELPQQEKITLTSDEEK
BBQ46 # (questionable signal sequence)
MIKNVIYILPKTKKFSQIVWLLIACKIWIVG
BBQ47 * (erpX)
MFGVVVNLRLMEWIMNKKMKIFIICAVFVLISSCKIDATGKDATGKDATGKDATGKDATG
BBQ56 #
MKIFFIFLDILYFLAYNICIDSYIINYELSIPCKRPLAKANGLLLY
BBQ89 *
MNKLLIFIILLVFSCNLSNSDQNNPLNMSNKEKISEYQINESSNKYSIFKRNSSVKRYTF
Pseudo-, Questionable and Short B. Burgdorferi B31 Plasmid Genes
by Sherwood Casjens, Granger Sutton, Jeremy Peterson & Dan Haft - completed March 1999.
The purpose of the following table is to call attention to the B31 plasmid genes which may not have a biological function. It lists (i) all the putative pseudogenes, (ii) all those computer (GLIMMER)-recognized genes that are "questionable" for some reason, and (iii) all the short (<300 bp) genes on the B. burgdorferi B31 plasmids. The reason for categorizing each gene as a "questionable gene" or "pseudogene" is given in the "COMMENTS" column. It is of course very possible that any given "short" or "questionable" gene or truncated "pseudogene" could be expressed and could have a biological function. We do not mean to imply that this is anything beyond our best current guess at functionality
DEFINTIONS
PSEUDOGENE - a region of DNA that is similar in sequence to a paralogous Borrelia gene or a gene from another organism, but which is truncated relative to other members of the gene family and/or does not have a full open reading frame.
QUESTIONABLE GENE - an ORF that is called a putative gene by the TIGR gene-recognition protocol, but may be a false recognition because it is either (i) within another gene or pseudogene, or (ii) was not called in highly similar, paralogous sequence elsewhere on the B31 plasmids.
-"Daggers" in column 1 indicate computer-recognized open reading frames that are in-frame inside of larger pseudogenes (they are questionable gene calls that are part of a larger pseudogene).
SHORT GENE - a <300 bp long ORF that was recognized as a putative gene by the TIGR gene-recognition protocol that is NOT in the "questionable" or "pseudogene" categories.
Genes <300 bp are rather highly over-represented on some of the B31 the plasmids, so it is possible, or even perhaps likely, that some these are in fact not real genes, but are spurious gene calls in "junk DNA" on these plasmids. Such short genes are often not closely packed like the Borrelia chromosomal genes and most other prokaryotic genes. About 9% (80/844) of the putative Bb chromosomal genes are <300 bp (and some of these are homologs of small genes of known function, pointing out again that particular short plasmid genes may well have a function). The fraction of "short" genes is considerably higher on the B31 plasmids which contain putative decaying DNA regions.
A "*" in the SHORT GENE column of the table below indicates a <300 bp gene that is either a "questionable gene" or "pseudogene" and so does not fit into the "short gene" category as we define it.
Pseudo-, Questionable and Short B. Burgdorferi B31 Plasmid Genes
|
GENE NAME |
5-end |
3-end |
SHORT GENE |
QUESTIONABLE GENE |
PSEUDO- GENE |
COMMENTS (paralogous family names in [square brackets]) |
|
cp9 |
||||||
|
BBC04 |
2700 |
2593 |
SHORT |
no homolog outside of Borrelia |
||
|
BBC07 |
4788 |
4507 |
SHORT |
no homolog outside of Borrelia |
||
|
cp26 |
||||||
|
BBB15 |
11636 |
11737 |
SHORT |
no homolog outside of Borrelia |
||
|
BBB20 |
17733 |
17626 |
SHORT |
no homolog outside of Borrelia |
||
|
BBB21 |
17750 |
17842 |
SHORT |
no homolog outside of Borrelia |
||
|
cp32-1 |
||||||
|
BBP14 |
8724 |
8957 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBP23 |
15215 |
15415 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
cp32-3 |
||||||
|
BBS14 |
8724 |
8957 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBS23 |
15212 |
15412 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBS28 |
16915 |
17046 |
SHORT |
no homolog |
||
|
BBS32 |
19198 |
19392 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBS43 |
28067 |
28246 |
SHORT |
no homolog |
||
|
cp32-4 |
||||||
|
BBR02 |
1306 |
1998 |
PSEUDO |
authentic frameshift |
||
|
BBR14 |
8734 |
8967 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBR23 |
15167 |
15367 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBR30 |
18829 |
18737 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBR35 |
21974 |
22357 |
PSEUDO |
authentic point mutation |
||
|
BBR39 |
25636 |
25538 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBR40 |
25865 |
25966 |
* |
PSEUDO |
N-terminal erp gene [162] fragment (named erpH) |
|
|
BBR41 |
26077 |
26817 |
PSEUDO |
fusion of erp gene [162] and family [161] gene |
||
|
cp32-6 |
||||||
|
BBM14 |
8734 |
8967 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBM23 |
15231 |
15431 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBM40 |
27731 |
27850 |
* |
QUESTIONABLE |
overlaps BBM39 |
|
|
cp32-7 |
||||||
|
BBO14 |
8726 |
8959 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBO23 |
15222 |
15422 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBO35 |
22755 |
22630 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBO41 |
28117 |
28007 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
cp32-8 |
||||||
|
BBL14 |
8724 |
8957 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBL23 |
15215 |
15415 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBL33 |
21467 |
21556 |
SHORT |
no homolog |
||
|
cp32-9 |
||||||
|
BBN05 |
3343 |
3950 |
PSEUDO |
authentic frameshift |
||
|
BBN06 |
3960 |
4935 |
PSEUDO |
authentic frameshift |
||
|
BBN13 |
8289 |
8742 |
PSEUDO |
authentic frameshift |
||
|
BBN14 |
8742 |
8975 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBN16 |
10283 |
11034 |
PSEUDO |
authentic frameshift |
||
|
BBN18 |
12009 |
12560 |
PSEUDO |
authentic point mutation |
||
|
BBN21 |
13807 |
14410 |
PSEUDO |
authentic frameshift |
||
|
BBN22 |
14423 |
15320 |
PSEUDO |
authentic frameshift |
||
|
BBN23 |
15312 |
15512 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBN29 |
18783 |
17776 |
PSEUDO |
authentic point mutation |
||
|
BBN37 |
25779 |
25006 |
PSEUDO |
authentic point mutation |
||
|
BBN40 |
27991 |
27884 |
* |
QUESTIONABLE |
gene not called in paralogous sequence elsewhere |
|
|
lp5 |
||||||
|
BBT01 |
195 |
635 |
PSEUDO |
[57] fragment |
||
|
BBT02 |
744 |
1094 |
PSEUDO |
[57] fragment |
||
|
BBT03 |
1208 |
1573 |
? |
[84] ([57] fragment?) |
||
|
BBT05 |
3200 |
3350 |
* |
PSEUDO |
[57] fragment |
|
|
BBT07 |
4388 |
4816 |
PSEUDO |
[52] C-terminal fragment |
||
|
lp17 |
||||||
|
BBD001 |
214 |
405 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD02 |
873 |
1019 |
* |
PSEUDO |
[77] ([57] fragment?) |
|
|
BBD03 |
1117 |
1309 |
* |
PSEUDO |
[57] fragment |
|
|
BBD04 |
1412 |
1765 |
PSEUDO |
[57] fragment |
||
|
BBD05 |
2389 |
2541 |
SHORT |
? |
[84] ([57] fragment?) |
|
|
BBD05.1 |
3018 |
3604 |
PSEUDO |
[57] fragment |
||
|
BBD06 |
3143 |
3604 |
QUESTIONABLE |
in-frame inside of D05.1 |
||
|
BBD07 |
4373 |
4260 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD08 |
4707 |
4802 |
* |
PSEUDO |
[82] fragment |
|
|
BBD12 |
7752 |
7624 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD15.01 |
10000 |
9940 |
* |
PSEUDO |
[175] fragment |
|
|
BBD15.1 |
10152 |
10326 |
* |
PSEUDO |
[82] fragment |
|
|
BBD16 |
10520 |
10428 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD17 |
10591 |
10683 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD19 |
12057 |
12167 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD20 |
12250 |
12975 |
PSEUDO |
[82] fragment |
||
|
BBD22 |
14072 |
14338 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD23 |
14781 |
15725 |
PSEUDO |
[82] fragment |
||
|
BBD24 |
16121 |
15894 |
SHORT |
no homolog outside of Borrelia |
||
|
BBD25 |
16212 |
16367 |
SHORT |
no homolog outside of Borrelia |
||
|
lp21 |
||||||
|
BBU01 |
184 |
615 |
PSEUDO |
[57] fragment |
||
|
BBU02 |
746 |
1111 |
? |
[84] ([57] fragment?) |
||
|
BBU03 |
1357 |
1241 |
SHORT |
no homolog outside of Borrelia |
||
|
BBU07 |
15349 |
15810 |
PSEUDO |
[57] fragment |
||
|
BBU08 |
15791 |
16081 |
* |
PSEUDO |
[137] fragment |
|
|
BBU09 |
26548 |
16231 |
PSEUDO |
[55] fragment |
||
|
BBU10 |
16603 |
16797 |
* |
PSEUDO |
[57] fragment |
|
|
BBU12 |
17918 |
18362 |
PSEUDO |
[52] authentic frameshift |
||
|
lp25 |
||||||
|
BBE01 |
255 |
157 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE03 |
4613 |
4422 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE04 |
4719 |
4856 |
PSEUDO |
family [54] fragment |
||
|
BBE04.1 |
5377 |
5734 |
PSEUDO |
family [44] fragment |
||
|
BBE05 |
5377 |
5526 |
* |
QUESTIONABLE |
E05 inside E04.1 and out of frame |
|
|
BBE06 |
5757 |
5903 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE07 |
6401 |
6185 |
* |
PSEUDO |
[26] fragment |
|
|
BBE08 |
6701 |
6558 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE10 |
7972 |
7877 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE11 |
8446 |
8315 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE12 |
8646 |
8524 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE13 |
8863 |
8955 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE14 |
9163 |
9375 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE15 |
9490 |
9356 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE21.1 |
14767 |
14893 |
* |
PSEUDO |
[82] fragment |
|
|
BBE23 |
15973 |
16155 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE23.1 |
16459 |
16540 |
* |
PSEUDO |
[57] fragment |
|
|
BBE23.2 |
16540 |
16721 |
* |
PSEUDO |
[32] fragment |
|
|
BBE24 |
16915 |
17169 |
* |
PSEUDO |
[49] fragment |
|
|
BBE24.1 |
17902 |
18303 |
PSEUDO |
[49?] fragment - see paralog list |
||
|
BBE25 |
18606 |
18505 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE26 |
18586 |
18711 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE27 |
19055 |
19195 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE28 |
19489 |
19340 |
SHORT |
no homolog outside of Borrelia |
||
|
BBE29 |
19697 |
20883 |
PSEUDO |
DNA methyltransferase pseudogene |
||
|
BBE29.1 |
21110 |
21476 |
PSEUDO |
[102] fragment |
||
|
BBE30 |
21558 |
21701 |
* |
PSEUDO |
[49] fragment |
|
|
BBE32 |
23723 |
23418 |
PSEUDO |
[57] fragment |
||
|
BBE33 |
24100 |
23850 |
* |
PSEUDO |
[57] fragment |
|
|
lp28-1 |
||||||
|
BBF001 |
1 |
163 |
* |
PSEUDO |
[88] fragment |
|
|
BBF001.1 |
200 |
300 |
* |
PSEUDO |
[80] fragment |
|
|
BBF02 |
1720 |
2073 |
PSEUDO |
[88] fragment |
||
|
BBF03 |
2619 |
2101 |
PSEUDO |
[80] fragment |
||
|
BBF04 |
2658 |
2804 |
* |
PSUEDO |
[77] ([57] fragment?) |
|
|
BBF05 |
2777 |
3073 |
* |
PSEUDO |
[57] fragment |
|
|
BBF06 |
3201 |
3377 |
* |
PSEUDO |
[57] fragment / overlaps K45 homology |
|
|
BBF07 |
3529 |
3413 |
SHORT |
no homolog outside of Borrelia |
||
|
BBF08 |
3849 |
3685 |
* |
PSEUDO |
[72] authentic frameshift |
|
|
BBF09 |
3962 |
4179 |
* |
PSEUDO |
[71] authentic frameshift |
|
|
BBF11 |
5435 |
5539 |
* |
QUESTIONABLE |
inside BBF11.1 and backwards |
|
|
BBF11.1 |
5620 |
5412 |
* |
PSEUDO |
[32] fragment |
|
|
BBF12 |
6540 |
5956 |
? |
patches of [49] homology; fusion pseudogene? |
||
|
BBF14.1 |
8197 |
8367 |
* |
PSEUDO |
[65] authentic frameshift |
|
|
BBF16 |
8389 |
8607 |
* |
PSEUDO |
[64] authentic point mutation |
|
|
BBF17 |
8772 |
9026 |
* |
PSEUDO |
[68] authentic frameshift |
|
|
BBF18 |
9561 |
10049 |
PSEUDO |
[82] pseudogene |
||
|
BBF19 |
10559 |
10036 |
QUESTIONABLE |
(BBF19 is an inverted part of BBF18) |
||
|
BBF19.1 |
10916 |
11200 |
* |
PSEUDO |
[175] fragment |
|
|
BBF20 |
10991 |
10701 |
* |
PSEUDO |
[85] fragment |
|
|
BBF21 |
11550 |
11449 |
SHORT |
no homolog outside of Borrelia |
||
|
BBF22 |
12018 |
11794 |
* |
PSEUDO |
[44] fragment |
|
|
BBF26.1 |
16209 |
15663 |
PSEUDO |
badly deleted [101] pseudogene |
||
|
BBF27 |
15925 |
15758 |
* |
QUESTIONABLE |
in frame inside F26.1 |
|
|
BBF28 |
16129 |
16001 |
* |
QUESTIONABLE |
out of frame (?) inside F26.1 |
|
|
BBF29 |
16825 |
16457 |
PSEUDO |
[49] fragment (patchy similarity) |
||
|
BBF30 |
17415 |
17170 |
SHORT |
no homolog outside of Borrelia |
||
|
BBF31 |
17805 |
17394 |
* |
PSEUDO |
[50] fragment |
|
|
BBF31.1 |
17920 |
18050 |
PSEUDO |
[57] fragment |
||
|
BBF32 |
26698 |
18430 |
PSEUDO |
vlsE recombination cassette; apparently functional but not transcribed; In the wider sense BBF32 contains 15 unexpressed pseudogenes (Zhang et al., 1997). |
||
|
lp28-2 |
||||||
|
BBG03 |
2104 |
2492 |
PSEUDO |
[48] fragment |
||
|
BBG04 |
2857 |
2753 |
SHORT |
no homolog outside of Borrelia |
||
|
BBG05 |
4056 |
2894 |
PSEUDO |
one frameshift relative to homologous transposases of this type |
||
|
BBG11 |
11015 |
10779 |
SHORT |
no homolog outside of Borrelia |
||
|
BBG31 |
26567 |
27082 |
PSEUDO |
N-terminal truncation of family [50] |
||
|
lp28-3 |
||||||
|
BBH01 |
273 |
464 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH03 |
926 |
1072 |
* |
PSEUDO |
[77] ([57] fragment?) |
|
|
BBH04 |
1045 |
1365 |
PSEUDO |
[57] fragment |
||
|
BBH05 |
1498 |
1677 |
* |
PSEUDO |
[57] fragment |
|
|
BBH07 |
3514 |
3086 |
PSEUDO |
[50] fragment |
||
|
BBH08 |
3730 |
3593 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH09.1 |
8091 |
7810 |
* |
PSEUDO |
[92] fragment |
|
|
BBH10 |
8203 |
8003 |
* |
QUESTIONABLE |
partially overlaps BBH09.1 backwards |
|
|
BBH10.1 |
8240 |
8310 |
* |
PSEUDO |
[82] fragment |
|
|
BBH11 |
8796 |
8704 |
* |
QUESTIONABLE |
backwards inside BBH11.1 |
|
|
BBH11.1 |
8320 |
8850 |
PSEUDO |
[1] fragment (in part) |
||
|
BBH12 |
9455 |
9589 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH14 |
10934 |
10821 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH15 |
11005 |
10913 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH16 |
11068 |
11187 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH17 |
11837 |
12025 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH18.1 |
13571 |
13693 |
* |
PSEUDO |
[65] fragment |
|
|
BBH19 |
13590 |
13709 |
* |
QUESTIONABLE |
overlaps BB18.1 partly in-frame |
|
|
BBH20 |
14596 |
13840 |
PSEUDO |
[171] fragment |
||
|
BBH20.1 |
14750 |
15300 |
PSEUDO |
[104] fragment |
||
|
BBH22 |
14870 |
14766 |
* |
QUESTIONABLE |
backwards inside BBH20.1 |
|
|
BBH23 |
15051 |
15158 |
* |
QUESTIONABLE |
in-frame inside BBH20.1 |
|
|
BBH24 |
15136 |
15342 |
* |
QUESTIONABLE |
mostly within BBH20.1 |
|
|
BBH24.1 |
15354 |
15750 |
PSEUDO |
[86] fragment |
||
|
BBH25 |
15810 |
15568 |
* |
QUESTIONABLE |
backwards inside BBH24.1 |
|
|
BBH30 |
20871 |
20415 |
PSEUDO |
[96] fragment |
||
|
BBH31 |
20997 |
21104 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH33 |
22678 |
22950 |
* |
PSEUDO |
[61] fragment |
|
|
BBH34 |
23383 |
23192 |
* |
PSEUDO |
[62] fragment |
|
|
BBH35 |
23447 |
23560 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH36 |
24180 |
24031 |
* |
QUESTIONABLE |
in frame inside H36.1 |
|
|
BBH36.1 |
24223 |
24041 |
* |
PSEUDO |
[44] fragment |
|
|
BBH36.2 |
24751 |
25112 |
PSEUDO |
[102] fragment |
||
|
BBH38 |
26614 |
26498 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH39 |
26754 |
26855 |
SHORT |
no homolog outside of Borrelia |
||
|
BBH40 |
27445 |
26981 |
PSEUDO |
[82] fragment |
||
|
lp28-4 |
||||||
|
BBI01 |
174 |
605 |
PSEUDO |
[57] fragment |
||
|
BBI02 |
736 |
1101 |
? |
[84] (fragment of [57]?) |
||
|
BBI02.1 |
1416 |
1346 |
* |
PSEUDO |
[172] fragment |
|
|
BBI02.2 |
1617 |
1862 |
* |
PSEUDO |
[57] fragment |
|
|
BBI03 |
1972 |
1850 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI04 |
2219 |
2127 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI05 |
2310 |
2191 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI07 |
3441 |
3346 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI08 |
3576 |
3752 |
* |
QUESTIONABLE |
overlaps BBI08.1 at C-terminus |
|
|
BBI08.1 |
3674 |
4314 |
PSEUDO |
[59] fragment |
||
|
BBI09 |
3745 |
3879 |
* |
QUESTIONABLE |
in-frame inside I08.1 |
|
|
BBI10 |
3911 |
4312 |
QUESTIONABLE |
in-frame inside I08.1 |
||
|
BBI11 |
4721 |
4626 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI12 |
5128 |
5343 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI13 |
5609 |
5704 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI14 |
6159 |
6269 |
* |
PSEUDO |
[60] fragment |
|
|
BBI15 |
6603 |
6830 |
* |
PSEUDO |
[60] fragment |
|
|
BBI17 |
8967 |
8824 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI18 |
10647 |
10498 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI23 |
13989 |
14090 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI24 |
14334 |
14438 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI25 |
15211 |
15339 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI27 |
17272 |
17096 |
* |
PSEUDO |
[60] fragment |
|
|
BBI30 |
19403 |
19507 |
SHORT |
no homolog outside of Borrelia |
||
|
BBI31.1 |
20240 |
20340 |
* |
PSEUDO |
[98] fragment |
|
|
BBI32 |
20479 |
20273 |
* |
QUESTIONABLE |
backwards inside of BBI31.1 |
|
|
BBI33 |
20482 |
20589 |
* |
PSEUDO |
[82] fragment |
|
|
BBI35 |
21992 |
22090 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBI37 |
23056 |
23154 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBI40 |
25320 |
25802 |
* |
PSEUDO |
[49] fragment |
|
|
BBI41 |
26036 |
25797 |
PSEUDO |
[82] fragment |
||
|
BBI43 |
27069 |
26884 |
* |
PSEUDO |
[55] fragment - see paralog list |
|
|
lp36 |
||||||
|
BBK001 |
86 |
14 |
* |
PSEUDO |
[82] fragment |
|
|
BBK02 |
2478 |
1213 |
QUESTIONABLE |
in frame inside K02.1 |
||
|
BBK02.1 |
3770 |
1213 |
PSEUDO |
pseudogene in [1] |
||
|
BBK03 |
3222 |
2821 |
QUESTIONABLE |
in frame inside K02.1 |
||
|
BBK04 |
3595 |
3419 |
* |
QUESTIONABLE |
in frame inside K02.1 |
|
|
BBK05 |
5096 |
4905 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK06 |
5126 |
5233 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK08 |
6281 |
6180 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK09 |
6366 |
6647 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK10 |
6983 |
6807 |
* |
PSEUDO |
[1] fragment |
|
|
BBK11 |
6956 |
7060 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK14 |
8921 |
9013 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK16 |
10223 |
10101 |
* |
PSEUDO |
[32] fragment |
|
|
BBK18 |
12143 |
12054 |
SHORT |
conserved hypothetical protein |
||
|
BBK20 |
13212 |
13301 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK24.1 |
17380 |
17580 |
* |
QUESTIONABLE |
inside BBK25 and backwards |
|
|
BBK25 |
17565 |
16969 |
PSEUDO |
[82] fragment |
||
|
BBK25.1 |
17880 |
20033 |
PSEUDO |
[1] fragment |
||
|
BBK26 |
18346 |
18462 |
* |
QUESTIONABLE |
in frame inside of K25.1 |
|
|
BBK27 |
19094 |
18807 |
* |
QUESTIONABLE |
backwards inside of BBK25.1 |
|
|
BBK28 |
19232 |
19348 |
* |
QUESTIONABLE |
in frame inside of K25.1 |
|
|
BBK29 |
19807 |
19718 |
* |
QUESTIONABLE |
in frame inside of K25.1 |
|
|
BBK30 |
19935 |
20033 |
* |
QUESTIONABLE |
in frame inside of K25.1 |
|
|
BBK31 |
20026 |
20166 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK33 |
21720 |
21890 |
* |
PSEUDO |
[65] fragment |
|
|
BBK34 |
21912 |
22130 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK35 |
22294 |
22464 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK36 |
22545 |
22646 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK37 |
23953 |
22913 |
PSEUDO |
[175 & 75] pseudogene |
||
|
BBK38 |
24146 |
23922 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK39 |
24293 |
24667 |
PSEUDO |
[59] fragment |
||
|
BBK42 |
26803 |
26588 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK42.1 |
26916 |
27078 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK43 |
26937 |
27041 |
QUESTIONABLE |
largely in-frame inside of BBK42.1 |
||
|
BBK44 |
27234 |
27350 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK46 |
29212 |
28394 |
PSEUDO |
[75] authentic frameshifts |
||
|
BBK51 |
34232 |
34327 |
SHORT |
no homolog outside of Borrelia |
||
|
BBK52.1 |
35722 |
35811 |
* |
PSEUDO |
[174] authentic frameshift |
|
|
BBK54 |
36577 |
36392 |
SHORT |
[55] fragment? - see paralog list |
||
|
lp38 |
||||||
|
BBJ001 |
482 |
1208 |
PSEUDO |
[60] fragment |
||
|
BBJ01 |
482 |
664 |
* |
QUESTIONABLE |
in frame inside J001 |
|
|
BBJ02 |
927 |
1208 |
* |
QUESTIONABLE |
in frame inside J001 |
|
|
BBJ02.1 |
1475 |
2367 |
PSEUDO |
[48] fragment |
||
|
BBJ03 |
1593 |
1742 |
* |
QUESTIONABLE |
in frame inside of BBJ02.1 |
|
|
BBJ04 |
2381 |
2271 |
* |
QUESTIONABLE |
backwards inside of J02.1 |
|
|
BBJ05 |
3828 |
2768 |
PSEUDO |
[82] fragment |
||
|
BBJ06 |
3486 |
3629 |
* |
QUESTIONABLE |
backwards inside of BBJ05 |
|
|
BBJ07 |
4307 |
4167 |
* |
QUESTIONABLE |
inside J07.1 |
|
|
BBJ07.1 |
4409 |
4167 |
* |
PSEUDO |
[98] fragment |
|
|
BBJ10 |
7270 |
7473 |
* |
PSEUDO |
[58] fragment |
|
|
BBJ11 |
7965 |
7783 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ11.1 |
8070 |
8260 |
* |
? |
[171] fragment (weak - not in TIGR gene list) |
|
|
BBJ12 |
8725 |
8636 |
* |
QUESTIONABLE |
inside of and out-of-frame with BBJ12.1 |
|
|
BBJ12.1 |
8782 |
8593 |
* |
PSEUDO |
[86] fragment |
|
|
BBJ13 |
9155 |
8880 |
* |
PSEUDO |
[69] fragment |
|
|
BBJ14 |
9125 |
9283 |
* |
QUESTIONABLE |
overlaps BBJ13 backwards |
|
|
BBJ15 |
10168 |
10043 |
* |
QUESTIONABLE |
inside BBJ15.1 and backwards |
|
|
BBJ15.1 |
9450 |
10150 |
PSEUDO |
[105] fragment |
||
|
BBJ20 |
13936 |
13775 |
* |
PSEUDO |
[167] pseudogene |
|
|
BBJ21 |
15514 |
15657 |
* |
QUESTIONABLE |
backwards inside BBJ21.1 |
|
|
BBJ21.1 |
15976 |
15484 |
* |
PSEUDO |
[138] fragment |
|
|
BBJ22 |
16003 |
15905 |
QUESTIONABLE |
overlaps BBJ21.1 in part |
||
|
BBJ30 |
23018 |
23119 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ32 |
24517 |
24389 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ33 |
24681 |
24791 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ35 |
26858 |
26959 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ37 |
28442 |
28281 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ38 |
28703 |
28608 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ39 |
29401 |
29267 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ39.1 |
29680 |
29600 |
PSEUDO |
[54] fragment |
||
|
BBJ40 |
29827 |
29919 |
* |
QUESTIONABLE |
paralog not called between BBJ41 and BBJ42 |
|
|
BBJ42 |
31133 |
30945 |
SHORT |
? |
possible [54] fragment (similarity not great; not in TIGR paralog list) |
|
|
BBJ44 |
32150 |
32251 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ45.1 |
33652 |
33522 |
* |
PSEUDO |
[173] authentic frameshift |
|
|
BBJ46 |
34668 |
34375 |
SHORT |
no homolog outside of Borrelia |
||
|
BBJ49 |
36455 |
36315 |
* |
PSEUDO |
[92] fragment |
|
|
BBJ50 |
36502 |
37203 |
PSEUDO |
authentic point mutation (see annotated gene list) |
||
|
BBJ51 |
38522 |
37382 |
PSEUDO |
[170] fragment/authentic frameshift |
||
|
lp54 |
||||||
|
BBA02 |
1238 |
1122 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA06 |
4342 |
4226 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA12 |
8202 |
8378 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA17 |
11390 |
11301 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA22 |
15084 |
15239 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA26 |
17386 |
17514 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA27 |
17563 |
17679 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA28 |
17906 |
17757 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA29 |
18019 |
17897 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA32 |
20654 |
20845 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA35 |
23391 |
23284 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA49 |
32678 |
32893 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA53 |
35890 |
36162 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA54 |
36192 |
36467 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA58 |
39566 |
39766 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA59 |
40048 |
39812 |
SHORT |
12 kd lipoprotein |
||
|
BBA62 |
42203 |
42406 |
SHORT |
7.5 kd lipoprotein |
||
|
BBA63 |
42576 |
42454 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA67 |
46021 |
46197 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA70 |
49158 |
48523 |
PSEUDO |
[54] fragment |
||
|
BBA71 |
49796 |
49386 |
PSEUDO |
[54] fragment |
||
|
BBA72 |
50031 |
49792 |
SHORT |
no homolog outside of Borrelia |
||
|
BBA75 |
52591 |
52496 |
SHORT |
no homolog outside of Borrelia |
||
|
lp56 |
||||||
|
BBQ01 |
279 |
545 |
* |
PSEUDO |
[55] fragment? see paralog list |
|
|
BBQ02 |
710 |
799 |
* |
QUESTIONABLE |
paralog not called in right end of lp28-4 |
|
|
BBQ04 |
1404 |
2265 |
PSEUDO |
[44] authentic frameshift |
||
|
BBQ10 |
6674 |
6339 |
PSEUDO |
[51] fragment; result of cp32 integration |
||
|
BBQ11 |
6586 |
6800 |
PSEUDO |
[148] fragment; result of cp32 integration |
||
|
BBQ16 |
9238 |
9624 |
PSEUDO |
[108] authentic frameshift |
||
|
BBQ21 |
12193 |
12426 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBQ30 |
18713 |
18913 |
SHORT |
probably a real gene; has paralogs on other cp32s |
||
|
BBQ36 |
21424 |
21513 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBQ46 |
29559 |
29651 |
* |
QUESTIONABLE |
paralog not called in paralogous sequence elsewhere |
|
|
BBQ51 |
33873 |
35095 |
PSEUDO |
[146] authentic frameshift |
||
|
BBQ54 |
36389 |
37132 |
PSEUDO |
[148] fragment; result of cp32 integration |
||
|
BBQ55 |
37533 |
36933 |
PSEUDO |
[51] fragment; result of cp32 integration |
||
|
BBQ56 |
37809 |
37672 |
SHORT |
no homolog outside of Borrelia |
||
|
BBQ57 |
38223 |
37819 |
QUESTIONABLE |
in frame and inside of BBQ60 |
||
|
BBQ58 |
38514 |
38419 |
* |
QUESTIONABLE |
in frame and inside of BBQ60 |
|
|
BBQ59 |
39077 |
38568 |
QUESTIONABLE |
in frame and inside of BBQ60 |
||
|
BBQ60 |
39360 |
37817 |
PSEUDO |
[101] fragment |
||
|
BBQ61 |
39482 |
39360 |
* |
QUESTIONABLE |
in frame and inside of BBQ63 |
|
|
BBQ62 |
39531 |
39902 |
QUESTIONABLE |
backwards and inside of BBQ63 |
||
|
BBQ63 |
39934 |
39400 |
PSEUDO |
[117] fragment |
||
|
BBQ64 |
40186 |
39962 |
* |
QUESTIONABLE |
inside of and in-frame with BBQ65 |
|
|
BBQ65 |
40218 |
39961 |
* |
PSEUDO |
[103] fragment |
|
|
BBQ66 |
40317 |
40409 |
SHORT |
no homolog outside of Borrelia |
||
|
BBQ67 |
43732 |
40439 |
PSEUDO |
putative adenine specific DNA methyltransferase (e.g., BBE29 [167]) fused to [102] - like sequences |
||
|
BBQ68 |
44232 |
43918 |
QUESTIONABLE |
in-frame and inside of BBQ69 |
||
|
BBQ69 |
44581 |
43769 |
PSEUDO |
[138] authentic frameshift/fragment |
||
|
BBQ70 |
44612 |
44511 |
* |
QUESTIONABLE |
in-frame in BBQ69 in part |
|
|
BBQ71 |
44582 |
45263 |
PSEUDO |
[105] fragment |
||
|
BBQ72 |
45314 |
45427 |
SHORT |
no homolog outside of Borrelia |
||
|
BBQ73 |
45630 |
45530 |
* |
PSEUDO |
[60] fragment |
|
|
BBQ74 |
46462 |
45804 |
PSEUDO |
[60] fragment |
||
|
BBQ75 |
46671 |
46781 |
* |
PSEUDO |
[168] fragment |
|
|
BBQ76 |
46982 |
47077 |
SHORT |
no homolog outside of Borrelia |
||
|
BBQ77 |
47163 |
47279 |
* |
PSEUDO |
contains a [82] fragment which may be part of one deleted pseudogene that includes BBQ79 |
|
|
BBQ78 |
47295 |
47393 |
* |
QUESTIONABLE |
out of frame inside BBQ81 |
|
|
BBQ79 |
47273 |
47596 |
PSEUDO |
[82] fragment |
||
|
BBQ80 |
48626 |
47787 |
* |
PSEUDO |
[60]fragment |
|
|
BBQ81 |
49246 |
49047 |
* |
PSEUDO |
[48] fragment |
|
|
BBQ82 |
49538 |
49347 |
* |
PSEUDO |
[76] fragment |
|
|
BBQ83 |
49868 |
49755 |
SHORT |
no homolog outside of Borrelia |
||
|
BBQ84 |
50550 |
50398 |
* |
PSEUDO |
[84] fragment (also [57] fragment?) |
|
|
BBQ84.1 |
50900 |
50700 |
* |
PSEUDO |
[169] authentic frameshift |
|
|
BBQ85 |
51528 |
51175 |
PSEUDO |
[57] fragment |
||
|
BBQ86 |
51823 |
51722 |
* |
PSEUDO |
[57] fragment |
|
|
BBQ87 |
52067 |
51921 |
* |
PSUEDO |
[77] ([57] fragment?) |
|
|
BBQ89 |
52726 |
52535 |
SHORT |
no homolog outside of Borrelia |
||
|
Large Chromosome pseudogenes (and short and questionable genes in rightmost 7.2 kbp) |
||||||
|
BB0119 |
117763 |
116828 |
PSEUDO |
single authentic frameshift |
||
|
BB0843.1 |
903255 |
903415 |
* |
PSEUDO |
[32] fragment |
|
|
BB0845 |
905120 |
905224 |
* |
QUESTIONABLE |
inside BB0845.1 and backwards |
|
|
BB0845.1 |
905255 |
905025 |
* |
PSEUDO |
[76] pseudogene |
|
|
BB0845.11 |
905395 |
905295 |
* |
PSEUDO? |
[166] fragment; possible pseudogene; this is a relatively poor homolog and is not included in the TIGR analysis |
|
|
BB0845.2 |
905475 |
905775 |
* |
PSEUDO |
[105] fragment |
|
|
BB0846 |
905865 |
905755 |
* |
QUESTIONABLE |
overlaps BB845.3 backwards |
|
|
BB0847 |
905839 |
905943 |
SHORT |
|||
|
BB0848 |
905928 |
906029 |
SHORT |
|||
|
BB0848.1 |
906075 |
906275 |
* |
PSEUDO |
[82] fragment |
|
|
BB0849 |
906162 |
906260 |
* |
QUESTIONABLE |
inside BB0848.1 |
|
|
BB0849.1 |
906725 |
906275 |
PSEUDO |
[57] fragment |
||
|
BB0849.2 |
907225 |
908225 |
PSEUDO |
[1] fragment |
||
|
BB0853 |
910175 |
909845 |
PSEUDO |
[57] fragment |
||
|
BB0853.1 |
910555 |
910375 |
* |
PSEUDO |
[57] fragment |
|
Direct, Tandem Repeat Arrays of "Short" Sequences and "Long" Inverted Repeats in the B. burgdorferi B31 Plasmids
Sherwood Casjens, Granger Sutton and Brian Stevenson - February, 1999
This compilation was not done in a completely rigorous fashion, so there may be additional short tandem repeat arrays on the B31 plasmids. The roles of these repeats are unknown with the exception of the vlsE cassettes on lp28-1 (Zhang et al., 1997).
|
Plasmid |
Unit Length (bp) |
# of repeats |
bp in array |
in gene? (location) |
Paralogous gene family |
Typical repeat unit sequence |
|
lp17 |
21 |
8.3 |
175 |
no (13154) |
-- |
TAATTAATATGTGATATAAAA |
|
lp21 |
63 |
176.2 |
11,004 |
no (3618) |
-- |
ATAAATCATATAAATAAATATTTCATAAATAAT AAGTAAAAGTGGTTTAGTTTTGGAGTGTAT (see below) |
|
lp28-1 |
33(A) |
2.5 |
83 |
BBF03 |
[80] |
tactactaagattgatactgttaaaagcgaact |
|
lp28-1 |
33(B) |
3.1 |
101 |
BBF03 |
[80] |
ggaatccaataacaaagttcttttggaaaagct |
|
lp28-1 |
~570 |
15 |
8800 |
BBF32 |
[170] |
vlsE pseudogene cassettes |
|
lp28-2 |
87(A) |
3.1 |
272 |
BBG33 |
[80] |
CAATCTTGTTACTAAGATTGATACTGTTAAAAG TGAACTTACTACTAAGATTGATAATGTAGAAAA GAATTTACAAAAGGATATATC |
|
lp28-2 |
33(B) |
4.4 |
145 |
BBG33 |
[80] |
AAAGCTGGAAGCCAATAACAAACTTCTTTTGGA |
|
lp28-3 |
54(A) |
5.6 |
304 |
BBH13 |
[80] |
tagaaaagaatttacaaaaagacatatttaatt tagatgctaagatagattctg |
|
lp28-4 |
27 |
21.5 |
661 |
BBI16 |
[60] |
aacagaagaagagcttaagaaaaaaca |
|
lp38 |
17 |
7.6 |
125 |
no (5938) |
-- |
AATTGATATTAAAATAT |
|
7 |
12.3 |
86 |
no (10287) |
-- |
TAATAGT |
|
|
lp54 |
11 |
7.1 |
78 |
no (20128) |
-- |
TAAATCAATAT |
|
cp32-3 |
54 |
3.6 |
194 |
BBS29 |
[80] |
ATACCAAGATAGATAATGTAGAAAAGAATTTAC AAAAAGATATATCTAATTTAG |
|
cp32-3 |
87 |
4.1 |
354 |
BBS37 |
[80] |
TTTAAACACTAAAATAGACAATGTTGAAAAGAA TTTAAATCTAAAAATAGATAATTTAGACTCTAA AATAGATACTGTAGAAAAGAA
(note that this complex 87 bp repeat is actually made up of two parts, a 54 bp part and a 33 bp part that is a fragment of the 54 bp sequence). |
Each of the cp32s have an "ortho-paralog" of both BBS29 and BBS37 (except cp32-1 and cp32-6, which have only a ortho-paralog of BBS37), that are not listed here. All of these do contain similar repeats. W. Zuckert has performed a more detailed analysis of the family 80 repeats and named these genes "bdr" for Borrelia direct repeat (Zuckert, Meyer & Barbour, 1999).
The 63 bp repeat tract on plasmid lp21
The lp21 "63 bp repeat tract" is a tandem array of 176 near repeats of a 63 bp sequence. The 63 bp sequence unit is actually made up of 34 different, very similar sequences (analysis by G. Sutton & S. Casjens). In the table below, the three columns represent: (1) a letter name given to each type of 63 bp repeat unit, (2) the number of that particular repeat unit present in the lp21 array, and (3) the sequence of that type of repeat unit.
a 28 --TGGAGTGTATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
b 2 --TGGAGTGTATAAATCATAAAAATAAATATTTTATAAAGAATTAGTAAAAGTGGTTTAGTTT
c 1 --TGGAGTGTATAAATCATAAAAATAAATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
d 14 --TGGAGTGTATAAATCATATAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
e 1 --TGGAGTGTATAAATCATATAAATAAATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
f 1 A--GGAGTGTATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
g 1 A--GGAGTGTATAAATCATATAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
h 5 ACTGGACTACATAAATCATAAAAATAAATATTTCATAAAGAATAAGTAAAAGTGGTTTAGTTT
i 6 ACTGGACTACATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
j 1 ACTGGACTACATAAATCATAAAAATAAATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
k 1 ACTGGACTACATAAATCATAAAAATAGATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
l 6 ACTGGACTACATAAATCATATAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
m 5 ACTGGACTACATAAATCATATAAATAGATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
n 3 ACTGGAGTGTATAAATCATAAAAATAAATATTTCATAAAGAATAAGTAAAAGTGGTTTAGTTT
o 5 ACTGGAGTGTATAAATCATAAAAATAAATATTTCATAAATAATAAGTAAAAGTGGTTTAGTTT
p 1 ACTGGAGTGTATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTGTAGTTT
q 38 ACTGGAGTGTATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
r 1 ACTGGAGTGTATAAATCATAAAAATAAATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
s 1 ACTGGAGTGTATAAATCATAAAAATAGATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
t 2 ACTGGAGTGTATAAATCATATAAATAAATATTTCATAAATAATAAGTAAAAGTGGTTTAGTTT
u 6 ACTGGAGTGTATAAATCATATAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
v 2 AGTGGACTACATAAATCATAAAAATAAATATTTCATAAAGAATAAGTAAAAGTGGTTTAGTTT
w 14 AGTGGACTACATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
x 1 AGTGGACTACATAAATCATAAAAATAAATATTTTATAAAGAATTAGTAAAAGTGGTTTAGTTT
y 3 AGTGGACTACATAAATCATAAAAATAAATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
z 1 AGTGGACTACATAAATCATAAAAATAGATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
A 6 AGTGGACTACATAAATCATATAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
B 1 AGTGGACTACATAAATCATATAAATAAATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
C 3 AGTGGACTACATAAATCATATAAATAGATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
D 3 AGTGGACTACATAAATCATATAAATAGATATTTTATAAATAATAAGTAAAAGTGGTTTAGTTT
E 1 AGTGGAGTGTATAAATCATAAAAATAAATATTTCATAAAGAATAAGTAAAAGTGGTTTAGTTT
F 8 AGTGGAGTGTATAAATCATAAAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
G 3 AGTGGAGTGTATAAATCATATAAATAAATATTTTATAAAGAATAAGTAAAAGTGGTTTAGTTT
H 1 CCTCGGGTACATAAATCATAAAAATAAATATTTCATAAAGAATAAGTAAAAGTGGTTTAGTTT
The types of repeat units appear in the order given below. Note that single underline and double underline indicate the two major larger perfect repeats of a contiguous group of 63 bp units within the array (i.e., large perfect sequence repeats) that probably indicate recent duplications of sections of the array; some smaller pieces (e.g., "w, l, q, q") of these groups can also be found elsewhere in the array. The longest tandem group of perfect 63 bp repeats is 3 "q" repeats in a row.
H, t, b, d, q, C, q, A, a, d, a, i, q, a, q, w, l, q, q, q, a, o, D, h, h, A,
G, a, o, a, q, a, o, D, h, h, A, G, a, o, a, e, u, F, d, d, q, C, D, a, a, d,
F, w, l, q, q, n, d, d, x, n, n, l, a, q, i, l, w, a, w, d, a, d, F, u, a, o,
w, a, q, w, l, q, q, q, a, i, a, q, q, k, d, d, q, E, u, a, q, q, p, a, q, d,
F, G, q, q, u, F, F, q, u, j, F, s, w, q, q, u, g, f, F, w, q, A, q, v, t, b,
d, q, C, q, A, a, d, a, i, q, a, q, w, l, q, q, i, q, q, a, w, A, a, w, y, B,
a, y, y, q, z, r, m, h, w, a, m, w, a, m, w, m, v, i, m, c
The cp32/cp9 ~180 bp Inverted Repeats
Each of the cp32s, the cp32-like portion of lp56 and cp9 carry a similar pair of ~180 bp inverted repeats. Each plasmid carries a left inverted repeat unit (IRL) and a right inverted repeat unit (IRR) which surround the putative plasmid partitioning gene cluster (Casjens et al., 2000).
outside inside inside outside
ORF-4 ---------------> "partition genes" <----------------- ORF-8/7
IRL IRR
These inverted repeats are not all completely identical; in some locations all IRLs are more like each other than they are like the IRRs (e.g., some members in tier 1 below), and in other places IRLs from one plasmid are more like IRRs from the same plasmid than they are like IRLs or IRRs from other plasmids (some members in tier 3 below). Below, we indicate the presence of smaller inverted repeats within the larger repeat units as follows (---> <---).
Each of the inverted repeat units overlaps a putative gene such that the ATG marked with asterisks (***) is the most likely start codon. Each of these has a credible Shine-Dalgarno ribosome binding site marked by "xxxx"; a "====>" indicates the 5-portion of the putative gene which is translated outward from the repeat unit. A paralogous family 161 gene (previously called ORF-4 family) gene starts within each IRL and a family 165 gene (previously called ORF-8/7 family) starts within each IRR; curiously, these two (otherwise unrelated) protein families will have very similar 13-15 N-terminal amino acid sequences. In some, but not all of these genes, the above open reading frame extends 5 of the marked ATG to the TTG marked with bullets (), but its ribosome binding site is not as credible as that of the ATG described above.
The roles of these inverted repeats are unknown, but they are positioned such that one might expect an outward directed promoter to initiate transcription in each of them. It is thus possible, or even likely, because of the extreme similarity of these regions, that these outward-directed promoters might be coordinately controlled.
It is interesting to note that these inverted repeats do not surround similar "partitioning" gene clusters on the linear plasmids, where family 161 and 165 genes are not located adjacent to the "partitioning" gene cluster.
The inverted repeat sequences below are named as follows: each of the cp32s is indicated simply by its cp32 number, lp56 is indicated by "56" and cp9 by "cp9". Note that cp32-5, which is not present in strain B31 MI but is present in some other B31 cultures, is included in this analysis.
|
Inside end |
|
|
-->............< |
|
|
1-IRL |
ACGGG..CTTAACTAATTTCTTTAGTAGATAATAGAGAATTTAGCTAAGC |
|
3-IRL |
ATGGG..CTTAGCTAAGTTCTTTAACA......AGAGAACGTAGCTAAGC |
|
4-IRL |
GTGGGAACTTGGCGAAATTCTTTTTAA......AGGGAATTTGGTTAAGT |
|
5-IRL |
TAGGG..ATTAACTAAGTTTTTTAGTAGATAATAGAAAATTTAGCTAAGC |
|
6-IRL |
GTGGGGACTTAACGAGATTCTTTAAGA......AAAGAGTTTGGTTAAGT |
|
7-IRL |
TTGGGGACTTAACGAGATTCTTTGAGA......AAAGAGTTTGGTTAAGT |
|
8-IRL |
ACGGG..CTTAACTAATTTCTTTAGTAGATAATAGAGAATTTAGCTAAGC |
|
9-IRL |
ATGGG..CTTAACTAAGTTCTTTAACA......AGAGAATTTAGCTAAGC |
|
56-IRL |
ACGGG..CTTAACTAATTTCTTTAGTAGATAATAGAGAATTTAGCTAAGC |
|
cp9-IRL |
TAGGG.. CTTTACTAAGTTCTTTTAAA......AGAGAATTTAGCAAAGC |
|
1-IRR |
TTGGG..TTTAGCTAAGTTCTTTAACA......AGAGAATTTAGCTAAGC |
|
3-IRR |
TTGGG..TTTAGCTAAGTTCTTGGATA......AGAGAATTTAAATAAAC |
|
4-IRR |
TTGGG..TTTAGCTAAGTTCTTTAACA......AGAGAATTTAAATAAGC |
|
6-IRR |
TTGGG..TTTAGCTAAGTTCTTTAACA......AGAGAATTTAAATAAGC |
|
7-IRR |
TTGGG..TTTAGCTAAGTTCTCTAACA......AGAGAATTTAAATAAGC |
|
8-IRR |
TTGGG..TTTAGCTAAGTTCTTAGACA......AGAGAATTTAAATAAGC |
|
9-IRR |
TTGGG..TTTAGCTAAGTTCTTTAACA......AGAGAATTTAAATAAGT |
|
56-IRR |
TTGGG..TTTAGCTAAGTTCTTAGATA......AGAGAATTTAAATAAAC |
|
cp9-IRR |
TAGGG..CTTTTCTAAGTTCTTTTAAA......AGAGAATTTAGTAAAGC |
|
....>..<.... |
|
|
1-IRL |
CC....TATTTTTTTGTAAAATTTTTTGTAAAAAAG....TTGGCAAAAA |
|
3-IRL |
CCG...CACCTTTTTGTAAAGATTTTTGTAAAAAAG....TTGGCAAAAA |
|
4-IRL |
CCCAC.TTCTTTTGTGTAAAATTTTTTGTAAAAAAG....TTGGCAAAAA |
|
5-IRL |
CC....TAATTTTTTGTAAAAATTTTTGTAAAAAAG....TTGGCAAAAA |
|
6-IRL |
CCCAC.TTCTTTTTTGTAAAAATTTTTGTAAAAAGC....CTGACAAAAA |
|
7-IRL |
CCCAC.TTCTTTTTTACAAAAATTTTTGTAAAAAAG....TTGGCAAAAA |
|
8-IRL |
CC...TATTTTTTTTGTAAAAATTTTTGTAAAAAAG....TTGGCAAAAA |
|
9-IRL |
CCGCAC...TTTTTGTAAAAATTTTTTGTAAAGAAG....TTGGCAAAAA |
|
56-IRL |
CC...TAT.TTTTTTGTAAAAATTTTTGTAAAAAAG....TTGGCAAAAA |
|
cp9-IRL |
CCTA..AGTCTTTTAACAAAAATTTTTATTAAAAAA..AGTTGACAAAAA |
|
1-IRR |
CC...TAT.TTTTTTGTAAAATTTTTTGTAAAAAAG....TTGGCAAAAA |
|
3-IRR |
CCAACTAT.TTTTTTACAAAAATTTTTGTAAAAAAAAAAGTTGGCAAAAA |
|
4-IRR |
CCAACTA.ATTTTTTGTAAAATTTTTTGTAAAAAAG....TTGGCAAAAA |
|
6-IRR |
CCAACTAT.TTTTTTGTAAAAATTTTTGTAAAAAAG....TTGGCAAAAA |
|
7-IRR |
CCAACTAA.TTTTTTGTAAAATTTTTTGTAAAAAAG....TTGGCAAAAA |
|
8-IRR |
CCAACTATTTTTTTTGTAAAGATTTTTGTAAAAAAG....TTGGCAAAAA |
|
9-IRR |
CCAACTA.TTTTTTTGTAAAAATTTTTGTAAAAAAG....TTGTCAAAAA |
|
56-IRR |
CCAACTA.TTTTTTTGTAAAATTTTTTGTAAAAAAG...CCTGACAAAAA |
|
cp9-IRb |
CCTA..AGTGTTTTAATAAAAAATTTTTTGTAAAAA..AGTTGGCAAAAA |
|
........................................xxxx |
|
|
1-IRL |
TAGTTTTTGCTATATACTTATTTTTATAAATAACC..ATA...GGAGTAA |
|
3-IRL |
TAGTTTTTGCTATATATTTATTTTTAT..ACAAAT..ATAA..GGAGAAA |
|
4-IRL |
TAGTTTTTGCTATATAATTA..TTTATTACAA....AATAA..GGAGGAA |
|
5-IRL |
TAGTTTTTGCTATATACTTATATTTATTAATACCA..ATTAAAGGAGGAA |
|
6-IRL |
TAGTTTTTGCTATATACTTAT..TTATTACTAA....ATAA..AGGAGGA |
|
7-IRL |
TAGTTTTTGCTATATACTTATATTTATTACTAT....AAAA..GGAGTAA |
|
8-IRL |
TAGTTTTTGCTATATACTTATATTTATTAATACAA..ACAA..GGAGGAA |
|
9-IRL |
TAGTTTTTGCTATATACTTATATTTATTAATACAT..ATAAACGGAGGAA |
|
56-IRL |
TAGTTTTTGCTATATACTTATATTTATTGAAAA...AACA...GGAGGAA |
|
cp9-IRa |
TAGTTTTTGCTATATATTTATATATAAGAAAATTATAACTTACGGAGTAA |
|
1-IRR |
TAGTTTTTGCTATATACTTATTTTTATAAATAACC..ATA...GGAGTAA |
|
3-IRR |
TAGTTTTTGCTATATATTTATTTTTAT.ACAAAT...ATAA..GGAGAAA |
|
4-IRR |
TAGTTTTTGCTATATAATTAT..TTATTACAA....AATAA..GGAGGAA |
|
6-IRR |
TAGTTTTTGCTATATACTTAT..TTATTACTA....AATAA..AGGAGGA |
|
7-IRR |
TAGTTTTTGCTATATACTTATATTTATTACTATAA.AA.....GGAGTAA |
|
8-IRR |
TAGTTTTTGCTATATACTTATATTTATTAATACA..AACAA..GGAGGAA |
|
9-IRR |
TAGTTTTTGCTATGTAATTAT.....TTATTACAA.AATAA..GGAGGAA |
|
56-IRR |
TAGTTTTTGCTATATACTTATATTTTTTACTATAA.AA.....GGAGTAA |
|
cp9-IRb |
TAGTTTTTGCTATATATTTATATATAAGAAAATTATAACTTGCGGAGTAA |
|
....***=============================>translation |
|
|
1-IRL |
AAAGATGGAAAATCTTTCAAACAATAAT...CAAGAAATACAAAATAATA |
|
3-IRL |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
4-IRL |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
5-IRL |
AAAGATGGAAAATCTTTCAAACAATAATAATCAAGAAATACAAAATAATA |
|
6-IRL |
AAACATGAACAATGTTTCAAACAATAATAATCAAGAAATACAAAATAATA |
|
7-IRL |
AAAGATGGAAAATCTTTCAAACAATAATAATCAAGAAATACAAAATAATA |
|
8-IRL |
AAAGATGGAAAATCTTTCAAACAATAAT...CAAGAAATACAAAATAATA |
|
9-IRL |
AAAGATGGAAAATCTTTAAAACAATAATAATC......CACAAGAAAATA |
|
56-IRL |
AAAGATGGAAAATCTTTCAAACAATAATAATCAAGAAATACAAAATAATA |
|
cp9-IRa |
AAAAATGAAAAACC..GCAAA.AACAATAATC......CACAAGAAATTA |
|
1-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
3-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
4-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
6-IRR |
AAACATGAACAATGTTTCAAACAATAATAATC......CACAAGACAATA |
|
7-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
8-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
9-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
56-IRR |
AAAGATGGAAAATCTTTCAAACAATAATAATC......CACAAGAAAATA |
|
cp9-IRb |
AAAAATGAAAAACT.....TA.AACAATAATC......CACAAAAAATTA |
|
Outside end |
|
|
1-IRL |
TTCAA |
|
3-IRL |
TTCAA |
|
4-IRL |
TTCAA |
|
5-IRL |
TTCAA |
|
6-IRL |
TTCAA |
|
7-IRL |
TTCAA |
|
8-IRL |
TTCAA |
|
9-IRL |
TTCAA |
|
56-IRL |
TTCAA |
|
cp9-IRa |
ATCAA |
|
1-IRR |
TTCAA |
|
3-IRR |
TTCAA |
|
4-IRR |
TTCAA |
|
6-IRR |
TTCAA |
|
7-IRR |
TTCAA |
|
8-IRR |
TTCAA |
|
9-IRR |
TTCAA |
|
56-IRR |
TTCAA |
|
cp9-IRb |
ATCAA |
"AMBIGUOUS" NUCLEOTIDES IN THE BORRELIA BURGDORFERI B31 GENOME SEQUENCE
Compiled by Jeremy Peterson & Sherwood Casjens - January , 1999
A few nucleotide positions in the B. burgdorferi B31 genome sequence were determined to have sequencing template clones of two types in the DNA library. These are likely to represent heterogeneity in the culture that was sequenced and are listed below. They are indicated in the GENBANK sequences with ambiguous nucleotide symbols.
Standard ambiguous nucleotide nomenclature is used in the GENBANK entries and below:
R=A,G
Y=C,T
M=A,C
K=G,T
S=C,G
W=A,T
N=A,G,C,T
Ambiguous nucleotides in the long chromosome
nucleotide, position
R, 18473
R, 62338
Y, 83386
M, 88493
R, 141298
R, 175458
Y, 188937
M, 194913
S, 198850
M, 210217
Y, 260019
K, 267641
R, 318524
W, 322559
M, 351692
Y, 367774
M, 390255
M, 390264
M, 390397
Y, 461147
R, 461510
Y, 478223
R, 511724
W, 540030
R, 540041
M, 540430
M, 540431
R, 546978
S, 552218
K, 565089
S, 586413
K, 608556
K, 658807
K, 658810
R, 691440
M, 758684
R, 762044
W, 781552
R, 796301
N, 804071
N, 804561
K, 834040
W, 889305
Ambiguous nucleotides in cp9
Part VI
none
Ambiguous nucleotides in cp26
none
Ambiguous nucleotides in cp32-1
none
Ambiguous nucleotides in cp32-3
none
Ambiguous nucleotides in cp32-4
none
Ambiguous nucleotides in cp32-6
none
Ambiguous nucleotides in cp32-7
none
Ambiguous nucleotides in cp32-8
none
Ambiguous nucleotides in cp32-9
none
Ambiguous nucleotides in lp5
nucleotide, position
M, 4530
Ambiguous nucleotides in lp17
none
Ambiguous nucleotides in lp21
none
Ambiguous nucleotides in lp25
nucleotide, position
Y, 2440
M, 3601
Y, 6064
S, 21223
R, 21237
W, 21835
Y, 21837
Ambiguous nucleotides in lp28-1
none
Ambiguous nucleotides in lp28-2
none
Ambiguous nucleotides in lp28-3
nucleotide, position
M, 12885
W, 27274
Ambiguous nucleotides in lp28-4
none
Ambiguous nucleotides in lp36
none
Ambiguous nucleotides in lp38
nucleotide, position
M, 30313
Y, 30403
Ambiguous nucleotides in lp54
nucleotide, position
R, 10215
K, 10662
Ambiguous nucleotides in lp56
none
B. burgdorferi B31 Plasmid Sequence Assembly
by Granger Sutton, Patti Rosa and Sherwood Casjens - February 1999
An improved version of TIGR ASSEMBLER was required to uniquely assemble the highly similar cp32s and lp56, as well as the very similar lp5 and lp21 plasmids. Very similar regions of DNA with only single base pair differences needed to be differentiated. In order to do this, unique base pairs within these repetitive regions needed to be identified. The new TIGR ASSEMBLER does this by counting the number of occurrences of each 32 bp oligomer (32mer) in the random sequence reads. 32mers which are not unique to a single plasmid or region will be over-represented. In determining the assembly, TIGR ASSEMBLER gives more weight to overlapping sequence reads which contain the same relatively under-represented 32mers, while giving much less weight to over-represented 32mers. In regions where there were few or no unique base pairs, clone mate constraints guided the assembly process. Clone mates are sequence reads from opposite ends of the same DNA clone. By using the known orientation of the clone mates and approximate size of the clone, TIGR ASSEMBLER chooses sequence read overlaps which satisfy the clone mate constraints. A new feature of TIGR ASSEMBLER which was essential for assembling the long tandem 63 bp repeat in lp21 is tandem 32mer masking. By default, any 32mer which occurs more than once in a single sequence read is not used for sequence overlap determination. This allowed small differences in the tandem repeat unit to determine correct overlaps while ignoring the myriad of possible non unique overlaps. The last important new feature of TIGR ASSEMBLER is the ability to initialize the assembly process with a previous set of assemblies from TIGR ASSEMBLER. The new TIGR ASSEMBLER did not achieve the final assemblies of the cp32s/lp56 or lp21 on the first iteration. The assemblies were inspected and, when determined to be incorrect, an assembly or portion of an assembly was discarded. The remaining assemblies and portion of assemblies were used to jump start TIGR ASSEMBLER and the process was repeated with different parameters until unique, internally consistent assemblies were achieved. Internal consistency was judged from clone mate constraints and base pair differences between overlapping sequence reads. This process was carried out without using information from the restriction maps of the plasmids that are described in the text.
Because of the difficulties encountered in the sequence assembly process (due to the extensive similarities among the plasmids), it was necessary to confirm the accuracy of the assembly of the plasmid sequences in B31. Restriction maps of five of the cp32 plasmids and of cp26 from B31 have been previously described in Casjens et al. (1997b) and Tilly et al. (1997), respectively. We therefore screened the pUC plasmid library clones used in the sequencing project (Fraser et al., 1997) for ones that hybridized uniquely or nearly so to only one of the remaining B31 linear plasmids in Southern analysis of CHEF gels of intact and of restricted B31 MI DNA, and these were used to construct restriction site maps of the cognate plasmids according to the strategy described by Casjens et al. (1997b) The structures of cp9 and lp17 were not confirmed in this way, since (i) they assembled unambiguously even with the original, less stringent TIGR ASSEMBLER, and (ii) Barbour et al. (1996) previously reported the sequence of B31 lp17, and (iii) Dunn et al. (1994) previously reported the sequence of a cp9-like plasmid from a related isolate. Our sequence assemblies agree with those sequences. For example for lp56 all of the 23 mapped sites were correctly predicted within experimental error by the assembled sequence of lp56. 344 restriction enzyme cleavage sites were mapped on the 19 plasmids, and all are correctly predicted by the sequence. In addition, no sites were found in the sequence that were not in the plasmid DNAs and vice versa (six differences between predictions made by the sequence and the previously published cp32-1, cp32-3, cp32-4 and cp32-6 restriction maps were found by further experimentation to be mapping errors).
Assembly of the sequences of the cp32s and the closely related portion of lp56 were particularly difficult. Nonetheless, they are likely to be correct since all of their restriction maps are predicted perfectly by the nucleotide sequences, which were assembled without knowledge of the restriction maps. All eight previously mapped restriction sites (or lack thereof) that are diagnostic of a single cp32 are present in the assembled sequence on the correct cp32, and all of the 19 blocks of sequence that had been previously mapped to individual cp32s (Casjens et al., 1997; Stevenson et al., 1998b; Zuckert and Meyer, 1996) are present in the correct cp32 at the experimentally determined location. (We note that the pOMB25 sequence that was attributed without mapping data to cp32-1 (Zuckert and Meyer, 1996) is actually in the cp32-3 sequence.) Although we feel it is unlikely, it remains possible that small regions of similar sequences could be placed on the wrong cp32 by our assembly technique.
Assembly of the lp21 sequence had a special problem in that it contains a long tract of a 63 bp direct repeat. The lp21 sequence reported here contains 11,004 bp, or 176 (plus one partial) tandem copies of the repeat unit. There are no unique, unrelated sequences interspersed among the 63 bp repeats, and not all of the repeats are identical, as is indicated experimentally by the small number of Tsp509I sites within the tract. This non-identity made assembly from random sequencing runs possible. In the reported sequence there are 34 distinct repeat sequences; 27 of these types (128 total repeat units) are 63 bp long and 7 types (48 units) are 61 bp long. The maximum number of adjacent identical units is three, and there are two large exact repeats within the tract; units 2-19 are identical to units 129-146 and units 20-30 are identical to units 31-41. In order to experimentally characterize this repeat tract further, we used Southern analysis to measure the sizes of several restriction fragments that contain all the repeats. CHEF electrophoresis gels of B31 MI DNA cleaved with MseI, DraI, AseI, HindIII, EcoRI, StuI, BsrGI, XbaI and EcoO109I (all of which are predicted not to cleave within the repeat region) gave single DNA bands of 11, 13, 14, 13, 16, 17, 18, 16 and 18 kbp (all ±1 kb), respectively, that hybridize to a 63 bp repeat DNA probe. In addition, Tsp509I gave probe reactive fragments of about 3.0, 3.5 and 4.3 kbp. MseI and Tsp509I cleave at TTAA and AATT, respectively, and so are expected to cleave the 71.8% A+T Borrelia DNA to an average fragment size of 60-70 bp; indeed the fragments mentioned above are by far the largest fragments (as observed by ethidium bromide staining) in complete digests of B31 DNA made by these two enzymes. The three Tsp509I fragments from genomic B31 DNA were gel purified and used as probes in Southern analyses of electrophoresis gels of uncleaved and MseI digested B31 DNA. All were found t o hybridize only to lp21 DNA and to the 11 kbp MseI band (data not shown). We also confirmed predictions from the sequence that three other four bp-recognizing restriction enzymes, AluI, RsaI and Sau3AI, do not cut the repeat region, and that the six bp recognizing SspI cuts the entire repeat region into very small (<300 bp) fragments. Calculations from the locations of the cleavage sites for the above nine enzymes in the surrounding unique sequence gave a repeat tract length value of 11.9±1.0 kbp. This is in reasonable agreement with the 11 kbp in the sequence, as are the predicted sizes of the Tsp509I repeat-containing bands (2.9, 3.6 and 4.5 kbp; cf. measured sizes above), and we conclude that the assembly of this repeat region is likely to be accurate. Thus the sequences of all twenty-one of the plasmids are strongly supported either by physical maps that are correctly predicted by the sequence or by independent sequence determinations.
REFERENCES
Akins, D. R., Caimano, M. J., Yang, X., Cerna, F., Norgard, M. V., and Radolf, J. D. (1999) Molecular and Evolutionary Analysis of Borrelia burgdorferi 297 Circular Plasmid-Encoded Lipoproteins with OspE- and OspF-Like Leader Peptides. Infect Immun 67: 1526-1532.
Akins, D. R., Popova, T., Brusca, J., Goldberg, M. L., Li, M., Baker, S. C., and Norgard, M. V. (1993) Use of PhoA Gene Fusions and Anchored PCR to Identify and Express Borrelia burgdorferi Candidate Outer Membrane Proteins. GENBANK ACCESSION #: L31427.
Akins, D. R., Popova, T., Brusca, J., Goldberg, M. L., Li, M., Baker, S. C., and Norgard, M. V. (1995a) Use of PhoA Gene Fusions and Anchored PCR to Identify and Express Borrelia burgdorferi Candidate Outer Membrane Proteins. GENBANK ACCESSION #: L31423.
Akins, D. R., Porcella, S., Norgard, M. V., and Radolf, J. D. (1994) Borrelia burgdorferi protein p23. GENBANK ACCESSION #: L31616.
Akins, D. R., Porcella, S. F., Popova, T. G., Shevchenko, D., Baker, S. I., Li, M., Norgard, M. V., and Radolf, J. D. (1995b) Evidence for in vivo but not in vitro expression of a Borrelia burgdorferi outer surface protein F (OspF) homologue. Mol Microbiol 18: 507-20.
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389-402.
Amouriaux, P., Assous, M., Margarita, D., Baranton, G., and Saint Girons, I. (1993) Polymerase chain reaction with the 30-kb circular plasmid of Borrelia burgdorferi B31 as a target for detection of the Lyme borreliosis agents in cerebrospinal fluid. Res Microbiol 144: 211-9.
Balmelli, T., Valsangiacomo, C., Peter, O., and Piffaretti, J.-C. (1996) Borrelia garinii 70 kbp plasmid D6 protein gene. GENBANK ACCESSION #: U50840.
Bancroft, I., and Wolk, C. P. (1989) Characterization of an insertion sequence (IS891) of novel structure from the cyanobacterium Anabaena sp. strain M-131. J Bacteriol 171: 5949-54.
Barbour, A. G., and Carter, C. J. (1997) Putative transposase with translational frameshifting in the Lyme disease agent Borrelia burgdorderi. GENBANK ACCESSION #: U85588.
Barbour, A. G., Carter, C. J., Bundoc, V., and Hinnebusch, J. (1996) The nucleotide sequence of a linear plasmid of Borrelia burgdorferi reveals similarities to those of circular plasmids of other prokaryotes. J Bacteriol 178: 6635-9.
Barbour, A. G., Tessier, S. L., and Hayes, S. F. (1984) Variation in a major surface protein of Lyme disease spirochetes. Infect Immun 45: 94-100.
Barbour, A. G., Tessier, S. L., and Todd, W. J. (1983) Lyme disease spirochetes and ixodid tick spirochetes share a common surface antigenic determinant defined by a monoclonal antibody. Infect Immun 41: 795-804.
Bergstrom, S., Bundoc, V. G., and Barbour, A. G. (1989) Molecular analysis of linear plasmid-encoded major surface proteins, OspA and OspB, of the Lyme disease spirochaete Borrelia burgdorferi. Mol Microbiol 3: 479-86.
Bono, J. L., Tilly, K., Stevenson, B., Hogan, D., and Rosa, P. (1998) Oligopeptide permease in Borrelia burgdorferi: putative peptide-binding components encoded by both chromosomal and plasmid loci. Microbiology 144: 1033-44.
Brandt, M. E., Riley, B. S., Radolf, J. D., and Norgard, M. V. (1990) Immunogenic integral membrane proteins of Borrelia burgdorferi are lipoproteins. Infect Immun 58: 983-91.
Bunikis, J., Olsen, B., Fingerle, V., Bonnedahl, J., Wilske, B., and Bergstrom, S. (1996) Molecular polymorphism of the lyme disease agent Borrelia garinii in northern Europe is influenced by a novel enzootic Borrelia focus in the North Atlantic. J Clin Microbiol 34: 364-8.
Caporale, D. A., and Kocher, T. D. (1994) Sequence variation in the outer-surface-protein genes of Borrelia burgdorferi. Mol Biol Evol 11: 51-64.
Casjens, S., Delange, M., Ley, H. L., 3rd, Rosa, P., and Huang, W. M. (1995) Linear chromosomes of Lyme disease agent spirochetes: genetic diversity and conservation of gene order. J Bacteriol 177: 2769-80.
Casjens, S., Palmer, N., van Vugt, R., Huang, W. M., Stevenson, B., Rosa, P., Lathigra, R., Sutton, G., Peterson, J., Dodson, R., Haft, D., Hickey, E., Gwinn, M., White, O., and Fraser, C. (2000) A bacterial genome in flux: The twelve linear and nine circular extrachromosomal DNAs of an infectious isolate of the Lyme disease spirochete Borrelia burgdorferi. Molecular Microbiology 35: in press.
Casjens, S., van Vugt, R., Tilly, K., Rosa, P. A., and Stevenson, B. (1997) Homology throughout the multiple 32-kilobase circular plasmids present in Lyme disease spirochetes. J Bacteriol 179: 217-27.
Champion, C. I., Blanco, D. R., Skare, J. T., Haake, D. A., Giladi, M., Foley, D., Miller, J. N., and Lovett, M. A. (1994) A 9.0-kilobase-pair circular plasmid of Borrelia burgdorferi encodes an exported protein: evidence for expression only during infection. Infect Immun 62: 2653-61.
Donadio, S., and Staver, M. J. (1993) IS1136, an insertion element in the erythromycin gene cluster of Saccharopolyspora erythraea. Gene 126: 147-51.
Dunn, J. J., Buchstein, S. R., Butler, L. L., Fisenne, S., Polin, D. S., Lade, B. N., and Luft, B. J. (1994) Complete nucleotide sequence of a circular plasmid from the Lyme disease spirochete, Borrelia burgdorferi. J Bacteriol 176: 2706-17.
Feng, S., Das, S., Barthold, S. W., and Fikrig, E. (1996) Characterization of two genes, p11 and p5, on the Borrelia burgdorferi 49-kilo base linear plasmid. Biochim Biophys Acta 1307: 270-2.
Feng, S., Das, S., Lam, T., Flavell, R. A., and Fikrig, E. (1995) A 55-kilodalton antigen encoded by a gene on a Borrelia burgdorferi 49- kilobase plasmid is recognized by antibodies in sera from patients with Lyme disease. Infect Immun 63: 3459-66.
Fikrig, E., Barthold, S. W., Sun, W., Feng, W., Telford, S. R., 3rd, and Flavell, R. A. (1997) Borrelia burgdorferi P35 and P37 proteins, expressed in vivo, elicit protective immunity. Immunity 6: 531-9.
Fikrig, E., Chen, M., Barthold, S. W., Anguita, J., Feng, W., Telford, S. R., and Flavell, R. A. (1999) Borrelia burgdorferi erpT expression in the arthropod vectcor and murine host. Molecular Microbiology 21: 281-290.
Fraser, C. M., Casjens, S., Huang, W. M., Sutton, G. G., Clayton, R., Lathigra, R., White, O., Ketchum, K. A., Dodson, R., Hickey, E. K., Gwinn, M., Dougherty, B., Tomb, J. F., Fleischmann, R. D., Richardson, D., Peterson, J., Kerlavage, A. R., Quackenbush, J., Salzberg, S., Hanson, M., van Vugt, R., Palmer, N., Adams, M. D., Gocayne, J., Venter, J. C., and et al. (1997) Genomic sequence of a Lyme disease spirochaete, Borrelia burgdorferi. Nature 390: 580-6.
Fuchs, R., Jauris, S., Lottspeich, F., Preac-Mursic, V., Wilske, B., and Soutschek, E. (1992) Molecular analysis and expression of a Borrelia burgdorferi gene encoding a 22 kDa protein (pC) in Escherichia coli. Mol Microbiol 6: 503-9.
Gilmore, R. D., Jr., Kappel, K. J., and Johnson, B. J. (1997) Molecular characterization of a 35-kilodalton protein of Borrelia burgdorferi, an antigen of diagnostic importance in early Lyme disease. J Clin Microbiol 35: 86-91.
Gilmore, R. D., Jr., and Mbow, M. L. (1998) A monoclonal antibody generated by antigen inoculation via tick bite is reactive to the Borrelia burgdorferi Rev protein, a member of the 2.9 gene family locus. Infect Immun 66: 980-6.
Guina, T., and Oliver, D. B. (1997) Cloning and analysis of a Borrelia burgdorferi membrane-interactive protein exhibiting haemolytic activity. Mol Microbiol 24: 1201-13.
Gulig, P. A., Caldwell, A. L., and Chiodo, V. A. (1992) Identification, genetic analysis and DNA sequence of a 7.8-kb virulence region of the Salmonella typhimurium virulence plasmid. Mol Microbiol 6: 1395-411.
Guo, B. P., Brown, E. L., Dorward, D. W., Rosenberg, L. C., and Hook, M. (1998) Decorin-binding adhesins from Borrelia burgdorferi. Molecular Microbiology 30: 711-23.
Hagman, K. E., Lahdenne, P., Popova, T. G., Porcella, S. F., Akins, D. R., Radolf, J. D., and Norgard, M. V. (1998) Decorin-binding protein of Borrelia burgdorferi is encoded within a two- gene operon and is protective in the murine model of Lyme borreliosis. Infect Immun 66: 2674-83.
Hanson, M. S., Cassatt, D. R., Guo, B. P., Patel, N. K., McCarthy, M. P., Dorward, D. W., and Hook, M. (1998) Active and passive immunity against Borrelia burgdorferi decorin binding protein A (DbpA) protects against infection. Infect Immun 66: 2143-53.
Hinnebusch, J., Bergstrom, S., and Barbour, A. G. (1990) Cloning and sequence analysis of linear plasmid telomeres of the bacterium Borrelia burgdorferi. Mol Microbiol 4: 811-20.
Indest, K. J., Ramamoorthy, R., Sole, M., Gilmore, R. D., Johnson, B. J., and Philipp, M. T. (1997) Cell-density-dependent expression of Borrelia burgdorferi lipoproteins in vitro. Infect Immun 65: 1165-71.
Jauris-Heipke, S., Fuchs, R., Motz, M., Preac-Mursic, V., Schwab, E., Soutschek, E., Will, G., and Wilske, B. (1993) Genetic heterogenity of the genes coding for the outer surface protein C (OspC) and the flagellin of Borrelia burgdorferi. Med Microbiol Immunol (Berl) 182: 37-50.
Jauris-Heipke, S., Liegl, G., Preac-Mursic, V., Rossler, D., Schwab, E., Soutschek, E., Will, G., and Wilske, B. (1995) Molecular analysis of genes encoding outer surface protein C (OspC) of Borrelia burgdorferi sensu lato: relationship to ospA genotype and evidence of lateral gene exchange of ospC. J Clin Microbiol 33: 1860-6.
Jonsson, M., Noppa, L., Barbour, A. G., and Bergstrom, S. (1992) Heterogeneity of outer membrane proteins in Borrelia burgdorferi: comparison of osp operons of three isolates of different geographic origins. Infect Immun 60: 1845-53.
Katona, L. I., Beck, G., and Habicht, G. S. (1992) Purification and immunological characterization of a major low- molecular-weight lipoprotein from Borrelia burgdorferi. Infect Immun 60: 4995-5003.
Krause, M., Harwood, J., Fierer, J., and Guiney, D. (1991) Genetic analysis of homology between the virulence plasmids of Salmonella dublin and Yersinia pseudotuberculosis. Infect Immun 59: 1860-3.
Lahdenne, P., Porcella, S. F., Hagman, K. E., Akins, D. R., Popova, T. G., Cox, D. L., Katona, L. I., Radolf, J. D., and Norgard, M. V. (1997) Molecular characterization of a 6.6-kilodalton Borrelia burgdorferi outer membrane-associated lipoprotein (lp6.6) which appears to be downregulated during mammalian infection. Infect Immun 65: 412-21.
Lam, T. T., Nguyen, T. P., Montgomery, R. R., Kantor, F. S., Fikrig, E., and Flavell, R. A. (1994) Outer surface proteins E and F of Borrelia burgdorferi, the agent of Lyme disease. Infect Immun 62: 290-8.
Li, H., Dunn, J. J., Luft, B. J., and Lawson, C. L. (1997) Crystal structure of Lyme disease antigen outer surface protein A complexed with an Fab. Proc Natl Acad Sci U S A 94: 3584-9.
Marconi, R. T., Casjens, S., Munderloh, U. G., and Samuels, D. S. (1996a) Analysis of linear plasmid dimers in Borrelia burgdorferi sensu lato isolates: implications concerning the potential mechanism of linear plasmid replication. J Bacteriol 178: 3357-61.
Marconi, R. T., Konkel, M. E., and Garon, C. F. (1993a) Variability of osp genes and gene products among species of Lyme disease spirochetes. Infect Immun 61: 2611-7.
Marconi, R. T., Samuels, D. S., and Garon, C. F. (1993b) Transcriptional analyses and mapping of the ospC gene in Lyme disease spirochetes. J Bacteriol 175: 926-32.
Marconi, R. T., Samuels, D. S., Landry, R. K., and Garon, C. F. (1994) Analysis of the distribution and molecular heterogeneity of the ospD gene among the Lyme disease spirochetes: evidence for lateral gene exchange. J Bacteriol 176: 4572-82.
Marconi, R. T., Samuels, D. S., Schwan, T. G., and Garon, C. F. (1993c) Identification of a protein in several Borrelia species which is related to OspC of the Lyme disease spirochetes. J Clin Microbiol 31: 2577-83.
Marconi, R. T., Sung, S. Y., Hughes, C. A., and Carlyon, J. A. (1996b) Molecular and evolutionary analyses of a variable series of genes in Borrelia burgdorferi that are related to ospE and ospF, constitute a gene family, and share a common upstream homology box. J Bacteriol 178: 5615-26.
Margolis, N., Hogan, D., Cieplak, W., Jr., Schwan, T. G., and Rosa, P. A. (1994a) Homology between Borrelia burgdorferi OspC and members of the family of Borrelia hermsii variable major proteins. Gene 143: 105-10.
Margolis, N., Hogan, D., Tilly, K., and Rosa, P. A. (1994b) Plasmid location of Borrelia purine biosynthesis gene homologs. J Bacteriol 176: 6427-32.
Masuzawa, T., Komikado, T., and Yanagihara, Y. (1997) PCR-restriction fragment length polymorphism analysis of the ospC gene for detection of mixed culture and for epidemiological typing of Borrelia burgdorferi sensu stricto. Clin Diagn Lab Immunol 4: 60-3.
Mathiesen, D. A., Oliver, J. H., Jr., Kolbert, C. P., Tullson, E. D., Johnson, B. J., Campbell, G. L., Mitchell, P. D., Reed, K. D., Telford, S. R., 3rd, Anderson, J. F., Lane, R. S., and Persing, D. H. (1997) Genetic heterogeneity of Borrelia burgdorferi in the United States. J Infect Dis 175: 98-107.
McGrath, B. C., Dunn, J. J., Buchstein, S. R., and Luft, B. J. (1997) Adaptation of RARE cleavage for mapping genes in Borrelia burgdorferi. GENBANK ACCESSION #: U22451.
McGrath, B. C., Dunn, J. J., Gorgone, G., Guttman, D., Dykhuizen, D., and Luft, B. J. (1995) Identification of an immunologically important hypervariable domain of major outer surface protein A of Borrelia burgdorferi [published erratum appears in Infect Immun 1995 Jun;63(6):2390]. Infect Immun 63: 1356-61.
Murai, N., Kamata, H., Nagashima, Y., Yagisawa, H., and Hirata, H. (1995) A novel insertion sequence (IS)-like element of the thermophilic bacterium PS3 promotes expression of the alanine carrier protein- encoding gene. Gene 163: 103-7.
Nevill-Manning, C. G., Wu, T. D., and Brutlag, D. L. (1998) Highly specific protein sequence motifs for genome analysis. Proc Natl Acad Sci U S A 95: 5865-71.
Norris, S. J., Carter, C. J., Howell, J. K., and Barbour, A. G. (1992) Low-passage-associated proteins of Borrelia burgdorferi B31: characterization and molecular cloning of OspD, a surface-exposed, plasmid-encoded lipoprotein. Infect Immun 60: 4662-72.
Porcella, S. F., Popova, T. G., Akins, D. R., Li, M., Radolf, J. D., and Norgard, M. V. (1996) Borrelia burgdorferi supercoiled plasmids encode multicopy tandem open reading frames and a lipoprotein gene family. J Bacteriol 178: 3293-307.
Probert, W., and Johnson, B. (1998) Identification of a 47 kd fibronectin-binding protein expressed by Borrelia burgdorferi isolate B31. Molecular Microbiology 30: 1003-1015.
Reindl, M., Redl, B., and Stoffler, G. (1993) Isolation and analysis of a linear plasmid-located gene of Borrelia burgdorferi B29 encoding a 27 kDa surface lipoprotein (P27) and its overexpression in Escherichia coli. Mol Microbiol 8: 1115-24.
Rosa, P. A., Schwan, T., and Hogan, D. (1992) Recombination between genes encoding major outer surface proteins A and B of Borrelia burgdorferi. Mol Microbiol 6: 3031-40.
Samuels, D. S., Marconi, R. T., and Garon, C. F. (1993) Variation in the size of the ospA-containing linear plasmid, but not the linear chromosome, among the three Borrelia species associated with Lyme disease. J Gen Microbiol 139: 2445-9.
Schwan, T. G., Piesman, J., Golde, W. T., Dolan, M. C., and Rosa, P. A. (1995) Induction of an outer surface protein on Borrelia burgdorferi during tick feeding. Proc Natl Acad Sci U S A 92: 2909-13.
Skare, J. T., Champion, C. I., Mirzabekov, T. A., Shang, E. S., Blanco, D. R., Erdjument-Bromage, H., Tempst, P., Kagan, B. L., Miller, J. N., and Lovett, M. A. (1996) Porin activity of the native and recombinant outer membrane protein Oms28 of Borrelia burgdorferi. J Bacteriol 178: 4909-18.
Stevenson, B., and Barthold, S. W. (1994) Expression and sequence of outer surface protein C among North American isolates of Borrelia burgdorferi. FEMS Microbiol Lett 124: 367-72.
Stevenson, B., Bockenstedt, L. K., and Barthold, S. W. (1994) Expression and gene sequence of outer surface protein C of Borrelia burgdorferi reisolated from chronically infected mice. Infect Immun 62: 3568-71.
Stevenson, B., Bono, J. L., Schwan, T. G., and Rosa, P. (1998a) Borrelia burgdorferi erp proteins are immunogenic in mammals infected by tick bite, and their synthesis is inducible in cultured bacteria. Infect Immun 66: 2648-54.
Stevenson, B., Casjens, S., and Rosa, P. (1998b) Evidence of past recombination events among the genes encoding the Erp antigens of Borrelia burgdorferi. Microbiology 144: 1869-79.
Stevenson, B., Casjens, S., van Vugt, R., Porcella, S. F., Tilly, K., Bono, J. L., and Rosa, P. (1997) Characterization of cp18, a naturally truncated member of the cp32 family of Borrelia burgdorferi plasmids. J Bacteriol 179: 4285-91.
Stevenson, B., Schwan, T. G., and Rosa, P. A. (1995) Temperature-related differential expression of antigens in the Lyme disease spirochete, Borrelia burgdorferi. Infect Immun 63: 4535-9.
Stevenson, B., Tilly, K., and Rosa, P. A. (1996) A family of genes located on four separate 32-kilobase circular plasmids in Borrelia burgdorferi B31. J Bacteriol 178: 3508-16.
Suk, K., Das, S., Sun, W., Jwang, B., Barthold, S. W., Flavell, R. A., and Fikrig, E. (1995) Borrelia burgdorferi genes selectively expressed in the infected host. Proc Natl Acad Sci U S A 92: 4269-73.
Sutcliffe, I., and Russell, R. (1995) Lipoproteins of Gram-positive bacteria. J. Bacteriol. 177: 1123-1128.
Theisen, M. (1996) Molecular cloning and characterization of nlpH, encoding a novel, surface-exposed, polymorphic, plasmid-encoded 33-kilodalton lipoprotein of Borrelia afzelii. J Bacteriol 178: 6435-42.
Tilly, K., Casjens, S., Stevenson, B., Bono, J. L., Samuels, D. S., Hogan, D., and Rosa, P. (1997) The Borrelia burgdorferi circular plasmid cp26: conservation of plasmid structure and targeted inactivation of the ospC gene. Mol Microbiol 25: 361-73.
Wallich, R., Brenner, C., Kramer, M. D., and Simon, M. M. (1995) Molecular cloning and immunological characterization of a novel linear- plasmid-encoded gene, pG, of Borrelia burgdorferi expressed only in vivo. Infect Immun 63: 3327-35.
Wallich, R., Helmes, C., Schaible, U. E., Lobet, Y., Moter, S. E., Kramer, M. D., and Simon, M. M. (1992) Evaluation of genetic divergence among Borrelia burgdorferi isolates by use of OspA, fla, HSP60, and HSP70 gene probes. Infect Immun 60: 4856-66.
Wallich, R., Schaible, U. E., Simon, M. M., Heiberger, A., and Kramer, M. D. (1989) Cloning and sequencing of the gene encoding the outer surface protein A (OspA) of a European Borrelia burgdorferi isolate. Nucleic Acids Res 17: 8864.
Wang, I., Dykhuizen, D., Qiu, W., Dunn, J., Bosler, E., and Luft, B. (1999) Genetic diversity of ospC in a local population of Borrelia burgdorferi sensu stricto. Genetics 151: 15-30.
Wang, J., Masuzawa, T., Komikado, T., and Yanagihara, Y. (1997a) Consensus sequence on the genes encoding the major outer surface proteins (OspA and OspB) of Borrelia garinii isolate. Microbiol Immunol 41: 83-91.
Wang, J., Masuzawa, T., Li, M., and Yanagihara, Y. (1997b) Deletion in the genes encoding outer surface proteins OspA and OspB of Borrelia garinii isolated from patients in Japan. Microbiol Immunol 41: 673-9.
Wang, J., Masuzawa, T., Li, M., and Yanagihara, Y. (1997c) An unusual illegitimate recombination occurs in the linear-plasmid- encoded outer-surface protein A gene of Borrelia afzelii. Microbiology 143: 3819-25.
Will, G., Jauris-Heipke, S., Schwab, E., Busch, U., Rossler, D., Soutschek, E., Wilske, B., and Preac-Mursic, V. (1995) Sequence analysis of ospA genes shows homogeneity within Borrelia burgdorferi sensu stricto and Borrelia afzelii strains but reveals major subgroups within the Borrelia garinii species. Med Microbiol Immunol (Berl) 184: 73-80.
Wilske, B., Busch, U., Eiffert, H., Fingerle, V., Pfister, H. W., Rossler, D., and Preac-Mursic, V. (1996a) Diversity of OspA and OspC among cerebrospinal fluid isolates of Borrelia burgdorferi sensu lato from patients with neuroborreliosis in Germany. Med Microbiol Immunol (Berl) 184: 195-201.
Wilske, B., Busch, U., Fingerle, V., Jauris-Heipke, S., Preac Mursic, V., Rossler, D., and Will, G. (1996b) Immunological and molecular variability of OspA and OspC. Implications for Borrelia vaccine development. Infection 24: 208-12.
Wilske, B., Luft, B., Schubach, W. H., Zumstein, G., Jauris, S., Preac-Mursic, V., and Kramer, M. D. (1992) Molecular analysis of the outer surface protein A (OspA) of Borrelia burgdorferi for conserved and variable antibody binding domains. Med Microbiol Immunol 181: 191-207.
Wilske, B., Preac-Mursic, V., Jauris, S., Hofmann, A., Pradel, I., Soutschek, E., Schwab, E., Will, G., and Wanner, G. (1993) Immunological and molecular polymorphisms of OspC, an immunodominant major outer surface protein of Borrelia burgdorferi. Infect Immun 61: 2182-91.
Zhang, J. R., Hardham, J. M., Barbour, A. G., and Norris, S. J. (1997) Antigenic variation in Lyme disease borreliae by promiscuous recombination of VMP-like sequence cassettes. Cell 89: 275-85.
Zhang, J. R., and Norris, S. J. (1998a) Genetic variation of the Borrelia burgdorferi gene vlsE involves cassette-specific, segmental gene conversion. Infect Immun 66: 3698-704.
Zhang, J. R., and Norris, S. J. (1998b) Kinetics and in vivo induction of genetic variation of vlsE in Borrelia burgdorferi. Infect Immun 66: 3689-97.
Zhou, X., Cahoon, M., Rosa, P., and Hedstrom, L. (1997) Expression, purification, and characterization of inosine 5- monophosphate dehydrogenase from Borrelia burgdorferi. J Biol Chem 272: 21977-81.
Zuckert, W. R., and Meyer, J. (1996) Circular and linear plasmids of Lyme disease spirochetes have extensive homology: characterization of a repeated DNA element. J Bacteriol 178: 2287-98.
Zuckert, W., Meyer, J. and Barbour, A. (1999) Circular and linear plasmids of Lyme disease spirochetes have extensive homology: characterization of a repeated DNA element. Infect. Immun. 67: 3257-66.
Zumstein, G., Fuchs, R., Hofmann, A., Preac-Mursic, V., Soutschek, E., and Wilske, B. (1992) Genetic polymorphism of the gene encoding the outer surface protein A (OspA) of Borrelia burgdorferi. Med Microbiol Immunol 181: 57-70.