WO2002040715A2 - Molecules for disease detection and treatment - Google Patents

Molecules for disease detection and treatment Download PDF

Info

Publication number
WO2002040715A2
WO2002040715A2 PCT/US2001/027628 US0127628W WO0240715A2 WO 2002040715 A2 WO2002040715 A2 WO 2002040715A2 US 0127628 W US0127628 W US 0127628W WO 0240715 A2 WO0240715 A2 WO 0240715A2
Authority
WO
WIPO (PCT)
Prior art keywords
2000sep08
polynucleotide
polypeptide
antibody
seq
Prior art date
Application number
PCT/US2001/027628
Other languages
French (fr)
Other versions
WO2002040715A8 (en
WO2002040715A3 (en
Inventor
Stuart Jackson
Stephen E. Lincoln
Christina M. Altus
Gerard E. Dufour
Michael S. Chalup
Jennifer L. Jackson
Anissa Lee Jones
Jimmy Y. Yu
Rachel J. Wright
Darryl Gietzen
Tommy F. Liu
Pierre E. Yap
Christopher R. Dahl
Monika G. Momiyama
Diana L. Bradley
Sameer D. Rohatgi
Bernard Harris
Ann M. Roseberry
Edward H. Gerstin, Jr.
Careyna H. Peralta
Marie H. David
Scott R. Panzer
Vincent Flores
Abel Daffo
Rakesh Marwaha
Alice J. Chen
Simon C. Chang
Alan P. Au
Rebekah R. Inman
Original Assignee
Incyte Genomics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Incyte Genomics, Inc. filed Critical Incyte Genomics, Inc.
Priority to CA002420983A priority Critical patent/CA2420983A1/en
Priority to US10/363,829 priority patent/US20040142331A1/en
Priority to AU2001287108A priority patent/AU2001287108A1/en
Priority to EP01966607A priority patent/EP1343885A2/en
Publication of WO2002040715A2 publication Critical patent/WO2002040715A2/en
Publication of WO2002040715A8 publication Critical patent/WO2002040715A8/en
Publication of WO2002040715A3 publication Critical patent/WO2002040715A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/05Animals comprising random inserted nucleic acids (transgenic)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the present invention relates to molecules for disease detection and treatment and to the use of these sequences in the diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of exogenous compounds on, the expression of molecules for disease detection and treatment.
  • the human genome is comprised of thousands of genes, many encoding gene products that function in the maintenance and growth of the various cells and tissues in the body. Aberrant expression or mutations in these genes and their products is the cause of, or is associated with, a variety of human diseases such as cancer and other cell proliferative disorders. The identification of these genes and their products is the basis of an ever-expanding effort to find markers for early detection of diseases, and targets for their prevention and treatment.
  • cancer represents a type of cell proliferative disorder that affects nearly every tissue in the body.
  • a wide variety of molecules, either aberrantly expressed or mutated, can be the cause of, or involved with, various cancers because tissue growth involves complex and_ordered patterns of cell proliferation, cell differentiation, and apoptosis.
  • Cell proliferation must be regulated to maintain both the number of cells and their spatial organization. This regulation depends upon the appropriate expression of proteins which control cell cycle progression in response to extracellular signals such as growth factors and other mitogens, and intracellular cues such as DNA damage or nutrient starvation.
  • Molecules which directly or indirectly modulate cell cycle progression fall into several categories, including growth factors and their receptors, second messenger and signal transduction proteins, oncogene products, tumor-suppressor proteins, and mitosis-promoting factors. Aberrant expression or mutations in any of these gene products can result in cell proliferative disorders such as cancer.
  • Oncogenes are genes generally derived from normal genes that, through abnormal expression or mutation, can effect the transformation of a normal cell to a malignant one (oncogenesis).
  • Oncoproteins, encoded by oncogenes can affect cell proliferation in a variety of ways and include growth factors, growth factor receptors, intracellular signal transducers, nuclear transcription factors, and cell-cycle control proteins.
  • tumor-suppressor genes are involved in inhibiting cell proliferation. Mutations which cause reduced or loss of function in tumor-suppressor genes result in aberrant cell proliferation and cancer. Thus a wide variety of genes and their products have been found that are associated with cell proliferative disorders such as cancer, but many more may exist that are yet to be discovered.
  • DNA-based arrays can provide a simple way to explore the expression of a single polymorphic gene or a large number of genes. When the expression of a single gene is explored, DNA-based arrays are employed to detect the expression of specific gene variants. For example, a p53 tumor suppressor gene array is used to determine whether individuals are carrying mutations that predispose them to cancer. A cytochrome p450 gene array is useful to determine whether individuals have one of a number of specific mutations that could result in increased drug metabolism, drug resistance or drug toxicity.
  • DNA-based array technology is especially relevant for the rapid screening of expression of a large number of genes.
  • a genetic predisposition, disease or therapeutic treatment may affect, directly or indirectly, the expression of a large number of genes.
  • the interactions may be expected, such as when the genes are part of the same signaling pathway.
  • the interactions may be totally unexpected. Therefore, DNA-based arrays can be used to investigate how genetic predisposition, disease, or therapeutic treatment affects the expression of a large number of genes.
  • the present invention relates to human disease detection and treatment molecule polynucleotides (mddt) as presented in the Sequence Listing.
  • mddt human disease detection and treatment molecule polynucleotides
  • the invention provides an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:l-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252.
  • the polynucleotide comprises at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through
  • the polynucleotide comprises at least 60 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the invention further provides a composition for the detection of expression of disease detection and treatment molecule polynucleotides comprising at least one isolated polynucleotide comprising a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d); and a detectable label.
  • the invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide sequence of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the method comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
  • the invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • a target polynucleotide having a polynucleotide selected from the group consisting of a) a polynucle
  • the method comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof.
  • the invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 30 contiguous nucleotides.
  • the invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 60 contiguous nucleotides.
  • the invention further provides a recombinant polynucleotide comprising a promoter sequence operably linked to an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the invention provides a cell transformed with the recombinant polynucleotide.
  • the invention provides a transgenic organism comprising the recombin
  • the invention also provides a method for producing a disease detection and treatment polypeptide, the method comprising a) culturing a cell under conditions suitable for expression of the disease detection and treatment polypeptide, wherein said cell is transformed with a recombinant polynucleotide, said recombinant polynucleotide comprising an isolated polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii
  • the invention additionally provides a method wherein the polypeptide has an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the invention also provides an isolated disease detection and treatment polypeptide (MDDT) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252.
  • MDDT disease detection and treatment polypeptide
  • the invention further provides a method of screening for a test compound that specifically binds to the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the method comprises a) combining the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 with at least one test compound under suitable conditions, and b) detecting binding of the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 to the test compound, thereby identifying a compound that specifically binds to the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the invention further provides a microarray wherein at least one element of the microarray is an isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • the invention also provides a method for generating a transcript image of a sample which contains polynucleotides.
  • the method comprises a) labeling the polynucleotides of the sample, b) contacting the elements of the microarray with the labeled polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample.
  • the invention provides a method for screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
  • a target polynucleotide comprises a polynucleotide selected from the group consisting of a
  • the method comprises a) exposing a sample comprising the target polynucleotide to a compound, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
  • the invention further provides a method for assessing toxicity of a test compound, said method comprising a) treating a biological sample containing nucleic acids with the test compound; b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of
  • Hybridization occurs under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv), and alternatively, the target polyn
  • the invention further provides an isolated polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the invention provides an isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the invention further provides an isolated polynucleotide encoding a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-
  • the polynucleotide encodes a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. In another alternative, the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252.
  • the invention provides an isolated antibody which specifically binds to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the invention further provides a composition comprising a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-
  • the composition comprises a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the invention additionally provides a method of treating a disease or condition associated with decreased expression of functional MDDT, comprising administering to a patient in need of such treatment the composition.
  • the invention also provides a method for screening a compound for effectiveness as an agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting agonist activity in the sample.
  • the invention provides a composition comprising an agonist compound identified by the method and a pharmaceutically acceptable excipient.
  • the invention provides a method of treating a disease or condition associated with decreased expression of functional MDDT, comprising administering to a patient in need of such treatment the composition.
  • the invention provides a method for screening a compound for effectiveness as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in the sample.
  • the invention provides, a composition comprising an antagonist compound identified by the method and a pharmaceutically acceptable excipient.
  • the invention provides a method of treating a disease or condition associated with overexpression of functional MDDT, comprising administering to a patient in need of such treatment the composition.
  • the invention further provides a method of screening for a compound that modulates the activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
  • the method comprises a) combining the polypeptide with at least one test compound under conditions permissive for the activity of the polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the polypeptide in the presence of the test compound with the activity of the polypeptide in the absence of the test compound, wherein a change in the activity of the polypeptide in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide.
  • Table 1 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with the sequence identification numbers (SEQ ID NO:s) and open reading frame identification numbers (ORF IDs) corresponding to polypeptides encoded by the template ID.
  • Table 2 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with their
  • GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the GenBank hits.
  • Table 3 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with polynucleotide segments of each template sequence as defined by the indicated “start” and “stop” nucleotide positions.
  • SEQ ID NO:s sequence identification numbers
  • template IDs template identification numbers
  • Table 4 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with polynucleotide segments of each template sequence as defined by the indicated “start” and “stop” nucleotide positions.
  • the reading frames of the polynucleotide segments are shown, and the polypeptides encoded by the polynucleotide segments constitute either signal peptide (SP) or transmembrane (TM) domains, as indicated.
  • SP signal peptide
  • TM transmembrane
  • the membrane topology of the encoded polypeptide sequence is indicated, the N-terminus (N) listed as being oriented to either the cytosolic (N in) or non- cytosolic (N out) side of the cell membrane or organelle.
  • Table 5 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with component sequence identification numbers (component IDs) corresponding to each template.
  • the component sequences, which were used to assemble the template sequences, are defined by the indicated “start” and “stop” nucleotide positions along each template.
  • Table 6 shows the tissue distribution profiles for the templates of the invention.
  • Table 7 shows the sequence identification numbers (SEQ ID NO:s) corresponding to the polypeptides of the present invention, along with the reading frames used to obtain the polypeptide segments, the lengths of the polypeptide segments, the "start” and “stop” nucleotide positions of the polynucleotide sequences used to define the encoded polypeptide segments, the GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the GenBank hits.
  • Table 8 summarizes the bioinformatics tools which are useful for analysis of the polynucleotides of the present invention.
  • the first column of Table 8 lists analytical tools, programs, and algorithms, the second column provides brief descriptions thereof, the third column presents appropriate references, all of which are incorporated by reference herein in their entirety, and the fourth column presents, where applicable, the scores, probability values, and other parameters used to evaluate the strength of a match between two sequences (the higher the score, the greater the homology between two sequences).
  • mddt refers to a nucleic acid sequence
  • MDDT refers to an amino acid sequence encoded by mddt
  • a “full-length” mddt refers to a nucleic acid sequence containing the entire coding region of a gene endogenously expressed in human tissue.
  • adjuvants are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's immunological response.
  • Alleles refers to an alternative form of a nucleic acid sequence. Alleles result from a “mutation,” a change or an alternative reading of the genetic code. Any given gene may have none, one, or many allelic forms. Mutations which give rise to alleles include deletions, additions, or substitutions of nucleotides. Each of these changes may occur alone, or in combination with the others, one or more times in a given nucleic acid sequence.
  • the present invention encompasses allelic mddt.
  • Amino acid sequence refers to a peptide, a polypeptide, or a protein of either natural or synthetic origin. The amino acid sequence is not limited to the complete, endogenous amino acid sequence and may be a fragment, epitope, variant, or derivative of a protein expressed by a nucleic acid sequence.
  • Amplification refers to the production of additional copies of a sequence and is carried out using polymerase chain reaction (PCR) technologies well known in the art.
  • PCR polymerase chain reaction
  • Antibody refers to intact molecules as well as to fragments thereof, such as Fab, F(ab') 2 , and Fv fragments, which are capable of binding the epitopic determinant.
  • Antibodies that bind MDDT polypeptides can be prepared using intact polypeptides or using fragments containing small peptides of interest as the immunizing antigen.
  • the polypeptide or peptide used to immunize an animal e.g., a mouse, a rat, or a rabbit
  • an animal e.g., a mouse, a rat, or a rabbit
  • RNA e.g., a mouse, a rat, or a rabbit
  • Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal.
  • Antisense sequence refers to a sequence capable of specifically hybridizing to a target sequence.
  • the antisense sequence may include DNA, RNA, or any nucleic acid mimic or analog such as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified sugar groups such as 2'-methoxyethyl sugars or 2'-methoxyethoxy sugars; or oligonucleotides having modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2'-deoxyguanosine.
  • Antisense sequence refers to a sequence capable of specifically hybridizing to a target sequence.
  • the antisense sequence can be DNA, RNA, or any nucleic acid mimic or analog.
  • Antisense technology refers to any technology which relies on the specific hybridization of an antisense sequence to a target sequence.
  • a "bin” is a portion of computer memory space used by a computer program for storage of data, and bounded in such a manner that data stored in a bin may be retrieved by the program.
  • Bioly active refers to an amino acid sequence having a structural, regulatory, or biochemical function of a naturally occurring amino acid sequence.
  • “Clone joining” is a process for combining gene bins based upon the bins' containing sequence information from the same clone.
  • the sequences may assemble into a primary gene transcript as well as one or more splice variants.
  • “Complementary” describes the relationship between two single-stranded nucleic acid sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its complement 3'-T-C-A-5').
  • a “component sequence” is a nucleic acid sequence selected by a computer program such as PHRED and used to assemble a consensus or template sequence from one or more component sequences.
  • a "consensus sequence” or “template sequence” is a nucleic acid sequence which has been assembled from overlapping sequences, using a computer program for fragment assembly such as the GELVIEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a relational database management system (RDMS).
  • GELVIEW fragment assembly system Genetics Computer Group (GCG), Madison WI
  • RDMS relational database management system
  • Consensus sequence or “template sequence” is a nucleic acid sequence which has been assembled from overlapping sequences, using a computer program for fragment assembly such as the GELVIEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a relational database management system (RDMS).
  • “Conservative amino acid substitutions” are those substitutions that, when made, least interfere with the properties of the original protein, i.e., the structure and especially the function of the protein is conserved and not significantly changed by such substitutions.
  • the table below shows amino acids which may be substituted for an original amino acid in
  • Conservative substitutions generally maintain (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain.
  • “Deletion” refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or amino acid residue, respectively, is absent.
  • Derivative refers to the chemical modification of a nucleic acid sequence, such as by replacement of hydrogen by an alkyl, acyl, amino, hydroxyl, or other group.
  • “Differential expression” refers to increased or upregulated; or decreased, downregulated, or absent gene or protein expression, determined by comparing at least two different samples. Such comparisons may be carried out between, for example, a treated and an untreated sample, or a diseased and a normal sample.
  • array element refers to a polynucleotide, polypeptide, or other chemical compound having a unique and defined position on a microarray.
  • E-value refers to the statistical probability that a match between two sequences occurred by chance.
  • Exon shuffling refers to the recombination of different coding regions (exons). Since an exon may represent a structural or functional domain of the encoded protein, new proteins may be assembled through the novel reassortment of stable substructures, thus allowing acceleration of the evolution of new protein functions.
  • a "fragment” is a unique portion of mddt or MDDT which is identical in sequence to but shorter in length than the parent sequence. A fragment may comprise up to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise from 10 to 1000 contiguous amino acid residues or nucleotides.
  • a fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 60,
  • a polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined sequence.
  • these lengths are exemplary, and any length that is supported by the specification, including the
  • a fragment of mddt comprises a region of unique polynucleotide sequence that specifically identifies mddt, for example, as distinct from any other sequence in the same genome.
  • a fragment of mddt is useful, for example, in hybridization and amplification technologies and in analogous methods that distinguish mddt from related polynucleotide sequences.
  • the precise length of a fragment of mddt and the region of mddt to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the intended purpose for the fragment.
  • a fragment of MDDT is encoded by a fragment of mddt.
  • a fragment of MDDT comprises a region of unique amino acid sequence that specifically identifies MDDT.
  • a fragment of MDDT is useful as an immunogenic peptide for the development of antibodies that specifically recognize MDDT.
  • the precise length of a fragment of MDDT and the region of MDDT to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the intended purpose for the fragment.
  • a “full length” nucleotide sequence is one containing at least a start site for translation to a protein sequence, followed by an open reading frame and a stop site, and encoding a "full length” polypeptide.
  • “Hit” refers to a sequence whose annotation will be used to describe a given template. Criteria for selecting the top hit are as follows: if the template has one or more exact nucleic acid matches, the top hit is the exact match with highest percent identity. If the template has no exact matches but has significant protein hits, the top hit is the protein hit with the lowest E-value. If the template has no significant protein hits, but does have significant non-exact nucleotide hits, the top hit is the nucleotide hit with the lowest E-value.
  • Hybridization refers to the process by which a strand of nucleotides anneals with a complementary strand through base pairing. Specific hybridization is an indication that two nucleic acid sequences share a high degree of identity. Specific hybridization complexes form under defined annealing conditions, and remain hybridized after the "washing" step.
  • the defined hybridization conditions include the annealing conditions and the washing step(s), the latter of which is particularly important in determining the stringency of the hybridization process, with more stringent condiUons allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not perfectly matched.
  • Permissive conditions for annealing of nucleic acid sequences are routinely determinable and may be consistent among hybridization experiments, whereas wash conditions may be varied among experiments to achieve the desired stringency.
  • T m thermal melting point
  • High stringency conditions for hybridization between polynucleotides of the present invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour.
  • temperatures of about 65°C, 60°C, or 55°C may be used.
  • SSC concentration may be varied from about 0.2 to 2 x SSC, with SDS being present at about 0.1 %.
  • blocking reagents are used to block non-specific hybridization.
  • blocking reagents include, for instance, denatured salmon sperm DNA at about 100-200 ⁇ g/ml. Useful variations on these conditions will be readily apparent to those skilled in the art.
  • Hybridization particularly under high stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity is strongly indicative of a similar role for the nucleotides and their resultant proteins.
  • RNA:DNA hybridizations may also be used under particular circumstances, such as RNA:DNA hybridizations. Appropriate hybridization conditions are routinely determinable by one of ordinary skill in the art.
  • Immunologically active or “immunogenic” describes the potential for a natural, recombinant, or synthetic peptide, epitope, polypeptide, or protein to induce antibody production in appropriate animals, cells, or cell lines.
  • “Insertion” or “addition” refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or residue, respectively, is added to the sequence.
  • Labeling refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or antibody with a reporter molecule capable of producing a detectable or measurable signal.
  • “Microarray” is any arrangement of nucleic acids, amino acids, antibodies, etc., on a substrate.
  • the substrate may be a solid support such as beads, glass, paper, nitrocellulose, nylon, or an appropriate membrane.
  • Linkers are short stretches of nucleotide sequence which may be added to a vector or an mddt to create restriction endonuclease sites to facilitate cloning.
  • Polylinkers are engineered to incorporate multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 3' overhangs (e.g., BamHI, EcoRI, and Hindlll) and those which provide blunt ends (e.g., EcoRV, SnaBI, and Stul).
  • Nucleic acid sequence refers to the specific order of nucleotides joined by phosphodiester bonds in a linear, polymeric arrangement. Depending on the number of nucleotides, the nucleic acid sequence can be considered an oligomer, oligonucleotide, or polynucleotide.
  • the nucleic acid can be DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be either double-stranded or single-stranded, and can represent either the sense or antisense (complementary) strand.
  • Oligomers refers to a nucleic acid sequence of at least about 6 nucleotides and as many as about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 and 30 nucleotides, that may be used in hybridization or amplification technologies. Oligomers may be used as, e.g., primers for PCR, and are usually chemically synthesized.
  • "Operably linked” refers to the situation in which a first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence.
  • PNA protein nucleic acid
  • PNA refers to a DNA mimic in which nucleotide bases are attached to a pseudopeptide backbone to increase stability.
  • PNAs also designated antigene agents, can prevent gene expression by targeting complementary messenger RNA.
  • percent identity and % identity refer to the percentage of residue matches between at least two polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences.
  • NCBI National Center for Biotechnology Information
  • BLAST Basic Local Alignment Search Tool
  • NCBI National Center for Biotechnology Information
  • BLAST Basic Local Alignment Search Tool
  • the BLAST software suite includes various sequence analysis programs including "blastn,” that is used to determine alignment between a known polynucleotide sequence and other sequences on a variety of databases.
  • BLAST 2 Sequences are tools that is used for direct pairwise comparison of two nucleotide sequences.
  • “BLAST 2 Sequences” can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2/.
  • BLAST 2 Sequences tool can be used for both blastn and blastp (discussed below). BLAST programs are commonly used with gap and other parameters set to default settings. For example, to compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences” tool Version 2.0.9 (May-07-1999) set at default parameters. Such default parameters may be, for example: Matrix: BLOSUM62
  • Percent identity may be measured over the length of an entire defined sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides.
  • Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured.
  • Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.
  • percent identity and % identity refer to the percentage of residue matches between at least two polypeptide sequences aligned using a standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some alignment methods take into account conservative amino acid substitutions. Such conservative substitutions, explained in more detail above, generally preserve the hydrophobicity and acidity of the substituted residue, thus preserving the structure (and therefore function) of the folded polypeptide. Percent identity between polypeptide sequences may be determined using the default parameters of the CLUSTAL V algorithm as incorporated into the MEG ALIGN version 3.12e sequence alignment program (described and referenced above).
  • CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs.
  • NCBI BLAST software suite may be used.
  • BLAST 2 Sequences Version 2.0.9 (May-07-1999) with blastp set at default parameters.
  • Such default parameters may be, for example: Matrix: BLOSUM62
  • Percent identity may be measured over the length of an entire defined polypeptide sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 contiguous residues.
  • Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured.
  • Post-translational modification of an MDDT may involve lipidation, glycosylation, phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the art. These processes may occur synthetically or biochemically. Biochemical modifications will vary by cell type depending on the enzymatic milieu and the MDDT.
  • Probe refers to mddt or fragments thereof, which are used to detect identical, allelic or related nucleic acid sequences.
  • Probes are isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes.
  • Primmers are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by complementary base-pairing. The primer may then be extended along the target DNA strand by a DNA polymerase enzyme.
  • Probes and primers as used in the present invention typically comprise at least 15 contiguous nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also be employed, such as probes and primers that comprise at least 20, 30, 40, 50, 60, 70, 80, 90, 100, or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may be considerably longer than these examples, and it is understood that any length supported by the specification, including the figures and Sequence Listing, may be used.
  • PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, 1991 , Whitehead Institute for Biomedical Research, Cambridge MA).
  • Oligonucleotides for use as primers are selected using software known in the art for such purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection programs have incorporated additional features for expanded capabilities. For example, the PrimOU primer selection program (available to the public from the Genome Center at University of Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from megabase sequences and is thus useful for designing primers on a genome-wide scope.
  • Primer3 primer selection program (available to the public from the Whitehead Institute/MIT Center for Genome Research, Cambridge MA) allows the user to input a "mispriming library," in which sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for microarrays. (The source code for the latter two primer selection programs may also be obtained from their respective sources and modified to meet the user's specific needs.)
  • PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved oligonucleotides and polynucleotide fragments.
  • the oligonucleotides and polynucleotide fragments identified by any of the above selection methods are useful in hybridization technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially complementary polynucleotides in a sample of nucleic acids.
  • oligonucleotide selection are not limited to those described above.
  • “Purified” refers to molecules, either polynucleotides or polypeptides that are isolated or separated from their natural environment and are at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other compounds with which they are naturally associated.
  • a "recombinant nucleic acid” is a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two or more otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques such as those described in Sambrook, supra.
  • the term recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid.
  • a recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell.
  • such recombinant nucleic acids may be part of a viral vector, e.g., based on a vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is expressed, inducing a protective immunological response in the mammal.
  • regulatory element refers to a nucleic acid sequence from nontranslated regions of a gene, and includes enhancers, promoters, introns, and 3' untranslated regions, which interact with host proteins to carry out or regulate transcription or translation.
  • Reporter molecules are chemical or biochemical moieties used for labeling a nucleic acid, an amino acid, or an antibody. They include radionuclides; enzymes; fluorescent, chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known in the art.
  • An "RNA equivalent,” in reference to a DNA sequence, is composed of the same linear sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose instead of deoxyribose. "Sample” is used in its broadest sense.
  • Samples may contain nucleic or amino acids, antibodies, or other materials, and may be derived from any source (e.g., bodily fluids including, but not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots or imprints from such cells or tissues).
  • source e.g., bodily fluids including, but not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots or imprints from such cells or tissues.
  • Specific binding or “specifically binding” refers to the interaction between a protein or peptide and its agonist, antibody, antagonist, or other binding partner. The interaction is dependent upon the presence of a particular structure of the protein, e.g., the antigenic
  • an antibody is specific for epitope "A”
  • the presence of a polypeptide containing epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody will reduce the amount of labeled A that binds to the antibody.
  • substitution refers to the replacement of at least one nucleotide or amino acid by a different nucleotide or amino acid.
  • Substrate refers to any suitable rigid or semi-rigid support including, e.g., membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles or capillaries.
  • the substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound.
  • a “transcript image” refers to the collective pattern of gene expression by a particular tissue or cell type under given conditions at a given time.
  • Transformation refers to a process by which exogenous DNA enters a recipient cell.
  • Transformation may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host cell being transformed. "Transformants" include stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as cells which transiently express inserted DNA or RNA.
  • a "transgenic organism,” as used herein, is any organism, including but not limited to animals and plants, in which one or more of the cells of the organism contains heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art.
  • the nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus.
  • the term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule.
  • the transgenic organisms contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, and plants and animals.
  • the isolated DNA of the present invention can be introduced into the host by methods known in the art, for example infection, transfection, transformation or transconjugation. Techniques for transferring the DNA of the present invention into such organisms are widely known and provided in references such as Sambrook et al. (1989), supra.
  • a "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having at least 25% sequence identity to the particular nucleic acid sequence over a certain length of one of the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters.
  • Such a pair of nucleic acids may show, for example, at least 30%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length.
  • the variant may result in "conservative" amino acid changes which do not affect structural and/or chemical properties.
  • a variant may be described as, for example, an "allelic” (as defined above), “splice,” “species,” or “polymorphic” variant.
  • a splice variant may have significant identity to a reference molecule, but will generally have a greater or lesser number of polynucleotides due to alternate splicing of exons during mRNA processing.
  • the corresponding polypeptide may possess additional functional domains or lack domains that are present in the reference molecule.
  • Species variants are polynucleotide sequences that vary from one species to another. The resulting polypeptides generally will have significant amino acid identity relative to each other.
  • a polymorphic variant is a variation in the polynucleotide sequence of a particular gene between individuals of a given species.
  • Polymorphic variants also may encompass "single nucleotide polymorphisms" (SNPs) in which the polynucleotide sequence varies by one base.
  • SNPs single nucleotide polymorphisms
  • the presence of SNPs may be indicative of, for example, a certain population, a disease state, or a propensity for a disease state.
  • variants of the polynucleotides of the present invention may be generated through recombinant methods.
  • One possible method is a DNA shuffling technique such as
  • MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 5,837,458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat. Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or improve the biological properties of MDDT, such as its biological or enzymatic activity or its ability to bind to other molecules or compounds.
  • DNA shuffling is a process by which a library of gene variants is produced using PCR-mediated recombination of gene fragments.
  • the library is then subjected to selection or screening procedures that identify those gene variants with the desired properties. These preferred variants may then be pooled and further subjected to recursive rounds of DNA shuffling and selection/screening.
  • genetic diversity is created through "artificial" breeding and rapid molecular evolution. For example, fragments of a single gene containing random point mutations may be recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, fragments of a given gene may be recombined with fragments of homologous genes in the same gene family, either from the same or different species, thereby maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable manner.
  • a "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters.
  • Such a pair of polypeptides may show, for example, at least 50%; at least 60%, at least 70%, at least 80%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length of one of the polypeptides.
  • cDNA sequences derived from human tissues and cell lines were aligned based on nucleotide sequence identity and assembled into "consensus" or "template” sequences which are designated by the template identification numbers (template IDs) in column 2 of Table 2.
  • the sequence identification numbers (SEQ ID NO:s) corresponding to the template IDs are shown in column 1.
  • the template sequences have similarity to GenBank sequences, or "hits,” as designated by the GI Numbers in column 3.
  • the statistical probability of each GenBank hit is indicated by a probability score in column 4, and the functional annotation corresponding to each
  • GenBank hit is listed in column 5.
  • the invention incorporates the nucleic acid sequences of these templates as disclosed in the Sequence Listing and the use of these sequences in the diagnosis and treatment of disease states characterized by defects in disease detection and treatment molecules.
  • the invention further utilizes these sequences in hybridization and amplification technologies, and in particular, in technologies which assess gene expression patterns correlated with specific cells or tissues and their responses in vivo or in vitro to pharmaceutical agents, toxins, and other treatments. In this manner, the sequences of the present invention are used to develop a transcript image for a particular cell or tissue.
  • cDNA was isolated from libraries constructed using RNA derived from normal and diseased human tissues and cell lines.
  • the human tissues and cell lines used for cDNA library construction were selected from a broad range of sources to provide a diverse population of cDNAs representative of gene transcription throughout the human body. Descriptions of the human tissues and cell lines used for cDNA library construction are provided in the LIFESEQ database (Incyte Genomics, Inc.
  • Human tissues were broadly selected from, for example, cardiovascular, dermatologic, endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, reproductive, and urologic sources.
  • Cell lines used for cDNA library construction were derived from, for example, leukemic cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial cells. Such cell lines include, for example, THP-1 , urkat, HUVEC, hNT2, WI38, HeLa, and other cell lines commonly used and available from public depositories (American Type Culture Collection, Manassas VA).
  • cell lines Prior to mRNA isolation, cell lines were untreated, treated with a pharmaceutical agent such as 5'-aza-2 -deoxycytidine, treated with an activating agent such as lipopolysaccharide in the case of leukocytic cell lines, or, in the case of endothelial cell lines, subjected to shear stress.
  • a pharmaceutical agent such as 5'-aza-2 -deoxycytidine
  • an activating agent such as lipopolysaccharide in the case of leukocytic cell lines, or, in the case of endothelial cell lines, subjected to shear stress.
  • Chain termination reaction products may be electrophoresed on urea- polyacrylamide gels and detected either by autoradiography (for radioisotope-labeled nucleotides) or by fluorescence (for fluorophore-labeled nucleotides).
  • Automated methods for mechanized reaction preparation, sequencing, and analysis using fluorescence detection methods have been developed.
  • Machines used to prepare cDNAs for sequencing can include the MICROLAB 2200 liquid transfer system (Hamilton Company (Hamilton), Reno NV), Peltier thermal cycler (PTC200; MJ Research, Inc.
  • Sequencing can be carried out using, for example, the ABI 373 or 377 (Applied Biosystems) or MEGABACE 1000 (Molecular Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA) DNA sequencing systems, or other automated and manual sequencing systems well known in the art.
  • nucleotide sequences of the Sequence Listing have been prepared by current, state-of- the-art, automated methods and, as such, may contain occasional sequencing errors or unidentified nucleotides. Such unidentified nucleotides are designated by an N. These infrequent unidentified bases do not represent a hindrance to practicing the invention for those skilled in the art.
  • Several methods employing standard recombinant techniques may be used to correct errors and complete the missing sequence information. (See, e.g., those described in Ausubel, F.M. et al. (1997) Short Protocols in Molecular Biology. John Wiley & Sons, New York NY; and Sambrook, J. et al. (1989) Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Press, Plainview NY.)
  • Human polynucleotide sequences may be assembled using programs or algorithms well known in the art. Sequences to be assembled are related, wholly or in part, and may be derived from a single or many different transcripts. Assembly of the sequences can be performed using such programs as PHRAP (Phils Revised Assembly Program) and the GELVIEW fragment assembly system (GCG), or other methods known in the art.
  • PHRAP Phils Revised Assembly Program
  • GCG GELVIEW fragment assembly system
  • cDNA sequences are used as "component” sequences that are assembled into “template” or “consensus” sequences as follows. Sequence chromatograms are processed, verified, and quality scores are obtained using PHRED. Raw sequences are edited using an editing pathway known as Block 1 (See, e.g., the LIFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA).
  • BLAST comparisons A series of BLAST comparisons is performed and low-information segments and repetitive elements (e.g., dinucleotide repeats, Alu repeats, etc.) are replaced by "n's", or masked, to prevent spurious matches. Mitochondrial and ribosomal RNA sequences are also removed.
  • the processed * sequences are then loaded into a relational database management system (RDMS) which assigns edited sequences to existing templates, if available.
  • RDMS relational database management system
  • RDMS relational database management system
  • a process is initiated which modifies existing templates or creates new templates from works in progress (i.e., nonfinal assembled sequences) containing queued sequences or the sequences themselves. After the new sequences have been assigned to templates, the templates can be merged into bins.
  • bins can be split and the templates reannotated.
  • bins are "clone joined" based upon clone information. Clone joining occurs when the 5' sequence of one clone is present in one bin and the 3' sequence from the same clone is present in a different bin, indicating that the two bins should be merged into a single bin. Only bins which share at least two different clones are merged.
  • a resultant template sequence may contain either a partial or a full length open reading frame, or all or part of a genetic regulatory element. This variation is due in part to the fact that the full length cDNAs of many genes are several hundred, and sometimes several thousand, bases in length.
  • Template sequences may be extended to include additional contiguous sequences derived from the parent RNA transcript using a variety of methods known to those of skill in the art. Extension may thus be used to achieve the full length coding sequence of a gene.
  • the cDNA sequences are analyzed using a variety of programs and algorithms which are well known in the art. (See, e.g., Ausubel, 1997, supra. Chapter 1.1; Meyers, R.A. (Ed.) (1995) Molecular Biology and Biotechnology. Wiley VCH, New York NY, pp. 856-853; and Table 8.) These analyses comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318); analyses of potential start and stop codons; and homology searches.
  • BLAST Basic Local Alignment Search Tool
  • BLAST is especially useful in determining exact matches and comparing two sequence fragments of arbitrary but equal lengths, whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score set by the user (Karlin, S. et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-845).
  • GenBank e.g., GenBank
  • SwissProt e.g., GenBank
  • BLOCKS e.g., BLOCKS
  • PFAM e.g., PFAM
  • other databases e.g., GenBank, SwissProt, BLOCKS, PFAM and other databases may be searched for sequences containing regions of homology to a query mddt or MDDT of the present invention.
  • Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g., motif, BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, in "Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997, incorporated herein by reference. Human Disease Detection and Treatment Molecule Sequences
  • the mddt of the present invention may be used for a variety of diagnostic and therapeutic purposes.
  • an mddt may be used to diagnose a particular condition, disease, or disorder associated with disease detection and treatment molecules.
  • Such conditions, diseases, and disorders include, but are not limited to, a cell proliferative disorder, such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix
  • the mddt can be used to detect the presence of, or to quantify the amount of, an mddt- related polynucleotide in a sample. This information is then compared to information obtained from appropriate reference samples, and a diagnosis is established.
  • a polynucleotide complementary to a given mddt can inhibit or inactivate a therapeutically relevant gene related to the mddt.
  • the expression of mddt may be routinely assessed by hybridization-based methods to determine, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity of mddt expression.
  • the level of expression of mddt may be compared among different cell types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at different developmental stages, or among cell types or tissues undergoing various treatments.
  • This type of analysis is useful, for example, to assess the relative levels of mddt expression in fully, or partially differentiated cells or tissues, to determine if changes in mddt expression levels are correlated with the development or progression of specific disease states, and to assess the response of a cell or tissue to a specific therapy, for example, in pharmacological or toxicological studies.
  • Methods for the analysis of mddt expression are based on hybridization and amplification technologies and include membrane-based procedures such as northern blot analysis, high-throughput procedures that utilize, for example, microarrays, and PCR-based procedures.
  • the mddt, their fragments, or complementary sequences may be used to identify the presence of and/or to determine the degree of similarity between two (or more) nucleic acid sequences.
  • the mddt may be hybridized to naturally occurring or recombinant nucleic acid sequences under appropriately selected temperatures and salt concentrations. Hybridization with a probe based on the nucleic acid sequence of at least one of the mddt allows for the detection of nucleic acid sequences, including genomic sequences, which are identical or related to the mddt of the Sequence Listing. Probes may be selected from non-conserved or unique regions of at least one of the polynucleotides of SEQ ID NO: 1 -252 and tested for their ability to identify or amplify the target nucleic acid sequence using standard protocols.
  • Polynucleotide sequences that are capable of hybridizing, in particular, to those shown in SEQ ID NO: 1-252 and fragments thereof, can be identified using various conditions of stringency. (See, e.g., Wahl, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507-51 1.) Hybridization conditions are discussed in "Definitions.”
  • a probe for use in Southern or northern hybridization may be derived from a fragment of an mddt sequence, or its complement, that is up to several hundred nucleotides in length and is either single-stranded or double-stranded. Such probes may be hybridized in solution to biological materials such as plasmids, bacterial, yeast, or human artificial chromosomes, cleared or sectioned tissues, or to artificial substrates containing mddt. Microarrays are particularly suitable for identifying the presence of and detecting the level of expression for multiple genes of interest by examining gene expression correlated with, e.g., various stages of development, treatment with a drug or compound, or disease progression.
  • An array analogous to a dot or slot blot may be used to arrange and link polynucleotides to the surface of a substrate using one or more of the following: mechanical (vacuum), chemical, thermal, or UV bonding procedures.
  • Such an array may contain any number of mddt and may be produced by hand or by using available devices, materials, and machines.
  • Microarrays may be prepared, used, and analyzed using methods known in the art. (See, e.g.,
  • Probes may be labeled by either PCR or enzymatic techniques using a variety of commercially available reporter molecules.
  • commercial kits are available for radioactive and chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline phosphatase labeling (Life Technologies).
  • mddt may be cloned into commercially available vectors for the production of RNA probes.
  • Such probes may be transcribed in the presence of at least one labeled nucleotide (e.g., 32 P-ATP, Amersham Pharmacia Biotech).
  • polynucleotides of SEQ ID NO: 1-252 or suitable fragments thereof can be used to isolate full length cDNA sequences utilizing hybridization and/or amplification procedures well known in the art, e.g., cDNA library screening, PCR amplification, etc.
  • the molecular cloning of such full length cDNA sequences may employ the method of cDNA library screening with probes using the hybridization, stringency, washing, and probing strategies described above and in Ausubel, supra.
  • Gene identification and mapping are important in the investigation and treatment of almost all conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, diabetes, and mental illnesses are of particular interest. Each of these conditions is more complex than the single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being predictive of predisposition for a particular condition, disease, or disorder.
  • cardiovascular disease may result from malfunctioning receptor molecules that fail to clear cholesterol from the bloodstream
  • diabetes may result when a particular individual's immune system is activated by an infection and attacks the insulin-producing cells of the pancreas.
  • Alzheimer's disease has been linked to a gene on chromosome 21; other studies predict a different gene and location. Mapping of disease genes is a complex and reiterative process and generally proceeds from genetic linkage analysis to physical mapping.
  • a genetic linkage map traces parts of chromosomes that are inherited in the same pattern as the condition.
  • Statistics link the inheritance of particular conditions to particular regions of chromosomes, as defined by RFLP or other markers.
  • RFLP radio frequency domain
  • markers and their locations are known from previous studies. More often, however, the markers are simply stretches of DNA that differ among individuals. Examples of genetic linkage maps can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) World Wide Web site.
  • mddt sequences may be used to generate hybridization probes useful in chromosomal mapping of naturally occurring genomic sequences. Either coding or noncoding sequences of mddt may be used, and in some instances, noncoding sequences may be preferable over coding sequences. For example, conservation of an mddt coding sequence among members of a multi-gene family may potentially cause undesired cross hybridization during chromosomal mapping.
  • sequences may be mapped to a particular chromosome, to a specific region of a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI constructions, or single chromosome cDNA libraries.
  • HACs human artificial chromosomes
  • YACs yeast artificial chromosomes
  • BACs bacterial artificial chromosomes
  • bacterial PI constructions or single chromosome cDNA libraries.
  • Fluorescent in situ hybridization may be correlated with other physical chromosome mapping techniques and genetic map data. (See, e.g., Meyers, supra, pp. 965-968.) Correlation between the location of mddt on a physical chromosomal map and a specific disorder, or a predisposition to a specific disorder, may help define the region of DNA associated with that disorder.
  • the mddt sequences may also be used to detect polymorphisms that are genetically linked to the inheritance of a particular condition, disease, or disorder.
  • In situ hybridization of chromosomal preparations and genetic mapping techniques may be used for extending existing genetic maps. Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may reveal associated markers even if the number or arm of the corresponding human chromosome is not known. These new marker sequences can be mapped to human chromosomes and may provide valuable information to investigators searching for disease genes using positional cloning or other gene discovery techniques.
  • any sequences mapping to that area may represent associated or regulatory genes for further investigation.
  • the nucleotide sequences of the subject invention may also be used to detect differences in chromosomal architecture due to translocation, inversion, etc., among normal, carrier, or affected individuals.
  • a disease-associated gene is mapped to a chromosomal region, the gene must be cloned in order to identify mutations or other alterations (e.g., translocations or inversions) that may be correlated with disease.
  • This process requires a physical map of the chromosomal region containing the disease-gene of interest along with associated markers. A physical map is necessary for determining the nucleotide sequence of and order of marker genes on a particular chromosomal region. Physical mapping techniques are well known in the art and require the generation of overlapping sets of cloned DNA fragments from a particular organelle, chromosome, or genome. These clones are analyzed to reconstruct and catalog their order. Once the position of a marker is determined, the DNA from that region is obtained by consulting the catalog and selecting clones from that region. The gene of interest is located through positional cloning techniques using hybridization or similar methods.
  • the mddt of the present invention may be used to design probes useful in diagnostic assays. Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, disorders, or diseases associated with abnormal levels of mddt expression. Labeled probes developed from mddt sequences are added to a sample under hybridizing conditions of desired stringency. In some instances, mddt, or fragments or oligonucleotides derived from mddt, may be used as primers in amplification steps prior to hybridization. The amount of hybridization complex formed is quantified and compared with standards for that cell or tissue. If mddt expression varies significantly from the standard, the assay indicates the presence of the condition, disorder, or disease.
  • Qualitative or quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based technologies or multiple-sample format technologies such as PCR, enzyme-linked immunosorbent assay (ELISA)-like, pin, or chip-based assays.
  • PCR enzyme-linked immunosorbent assay
  • the probes described above may also be used to monitor the progress of conditions, disorders, or diseases associated with abnormal levels of mddt expression, or to evaluate the efficacy of a particular therapeutic treatment.
  • the candidate probe may be identified from the mddt that are specific to a given human tissue and have not been observed in GenBank or other genome databases. Such a probe may be used in animal studies, preclinical tests, clinical trials, or in monitoring the treatment of an individual patient.
  • standard expression is established by methods well known in the art for use as a basis of comparison, samples from patients affected by the disorder or disease are combined with the probe to evaluate any deviation from the standard profile, and a therapeutic agent is administered and effects are monitored to generate a treatment profile. Efficacy is evaluated by determining whether the expression progresses toward or returns to the standard normal pattern. Treatment profiles may be generated over a period of several days or several months. Statistical methods well known to those skilled in the art may be use to determine the significance of such therapeutic agents.
  • the polynucleotides are also useful for identifying individuals from minute biological samples, for example, by matching the RFLP pattern of a sample's DNA to that of an individual's DNA.
  • the polynucleotides of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. These sequences can be used to prepare
  • oligonucleotide primers derived from the mddt of the invention may be used to detect single nucleotide polymorphisms (SNPs) SNPs are substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease in humans
  • SNP detection include, but are not limited to, single-stranded conformation polymo ⁇ hism (SSCP) and fluorescent SSCP (fSSCP) methods.
  • oligonucleotide primers derived from mddt are used to amplify DNA using the polymerase chain reaction (PCR)
  • the DNA may be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the like.
  • SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing gels
  • the oligonucleotide primers are fluorescently labeled', which allows detection of the amplimers in high- throughput equipment such as DNA sequencing machines
  • sequence database analysis methods termed in silico SNP (isSNP) are capable of identifying polymo ⁇ hisms by comparing the sequences of individual overlapping DNA fragments which assemble into a common consensus sequence
  • SNPs may be detected and characterized by mass spectrometry using, for example, the high throughput MASSARRAY system (Sequenom, Inc , San Diego CA) DNA-based identification techniques are critical in forensic technology.
  • DNA sequences taken from very small biological samples such as tissues, e.g , hair or skin, or body fluids, e.g., blood, saliva, semen, etc , can be amplified using, e.g., PCR, to identify individuals.
  • PCR PCR
  • polynucleotides of the present invention can be used as polymo ⁇ hic markers.
  • reagents capable of identifying the source of a particular tissue.
  • Appropriate reagents can comprise, for example, DNA probes or primers prepared from the sequences of the present invention that are specific for particular tissues Panels of such reagents can identify tissue by species and/or by organ type. In a similar fashion, these reagents can be used to screen tissue cultures for contamination.
  • polynucleotides of the present invention can also be used as molecular weight markers on nucleic acid gels or Southern blots, as diagnostic probes for the presence of a specific mRNA in a particular cell type, in the creation of subtracted cDNA libraries which aid in the discovery of novel polynucleotides, in selection and synthesis of oligomers for attachment to an array or other support, and as an antigen to elicit an immune response.
  • the mddt of the invention or their mammalian homologs may be "knocked out" in an animal model system using homologous recombination in embryonic stem (ES) cells.
  • ES embryonic stem
  • mouse ES cells such as the mouse 129/SvJ cell line
  • the ES cells are transformed with a vector containing the gene of interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292).
  • the vector integrates into the corresponding region of the host genome by homologous recombination.
  • homologous recombination takes place using the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. (1996) Clin. Invest. 97: 1999- 2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330).
  • Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents.
  • the mddt of the invention may also be manipulated in vitro in ES cells derived from human blastocysts.
  • Human ES cells have the potential to differentiate into at least eight separate cell lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. (1998) Science 282: 1 145-1 147).
  • the mddt of the invention can also be used to create "knockin” humanized animals (pigs) or transgenic animals (mice or rats) to model human disease.
  • knockin technology a region of mddt is injected into animal ES cells, and the injected sequence integrates into the animal cell genome.
  • Transformed cells are injected into blastulae, and the blastulae are implanted as described above.
  • Transgenic progeny or inbred lines are studied and treated with potential pharmaceutical agents to obtain information on treatment of a human disease.
  • a mammal inbred to overexpress mddt resulting, e.g., in the secretion of MDDT in its milk, may also serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74).
  • MDDT encoded by polynucleotides of the present invention may be used to screen for molecules that bind to or are bound by the encoded polypeptides.
  • the binding of the polypeptide and the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the polypeptide or the bound molecule.
  • Examples of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or small molecules.
  • the molecule is closely related to the natural ligand of the polypeptide, e.g., a ligand or fragment thereof, a natural substrate, or a structural or functional mimetic.
  • a ligand or fragment thereof e.g., a ligand or fragment thereof, a natural substrate, or a structural or functional mimetic.
  • the molecule can be closely related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, e.g., the active site. In either case, the molecule can be rationally designed using known techniques.
  • the screening for these molecules involves producing appropriate cells which express the polypeptide, either as a secreted protein or on the cell membrane.
  • Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells expressing the polypeptide or cell membrane fractions which contain the expressed polypeptide are then contacted with a test compound and binding, stimulation, or inhibition of activity of either the polypeptide or the molecule is analyzed.
  • An assay may simply test binding of a candidate compound to the polypeptide, wherein binding is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. Alternatively, the assay may assess binding in the presence of a labeled competitor. Additionally, the assay can be carried out using cell-free preparations, polypeptide/molecule affixed to a solid support, chemical libraries, or natural product mixtures. The assay may also simply comprise the steps of mixing a candidate compound with a solution containing a polypeptide, measuring polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity or binding to a standard.
  • an ELISA assay using, e.g., a monoclonal or polyclonal antibody can measure polypeptide level in a sample.
  • the antibody can measure polypeptide level by either binding, directly or indirectly, to the polypeptide or by competing with the polypeptide for a substrate.
  • All of the above assays can be used in a diagnostic or prognostic context.
  • the molecules discovered using these assays can be used to treat disease or to bring about a particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule.
  • the assays can discover agents which may inhibit or enhance the production of the polypeptide from suitably manipulated cells or tissues. Transcript Imaging and Toxicological Testing
  • a transcript image represents the global pattern of gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by quantifying the number of expressed genes and their relative abundance under given conditions and at a given time.
  • a transcript image may be generated by hybridizing the polynucleotides of the present invention or their complements to the totality of transcripts or reverse transcripts of a particular tissue or cell type.
  • the hybridization takes place in high-throughput format, wherein the polynucleotides of the present invention or their complements comprise a subset of a plurality of elements on a microarray.
  • the resultant transcript image would provide a profile of gene activity pertaining to disease detection and treatment molecules.
  • Transcript images which profile mddt expression may be generated using transcripts isolated from tissues, cell lines, biopsies, or other biological samples.
  • the transcript image may thus reflect mddt expression in vivo, as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line.
  • Transcript images which profile mddt expression may also be used in conjunction with in vitro model systems and preclinical evaluation of pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental compounds. All compounds induce characteristic gene expression patterns, frequently termed molecular finge ⁇ rints or toxicant signatures, which are indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog. 24' 153- 159; Steiner, S. and Anderson, N. L. (2000) Toxicol. Lett. 112-1 13:467-71 , expressly inco ⁇ orated by reference herein). If a test compound has a signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. These finge ⁇ rints or signatures are most useful and refined when they contain expression information from a large number of genes and gene families.
  • a genome-wide measurement of expression provides the highest quality signature. Even genes whose expression is not altered by any tested compounds are important as well, as the levels of expression of these genes are used to normalize the rest of the expression data. The normalization procedure is useful for comparison of expression data after treatment with different compounds. While the assignment of gene function to elements of a toxicant signature aids in inte ⁇ retation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical matching of signatures which leads to prediction of toxicity.
  • the toxicity of a test compound is assessed by treating a biological sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the treated biological sample are hybridized with one or more probes specific to the polynucleotides of the present invention, so that transcript levels corresponding to the polynucleotides of the present invention may be quantified. The transcript levels in the treated biological sample are compared with levels in an untreated biological sample. Differences in the transcript levels between the two samples are indicative of a toxic response caused by the test compound in the treated sample.
  • proteome refers to the global pattern of protein expression in a particular tissue or cell type.
  • proteome expression patterns, or profiles are analyzed by quantifying the number of expressed proteins and their relative abundance under given conditions and at a given time.
  • a profile of a cell's proteome may thus be generated by separating and analyzing the polypeptides of a particular tissue or cell type.
  • the separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra).
  • the proteins are visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains.
  • the optical density of each protein spot is generally proportional to the level of the protein in the sample.
  • the optical densities of equivalently positioned protein spots from different samples for example, from biological samples either treated or untreated with a test compound or therapeutic agent, are compared to identify any changes in protein spot density related to the treatment.
  • the proteins in the spots are partially sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry.
  • the identity of the protein in a spot may be determined by comparing its partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences of the present invention. In some cases, further sequence data may be obtained for definitive protein identification.
  • a proteomic profile may also be generated using antibodies specific for MDDT to quantify the levels of MDDT expression.
  • the antibodies are used as elements on a microarray, and protein expression levels are quantified by exposing the microarray to the sample and detecting the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103-1 1 ; Mendoze, L. G. et al. (1999) Biotechniques 27:778-88). Detection may be performed by a variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or ami no-reactive fluorescent compound and detecting the amount of fluorescence bound at each array element.
  • Toxicant signatures at the proteome level are also useful for toxicological screening, and should be analyzed in parallel with toxicant signatures at the transcript level.
  • There is a poor correlation between transcript and protein abundances for some proteins in some tissues (Anderson, N. L. and Seilhamer, J. (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the analysis of compounds which do not significantly affect the transcript image, but which alter the proteomic profile.
  • the analysis of transcripts in body fluids is difficult, due to rapid degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases.
  • the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound.
  • Proteins that are expressed in the treated biological sample are separated so that the amount of each protein can be quantified.
  • the amount of each protein is compared to the amount of the corresponding protein in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample.
  • Individual proteins are identified by sequencing the amino acid residues of the individual proteins and comparing these partial sequences to the MDDT encoded by polynucleotides of the present invention.
  • the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins from the biological sample are incubated with antibodies specific to the MDDT encoded by polynucleotides of the present invention. The amount of protein recognized by the antibodies is quantified. The amount of protein in the treated biological sample is compared with the amount in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample.
  • Transcript images may be used to profile mddt expression in distinct tissue types. This process can be used to determine disease detection and treatment molecule activity in a particular tissue type relative to this activity in a different tissue type. Transcript images may be used to generate a profile of mddt expression characteristic of diseased tissue. Transcript images of tissues before and after treatment may be used for diagnostic pu ⁇ oses, to monitor the progression of disease, and to monitor the efficacy of drug treatments for diseases which affect the activity of disease detection and treatment molecules.
  • Transcript images of cell lines can be used to assess disease detection and treatment molecule activity and/or to identify cell lines that lack or misregulate this activity. Such cell lines may then be treated with pharmaceutical agents, and a transcript image following treatment may indicate the efficacy of these agents in restoring desired levels of this activity. A similar approach may be used to assess the toxicity of pharmaceutical agents as reflected by undesirable changes in disease detection and treatment molecule activity. Candidate pharmaceutical agents may be evaluated by comparing their associated transcript images with those of pharmaceutical agents of known effectiveness.
  • Antisense Molecules The polynucleotides of the present invention are useful in antisense technology. Antisense technology or therapy relies on the modulation of expression of a target protein through the specific binding of an antisense sequence to a target sequence encoding the target protein or directing its expression.
  • Antisense technology or therapy relies on the modulation of expression of a target protein through the specific binding of an antisense sequence to a target sequence encoding the target protein or directing its expression.
  • Agrawal, S., ed. 1996 Antisense Therapeutics, Humana Press Inc., Totawa NJ; Alama, A. et al. (1997) Pharmacol. Res. 36(3): 171-178; Crooke, S.T. (1997) Adv. Pharmacol. 40: 1-49; Sharma, H.W. and R.
  • An antisense sequence is a polynucleotide sequence capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences bind to cellular mRNA and/or genomic DNA, affecting translation and or transcription. Antisense sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) Antisense Res. Dev. l(3):285-288; Lee, R. et al. (1998) Biochemistry 37(3):900-1010; Pardridge,
  • the binding which results in modulation of expression occurs through hybridization or binding of complementary base pairs.
  • Antisense sequences can also bind to DNA duplexes through specific interactions in the major groove of the double helix.
  • the polynucleotides of the present invention and fragments thereof can be used as antisense sequences to modify the expression of the polypeptide encoded by mddt.
  • antisense sequences can be produced ex vivo, such as by using any of the ABI nucleic acid synthesizer series (Applied Biosystems) or other automated systems known in the art. Antisense sequences can also be produced biologically, such as by transforming an appropriate host cell with an expression vector containing the sequence of interest. (See, e.g., Agrawal, supra.)
  • Antisense sequences can be delivered intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence complementary to at least a portion of the cellular sequence encoding the target protein.
  • Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as retrovirus and adeno-associated virus vectors.
  • viral vectors such as retrovirus and adeno-associated virus vectors.
  • retrovirus vectors See, e.g., Miller, A.D. (1990) Blood 76:271 ; Ausubel, F.M. et al. (1995) Current Protocols in Molecular Biology. John Wiley & Sons, New York NY; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.
  • Other gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other systems known in the art.
  • the nucleotide sequences encoding MDDT or fragments thereof may be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host.
  • an appropriate expression vector i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host.
  • Methods which are well known to those skilled in the art may be used to construct expression vectors containing sequences encoding MDDT and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra. Chapters 4, 8, 16, and 17; and Ausubel, supra. Chapters 9, 10, 13, and 16.)
  • a variety of expression vector/host systems may be utilized to contain and express sequences encoding MDDT. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal (mammalian) cell systems.
  • microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors
  • yeast transformed with yeast expression vectors insect cell systems infected with viral expression vectors (e.g., baculovirus)
  • plant cell systems transformed with viral expression vectors e.g., cauliflower mosaic virus
  • Expression vectors derived from retroviruses, adenoviruses, or he ⁇ es or vaccinia viruses, or from various bacterial plasmids, may be used for delivery of nucleotide sequences to the targeted organ, tissue, or cell population.
  • sequences encoding MDDT can be transformed into cell lines using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Any number of selection systems may be used to recover transformed cell lines.
  • the mddt of the invention may be used for somatic or germline gene therapy.
  • Gene therapy may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined immunodeficiency (SCID)-Xl disease characterized by X-linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-
  • mddt hepatitis B or C virus
  • fungal parasites such as Candida albicans and Paracoccidioides brasiliensis
  • protozoan parasites such as Plasmodium falciparum and Trvpanosoma cruzi.
  • the expression of mddt from an appropriate population of transduced cells may alleviate the clinical manifestations caused by the genetic deficiency.
  • diseases or disorders caused by deficiencies in mddt are treated by constructing mammalian expression vectors comprising mddt and introducing these vectors by mechanical means into mddt-deficient cells.
  • Mechanical transfer technologies for use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the use of DNA transposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev. Biochem. 62: 191- 217; Ivies, Z. (1997) Cell 91 :501-510; Boulay, J-L.
  • Expression vectors that may be effective for the expression of mddt include, but are not limited to, the PCDNA 3.1 , EPITAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), and PTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA).
  • the mddt of the invention ' may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or ⁇ -actin genes), (ii) an inducible promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and Bujard, H. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:5547-5551 ; Gossen, M. et al., (1995) Science 268: 1766-1769; Rossi, F.M.V.
  • a constitutively active promoter e.g., from cytomegalovirus (CMV), Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or ⁇ -actin genes
  • the FK506/rapamycin inducible promoter or the RU486/mifepristone inducible promoter (Rossi, F.M.V. and Blau, H.M. supra), or (iii) a tissue-specific promoter or the native promoter of the endogenous gene encoding MDDT from a normal individual.
  • liposome transformation kits e.g., the PERFECT LIPID TRANSFECTION KIT, available from Invitrogen
  • PERFECT LIPID TRANSFECTION KIT available from Invitrogen
  • transformation is performed using the calcium phosphate method (Graham, F.L. and Eb, A.J. (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J. 1 :841-845).
  • the introduction of DNA to primary cells requires modification of these standardized mammalian transfection protocols.
  • diseases or disorders caused by genetic defects with respect to mddt expression are treated by constructing a retrovirus vector consisting of (i) mddt under the control of an independent promoter or the retrovirus long terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) along with additional retrovirus cw-acting RNA sequences and coding sequences required for efficient vector propagation.
  • LTR long terminal repeat
  • RRE Rev-responsive element
  • Retrovirus vectors e.g., PFB and PFBNEO are commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. U.S.A. 92:6733-6737), inco ⁇ orated by reference herein.
  • the vector is propagated in an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol.
  • VPCL vector producing cell line
  • U.S. Patent Number 5,910,434 to Rigg discloses a method for obtaining retrovirus packaging cell lines and is hereby inco ⁇ orated by reference.
  • Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4 + T-cells), and the return of transduced cells to a patient are procedures well known to persons skilled in the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71 :7020-7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71 :4707-4716; Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. U.S.A. 95: 1201- 1206; Su, L. ( 1997) Blood 89:2283-2290).
  • an adenovirus-based gene therapy delivery system is used to deliver mddt to cells which have one or more genetic abnormalities with respect to the expression of mddt.
  • the construction and packaging of adenovirus-based vectors are well known to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995)
  • adenoviral vectors are described in U.S. Patent Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby inco ⁇ orated by reference.
  • Adenoviral vectors see also Antinozzi, P.A. et al. (1999) Annu. Rev. Nutr. 19:51 1-544 and Verma, I.M. and Somia, N. (1997) Nature 18:389:239-242, both incorporated by reference herein.
  • a he ⁇ es-based, gene therapy delivery system is used to deliver mddt to target cells which have one or more genetic abnormalities with respect to the expression of mddt.
  • herpes simplex virus (HSV)-based vectors may be especially valuable for introducing mddt to cells of the central nervous system, for which HSV has a tropism.
  • HSV herpes simplex virus
  • the construction and packaging of he ⁇ es-based vectors are well known to those with ordinary skill in the art.
  • a replication-competent herpes simplex virus (HSV) type 1-based vector has been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395).
  • the construction of a HSV- 1 virus vector has also been disclosed in detail in U.S.
  • Patent Number 5,804,413 to DeLuca (“He ⁇ es simplex virus strains for gene transfer"), which is hereby inco ⁇ orated by reference.
  • U.S. Patent Number 5,804,413 teaches the use of recombinant HSV d92 which consists of a genome containing at least one exogenous gene to be transferred to a cell under the control of the appropriate promoter for pu ⁇ oses including human gene therapy. Also taught by this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22.
  • HSV vectors see also Goins, W. F. et al. 1999 J. Virol. 73:519-532 and Xu, H. et al., (1994) Dev.
  • an alphavirus (positive, single-stranded RNA virus) vector is used to deliver mddt to target cells.
  • SFV Semliki Forest Virus
  • RNA replicates to higher levels than the full-length genomic RNA, resulting in the ove ⁇ roduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g., protease and polymerase).
  • enzymatic activity e.g., protease and polymerase.
  • inserting mddt into the alphavirus genome in place of the capsid-coding region results in the production of a large number of mddt RNAs and the synthesis of high levels of MDDT in vector transduced cells.
  • alphavirus infection is typically associated with cell lysis within a few days
  • the ability to establish a persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy application (Dryga, S.A. et al. ( 1997) Virology 228:74-83).
  • the wide host range of alphaviruses will allow the introduction of mddt into a variety of cell types.
  • the specific transduction of a subset of cells in a population may require the sorting of cells prior to transduction.
  • the methods of manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA transfections, and performing alphavirus infections, are well known to those with ordinary skill in the art.
  • Anti-MDDT antibodies may be used to analyze protein expression levels. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, and Fab fragments. For descriptions of and protocols of antibody technologies, see, e.g., Pound J.D. (1998) Immunochemical
  • amino acid sequence encoded by the mddt of the Sequence Listing may be analyzed by appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to determine regions of high immunogenicity.
  • appropriate software e.g., LASERGENE NAVIGATOR software, DNASTAR
  • the optimal sequences for immunization are selected from the C- terminus, the N-terminus, and those intervening, hydrophilic regions of the polypeptide which are likely to be exposed to the external environment when the polypeptide is in its natural conformation.
  • Peptides used for antibody induction do not need to have biological activity; however, they must be antigenic.
  • Peptides used to induce specific antibodies may have an amino acid sequence consisting of at least five amino acids, preferably at least 10 amino acids, and most preferably at least 15 amino acids.
  • a peptide which mimics an antigenic fragment of the natural polypeptide may be fused with another protein such as keyhole hemolimpet cyanin (KLH; Sigma, St. Louis MO) for antibody production.
  • KLH keyhole hemolimpet cyanin
  • a peptide encompassing an antigenic region may be expressed from an mddt, synthesized as described above, or purified from human cells. Procedures well known in the art may be used for the production of antibodies.
  • Various hosts including mice, goats, and rabbits, may be immunized by injection with a peptide. Depending on the host species, various adjuvants may be used to increase immunological response.
  • peptides about 15 residues in length may be synthesized using an ABI 431 A peptide synthesizer (Applied Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by reaction with M-maleimidobenzoyl-N-hydroxysuccinimide ester (Ausubel, 1995, supra).
  • Rabbits are immunized with the peptide-KLH complex in complete Freund's adjuvant.
  • the resulting antisera are tested for antipeptide activity by binding the peptide to plastic, blocking with 1 % bovine serum albumin (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG.
  • Antisera with antipeptide activity are tested for anti-MDDT activity using protocols well known in the art, including ELISA, radioimmunoassay (RIA), and immunoblotting.
  • isolated and purified peptide may be used to immunize mice (about 100 ⁇ g of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodinated and used to screen the immunized animals' B-lymphocytes for production of antipeptide antibodies. Positive cells are then used to produce hybridomas using standard techniques. About 20 mg of peptide is sufficient for labeling and screening several thousand clones. Hybridomas of interest are detected by screening with radioiodinated peptide to identify those fusions producing peptide-specific monoclonal antibody.
  • wells of a multi-well plate (FAST, Becton-Dickinson, Palo Alto, CA) are coated with affinity-purified, specific rabbit-anti-mouse (or suitable anti-species IgG) antibodies at 10 mg ml.
  • the coated wells are blocked with 1 % BSA and washed and exposed to supernatants from hybridomas. After incubation, the wells are exposed to radiolabeled peptide at 1 mg/ml.
  • Clones producing antibodies bind a quantity of labeled peptide that is detectable above background. Such clones are expanded and subjected to 2 cycles of cloning. Cloned hybridomas are injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia Biotech). Several procedures for the production of monoclonal antibodies, including in vitro production, are described in
  • Antibody fragments containing specific binding sites for an epitope may also be generated.
  • such fragments include, but are not limited to, the F(ab')2 fragments produced by pepsin digestion of the antibody molecule, and the Fab fragments generated by reducing the disulfide bridges of the F(ab')2 fragments.
  • construction of Fab expression libraries in filamentous bacteriophage allows rapid and easy identification of monoclonal fragments with desired specificity (Pound, supra. Chaps. 45-47).
  • Antibodies generated against polypeptide encoded by mddt can be used to purify and characterize full-length MDDT protein and its activity, binding partners, etc. Assays Using Antibodies
  • Anti-MDDT antibodies may be used in assays to quantify the amount of MDDT found in a particular human cell. Such assays include methods utilizing the antibody and a label to detect expression level under normal or disease conditions.
  • the peptides and antibodies of the invention may be used with or without modification or labeled by joining them, either covalently or noncovalently, with a reporter molecule.
  • Protocols for detecting and measuring protein expression using either polyclonal or monoclonal antibodies are well known in the art. Examples include ELISA, RIA, and fluorescent activated cell sorting (FACS). Such immunoassays typically involve the formation of complexes between the MDDT and its specific antibody and the measurement of such complexes. These and other assays are described in Pound (supra).
  • RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated from various tissues. Some tissues were homogenized and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of denaturants, such as TRIZOL (Life Technologies Inc.).
  • RNA was provided with RNA and constructed the corresponding cDNA libraries.
  • cDNA was synthesized and cDNA libraries were constructed with the UNIZAP vector system (Stratagene Cloning Systems, Inc. (Stratagene), La Jolla CA) or SUPERSCRIPT plasmid system (Life Technologies), using the recommended procedures or similar methods known in the art. (See, e.g., Ausubel, 1997, supra. Chapters 5.1 through 6.6.) Reverse transcription was initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or enzymes.
  • cDNA was size-selected (300-1000 bp) using SEPHACRYL SI 000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis.
  • cDNAs were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), or pINCY (Incyte Genomics, Palo Alto CA), or derivatives thereof.
  • PBLUESCRIPT plasmid (Stratagene)
  • PSPORT1 plasmid (Life Technologies)
  • PCDNA2.1 plasmid Invitrogen, Carlsbad CA
  • PBK-CMV plasmid (Stratagene)
  • Recombinant plasmids were transformed into competent E. coli cells including XLl -Blue, XLl-BlueMRF, or SOLR from Stratagene or DH5 ⁇ , DH10B, or ElectroMAX DH10B from Life Technologies.
  • Plasmids were recovered from host cells by in vivo excision using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using at least one of the following: the Magic or WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge
  • plasmids were resuspended in 0.1 ml of distilled water and stored, with or without lyophilization, at 4°C.
  • plasmid DNA was amplified from host cell lysates using direct link PCR in a high-throughput format. (Rao, V.B. (1994) Anal. Biochem. 216: 1 -14.) Host cell lysis and thermal cycling steps were carried out in a single reaction mixture.
  • cDNA sequencing reactions were prepared using reagents provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI protocols and base calling software; or other sequence analysis systems known in the art. Reading frames within the cDNA sequences were identified using standard methods (reviewed in Ausubel, 1997, supra. Chapter 7.7). Some of the cDNA sequences were selected for extension using the techniques disclosed in Example VIII.
  • Component sequences from chromatograms were subject to PHRED analysis and assigned a quality score.
  • the sequences having at least a required quality score were subject to various pre- processing editing pathways to eliminate, e.g., low quality 3' ends, vector and linker sequences, polyA tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial contamination sequences, and sequences smaller than 50 base pairs.
  • low-information sequences and repetitive elements e.g., dinucleotide repeats, Alu repeats, etc.
  • Processed sequences were then subject to assembly procedures in which the sequences were assigned to gene bins (bins). Each sequence could only belong to one bin.
  • Sequences in each gene bin were assembled to produce consensus sequences (templates). Subsequent new sequences were added to existing bins using BLASTn (v.1.4 WashU) and CROSSMATCH. Candidate pairs were identified as all BLAST hits having a quality score greater than or equal to 150. Alignments of at least 82% local identity were accepted into the bin.
  • the component sequences from each bin were assembled using a version of PHRAP. Bins with several overlapping component sequences were assembled using DEEP PHRAP.
  • the orientation (sense or antisense) of each assembled template was determined based on the number and orientation of its component sequences. Template sequences as disclosed in the sequence listing correspond to sense strand sequences (the "forward" reading frames), to the best determination.
  • the complementary (antisense) strands are inherently disclosed herein.
  • the component sequences which were used to assemble each template consensus sequence are listed in Table 5, along with their positions along the template nucleotide sequences.
  • Bins were compared against each other and those having local similarity of at least 82% were combined and reassembled. Reassembled bins having templates of insufficient overlap (less than 95% local identity) were re-split. Assembled templates were also subject to analysis by
  • STITCHER/EXON MAPPER algorithms which analyze the probabilities of the presence of splice variants, alternatively spliced exons, splice junctions, differential expression of alternative spliced genes across tissue types or disease states, etc. These resulting bins were subject to several rounds of the above assembly procedures. Once gene bins were generated based upon sequence alignments, bins were clone joined based upon clone information. If the 5' sequence of one clone was present in one bin and the 3' sequence from the same clone was present in a different bin, it was likely that the two bins actually belonged together in a single bin. The resulting combined bins underwent assembly procedures to regenerate the consensus sequences. The final assembled templates were subsequently annotated using the following procedure.
  • the template sequences were further analyzed by translating each template in all three forward reading frames and searching each translation against the Pfam database of hidden Markov model-based protein families and domains using the HMMER software package (available to the public from Washington University School of Medicine, St. Louis MO). Regions of templates which, when translated, contain similarity to Pfam consensus sequences are reported in Table 3, along with descriptions of Pfam protein domains and families. Only those Pfam hits with an E-value of ⁇ 1 x 10 "3 are reported.
  • Template sequences were also translated in all three forward reading frames, and each translation was searched against TMAP, a program that uses weight matrices to delineate transmembrane segments on protein sequences and determine orientation, with respect to the cell cytosol (Persson, B. and P. Argos (1994) J. Mol. Biol. 237: 182-192; Persson, B. and P. Argos (1996) Protein Sci. 5:363-371.) Regions of templates which, when translated, contain similarity to signal peptide or transmembrane consensus sequences are reported in Table 4.
  • HMMER analysis as reported in Tables 3 and 4 may support the results of BLAST analysis as reported in Table 2 or may suggest alternative or additional properties of template- encoded polypeptides not previously uncovered by BLAST or other analyses.
  • Template sequences are further analyzed using the bioinformatics tools listed in Table 8, or using sequence analysis software known in the art such as MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Template sequences may be further queried against public databases such as the GenBank rodent, mammalian, vertebrate, prokaryote, and eukaryote databases. The template sequences were translated to derive the corresponding longest open reading frame as presented by the polypeptide sequences as reported in Table 2.
  • polypeptide of the invention may begin at any of the mefhionine residues within the full length translated polypeptide.
  • Polypeptide sequences were subsequently analyzed by querying against the GenBank protein database (GENPEPT, (GenBank version 124)).
  • GenBank protein database GenBank version 1214
  • Full length polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco
  • Columns 4 and 5 show the start and stop nucleotide positions of the polynucleotide sequences encoding the polypeptide segments.
  • Column 6 shows the GenBank identification number (GI Number) of the nearest GenBank homolog.
  • Column 7 shows the probability score for the match between each polypeptide and its GenBank homolog.
  • Column 8 shows the annotation of the GenBank homolog.
  • Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs from a particular cell type or tissue have been bound. (See, e.g., Sambrook, supra, ch. 7; Ausubel, 1995, supra, ch. 4 and 16.)
  • the product score takes into account both the degree of similarity between two sequences and the length of the sequence match.
  • the product score is a normalized value between 0 and 100, and is calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the product is divided by (5 times the length of the shorter of the two sequences).
  • the BLAST score is calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate the product score.
  • the product score represents a balance between fractional overlap and quality in a BLAST alignment.
  • a product score of 100 is produced only for 100% identity over the entire length of the shorter of the two sequences being compared.
  • a product score of 70 is produced either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the other.
  • a product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap.
  • a tissue distribution profile is determined for each template by compiling the cDNA library tissue classifications of its component cDNA sequences.
  • Each component sequence is derived from a cDNA library constructed from a human tissue.
  • Each human tissue is classified into one of the following categories: cardiovascular system; connective tissue; digestive system; embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory system, sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract.
  • Template sequences, component sequences, and cDNA library /tissue information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA).
  • Table 6 shows the tissue distribution profile for the templates of the invention. For each template, the three most frequently observed tissue categories are shown in column 3, along with the percentage of component sequences belonging to each category. Only tissue categories with percentage values of ⁇ 10% are shown. A tissue distribution of "widely distributed" in column 3 indicates percentage values of ⁇ 10% in all tissue categories.
  • Transcript images are generated as described in Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, inco ⁇ orated herein by reference.
  • Oligonucleotide primers designed using an mddt of the Sequence Listing are used to extend the nucleic acid sequence.
  • One primer is synthesized to initiate 5' extension of the template, and the other primer, to initiate 3' extension of the template.
  • the initial primers may be designed using OLIGO 4.06 software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target sequence at temperatures of about 68°C to about 72°C.
  • PCR is performed in 96-well plates using the PTC-200 thermal cycler (MJ Research).
  • the reaction mix contains DNA template, 200 nmol of each primer, reaction buffer containing Mg 2 ⁇ (NH 4 ) 2 S0 4 , and ⁇ - mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme (Life Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair PCI A and PCI B: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C.
  • the parameters for primer pair T7 and SK+ are as follows: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 57°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C.
  • the concentration of DNA in each well is determined by dispensing 100 ⁇ l PICOGREEN quantitation reagent (0.25% (v/v); Molecular Probes) dissolved in IX Tris-EDTA (TE) and 0.5 ⁇ l of undiluted PCR product into each well of an opaque fluorimeter plate (Coming Inco ⁇ orated (Coming), Co ing NY), allowing the DNA to bind to the reagent.
  • the plate is scanned in a FLUOROSKAN II (Labsystems Oy) to measure the fluorescence of the sample and to quantify the concentration of DNA.
  • a 5 ⁇ l to 10 ⁇ l aliquot of the reaction mixture is analyzed by electrophoresis on a 1 % agarose mini-gel to determine which reactions are successful in extending the sequence.
  • the extended nucleotides are desalted and concentrated, transferred to 384-well plates, digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech).
  • CviJI cholera virus endonuclease Molecular Biology Research, Madison WI
  • sonicated or sheared prior to religation into pUC 18 vector
  • the digested nucleotides are separated on low concentration (0.6 to 0.8%) agarose gels, fragments are excised, and agar digested with AGAR ACE (Promega).
  • Extended clones are religated using T4 ligase (New England Biolabs, Inc., Beverly MA) into pUC 18 vector (Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transfected into competent E. coli cells. Transformed cells are selected on antibiotic-containing media, individual colonies are picked and cultured overnight at 37°C in 384-well plates in LB/2x carbenicillin liquid media.
  • the cells are lysed, and DNA is amplified by PCR using Taq DNA polymerase (Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at 4°C.
  • DNA is quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reamplified using the same conditions as described above.
  • Samples are diluted with 20% dimethysulfoxide (1 :2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems).
  • DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems).
  • the mddt is used to obtain regulatory sequences (promoters, introns, and enhancers) using the procedure above, oligonucleotides designed for such extension, and an appropriate genomic library.
  • Hybridization probes derived from the mddt of the Sequence Listing are employed for screening cDNAs, mRNAs, or genomic DNA. The labeling of probe nucleotides between 100 and 1000 nucleotides in length is specifically described, but essentially the same procedure may be used with larger cDNA fragments. Probe sequences are labeled at room temperature for 30 minutes using a T4 polynucleotide kinase, ⁇ 32 P-ATP, and 0.5X One-Phor-All Plus (Amersham Pharmacia Biotech) buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech). The probe mixture is diluted to 10 7 dpm/ ⁇ g/ml hybridization buffer and used in a typical membrane-based hybridization analysis.
  • the DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed through a 0.7% agarose gel.
  • the DNA fragments are transferred from the agarose to nylon membrane (NYTRAN Plus, Schleicher & Schuell, Inc., Keene NH) using procedures specified by the manufacturer of the membrane.
  • Prehybridization is carried out for three or more hours at 68 °C, and hybridization is carried out overnight at 68°C.
  • blots are sequentially washed at room temperature under increasingly stringent conditions, up- to 0. Ix saline sodium citrate (SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a PHOSPHORIMAGER cassette (Molecular Dynamics) or are exposed to autoradiography film, hybridization patterns of standard and experimental lanes are compared. Essentially the same procedure is employed when screening RNA.
  • Genome Research (WIGR), and Genethon are used to determine if any of the clustered sequences have been previously mapped. Inclusion of a mapped sequence in a cluster will result in the assignment of all sequences of that cluster, including its particular SEQ ID NO:, to that map location.
  • the genetic map locations of SEQ ID NO: 1 -252 are described as ranges, or intervals, of human chromosomes. The map position of an interval, in cenUMorgans, is measured relative to the terminus of the chromosome's p-arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between chromosomal markers.
  • cM is roughly equivalent to 1 megabase (Mb) of DNA in humans, although this can vary widely due to hot and cold spots of recombination.
  • Mb megabase
  • the cM distances are based on genetic markers mapped by Genethon which provide boundaries for radiation hybrid markers whose sequences were included in each of the clusters.
  • Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and polyA + RNA is purified using the oligo (dT) cellulose method.
  • Each polyA + RNA sample is reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/ ⁇ l oligo-dT primer (21mer), IX first strand buffer, 0.03 units/ ⁇ l RNase inhibitor, 500 ⁇ M dATP, 500 ⁇ M dGTP, 500 ⁇ M dTTP, 40 ⁇ M dCTP, 40 ⁇ M dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech).
  • the reverse transcription reaction is performed in a 25 ml volume containing 200 ng polyA + RNA with GEMBRIGHT kits (Incyte).
  • Specific control polyA + RNAs are synthesized by in vitro transcription from non-coding yeast genomic DNA (W. Lei, unpublished). As quantitative controls, the control RNAs at 0.002 ng,
  • 0.02 ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at ratios of 1 : 100,000, 1 : 10,000, 1 : 1000, 1 : 100 (w/w) to sample mRNA respectively.
  • the control mRNAs are diluted into reverse transcription reaction at ratios of 1:3, 3:1, 1 :10, 10:1, 1 :25, 25:1 (w/w) to sample mRNA differential expression patterns. After incubation at 37° C for 2 hr, each reaction sample (one with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA.
  • Probes are purified using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. (CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The probe is then dried to completion using a SpeedVAC (Savant Instmments Inc., Holbrook NY) and resuspended in 14 ⁇ l 5X SSC/0.2% SDS.
  • CLONTECH Laboratories, Inc. CLONTECH
  • Palo Alto CA ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol.
  • the probe is then dried to completion using a SpeedVAC (Savant Instmments Inc., Holbrook NY) and resuspended in 14 ⁇ l 5X SSC/0.2% SDS.
  • Sequences of the present invention are used to generate array elements.
  • Each array element is amplified from bacterial cells containing vectors with cloned cDNA inserts.
  • PCR amplification uses primers complementary to the vector sequences flanking the cDNA insert.
  • Array elements are amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 ⁇ g.
  • Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). Purified array elements are immobilized on polymer-coated glass slides. Glass microscope slides (Co ing) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water washes between and after treatments.
  • Patent No. 5,807,522 inco ⁇ orated herein by reference.
  • 1 ⁇ l of the array element DNA is loaded into the open capillary printing element by a high-speed robotic apparatus.
  • the apparatus then deposits about 5 nl of array element sample per slide.
  • Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water.
  • Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate buffered saline (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60° C followed by washes in 0.2% SDS and distilled water as before.
  • PBS phosphate buffered saline
  • Hybridization reactions contain 9 ⁇ l of probe mixture consisting of 0.2 ⁇ g each of Cy3 and Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer.
  • the probe mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and covered with an 1.8 cm 2 coverslip.
  • the arrays are transferred to a wate ⁇ roof chamber having a cavity just slightly larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 ⁇ l of
  • Reporter-labeled hybridization complexes are detected with a microscope equipped with an Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5.
  • the excitation laser light is focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY).
  • the slide containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- scanned past the objective.
  • the 1.8 cm x 1.8 cm array used in the present example is scanned with a resolution of 20 micrometers.
  • a mixed gas multiline laser excites the two fluorophores sequentially. Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate filters positioned between the array and the photomultiplier tubes are used to filter the signals.
  • the emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5.
  • Each array is typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, although the apparatus is capable of recording the spectra from both fluorophores simultaneously.
  • the sensitivity of the scans is typically calibrated using the signal intensity generated by a cDNA control species added to the probe mix at a known concentration.
  • a specific location on the array contains a complementary DNA sequence, allowing the intensity of the signal at that location to be correlated with a weight ratio of hybridizing species of 1 : 100,000.
  • the calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and adding identical amounts of each to the hybridization mixture.
  • the output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital (A D) conversion board (Analog Devices, Inc., Norwood, MA) installed in an IBM-compatible PC computer.
  • a D analog-to-digital
  • the digitized data are displayed as an image where the signal intensity is mapped using a linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high signal).
  • the data is also analyzed quantitatively. Where two different fluorophores are excited and measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission spectra) between the fluorophores using each fluorophore 's emission spectrum.
  • a grid is superimposed over the fluorescence signal image such that the signal from each spot is centered in each element of the grid.
  • the fluorescence signal within each element is then integrated to obtain a numerical value corresponding to the average intensity of the signal.
  • the software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte).
  • Sequences complementary to the mddt are used to detect, decrease, or inhibit expression of the naturally occurring nucleotide.
  • the use of oligonucleotides comprising from about 15 to 30 base pairs is typical in the art. However, smaller or larger sequence fragments can also be used.
  • Appropriate oligonucleotides are designed from the mddt using OLIGO 4.06 software (National Biosciences) or other appropriate programs and are synthesized using methods standard in the art or ordered from a commercial supplier.
  • OLIGO 4.06 software National Biosciences
  • a complementary oligonucleotide is designed from the most unique 5' sequence and used to prevent transcription factor binding to the promoter sequence.
  • To inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding and processing of the transcript.
  • MDDT expression and purification of MDDT is accomplished using bacterial or vims-based expression systems.
  • cDNA is subcloned into an appropriate vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription
  • promoters include, but are not limited to, the trp-lac (tac) hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element
  • Recombinant vectors are transformed into suitable bacterial hosts, e.g , BL21(DE3).
  • Antibiotic resistant bacteria express MDDT upon induction with isopropyl beta-D- thiogalactopyranoside (IPTG). Expression of MDDT in eukaryotic cells is achieved by infecting insect or mammalian cell lines with recombinant Autographica californica nuclear polyhedrosis vims
  • AcMNPV baculovirus
  • baculovirus The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding MDDT by either homologous recombination or bacterial-mediated transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong polyhedrin promoter drives high levels of cDNA transcription Recombinant baculovirus is used to infect Spodoptera frugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases.
  • Sf9 infect Spodoptera frugiperda
  • MDDT is synthesized as a fusion protein with, e.g., glutathione S- transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, affinity-based purification of recombinant fusion protein from cmde cell lysates GST, a 26-kilodalton enzyme from Schistosoma laponicum, enables the purification of fusion proteins on immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia Biotech) Following purification, the GST moiety can be proteolytically cleaved from MDDT at specifically engineered sites FLAG, an 8-amino acid peptide, enables immunoaffinity purification using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak
  • MDDT or biologically active fragments thereof, are labeled with l25 I Bolton-Hunter reagent.
  • Bolton-Hunter reagent See, e.g , Bolton, A E and W.M Hunter (1973) Biochem J 133 529-539 )
  • Candidate molecules previously arrayed in the wells of a multi-well plate are incubated with the labeled MDDT, washed, and any wells with labeled MDDT complex are assayed
  • Data obtained using different concentrations of MDDT are used to calculate values for the number, affinity, and association of MDDT with the candidate molecules.
  • molecules interacting with MDDT are analyzed using the yeast two-hybrid system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commercially available kits based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH).
  • MDDT may also be used in the PATHCALLING process (CuraGen Co ⁇ ., New Haven CT) which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. Patent No. 6,057,101 ).
  • PATHCALLING process CuraGen Co ⁇ ., New Haven CT
  • yeast two-hybrid system in a high-throughput manner to determine all interactions between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. Patent No. 6,057,101 ).
  • MDDT function is assessed by expressing mddt at physiologically elevated levels in mammalian cell culture systems.
  • cDNA is subcloned into a mammalian expression vector containing a strong promoter that drives high levels of cDNA expression.
  • Vectors of choice include pCMV SPORT (Life Technologies) and pCR3.1 (Invitrogen Co ⁇ oration, Carlsbad CA), both of which contain the cytomegalovirus promoter.
  • 5-10 ⁇ g of recombinant vector are transiently transfected into a human cell line, preferably of endothelial or hematopoietic origin, using either liposome formulations or electroporation.
  • 1 -2 ⁇ g of an additional plasmid containing sequences encoding a marker protein are co-transfected.
  • Expression of a marker protein provides a means to distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression from the recombinant vector.
  • Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; CLONTECH), CD64, or a CD64-GFP fusion protein.
  • FCM Flow cytometry
  • FCM an automated laser optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the uptake of fluorescent molecules that diagnose events preceding or coincident with cell death.
  • CD64 and CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human immunoglobulin G (IgG).
  • Transfected cells are efficiently separated from nontransfected cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake Success NY).
  • mRNA can be purified from the cells using methods well known by those of skill in the art. Expression of mRNA encoding MDDT and other genes of interest can be analyzed by northern analysis or microarray techniques.
  • PAGE polyacrylamide gel electrophoresis
  • the MDDT amino acid sequence is analyzed using LASERGENE software (DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is synthesized and used to raise antibodies by means known to those of skill in the art.
  • LASERGENE software DNASTAR
  • Methods for selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well described in the art. (See, e.g., Ausubel, 1995, supra. Chapter 1 1.)
  • peptides 15 residues in length are synthesized using an ABI 431 A peptide synthesizer (Applied Biosystems) using fmoc -chemistry and coupled to KLH (Sigma) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity.
  • ABI 431 A peptide synthesizer Applied Biosystems
  • MBS N-maleimidobenzoyl-N-hydroxysuccinimide ester
  • Rabbits are immunized with the peptide-KLH complex in complete Freund's adjuvant.
  • Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to plastic, blocking with 1 % BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG.
  • Antisera with antipeptide activity are tested for anti-MDDT activity using protocols well known in the art, including ELISA, RIA, and immunoblotting.
  • Naturally occurring or recombinant MDDT is substantially purified by immunoaffinity chromatography using antibodies specific for MDDT.
  • An immunoaffinity column is constructed by covalently coupling anti-MDDT antibody to an activated chromatographic resin, such as CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the resin is blocked and washed according to the manufacturer's instmctions.
  • Media containing MDDT are passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of MDDT (e.g., high ionic strength buffers in the presence of detergent).
  • the column is eluted under conditions that disrupt antibody/MDDT binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such as urea or thiocyanate ion), and MDDT is collected.
  • NT2RP2003000 weakly similar to TUMOR NECROSIS FACTOR, ALPHA- TABLE 2
  • NT2RP3002351 weakly similar to Human mRNA for NAD-dependent methylene tetrahydrofolate
  • NT2RP2006571 moderately similar to CYTOCHROME
  • NT2RM2000371 weakly similar to POLYRIBONUCLEOTI DE
  • IRTA2 immunoglobulin receptor translocation associated protein 2c
  • FKSG82 Homo sapiens serine/threonine kinase FKSG82 (FKSG82) mRNA
  • NT2RP2004961 moderately similar to Rattus norvegicus KRAB/zinc finger

Abstract

The present invention provides purified disease detection and treatment molecule polynucleotides (mddt). Also encompassed are the polypeptides (MDDT) encoded by mddt. The invention also provides for the use of mddt, or complements, oligonucleotides, or fragments thereof in diagnostic assays. The invention further provides for vectors and host cells containing mddt for the expression of MDDT. The invention additionally provides for the use of isolated and purified MDDT to induce antibodies and to screen libraries of compounds and the use of anti-MDDT antibodies in diagnostic assays. Also provided are microarrays containing mddt and methods of use.

Description

MOLECULES FOR DISEASE DETECTION AND TREATMENT
TECHNICAL FIELD The present invention relates to molecules for disease detection and treatment and to the use of these sequences in the diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of exogenous compounds on, the expression of molecules for disease detection and treatment.
BACKGROUND OF THE INVENTION
The human genome is comprised of thousands of genes, many encoding gene products that function in the maintenance and growth of the various cells and tissues in the body. Aberrant expression or mutations in these genes and their products is the cause of, or is associated with, a variety of human diseases such as cancer and other cell proliferative disorders. The identification of these genes and their products is the basis of an ever-expanding effort to find markers for early detection of diseases, and targets for their prevention and treatment.
For example, cancer represents a type of cell proliferative disorder that affects nearly every tissue in the body. A wide variety of molecules, either aberrantly expressed or mutated, can be the cause of, or involved with, various cancers because tissue growth involves complex and_ordered patterns of cell proliferation, cell differentiation, and apoptosis. Cell proliferation must be regulated to maintain both the number of cells and their spatial organization. This regulation depends upon the appropriate expression of proteins which control cell cycle progression in response to extracellular signals such as growth factors and other mitogens, and intracellular cues such as DNA damage or nutrient starvation. Molecules which directly or indirectly modulate cell cycle progression fall into several categories, including growth factors and their receptors, second messenger and signal transduction proteins, oncogene products, tumor-suppressor proteins, and mitosis-promoting factors. Aberrant expression or mutations in any of these gene products can result in cell proliferative disorders such as cancer. Oncogenes are genes generally derived from normal genes that, through abnormal expression or mutation, can effect the transformation of a normal cell to a malignant one (oncogenesis). Oncoproteins, encoded by oncogenes, can affect cell proliferation in a variety of ways and include growth factors, growth factor receptors, intracellular signal transducers, nuclear transcription factors, and cell-cycle control proteins. In contrast, tumor-suppressor genes are involved in inhibiting cell proliferation. Mutations which cause reduced or loss of function in tumor-suppressor genes result in aberrant cell proliferation and cancer. Thus a wide variety of genes and their products have been found that are associated with cell proliferative disorders such as cancer, but many more may exist that are yet to be discovered.
DNA-based arrays can provide a simple way to explore the expression of a single polymorphic gene or a large number of genes. When the expression of a single gene is explored, DNA-based arrays are employed to detect the expression of specific gene variants. For example, a p53 tumor suppressor gene array is used to determine whether individuals are carrying mutations that predispose them to cancer. A cytochrome p450 gene array is useful to determine whether individuals have one of a number of specific mutations that could result in increased drug metabolism, drug resistance or drug toxicity.
DNA-based array technology is especially relevant for the rapid screening of expression of a large number of genes. There is a growing awareness that gene expression is affected in a global fashion. A genetic predisposition, disease or therapeutic treatment may affect, directly or indirectly, the expression of a large number of genes. In some cases the interactions may be expected, such as when the genes are part of the same signaling pathway. In other cases, such as when the genes participate in separate signaling pathways, the interactions may be totally unexpected. Therefore, DNA-based arrays can be used to investigate how genetic predisposition, disease, or therapeutic treatment affects the expression of a large number of genes.
The discovery of new molecules for disease detection and treatment satisfies a need in the art by providing new compositions which are useful in the diagnosis, study, prevention, and treatment of diseases associated with, as well as effects of exogenous compounds on, the expression of molecules for disease detection and treatment.
SUMMARY OF THE INVENTION
The present invention relates to human disease detection and treatment molecule polynucleotides (mddt) as presented in the Sequence Listing. The mddt uniquely identify genes encoding structural, functional, and regulatory disease detection and treatment molecules.
The invention provides an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:l-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In one alternative, the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252. In another alternative, the polynucleotide comprises at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In another alternative, the polynucleotide comprises at least 60 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide comprising a polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
The invention further provides a composition for the detection of expression of disease detection and treatment molecule polynucleotides comprising at least one isolated polynucleotide comprising a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d); and a detectable label.
The invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide sequence of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d).
The method comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
The invention also provides a method for detecting a target polynucleotide in a sample, said target polynucleotide having a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The method comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof. In one alternative, the invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 30 contiguous nucleotides. In one alternative, the invention provides a composition comprising a target polynucleotide of the method, wherein said probe comprises at least 60 contiguous nucleotides.
The invention further provides a recombinant polynucleotide comprising a promoter sequence operably linked to an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). In one alternative, the invention provides a cell transformed with the recombinant polynucleotide. In another alternative, the invention provides a transgenic organism comprising the recombinant polynucleotide.
The invention also provides a method for producing a disease detection and treatment polypeptide, the method comprising a) culturing a cell under conditions suitable for expression of the disease detection and treatment polypeptide, wherein said cell is transformed with a recombinant polynucleotide, said recombinant polynucleotide comprising an isolated polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv), and b) recovering the disease detection and treatment polypeptide so expressed. The invention additionally provides a method wherein the polypeptide has an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The invention also provides an isolated disease detection and treatment polypeptide (MDDT) encoded by at least one polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252. The invention further provides a method of screening for a test compound that specifically binds to the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The method comprises a) combining the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 with at least one test compound under suitable conditions, and b) detecting binding of the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 to the test compound, thereby identifying a compound that specifically binds to the polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The invention further provides a microarray wherein at least one element of the microarray is an isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The invention also provides a method for generating a transcript image of a sample which contains polynucleotides. The method comprises a) labeling the polynucleotides of the sample, b) contacting the elements of the microarray with the labeled polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample.
Additionally, the invention provides a method for screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; c) a polynucleotide complementary to the polynucleotide of a); d) a polynucleotide complementary to the polynucleotide of b); and e) an RNA equivalent of a) through d). The method comprises a) exposing a sample comprising the target polynucleotide to a compound, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
The invention further provides a method for assessing toxicity of a test compound, said method comprising a) treating a biological sample containing nucleic acids with the test compound; b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv). Hybridization occurs under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252; ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252; iii) a polynucleotide complementary to the polynucleotide of i); iv) a polynucleotide complementary to the polynucleotide of ii); and v) an RNA equivalent of i) through iv), and alternatively, the target polynucleotide comprises a polynucleotide sequence of a fragment of a polynucleotide selected from the group consisting of i-v above; c) quantifying the amount of hybridization complex; and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound. The invention further provides an isolated polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. In one alternative, the invention provides an isolated polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
The invention further provides an isolated polynucleotide encoding a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-
506. In one alternative, the polynucleotide encodes a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. In another alternative, the polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1-252. Additionally, the invention provides an isolated antibody which specifically binds to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
The invention further provides a composition comprising a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-
506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and a pharmaceutically acceptable excipient. In one embodiment, the composition comprises a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The invention additionally provides a method of treating a disease or condition associated with decreased expression of functional MDDT, comprising administering to a patient in need of such treatment the composition.
The invention also provides a method for screening a compound for effectiveness as an agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting agonist activity in the sample. In one alternative, the invention provides a composition comprising an agonist compound identified by the method and a pharmaceutically acceptable excipient. In another alternative, the invention provides a method of treating a disease or condition associated with decreased expression of functional MDDT, comprising administering to a patient in need of such treatment the composition.
Additionally, the invention provides a method for screening a compound for effectiveness as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in the sample.
In one alternative, the invention provides, a composition comprising an antagonist compound identified by the method and a pharmaceutically acceptable excipient. In another alternative, the invention provides a method of treating a disease or condition associated with overexpression of functional MDDT, comprising administering to a patient in need of such treatment the composition. The invention further provides a method of screening for a compound that modulates the activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506. The method comprises a) combining the polypeptide with at least one test compound under conditions permissive for the activity of the polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the polypeptide in the presence of the test compound with the activity of the polypeptide in the absence of the test compound, wherein a change in the activity of the polypeptide in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide.
DESCRIPTION OF THE TABLES Table 1 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with the sequence identification numbers (SEQ ID NO:s) and open reading frame identification numbers (ORF IDs) corresponding to polypeptides encoded by the template ID.
Table 2 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with their
GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the GenBank hits.
Table 3 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with polynucleotide segments of each template sequence as defined by the indicated "start" and "stop" nucleotide positions. The reading frames of the polynucleotide segments and the Pfam hits, Pfam descriptions, and E-values corresponding to the polypeptide domains encoded by the polynucleotide segments are indicated.
Table 4 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with polynucleotide segments of each template sequence as defined by the indicated "start" and "stop" nucleotide positions. The reading frames of the polynucleotide segments are shown, and the polypeptides encoded by the polynucleotide segments constitute either signal peptide (SP) or transmembrane (TM) domains, as indicated. The membrane topology of the encoded polypeptide sequence is indicated, the N-terminus (N) listed as being oriented to either the cytosolic (N in) or non- cytosolic (N out) side of the cell membrane or organelle.
Table 5 shows the sequence identification numbers (SEQ ID NO:s) and template identification numbers (template IDs) corresponding to the polynucleotides of the present invention, along with component sequence identification numbers (component IDs) corresponding to each template. The component sequences, which were used to assemble the template sequences, are defined by the indicated "start" and "stop" nucleotide positions along each template.
Table 6 shows the tissue distribution profiles for the templates of the invention.
Table 7 shows the sequence identification numbers (SEQ ID NO:s) corresponding to the polypeptides of the present invention, along with the reading frames used to obtain the polypeptide segments, the lengths of the polypeptide segments, the "start" and "stop" nucleotide positions of the polynucleotide sequences used to define the encoded polypeptide segments, the GenBank hits (GI Numbers), probability scores, and functional annotations corresponding to the GenBank hits.
Table 8 summarizes the bioinformatics tools which are useful for analysis of the polynucleotides of the present invention. The first column of Table 8 lists analytical tools, programs, and algorithms, the second column provides brief descriptions thereof, the third column presents appropriate references, all of which are incorporated by reference herein in their entirety, and the fourth column presents, where applicable, the scores, probability values, and other parameters used to evaluate the strength of a match between two sequences (the higher the score, the greater the homology between two sequences).
DETAILED DESCRIPTION OF THE INVENTION
Before the nucleic acid sequences and methods are presented, it is to be understood that this invention is not limited to the particular machines, methods, and materials described. Although particular embodiments are described, machines, methods, and materials similar or equivalent to these embodiments may be used to practice the invention. The preferred machines, methods, and materials set forth are not intended to limit the scope of the invention which is limited only by the appended claims.
The singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. All technical and scientific terms have the meanings commonly understood by one of ordinary skill in the art. All publications are incorporated by reference for the purpose of describing and disclosing the cell lines, vectors, and methodologies which are presented and which might be used in connection with the invention. Nothing in the specification is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.
Definitions
As used herein, the lower case "mddt" refers to a nucleic acid sequence, while the upper case "MDDT" refers to an amino acid sequence encoded by mddt. A "full-length" mddt refers to a nucleic acid sequence containing the entire coding region of a gene endogenously expressed in human tissue. "Adjuvants" are materials such as Freund's adjuvant, mineral gels (aluminum hydroxide), and surface active substances (lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and dinitrophenol) which may be administered to increase a host's immunological response.
"Allele" refers to an alternative form of a nucleic acid sequence. Alleles result from a "mutation," a change or an alternative reading of the genetic code. Any given gene may have none, one, or many allelic forms. Mutations which give rise to alleles include deletions, additions, or substitutions of nucleotides. Each of these changes may occur alone, or in combination with the others, one or more times in a given nucleic acid sequence. The present invention encompasses allelic mddt. "Amino acid sequence" refers to a peptide, a polypeptide, or a protein of either natural or synthetic origin. The amino acid sequence is not limited to the complete, endogenous amino acid sequence and may be a fragment, epitope, variant, or derivative of a protein expressed by a nucleic acid sequence.
"Amplification" refers to the production of additional copies of a sequence and is carried out using polymerase chain reaction (PCR) technologies well known in the art.
"Antibody" refers to intact molecules as well as to fragments thereof, such as Fab, F(ab')2, and Fv fragments, which are capable of binding the epitopic determinant. Antibodies that bind MDDT polypeptides can be prepared using intact polypeptides or using fragments containing small peptides of interest as the immunizing antigen. The polypeptide or peptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal.
"Antisense sequence" refers to a sequence capable of specifically hybridizing to a target sequence. The antisense sequence may include DNA, RNA, or any nucleic acid mimic or analog such as peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified sugar groups such as 2'-methoxyethyl sugars or 2'-methoxyethoxy sugars; or oligonucleotides having modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2'-deoxyguanosine. "Antisense sequence" refers to a sequence capable of specifically hybridizing to a target sequence. The antisense sequence can be DNA, RNA, or any nucleic acid mimic or analog.
"Antisense technology" refers to any technology which relies on the specific hybridization of an antisense sequence to a target sequence.
A "bin" is a portion of computer memory space used by a computer program for storage of data, and bounded in such a manner that data stored in a bin may be retrieved by the program.
"Biologically active" refers to an amino acid sequence having a structural, regulatory, or biochemical function of a naturally occurring amino acid sequence.
"Clone joining" is a process for combining gene bins based upon the bins' containing sequence information from the same clone. The sequences may assemble into a primary gene transcript as well as one or more splice variants.
"Complementary" describes the relationship between two single-stranded nucleic acid sequences that anneal by base-pairing (5'-A-G-T-3' pairs with its complement 3'-T-C-A-5').
A "component sequence" is a nucleic acid sequence selected by a computer program such as PHRED and used to assemble a consensus or template sequence from one or more component sequences.
A "consensus sequence" or "template sequence" is a nucleic acid sequence which has been assembled from overlapping sequences, using a computer program for fragment assembly such as the GELVIEW fragment assembly system (Genetics Computer Group (GCG), Madison WI) or using a relational database management system (RDMS). "Conservative amino acid substitutions" are those substitutions that, when made, least interfere with the properties of the original protein, i.e., the structure and especially the function of the protein is conserved and not significantly changed by such substitutions. The table below shows amino acids which may be substituted for an original amino acid in a protein and which are regarded as conservative substitutions. Original Residue Conservative Substitution
Ala Gly, Ser
Arg His, Lys
Asn Asp, Gin, His
Asp Asn, Glu
Cys Ala, Ser
Gin Asn, Glu, His
Glu Asp, Gin, His
Gly Ala
His Asn, Arg, Gin, Glu
He Leu, Val
Leu He, Val
Lys Arg, Gin, Glu
Met Leu, He
Phe His, Met, Leu, Trp, Tyr
Ser Cys, Thr
Thr Ser, Val
Trp Phe, Tyr
Tyr His, Phe, Trp
Val He, Leu, Thr
Conservative substitutions generally maintain (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain.
"Deletion" refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or amino acid residue, respectively, is absent.
"Derivative" refers to the chemical modification of a nucleic acid sequence, such as by replacement of hydrogen by an alkyl, acyl, amino, hydroxyl, or other group. "Differential expression" refers to increased or upregulated; or decreased, downregulated, or absent gene or protein expression, determined by comparing at least two different samples. Such comparisons may be carried out between, for example, a treated and an untreated sample, or a diseased and a normal sample.
The terms "element" and "array element" refer to a polynucleotide, polypeptide, or other chemical compound having a unique and defined position on a microarray.
"E-value" refers to the statistical probability that a match between two sequences occurred by chance.
"Exon shuffling" refers to the recombination of different coding regions (exons). Since an exon may represent a structural or functional domain of the encoded protein, new proteins may be assembled through the novel reassortment of stable substructures, thus allowing acceleration of the evolution of new protein functions. A "fragment" is a unique portion of mddt or MDDT which is identical in sequence to but shorter in length than the parent sequence. A fragment may comprise up to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise from 10 to 1000 contiguous amino acid residues or nucleotides. A fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 60,
75, 100, 150, 250 or at least 500 contiguous amino acid residues or nucleotides in length. Fragments may be preferentially selected from certain regions of a molecule. For example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined sequence. Clearly these lengths are exemplary, and any length that is supported by the specification, including the
Sequence Listing and the figures, may be encompassed by the present embodiments.
A fragment of mddt comprises a region of unique polynucleotide sequence that specifically identifies mddt, for example, as distinct from any other sequence in the same genome. A fragment of mddt is useful, for example, in hybridization and amplification technologies and in analogous methods that distinguish mddt from related polynucleotide sequences. The precise length of a fragment of mddt and the region of mddt to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the intended purpose for the fragment.
A fragment of MDDT is encoded by a fragment of mddt. A fragment of MDDT comprises a region of unique amino acid sequence that specifically identifies MDDT. For example, a fragment of MDDT is useful as an immunogenic peptide for the development of antibodies that specifically recognize MDDT. The precise length of a fragment of MDDT and the region of MDDT to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the intended purpose for the fragment.
A "full length" nucleotide sequence is one containing at least a start site for translation to a protein sequence, followed by an open reading frame and a stop site, and encoding a "full length" polypeptide.
"Hit" refers to a sequence whose annotation will be used to describe a given template. Criteria for selecting the top hit are as follows: if the template has one or more exact nucleic acid matches, the top hit is the exact match with highest percent identity. If the template has no exact matches but has significant protein hits, the top hit is the protein hit with the lowest E-value. If the template has no significant protein hits, but does have significant non-exact nucleotide hits, the top hit is the nucleotide hit with the lowest E-value.
"Homology" refers to sequence similarity either between a reference nucleic acid sequence and at least a fragment of an mddt or between a reference amino acid sequence and a fragment of an MDDT. "Hybridization" refers to the process by which a strand of nucleotides anneals with a complementary strand through base pairing. Specific hybridization is an indication that two nucleic acid sequences share a high degree of identity. Specific hybridization complexes form under defined annealing conditions, and remain hybridized after the "washing" step. The defined hybridization conditions include the annealing conditions and the washing step(s), the latter of which is particularly important in determining the stringency of the hybridization process, with more stringent condiUons allowing less non-specific binding, i.e., binding between pairs of nucleic acid probes that are not perfectly matched. Permissive conditions for annealing of nucleic acid sequences are routinely determinable and may be consistent among hybridization experiments, whereas wash conditions may be varied among experiments to achieve the desired stringency.
Generally, stringency of hybridization is expressed with reference to the temperature under which the wash step is carried out. Generally, such wash temperatures are selected to be about 5°C to 20°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. An equation for calculating Tm and conditions for nucleic acid hybridization is well known and can be found in Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual. 2nd ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; specifically see volume 2, chapter 9.
High stringency conditions for hybridization between polynucleotides of the present invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, for 1 hour.
Alternatively, temperatures of about 65°C, 60°C, or 55°C may be used. SSC concentration may be varied from about 0.2 to 2 x SSC, with SDS being present at about 0.1 %. Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, denatured salmon sperm DNA at about 100-200 μg/ml. Useful variations on these conditions will be readily apparent to those skilled in the art. Hybridization, particularly under high stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity is strongly indicative of a similar role for the nucleotides and their resultant proteins.
Other parameters, such as temperature, salt concentration, and detergent concentration may be varied to achieve the desired stringency. Denaturants, such as formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, such as RNA:DNA hybridizations. Appropriate hybridization conditions are routinely determinable by one of ordinary skill in the art.
"Immunologically active" or "immunogenic" describes the potential for a natural, recombinant, or synthetic peptide, epitope, polypeptide, or protein to induce antibody production in appropriate animals, cells, or cell lines. "Insertion" or "addition" refers to a change in either a nucleic or amino acid sequence in which at least one nucleotide or residue, respectively, is added to the sequence.
"Labeling" refers to the covalent or noncovalent joining of a polynucleotide, polypeptide, or antibody with a reporter molecule capable of producing a detectable or measurable signal. "Microarray" is any arrangement of nucleic acids, amino acids, antibodies, etc., on a substrate. The substrate may be a solid support such as beads, glass, paper, nitrocellulose, nylon, or an appropriate membrane.
"Linkers" are short stretches of nucleotide sequence which may be added to a vector or an mddt to create restriction endonuclease sites to facilitate cloning. "Polylinkers" are engineered to incorporate multiple restriction enzyme sites and to provide for the use of enzymes which leave 5' or 3' overhangs (e.g., BamHI, EcoRI, and Hindlll) and those which provide blunt ends (e.g., EcoRV, SnaBI, and Stul).
"Naturally occurring" refers to an endogenous polynucleotide or polypeptide that may be isolated from viruses or prokaryotic or eukaryotic cells. "Nucleic acid sequence" refers to the specific order of nucleotides joined by phosphodiester bonds in a linear, polymeric arrangement. Depending on the number of nucleotides, the nucleic acid sequence can be considered an oligomer, oligonucleotide, or polynucleotide. The nucleic acid can be DNA, RNA, or any nucleic acid analog, such as PNA, may be of genomic or synthetic origin, may be either double-stranded or single-stranded, and can represent either the sense or antisense (complementary) strand.
"Oligomer" refers to a nucleic acid sequence of at least about 6 nucleotides and as many as about 60 nucleotides, preferably about 15 to 40 nucleotides, and most preferably between about 20 and 30 nucleotides, that may be used in hybridization or amplification technologies. Oligomers may be used as, e.g., primers for PCR, and are usually chemically synthesized. "Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences may be in close proximity or contiguous and, where necessary to join two protein coding regions, in the same reading frame. "Peptide nucleic acid" (PNA) refers to a DNA mimic in which nucleotide bases are attached to a pseudopeptide backbone to increase stability. PNAs, also designated antigene agents, can prevent gene expression by targeting complementary messenger RNA.
The phrases "percent identity" and "% identity", as applied to polynucleotide sequences, refer to the percentage of residue matches between at least two polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to optimize alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences.
Percent identity between polynucleotide sequences may be determined using the default parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e sequence alignment program. This program is part of the LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, Madison WI). CLUSTAL V is described in Higgins, D.G. and Sharp, P.M. (1989) CABIOS 5: 151 -153 and in Higgins, D.G. et al. (1992) CABIOS 8: 189-191. For pairwise alignments of polynucleotide sequences, the default parameters are set as follows: Ktuple=2, gap penalty=5, window=4, and "diagonals saved"=4. The "weighted" residue weight table is selected as the default. Percent identity is reported by CLUSTAL V as the "percent similarity" between aligned polynucleotide sequence pairs.
Alternatively, a suite of commonly used and freely available sequence comparison algorithms is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) (Altschul, S.F. et al. (1990) J. Mol. Biol. 215:403-410), which is available from several sources, including the NCBI, Bethesda, MD, and on the Internet at http://www.ncbi.nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence analysis programs including "blastn," that is used to determine alignment between a known polynucleotide sequence and other sequences on a variety of databases. Also available is a tool called "BLAST 2 Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2/. The
"BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST programs are commonly used with gap and other parameters set to default settings. For example, to compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) set at default parameters. Such default parameters may be, for example: Matrix: BLOSUM62
Reward for match: 1
Penalty for mismatch: -2
Open Gap: 5 and Extension Gap: 2 penalties
Gap x drop -off: 50 Expect: 10
Word Size: 11
Filter: on
Percent identity may be measured over the length of an entire defined sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured. Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.
The phrases "percent identity" and "% identity", as applied to polypeptide sequences, refer to the percentage of residue matches between at least two polypeptide sequences aligned using a standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some alignment methods take into account conservative amino acid substitutions. Such conservative substitutions, explained in more detail above, generally preserve the hydrophobicity and acidity of the substituted residue, thus preserving the structure (and therefore function) of the folded polypeptide. Percent identity between polypeptide sequences may be determined using the default parameters of the CLUSTAL V algorithm as incorporated into the MEG ALIGN version 3.12e sequence alignment program (described and referenced above). For pairwise alignments of polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l , gap penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default residue weight table. As with polynucleotide alignments, the percent identity is reported by
CLUSTAL V as the "percent similarity" between aligned polypeptide sequence pairs.
Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 2.0.9 (May-07-1999) with blastp set at default parameters. Such default parameters may be, for example: Matrix: BLOSUM62
Open Gap: 11 and Extension Gap: 1 penalty
Gap x drop -off: 50
Expect: 10
Word Size: 3 Filter: on
Percent identity may be measured over the length of an entire defined polypeptide sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in figures or Sequence Listings, may be used to describe a length over which percentage identity may be measured.
"Post-translational modification" of an MDDT may involve lipidation, glycosylation, phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the art. These processes may occur synthetically or biochemically. Biochemical modifications will vary by cell type depending on the enzymatic milieu and the MDDT.
"Probe" refers to mddt or fragments thereof, which are used to detect identical, allelic or related nucleic acid sequences. Probes are isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. "Primers" are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by complementary base-pairing. The primer may then be extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR). Probes and primers as used in the present invention typically comprise at least 15 contiguous nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also be employed, such as probes and primers that comprise at least 20, 30, 40, 50, 60, 70, 80, 90, 100, or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may be considerably longer than these examples, and it is understood that any length supported by the specification, including the figures and Sequence Listing, may be used.
Methods for preparing and using probes and primers are described in the references, for example Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Press, Plainview NY; Ausubel et al.,1987, Current Protocols in Molecular Biology, Greene Publ. Assoc. & Wiley-Intersciences, New York NY; Innis et al., 1990, PCR Protocols. A Guide to Methods and Applications. Academic Press, San Diego CA. PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, 1991 , Whitehead Institute for Biomedical Research, Cambridge MA).
Oligonucleotides for use as primers are selected using software known in the art for such purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection programs have incorporated additional features for expanded capabilities. For example, the PrimOU primer selection program (available to the public from the Genome Center at University of Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from megabase sequences and is thus useful for designing primers on a genome-wide scope. The Primer3 primer selection program (available to the public from the Whitehead Institute/MIT Center for Genome Research, Cambridge MA) allows the user to input a "mispriming library," in which sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for microarrays. (The source code for the latter two primer selection programs may also be obtained from their respective sources and modified to meet the user's specific needs.) The
PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and polynucleotide fragments identified by any of the above selection methods are useful in hybridization technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are not limited to those described above. "Purified" refers to molecules, either polynucleotides or polypeptides that are isolated or separated from their natural environment and are at least 60% free, preferably at least 75% free, and most preferably at least 90% free from other compounds with which they are naturally associated.
A "recombinant nucleic acid" is a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two or more otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques such as those described in Sambrook, supra. The term recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell.
Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is expressed, inducing a protective immunological response in the mammal.
"Regulatory element" refers to a nucleic acid sequence from nontranslated regions of a gene, and includes enhancers, promoters, introns, and 3' untranslated regions, which interact with host proteins to carry out or regulate transcription or translation.
"Reporter" molecules are chemical or biochemical moieties used for labeling a nucleic acid, an amino acid, or an antibody. They include radionuclides; enzymes; fluorescent, chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known in the art. An "RNA equivalent," in reference to a DNA sequence, is composed of the same linear sequence of nucleotides as the reference DNA sequence with the exception that all occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose instead of deoxyribose. "Sample" is used in its broadest sense. Samples may contain nucleic or amino acids, antibodies, or other materials, and may be derived from any source (e.g., bodily fluids including, but not limited to, saliva, blood, and urine; chromosome(s), organelles, or membranes isolated from a cell; genomic DNA, RNA, or cDNA in solution or bound to a substrate; and cleared cells or tissues or blots or imprints from such cells or tissues). "Specific binding" or "specifically binding" refers to the interaction between a protein or peptide and its agonist, antibody, antagonist, or other binding partner. The interaction is dependent upon the presence of a particular structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide containing epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody will reduce the amount of labeled A that binds to the antibody.
"Substitution" refers to the replacement of at least one nucleotide or amino acid by a different nucleotide or amino acid.
"Substrate" refers to any suitable rigid or semi-rigid support including, e.g., membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles or capillaries. The substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound.
A "transcript image" refers to the collective pattern of gene expression by a particular tissue or cell type under given conditions at a given time. "Transformation" refers to a process by which exogenous DNA enters a recipient cell.
Transformation may occur under natural or artificial conditions using various methods well known in the art. Transformation may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method is selected based on the host cell being transformed. "Transformants" include stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as cells which transiently express inserted DNA or RNA.
A "transgenic organism," as used herein, is any organism, including but not limited to animals and plants, in which one or more of the cells of the organism contains heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus. The term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The transgenic organisms contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, and plants and animals. The isolated DNA of the present invention can be introduced into the host by methods known in the art, for example infection, transfection, transformation or transconjugation. Techniques for transferring the DNA of the present invention into such organisms are widely known and provided in references such as Sambrook et al. (1989), supra. A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having at least 25% sequence identity to the particular nucleic acid sequence over a certain length of one of the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 30%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length. The variant may result in "conservative" amino acid changes which do not affect structural and/or chemical properties. A variant may be described as, for example, an "allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have significant identity to a reference molecule, but will generally have a greater or lesser number of polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding polypeptide may possess additional functional domains or lack domains that are present in the reference molecule. Species variants are polynucleotide sequences that vary from one species to another. The resulting polypeptides generally will have significant amino acid identity relative to each other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene between individuals of a given species. Polymorphic variants also may encompass "single nucleotide polymorphisms" (SNPs) in which the polynucleotide sequence varies by one base. The presence of SNPs may be indicative of, for example, a certain population, a disease state, or a propensity for a disease state.
In an alternative, variants of the polynucleotides of the present invention may be generated through recombinant methods. One possible method is a DNA shuffling technique such as
MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent Number 5,837,458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians, F.C. et al. (1999) Nat. Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or improve the biological properties of MDDT, such as its biological or enzymatic activity or its ability to bind to other molecules or compounds. DNA shuffling is a process by which a library of gene variants is produced using PCR-mediated recombination of gene fragments. The library is then subjected to selection or screening procedures that identify those gene variants with the desired properties. These preferred variants may then be pooled and further subjected to recursive rounds of DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial" breeding and rapid molecular evolution. For example, fragments of a single gene containing random point mutations may be recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, fragments of a given gene may be recombined with fragments of homologous genes in the same gene family, either from the same or different species, thereby maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable manner. A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%; at least 60%, at least 70%, at least 80%, at least 90%, at least 91 %, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length of one of the polypeptides.
THE INVENTION
In a particular embodiment, cDNA sequences derived from human tissues and cell lines were aligned based on nucleotide sequence identity and assembled into "consensus" or "template" sequences which are designated by the template identification numbers (template IDs) in column 2 of Table 2. The sequence identification numbers (SEQ ID NO:s) corresponding to the template IDs are shown in column 1. The template sequences have similarity to GenBank sequences, or "hits," as designated by the GI Numbers in column 3. The statistical probability of each GenBank hit is indicated by a probability score in column 4, and the functional annotation corresponding to each
GenBank hit is listed in column 5.
The invention incorporates the nucleic acid sequences of these templates as disclosed in the Sequence Listing and the use of these sequences in the diagnosis and treatment of disease states characterized by defects in disease detection and treatment molecules. The invention further utilizes these sequences in hybridization and amplification technologies, and in particular, in technologies which assess gene expression patterns correlated with specific cells or tissues and their responses in vivo or in vitro to pharmaceutical agents, toxins, and other treatments. In this manner, the sequences of the present invention are used to develop a transcript image for a particular cell or tissue.
Derivation of Nucleic Acid Sequences cDNA was isolated from libraries constructed using RNA derived from normal and diseased human tissues and cell lines. The human tissues and cell lines used for cDNA library construction were selected from a broad range of sources to provide a diverse population of cDNAs representative of gene transcription throughout the human body. Descriptions of the human tissues and cell lines used for cDNA library construction are provided in the LIFESEQ database (Incyte Genomics, Inc.
(Incyte), Palo Alto CA). Human tissues were broadly selected from, for example, cardiovascular, dermatologic, endocrine, gastrointestinal, hematopoietic/immune system, musculoskeletal, neural, reproductive, and urologic sources.
Cell lines used for cDNA library construction were derived from, for example, leukemic cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung fibroblasts, and endothelial cells. Such cell lines include, for example, THP-1 , urkat, HUVEC, hNT2, WI38, HeLa, and other cell lines commonly used and available from public depositories (American Type Culture Collection, Manassas VA). Prior to mRNA isolation, cell lines were untreated, treated with a pharmaceutical agent such as 5'-aza-2 -deoxycytidine, treated with an activating agent such as lipopolysaccharide in the case of leukocytic cell lines, or, in the case of endothelial cell lines, subjected to shear stress.
Sequencing of the cDNAs
Methods for DNA sequencing are well known in the art. Conventional enzymatic methods employ the Klenow fragment of DNA polymerase I, SEQUENASE DNA polymerase (U.S. Biochemical Corporation, Cleveland OH), Taq polymerase (Applied Biosystems, Foster City CA), thermostable T7 polymerase (Amersham Pharmacia Biotech, Inc. (Amersham Pharmacia Biotech), Piscataway NJ), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Life Technologies Inc. (Life Technologies), Gaithersburg MD), to extend the nucleic acid sequence from an oligonucleotide primer annealed to the DNA template of interest. Methods have been developed for the use of both single-stranded and double- stranded templates. Chain termination reaction products may be electrophoresed on urea- polyacrylamide gels and detected either by autoradiography (for radioisotope-labeled nucleotides) or by fluorescence (for fluorophore-labeled nucleotides). Automated methods for mechanized reaction preparation, sequencing, and analysis using fluorescence detection methods have been developed. Machines used to prepare cDNAs for sequencing can include the MICROLAB 2200 liquid transfer system (Hamilton Company (Hamilton), Reno NV), Peltier thermal cycler (PTC200; MJ Research, Inc. (MJ Research), Watertown MA), and ABI CATALYST 800 thermal cycler (Applied Biosystems). Sequencing can be carried out using, for example, the ABI 373 or 377 (Applied Biosystems) or MEGABACE 1000 (Molecular Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA) DNA sequencing systems, or other automated and manual sequencing systems well known in the art.
The nucleotide sequences of the Sequence Listing have been prepared by current, state-of- the-art, automated methods and, as such, may contain occasional sequencing errors or unidentified nucleotides. Such unidentified nucleotides are designated by an N. These infrequent unidentified bases do not represent a hindrance to practicing the invention for those skilled in the art. Several methods employing standard recombinant techniques may be used to correct errors and complete the missing sequence information. (See, e.g., those described in Ausubel, F.M. et al. (1997) Short Protocols in Molecular Biology. John Wiley & Sons, New York NY; and Sambrook, J. et al. (1989) Molecular Cloning. A Laboratory Manual, Cold Spring Harbor Press, Plainview NY.)
Assembly of cDNA Sequences
Human polynucleotide sequences may be assembled using programs or algorithms well known in the art. Sequences to be assembled are related, wholly or in part, and may be derived from a single or many different transcripts. Assembly of the sequences can be performed using such programs as PHRAP (Phils Revised Assembly Program) and the GELVIEW fragment assembly system (GCG), or other methods known in the art.
Alternatively, cDNA sequences are used as "component" sequences that are assembled into "template" or "consensus" sequences as follows. Sequence chromatograms are processed, verified, and quality scores are obtained using PHRED. Raw sequences are edited using an editing pathway known as Block 1 (See, e.g., the LIFESEQ Assembled User Guide, Incyte Genomics, Palo Alto, CA).
A series of BLAST comparisons is performed and low-information segments and repetitive elements (e.g., dinucleotide repeats, Alu repeats, etc.) are replaced by "n's", or masked, to prevent spurious matches. Mitochondrial and ribosomal RNA sequences are also removed. The processed* sequences are then loaded into a relational database management system (RDMS) which assigns edited sequences to existing templates, if available. When additional sequences are added into the RDMS, a process is initiated which modifies existing templates or creates new templates from works in progress (i.e., nonfinal assembled sequences) containing queued sequences or the sequences themselves. After the new sequences have been assigned to templates, the templates can be merged into bins. If multiple templates exist in one bin, the bin can be split and the templates reannotated. Once gene bins have been generated based upon sequence alignments, bins are "clone joined" based upon clone information. Clone joining occurs when the 5' sequence of one clone is present in one bin and the 3' sequence from the same clone is present in a different bin, indicating that the two bins should be merged into a single bin. Only bins which share at least two different clones are merged. A resultant template sequence may contain either a partial or a full length open reading frame, or all or part of a genetic regulatory element. This variation is due in part to the fact that the full length cDNAs of many genes are several hundred, and sometimes several thousand, bases in length. With current technology, cDNAs comprising the coding regions of large genes cannot be cloned because of vector limitations, incomplete reverse transcription of the mRNA, or incomplete "second strand" synthesis. Template sequences may be extended to include additional contiguous sequences derived from the parent RNA transcript using a variety of methods known to those of skill in the art. Extension may thus be used to achieve the full length coding sequence of a gene.
Analysis of the cDNA Sequences The cDNA sequences are analyzed using a variety of programs and algorithms which are well known in the art. (See, e.g., Ausubel, 1997, supra. Chapter 1.1; Meyers, R.A. (Ed.) (1995) Molecular Biology and Biotechnology. Wiley VCH, New York NY, pp. 856-853; and Table 8.) These analyses comprise both reading frame determinations, e.g., based on triplet codon periodicity for particular organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318); analyses of potential start and stop codons; and homology searches.
Computer programs known to those of skill in the art for performing computer-assisted searches for amino acid and nucleic acid sequence similarity, include, for example, Basic Local Alignment Search Tool (BLAST; Altschul, S.F. (1993) J. Mol. Evol. 36:290-300; Altschul, S.F. et al. (1990) J. Mol. Biol. 215:403-410). BLAST is especially useful in determining exact matches and comparing two sequence fragments of arbitrary but equal lengths, whose alignment is locally maximal and for which the alignment score meets or exceeds a threshold or cutoff score set by the user (Karlin, S. et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-845). Using an appropriate search tool (e.g., BLAST or HMM), GenBank, SwissProt, BLOCKS, PFAM and other databases may be searched for sequences containing regions of homology to a query mddt or MDDT of the present invention.
Other approaches to the identification, assembly, storage, and display of nucleotide and polypeptide sequences are provided in "Relational Database for Storing Biomolecule Information," U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecular Sequence Database," U.S.S.N. 08/81 1,758, filed March 6, 1997; and "Relational Database and System for Storing Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, all of which are incorporated by reference herein in their entirety.
Protein hierarchies can be assigned to the putative encoded polypeptide based on, e.g., motif, BLAST, or biological analysis. Methods for assigning these hierarchies are described, for example, in "Database System Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997, incorporated herein by reference. Human Disease Detection and Treatment Molecule Sequences
The mddt of the present invention may be used for a variety of diagnostic and therapeutic purposes. For example, an mddt may be used to diagnose a particular condition, disease, or disorder associated with disease detection and treatment molecules. Such conditions, diseases, and disorders include, but are not limited to, a cell proliferative disorder, such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; and an autoimmune/inflammatory disorder, such as actinic keratosis, acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, arteriosclerosis, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, bronchitis, bursitis, cholecystitis, cirrhosis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, paroxysmal nocturnal hemoglobinuria, hepatitis, hypereosinophilia, irritable bowel syndrome, episodic lymphopenia with lymphocytotoxins, mixed connective tissue disease (MCTD), multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, myelofibrosis, osteoarthritis, osteoporosis, pancreatitis, polycythemia vera, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjόgren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, primary thrombocythemia, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, trauma, and hematopoietic cancer including lymphoma, leukemia, and myeloma. The mddt can be used to detect the presence of, or to quantify the amount of, an mddt- related polynucleotide in a sample. This information is then compared to information obtained from appropriate reference samples, and a diagnosis is established. Alternatively, a polynucleotide complementary to a given mddt can inhibit or inactivate a therapeutically relevant gene related to the mddt.
Analysis of mddt Expression Patterns
The expression of mddt may be routinely assessed by hybridization-based methods to determine, for example, the tissue-specificity, disease-specificity, or developmental stage-specificity of mddt expression. For example, the level of expression of mddt may be compared among different cell types or tissues, among diseased and normal cell types or tissues, among cell types or tissues at different developmental stages, or among cell types or tissues undergoing various treatments. This type of analysis is useful, for example, to assess the relative levels of mddt expression in fully, or partially differentiated cells or tissues, to determine if changes in mddt expression levels are correlated with the development or progression of specific disease states, and to assess the response of a cell or tissue to a specific therapy, for example, in pharmacological or toxicological studies. Methods for the analysis of mddt expression are based on hybridization and amplification technologies and include membrane-based procedures such as northern blot analysis, high-throughput procedures that utilize, for example, microarrays, and PCR-based procedures.
Hybridization and Genetic Analysis
The mddt, their fragments, or complementary sequences, may be used to identify the presence of and/or to determine the degree of similarity between two (or more) nucleic acid sequences. The mddt may be hybridized to naturally occurring or recombinant nucleic acid sequences under appropriately selected temperatures and salt concentrations. Hybridization with a probe based on the nucleic acid sequence of at least one of the mddt allows for the detection of nucleic acid sequences, including genomic sequences, which are identical or related to the mddt of the Sequence Listing. Probes may be selected from non-conserved or unique regions of at least one of the polynucleotides of SEQ ID NO: 1 -252 and tested for their ability to identify or amplify the target nucleic acid sequence using standard protocols.
Polynucleotide sequences that are capable of hybridizing, in particular, to those shown in SEQ ID NO: 1-252 and fragments thereof, can be identified using various conditions of stringency. (See, e.g., Wahl, G.M. and S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods Enzymol. 152:507-51 1.) Hybridization conditions are discussed in "Definitions."
A probe for use in Southern or northern hybridization may be derived from a fragment of an mddt sequence, or its complement, that is up to several hundred nucleotides in length and is either single-stranded or double-stranded. Such probes may be hybridized in solution to biological materials such as plasmids, bacterial, yeast, or human artificial chromosomes, cleared or sectioned tissues, or to artificial substrates containing mddt. Microarrays are particularly suitable for identifying the presence of and detecting the level of expression for multiple genes of interest by examining gene expression correlated with, e.g., various stages of development, treatment with a drug or compound, or disease progression. An array analogous to a dot or slot blot may be used to arrange and link polynucleotides to the surface of a substrate using one or more of the following: mechanical (vacuum), chemical, thermal, or UV bonding procedures. Such an array may contain any number of mddt and may be produced by hand or by using available devices, materials, and machines.
Microarrays may be prepared, used, and analyzed using methods known in the art. (See, e.g.,
Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad.
Sci. USA 93: 10614-10619; Baldeschweiler et al. (1995) PCT application W095/251 1 16; Shalon, D. et al. (1995) PCT application WO95/35505; Heller, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA
94:2150-2155; and Heller, MJ. et al. (1997) U.S. Patent No. 5,605,662.)
Probes may be labeled by either PCR or enzymatic techniques using a variety of commercially available reporter molecules. For example, commercial kits are available for radioactive and chemiluminescent labeling (Amersham Pharmacia Biotech) and for alkaline phosphatase labeling (Life Technologies). Alternatively, mddt may be cloned into commercially available vectors for the production of RNA probes. Such probes may be transcribed in the presence of at least one labeled nucleotide (e.g., 32P-ATP, Amersham Pharmacia Biotech).
Additionally the polynucleotides of SEQ ID NO: 1-252 or suitable fragments thereof can be used to isolate full length cDNA sequences utilizing hybridization and/or amplification procedures well known in the art, e.g., cDNA library screening, PCR amplification, etc. The molecular cloning of such full length cDNA sequences may employ the method of cDNA library screening with probes using the hybridization, stringency, washing, and probing strategies described above and in Ausubel, supra.
Chapters 3, 5, and 6. These procedures may also be employed with genomic libraries to isolate genomic sequences of mddt in order to analyze, e.g., regulatory elements.
Genetic Mapping
Gene identification and mapping are important in the investigation and treatment of almost all conditions, diseases, and disorders. Cancer, cardiovascular disease, Alzheimer's disease, arthritis, diabetes, and mental illnesses are of particular interest. Each of these conditions is more complex than the single gene defects of sickle cell anemia or cystic fibrosis, with select groups of genes being predictive of predisposition for a particular condition, disease, or disorder. For example, cardiovascular disease may result from malfunctioning receptor molecules that fail to clear cholesterol from the bloodstream, and diabetes may result when a particular individual's immune system is activated by an infection and attacks the insulin-producing cells of the pancreas. In some studies, Alzheimer's disease has been linked to a gene on chromosome 21; other studies predict a different gene and location. Mapping of disease genes is a complex and reiterative process and generally proceeds from genetic linkage analysis to physical mapping.
As a condition is noted among members of a family, a genetic linkage map traces parts of chromosomes that are inherited in the same pattern as the condition. Statistics link the inheritance of particular conditions to particular regions of chromosomes, as defined by RFLP or other markers. (See, for example, Lander, E. S. and Botstein, D. (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357.) Occasionally, genetic markers and their locations are known from previous studies. More often, however, the markers are simply stretches of DNA that differ among individuals. Examples of genetic linkage maps can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) World Wide Web site.
In another embodiment of the invention, mddt sequences may be used to generate hybridization probes useful in chromosomal mapping of naturally occurring genomic sequences. Either coding or noncoding sequences of mddt may be used, and in some instances, noncoding sequences may be preferable over coding sequences. For example, conservation of an mddt coding sequence among members of a multi-gene family may potentially cause undesired cross hybridization during chromosomal mapping. The sequences may be mapped to a particular chromosome, to a specific region of a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial PI constructions, or single chromosome cDNA libraries. (See, e.g., Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355; Price, CM. (1993) Blood Rev. 7:127-134; and Trask, BJ. (1991 ) Trends Genet.
7: 149- 154.)
Fluorescent in situ hybridization (FISH) may be correlated with other physical chromosome mapping techniques and genetic map data. (See, e.g., Meyers, supra, pp. 965-968.) Correlation between the location of mddt on a physical chromosomal map and a specific disorder, or a predisposition to a specific disorder, may help define the region of DNA associated with that disorder.
The mddt sequences may also be used to detect polymorphisms that are genetically linked to the inheritance of a particular condition, disease, or disorder.
In situ hybridization of chromosomal preparations and genetic mapping techniques, such as linkage analysis using established chromosomal markers, may be used for extending existing genetic maps. Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may reveal associated markers even if the number or arm of the corresponding human chromosome is not known. These new marker sequences can be mapped to human chromosomes and may provide valuable information to investigators searching for disease genes using positional cloning or other gene discovery techniques. Once a disease or syndrome has been crudely correlated by genetic linkage with a particular genomic region, e.g., ataxia-telangiectasia to 1 lq22-23, any sequences mapping to that area may represent associated or regulatory genes for further investigation. (See, e.g., Gatti, R.A. et al. (1988) Nature 336:577-580.) The nucleotide sequences of the subject invention may also be used to detect differences in chromosomal architecture due to translocation, inversion, etc., among normal, carrier, or affected individuals. Once a disease-associated gene is mapped to a chromosomal region, the gene must be cloned in order to identify mutations or other alterations (e.g., translocations or inversions) that may be correlated with disease. This process requires a physical map of the chromosomal region containing the disease-gene of interest along with associated markers. A physical map is necessary for determining the nucleotide sequence of and order of marker genes on a particular chromosomal region. Physical mapping techniques are well known in the art and require the generation of overlapping sets of cloned DNA fragments from a particular organelle, chromosome, or genome. These clones are analyzed to reconstruct and catalog their order. Once the position of a marker is determined, the DNA from that region is obtained by consulting the catalog and selecting clones from that region. The gene of interest is located through positional cloning techniques using hybridization or similar methods.
Diagnostic Uses
The mddt of the present invention may be used to design probes useful in diagnostic assays. Such assays, well known to those skilled in the art, may be used to detect or confirm conditions, disorders, or diseases associated with abnormal levels of mddt expression. Labeled probes developed from mddt sequences are added to a sample under hybridizing conditions of desired stringency. In some instances, mddt, or fragments or oligonucleotides derived from mddt, may be used as primers in amplification steps prior to hybridization. The amount of hybridization complex formed is quantified and compared with standards for that cell or tissue. If mddt expression varies significantly from the standard, the assay indicates the presence of the condition, disorder, or disease. Qualitative or quantitative diagnostic methods may include northern, dot blot, or other membrane or dip-stick based technologies or multiple-sample format technologies such as PCR, enzyme-linked immunosorbent assay (ELISA)-like, pin, or chip-based assays.
The probes described above may also be used to monitor the progress of conditions, disorders, or diseases associated with abnormal levels of mddt expression, or to evaluate the efficacy of a particular therapeutic treatment. The candidate probe may be identified from the mddt that are specific to a given human tissue and have not been observed in GenBank or other genome databases. Such a probe may be used in animal studies, preclinical tests, clinical trials, or in monitoring the treatment of an individual patient. In a typical process, standard expression is established by methods well known in the art for use as a basis of comparison, samples from patients affected by the disorder or disease are combined with the probe to evaluate any deviation from the standard profile, and a therapeutic agent is administered and effects are monitored to generate a treatment profile. Efficacy is evaluated by determining whether the expression progresses toward or returns to the standard normal pattern. Treatment profiles may be generated over a period of several days or several months. Statistical methods well known to those skilled in the art may be use to determine the significance of such therapeutic agents.
The polynucleotides are also useful for identifying individuals from minute biological samples, for example, by matching the RFLP pattern of a sample's DNA to that of an individual's DNA. The polynucleotides of the present invention can also be used to determine the actual base-by-base DNA sequence of selected portions of an individual's genome. These sequences can be used to prepare
PCR primers for amplifying and isolating such selected DNA, which can then be sequenced Using this technique, an individual can be identified through a unique set of DNA sequences. Once a unique ID database is established for an individual, positive identification of that individual can be made from extremely small tissue samples In a particular aspect, oligonucleotide primers derived from the mddt of the invention may be used to detect single nucleotide polymorphisms (SNPs) SNPs are substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease in humans Methods of SNP detection include, but are not limited to, single-stranded conformation polymoφhism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from mddt are used to amplify DNA using the polymerase chain reaction (PCR) The DNA may be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing gels In fSCCP, the oligonucleotide primers are fluorescently labeled', which allows detection of the amplimers in high- throughput equipment such as DNA sequencing machines Additionally, sequence database analysis methods, termed in silico SNP (isSNP), are capable of identifying polymoφhisms by comparing the sequences of individual overlapping DNA fragments which assemble into a common consensus sequence These computer-based methods filter out sequence variations due to laboratory preparation of DNA and sequencing errors using statistical models and automated analyses of DNA sequence chromatograms In the alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the high throughput MASSARRAY system (Sequenom, Inc , San Diego CA) DNA-based identification techniques are critical in forensic technology. DNA sequences taken from very small biological samples such as tissues, e.g , hair or skin, or body fluids, e.g., blood, saliva, semen, etc , can be amplified using, e.g., PCR, to identify individuals. (See, e.g , Erhch, H. (1992) PCR Technology, Freeman and Co , New York, NY) Similarly, polynucleotides of the present invention can be used as polymoφhic markers.
There is also a need for reagents capable of identifying the source of a particular tissue. Appropriate reagents can comprise, for example, DNA probes or primers prepared from the sequences of the present invention that are specific for particular tissues Panels of such reagents can identify tissue by species and/or by organ type. In a similar fashion, these reagents can be used to screen tissue cultures for contamination.
The polynucleotides of the present invention can also be used as molecular weight markers on nucleic acid gels or Southern blots, as diagnostic probes for the presence of a specific mRNA in a particular cell type, in the creation of subtracted cDNA libraries which aid in the discovery of novel polynucleotides, in selection and synthesis of oligomers for attachment to an array or other support, and as an antigen to elicit an immune response. Disease Model Systems Using mddt
The mddt of the invention or their mammalian homologs may be "knocked out" in an animal model system using homologous recombination in embryonic stem (ES) cells. Such techniques are well known in the art and are useful for the generation of animal models of human disease. (See, e.g.,
U.S. Patent Number 5,175,383 and U.S. Patent Number 5,767,337.) For example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and grown in culture. The ES cells are transformed with a vector containing the gene of interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. (1989) Science 244:1288-1292). The vector integrates into the corresponding region of the host genome by homologous recombination.
Alternatively, homologous recombination takes place using the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J.D. (1996) Clin. Invest. 97: 1999- 2002; Wagner, K.U. et al. (1997) Nucleic Acids Res. 25:4323-4330). Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents.
The mddt of the invention may also be manipulated in vitro in ES cells derived from human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. (1998) Science 282: 1 145-1 147).
The mddt of the invention can also be used to create "knockin" humanized animals (pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region of mddt is injected into animal ES cells, and the injected sequence integrates into the animal cell genome.
Transformed cells are injected into blastulae, and the blastulae are implanted as described above. Transgenic progeny or inbred lines are studied and treated with potential pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, a mammal inbred to overexpress mddt, resulting, e.g., in the secretion of MDDT in its milk, may also serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). Screening Assays
MDDT encoded by polynucleotides of the present invention may be used to screen for molecules that bind to or are bound by the encoded polypeptides. The binding of the polypeptide and the molecule may activate (agonist), increase, inhibit (antagonist), or decrease activity of the polypeptide or the bound molecule. Examples of such molecules include antibodies, oligonucleotides, proteins (e.g., receptors), or small molecules.
Preferably, the molecule is closely related to the natural ligand of the polypeptide, e.g., a ligand or fragment thereof, a natural substrate, or a structural or functional mimetic. (See, Coligan et al.,
(1991 ) Current Protocols in Immunology 1 (2): Chapter 5.) Similarly, the molecule can be closely related to the natural receptor to which the polypeptide binds, or to at least a fragment of the receptor, e.g., the active site. In either case, the molecule can be rationally designed using known techniques.
Preferably, the screening for these molecules involves producing appropriate cells which express the polypeptide, either as a secreted protein or on the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells expressing the polypeptide or cell membrane fractions which contain the expressed polypeptide are then contacted with a test compound and binding, stimulation, or inhibition of activity of either the polypeptide or the molecule is analyzed.
An assay may simply test binding of a candidate compound to the polypeptide, wherein binding is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. Alternatively, the assay may assess binding in the presence of a labeled competitor. Additionally, the assay can be carried out using cell-free preparations, polypeptide/molecule affixed to a solid support, chemical libraries, or natural product mixtures. The assay may also simply comprise the steps of mixing a candidate compound with a solution containing a polypeptide, measuring polypeptide/molecule activity or binding, and comparing the polypeptide/molecule activity or binding to a standard. Preferably, an ELISA assay using, e.g., a monoclonal or polyclonal antibody, can measure polypeptide level in a sample. The antibody can measure polypeptide level by either binding, directly or indirectly, to the polypeptide or by competing with the polypeptide for a substrate.
All of the above assays can be used in a diagnostic or prognostic context. The molecules discovered using these assays can be used to treat disease or to bring about a particular result in a patient (e.g., blood vessel growth) by activating or inhibiting the polypeptide/molecule. Moreover, the assays can discover agents which may inhibit or enhance the production of the polypeptide from suitably manipulated cells or tissues. Transcript Imaging and Toxicological Testing
Another embodiment relates to the use of mddt to develop a transcript image of a tissue or cell type. A transcript image represents the global pattern of gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by quantifying the number of expressed genes and their relative abundance under given conditions and at a given time. (See Seilhamer et al.,
"Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, expressly incoφorated by reference herein.) Thus a transcript image may be generated by hybridizing the polynucleotides of the present invention or their complements to the totality of transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the hybridization takes place in high-throughput format, wherein the polynucleotides of the present invention or their complements comprise a subset of a plurality of elements on a microarray. The resultant transcript image would provide a profile of gene activity pertaining to disease detection and treatment molecules.
Transcript images which profile mddt expression may be generated using transcripts isolated from tissues, cell lines, biopsies, or other biological samples. The transcript image may thus reflect mddt expression in vivo, as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line.
Transcript images which profile mddt expression may also be used in conjunction with in vitro model systems and preclinical evaluation of pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental compounds. All compounds induce characteristic gene expression patterns, frequently termed molecular fingeφrints or toxicant signatures, which are indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog. 24' 153- 159; Steiner, S. and Anderson, N. L. (2000) Toxicol. Lett. 112-1 13:467-71 , expressly incoφorated by reference herein). If a test compound has a signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. These fingeφrints or signatures are most useful and refined when they contain expression information from a large number of genes and gene families.
Ideally, a genome-wide measurement of expression provides the highest quality signature. Even genes whose expression is not altered by any tested compounds are important as well, as the levels of expression of these genes are used to normalize the rest of the expression data. The normalization procedure is useful for comparison of expression data after treatment with different compounds. While the assignment of gene function to elements of a toxicant signature aids in inteφretation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical matching of signatures which leads to prediction of toxicity. (See, for example, Press Release 00-02 from the National Institute of Environmental Health Sciences, released February 29, 2000, available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore, it is important and desirable in toxicological screening using toxicant signatures to include all expressed gene sequences. In one embodiment, the toxicity of a test compound is assessed by treating a biological sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the treated biological sample are hybridized with one or more probes specific to the polynucleotides of the present invention, so that transcript levels corresponding to the polynucleotides of the present invention may be quantified. The transcript levels in the treated biological sample are compared with levels in an untreated biological sample. Differences in the transcript levels between the two samples are indicative of a toxic response caused by the test compound in the treated sample.
Another particular embodiment relates to the use of MDDT encoded by polynucleotides of the present invention to analyze the proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression in a particular tissue or cell type. Each protein component of a proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, are analyzed by quantifying the number of expressed proteins and their relative abundance under given conditions and at a given time. A profile of a cell's proteome may thus be generated by separating and analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra). The proteins are visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot is generally proportional to the level of the protein in the sample. The optical densities of equivalently positioned protein spots from different samples, for example, from biological samples either treated or untreated with a test compound or therapeutic agent, are compared to identify any changes in protein spot density related to the treatment. The proteins in the spots are partially sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry. The identity of the protein in a spot may be determined by comparing its partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences of the present invention. In some cases, further sequence data may be obtained for definitive protein identification.
A proteomic profile may also be generated using antibodies specific for MDDT to quantify the levels of MDDT expression. In one embodiment, the antibodies are used as elements on a microarray, and protein expression levels are quantified by exposing the microarray to the sample and detecting the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103-1 1 ; Mendoze, L. G. et al. (1999) Biotechniques 27:778-88). Detection may be performed by a variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or ami no-reactive fluorescent compound and detecting the amount of fluorescence bound at each array element.
Toxicant signatures at the proteome level are also useful for toxicological screening, and should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor correlation between transcript and protein abundances for some proteins in some tissues (Anderson, N. L. and Seilhamer, J. (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the analysis of compounds which do not significantly affect the transcript image, but which alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases. In another embodiment, the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins that are expressed in the treated biological sample are separated so that the amount of each protein can be quantified. The amount of each protein is compared to the amount of the corresponding protein in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample. Individual proteins are identified by sequencing the amino acid residues of the individual proteins and comparing these partial sequences to the MDDT encoded by polynucleotides of the present invention.
In another embodiment, the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins from the biological sample are incubated with antibodies specific to the MDDT encoded by polynucleotides of the present invention. The amount of protein recognized by the antibodies is quantified. The amount of protein in the treated biological sample is compared with the amount in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample.
Transcript images may be used to profile mddt expression in distinct tissue types. This process can be used to determine disease detection and treatment molecule activity in a particular tissue type relative to this activity in a different tissue type. Transcript images may be used to generate a profile of mddt expression characteristic of diseased tissue. Transcript images of tissues before and after treatment may be used for diagnostic puφoses, to monitor the progression of disease, and to monitor the efficacy of drug treatments for diseases which affect the activity of disease detection and treatment molecules.
Transcript images of cell lines can be used to assess disease detection and treatment molecule activity and/or to identify cell lines that lack or misregulate this activity. Such cell lines may then be treated with pharmaceutical agents, and a transcript image following treatment may indicate the efficacy of these agents in restoring desired levels of this activity. A similar approach may be used to assess the toxicity of pharmaceutical agents as reflected by undesirable changes in disease detection and treatment molecule activity. Candidate pharmaceutical agents may be evaluated by comparing their associated transcript images with those of pharmaceutical agents of known effectiveness.
Antisense Molecules The polynucleotides of the present invention are useful in antisense technology. Antisense technology or therapy relies on the modulation of expression of a target protein through the specific binding of an antisense sequence to a target sequence encoding the target protein or directing its expression. (See, e.g., Agrawal, S., ed. (1996) Antisense Therapeutics, Humana Press Inc., Totawa NJ; Alama, A. et al. (1997) Pharmacol. Res. 36(3): 171-178; Crooke, S.T. (1997) Adv. Pharmacol. 40: 1-49; Sharma, H.W. and R. Narayanan (1995) Bioessays 17(12):1055-1063; and Lavrosky, Y. et al. (1997) Biochem. Mol. Med. 62(1): 1 1-22.) An antisense sequence is a polynucleotide sequence capable of specifically hybridizing to at least a portion of the target sequence. Antisense sequences bind to cellular mRNA and/or genomic DNA, affecting translation and or transcription. Antisense sequences can be DNA, RNA, or nucleic acid mimics and analogs. (See, e.g., Rossi, J.J. et al. (1991) Antisense Res. Dev. l(3):285-288; Lee, R. et al. (1998) Biochemistry 37(3):900-1010; Pardridge,
W.M. et al. (1995) Proc. Natl. Acad. Sci. USA 92(12):5592-5596; and Nielsen, P. E. and Haaima, G. (1997) Chem. Soc. Rev. 96:73-78.) Typically, the binding which results in modulation of expression occurs through hybridization or binding of complementary base pairs. Antisense sequences can also bind to DNA duplexes through specific interactions in the major groove of the double helix. The polynucleotides of the present invention and fragments thereof can be used as antisense sequences to modify the expression of the polypeptide encoded by mddt. The antisense sequences can be produced ex vivo, such as by using any of the ABI nucleic acid synthesizer series (Applied Biosystems) or other automated systems known in the art. Antisense sequences can also be produced biologically, such as by transforming an appropriate host cell with an expression vector containing the sequence of interest. (See, e.g., Agrawal, supra.)
In therapeutic use, any gene delivery system suitable for introduction of the antisense sequences into appropriate target cells can be used. Antisense sequences can be delivered intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence complementary to at least a portion of the cellular sequence encoding the target protein. (See, e.g., Slater, J.E., et al. (1998) J. Allergy Clin. Immunol. 102(3):469-475; and Scanlon, K.J., et al. (1995)
9(13): 1288- 1296.) Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as retrovirus and adeno-associated virus vectors. (See, e.g., Miller, A.D. (1990) Blood 76:271 ; Ausubel, F.M. et al. (1995) Current Protocols in Molecular Biology. John Wiley & Sons, New York NY; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63(3):323-347.) Other gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other systems known in the art. (See, e.g., Rossi, J.J. (1995) Br. Med. Bull. 51(l):217-225; Boado, RJ. et al. (1998) J. Pharm. Sci. 87(1 1 ): 1308-1315; and Morris, M.C. et al. (1997) Nucleic Acids Res. 25(14):2730-2736.)
Expression In order to express a biologically active MDDT, the nucleotide sequences encoding MDDT or fragments thereof may be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host. Methods which are well known to those skilled in the art may be used to construct expression vectors containing sequences encoding MDDT and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook, supra. Chapters 4, 8, 16, and 17; and Ausubel, supra. Chapters 9, 10, 13, and 16.)
A variety of expression vector/host systems may be utilized to contain and express sequences encoding MDDT. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal (mammalian) cell systems. (See, e.g., Sambrook, supra; Ausubel, 1995, supra. Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Bitter, G.A. et al. (1987) Methods Enzymol.
153:516-544; Scorer, C.A. et al. (1994) Bio/Technology 12: 181 -184; Engelhard, E.K. et al. (1994) Proc. Natl. Acad. Sci. USA 91 :3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO J. 6:307-31 1 ; Coruzzi, G. et al. (1984) EMBO J. 3: 1671-1680; Brogue, R. et al. (1984) Science 224:838-843; Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105; The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York NY, pp.
191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81 :3655-3659; and Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355.) Expression vectors derived from retroviruses, adenoviruses, or heφes or vaccinia viruses, or from various bacterial plasmids, may be used for delivery of nucleotide sequences to the targeted organ, tissue, or cell population. (See, e.g., Di Nicola, M; et al. (1998) Cancer Gen. Ther. 5(6):350-356; Yu, M. et al., (1993) Proc. Natl. Acad. Sci. USA
90(13):6340-6344; Buller, R.M. et al. (1985) Nature 317(6040):813-815; McGregor, D.P. et al. (1994) Mol. Immunol. 31(3):219-226; and Verma, I.M. and N. Somia (1997) Nature 389:239-242.) The invention is not limited by the host cell employed.
For long term production of recombinant proteins in mammalian systems, stable expression of MDDT in cell lines is preferred. For example, sequences encoding MDDT can be transformed into cell lines using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Any number of selection systems may be used to recover transformed cell lines. (See, e.g., Wigler, M. et al. (1977) Cell 1 1 :223-232; Lowy, I. et al. (1980) Cell 22:817-823.; Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 150:1-14; Hartman, S.C. and R.C.Mulligan (1988) Proc. Natl. Acad. Sci. USA 85:8047-8051 ; Rhodes, C.A. (1995) Methods Mol. Biol. 55: 121-131.)
Therapeutic Uses of mddt The mddt of the invention may be used for somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined immunodeficiency (SCID)-Xl disease characterized by X-linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency (Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-
216; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassemias, familial hypercholesterolemia, and hemophilia resulting from Factor VIII or Factor IX deficiencies (Crystal, R.G. (1995) Science 270:404-410; Verma, I.M. and Somia, N. (1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA. 93: 1 1395-11399), hepatitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides brasiliensis; and protozoan parasites such as Plasmodium falciparum and Trvpanosoma cruzi). In the case where a genetic deficiency in mddt expression or regulation causes disease, the expression of mddt from an appropriate population of transduced cells may alleviate the clinical manifestations caused by the genetic deficiency.
In a further embodiment of the invention, diseases or disorders caused by deficiencies in mddt are treated by constructing mammalian expression vectors comprising mddt and introducing these vectors by mechanical means into mddt-deficient cells. Mechanical transfer technologies for use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the use of DNA transposons (Morgan, R.A. and Anderson, W.F. (1993) Annu. Rev. Biochem. 62: 191- 217; Ivies, Z. (1997) Cell 91 :501-510; Boulay, J-L. and Recipon, H. (1998) Curr. Opin. Biotechnol. 9:445-450). Expression vectors that may be effective for the expression of mddt include, but are not limited to, the PCDNA 3.1 , EPITAG, PRCCMV2, PREP, PVAX vectors (Invitrogen, Carlsbad CA), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), and PTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). The mddt of the invention ' may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or β-actin genes), (ii) an inducible promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and Bujard, H. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:5547-5551 ; Gossen, M. et al., (1995) Science 268: 1766-1769; Rossi, F.M.V. and Blau, H.M. (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid (Invitrogen); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND;
Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, F.M.V. and Blau, H.M. supra), or (iii) a tissue-specific promoter or the native promoter of the endogenous gene encoding MDDT from a normal individual.
Commercially available liposome transformation kits (e.g., the PERFECT LIPID TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver polynucleotides to target cells in culture and require minimal effort to optimize experimental parameters. In the alternative, transformation is performed using the calcium phosphate method (Graham, F.L. and Eb, A.J. (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J. 1 :841-845). The introduction of DNA to primary cells requires modification of these standardized mammalian transfection protocols.
In another embodiment of the invention, diseases or disorders caused by genetic defects with respect to mddt expression are treated by constructing a retrovirus vector consisting of (i) mddt under the control of an independent promoter or the retrovirus long terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) along with additional retrovirus cw-acting RNA sequences and coding sequences required for efficient vector propagation.
Retrovirus vectors (e.g., PFB and PFBNEO) are commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. U.S.A. 92:6733-6737), incoφorated by reference herein. The vector is propagated in an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol. 61:1647-1650; Bender, M.A. et al. (1987) J. Virol. 61 : 1639-1646; Adam, M.A. and Miller, A.D. (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471 ; Zufferey, R. et al. (1998) J. Virol. 72:9873-9880). U.S. Patent Number 5,910,434 to Rigg ("Method for obtaining retrovirus packaging cell lines producing high transducing efficiency retroviral supernatant") discloses a method for obtaining retrovirus packaging cell lines and is hereby incoφorated by reference. Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4+ T-cells), and the return of transduced cells to a patient are procedures well known to persons skilled in the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71 :7020-7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M.L. (1997) J. Virol. 71 :4707-4716; Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. U.S.A. 95: 1201- 1206; Su, L. ( 1997) Blood 89:2283-2290).
In the alternative, an adenovirus-based gene therapy delivery system is used to deliver mddt to cells which have one or more genetic abnormalities with respect to the expression of mddt. The construction and packaging of adenovirus-based vectors are well known to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M.E. et al. (1995)
Transplantation 27:263-268). Potentially useful adenoviral vectors are described in U.S. Patent Number 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), hereby incoφorated by reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999) Annu. Rev. Nutr. 19:51 1-544 and Verma, I.M. and Somia, N. (1997) Nature 18:389:239-242, both incorporated by reference herein. In another alternative, a heφes-based, gene therapy delivery system is used to deliver mddt to target cells which have one or more genetic abnormalities with respect to the expression of mddt. The use of herpes simplex virus (HSV)-based vectors may be especially valuable for introducing mddt to cells of the central nervous system, for which HSV has a tropism. The construction and packaging of heφes-based vectors are well known to those with ordinary skill in the art. A replication-competent herpes simplex virus (HSV) type 1-based vector has been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395). The construction of a HSV- 1 virus vector has also been disclosed in detail in U.S. Patent Number 5,804,413 to DeLuca ("Heφes simplex virus strains for gene transfer"), which is hereby incoφorated by reference. U.S. Patent Number 5,804,413 teaches the use of recombinant HSV d92 which consists of a genome containing at least one exogenous gene to be transferred to a cell under the control of the appropriate promoter for puφoses including human gene therapy. Also taught by this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV vectors, see also Goins, W. F. et al. 1999 J. Virol. 73:519-532 and Xu, H. et al., (1994) Dev. Biol. 163: 152-161 , hereby incoφorated by reference. The manipulation of cloned heφesvirus sequences, the generation of recombinant virus following the transfection of multiple plasmids containing different segments of the large heφesvirus genomes, the growth and propagation of heφesvirus, and the infection of cells with herpesvirus are techniques well known to those of ordinary skill in the art.
In another alternative, an alphavirus (positive, single-stranded RNA virus) vector is used to deliver mddt to target cells. The biology of the prototypic alphavirus, Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based on the SFV genome (Garoff, H. and Li, K-J. (1998) Curr. Opin. Biotech. 9:464-469). During alphavirus RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid proteins. This subgenomic RNA replicates to higher levels than the full-length genomic RNA, resulting in the oveφroduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g., protease and polymerase). Similarly, inserting mddt into the alphavirus genome in place of the capsid-coding region results in the production of a large number of mddt RNAs and the synthesis of high levels of MDDT in vector transduced cells. While alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy application (Dryga, S.A. et al. ( 1997) Virology 228:74-83). The wide host range of alphaviruses will allow the introduction of mddt into a variety of cell types. The specific transduction of a subset of cells in a population may require the sorting of cells prior to transduction. The methods of manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA transfections, and performing alphavirus infections, are well known to those with ordinary skill in the art.
Antibodies
Anti-MDDT antibodies may be used to analyze protein expression levels. Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, and Fab fragments. For descriptions of and protocols of antibody technologies, see, e.g., Pound J.D. (1998) Immunochemical
Protocols. Humana Press, Totowa, NJ.
The amino acid sequence encoded by the mddt of the Sequence Listing may be analyzed by appropriate software (e.g., LASERGENE NAVIGATOR software, DNASTAR) to determine regions of high immunogenicity. The optimal sequences for immunization are selected from the C- terminus, the N-terminus, and those intervening, hydrophilic regions of the polypeptide which are likely to be exposed to the external environment when the polypeptide is in its natural conformation.
Analysis used to select appropriate epitopes is also described by Ausubel (1997, supra. Chapter 1 1.7).
Peptides used for antibody induction do not need to have biological activity; however, they must be antigenic. Peptides used to induce specific antibodies may have an amino acid sequence consisting of at least five amino acids, preferably at least 10 amino acids, and most preferably at least 15 amino acids. A peptide which mimics an antigenic fragment of the natural polypeptide may be fused with another protein such as keyhole hemolimpet cyanin (KLH; Sigma, St. Louis MO) for antibody production. A peptide encompassing an antigenic region may be expressed from an mddt, synthesized as described above, or purified from human cells. Procedures well known in the art may be used for the production of antibodies. Various hosts including mice, goats, and rabbits, may be immunized by injection with a peptide. Depending on the host species, various adjuvants may be used to increase immunological response.
In one procedure, peptides about 15 residues in length may be synthesized using an ABI 431 A peptide synthesizer (Applied Biosystems) using fmoc-chemistry and coupled to KLH (Sigma) by reaction with M-maleimidobenzoyl-N-hydroxysuccinimide ester (Ausubel, 1995, supra). Rabbits are immunized with the peptide-KLH complex in complete Freund's adjuvant. The resulting antisera are tested for antipeptide activity by binding the peptide to plastic, blocking with 1 % bovine serum albumin (BSA), reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-MDDT activity using protocols well known in the art, including ELISA, radioimmunoassay (RIA), and immunoblotting.
In another procedure, isolated and purified peptide may be used to immunize mice (about 100 μg of peptide) or rabbits (about 1 mg of peptide). Subsequently, the peptide is radioiodinated and used to screen the immunized animals' B-lymphocytes for production of antipeptide antibodies. Positive cells are then used to produce hybridomas using standard techniques. About 20 mg of peptide is sufficient for labeling and screening several thousand clones. Hybridomas of interest are detected by screening with radioiodinated peptide to identify those fusions producing peptide-specific monoclonal antibody. In a typical protocol, wells of a multi-well plate (FAST, Becton-Dickinson, Palo Alto, CA) are coated with affinity-purified, specific rabbit-anti-mouse (or suitable anti-species IgG) antibodies at 10 mg ml. The coated wells are blocked with 1 % BSA and washed and exposed to supernatants from hybridomas. After incubation, the wells are exposed to radiolabeled peptide at 1 mg/ml.
Clones producing antibodies bind a quantity of labeled peptide that is detectable above background. Such clones are expanded and subjected to 2 cycles of cloning. Cloned hybridomas are injected into pristane-treated mice to produce ascites, and monoclonal antibody is purified from the ascitic fluid by affinity chromatography on protein A (Amersham Pharmacia Biotech). Several procedures for the production of monoclonal antibodies, including in vitro production, are described in
Pound (supra). Monoclonal antibodies with antipeptide activity are tested for anti-MDDT activity using protocols well known in the art, including ELISA, RIA, and immunoblotting.
Antibody fragments containing specific binding sites for an epitope may also be generated. For example, such fragments include, but are not limited to, the F(ab')2 fragments produced by pepsin digestion of the antibody molecule, and the Fab fragments generated by reducing the disulfide bridges of the F(ab')2 fragments. Alternatively, construction of Fab expression libraries in filamentous bacteriophage allows rapid and easy identification of monoclonal fragments with desired specificity (Pound, supra. Chaps. 45-47). Antibodies generated against polypeptide encoded by mddt can be used to purify and characterize full-length MDDT protein and its activity, binding partners, etc. Assays Using Antibodies
Anti-MDDT antibodies may be used in assays to quantify the amount of MDDT found in a particular human cell. Such assays include methods utilizing the antibody and a label to detect expression level under normal or disease conditions. The peptides and antibodies of the invention may be used with or without modification or labeled by joining them, either covalently or noncovalently, with a reporter molecule.
Protocols for detecting and measuring protein expression using either polyclonal or monoclonal antibodies are well known in the art. Examples include ELISA, RIA, and fluorescent activated cell sorting (FACS). Such immunoassays typically involve the formation of complexes between the MDDT and its specific antibody and the measurement of such complexes. These and other assays are described in Pound (supra).
Without further elaboration, it is believed that one skilled in the art can, using the preceding description, utilize the present invention to its fullest extent. The following preferred specific embodiments are, therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way whatsoever.
The disclosures of all patents, applications, and publications mentioned above and below, including U.S. Serial No. 60/230,517, U.S. Serial No. 60/230,599, U.S. Serial No. 60/230,514, U.S. Serial No. 60/231 , 167, U.S. Serial No. 60/230,598, U.S. Serial No. 60/230,988, U.S. Serial No. 60/230,518, U.S. Serial No. 60/230,515, U.S. Serial No. 60/229,751 , U.S. Serial No. 60/230,610, U.S. Serial No. 60/229,749, U.S. Serial No. 60/229,750, U.S. Serial No. 60/230,597, U.S. Serial No.
60/230,505, U.S. Serial No. 60/231 ,163, U.S. Serial No. 60/229,747, U.S. Serial No. 60/229*,748, U.S. Serial No. 60/230,583, U.S. Serial No. 60/230,519, U.S. Serial No. 60/230,595, U.S. Serial No. 60/230,865, U.S. Serial No. 60/230,989, and U.S. Serial No. 60/230,951 , are hereby expressly incoφorated by reference.
EXAMPLES I. Construction of cDNA Libraries
RNA was purchased from CLONTECH Laboratories, Inc. (Palo Alto CA) or isolated from various tissues. Some tissues were homogenized and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of denaturants, such as TRIZOL (Life
Technologies), a monophasic solution of phenol and guanidine isothiocyanate. The resulting lysates were centrifuged over CsCI cushions or extracted with chloroform. RNA was precipitated with either isopropanol or sodium acetate and ethanol, or by other routine methods.
Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA purity. In most cases, RNA was treated with DNase. For most libraries, poly(A+) RNA was isolated using oligo d(T)-coupled paramagnetic particles (Promega Coφoration (Promega), Madison WI), OLIGOTEX latex particles (QIAGEN, Inc. (QIAGEN), Valencia CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA purification kit (Ambion, Inc., Austin TX). In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP vector system (Stratagene Cloning Systems, Inc. (Stratagene), La Jolla CA) or SUPERSCRIPT plasmid system (Life Technologies), using the recommended procedures or similar methods known in the art. (See, e.g., Ausubel, 1997, supra. Chapters 5.1 through 6.6.) Reverse transcription was initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or enzymes. For most libraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL SI 000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Pharmacia Biotech) or preparative agarose gel electrophoresis. cDNAs were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), or pINCY (Incyte Genomics, Palo Alto CA), or derivatives thereof. Recombinant plasmids were transformed into competent E. coli cells including XLl -Blue, XLl-BlueMRF, or SOLR from Stratagene or DH5α, DH10B, or ElectroMAX DH10B from Life Technologies.
II. Isolation of cDNA Clones
Plasmids were recovered from host cells by in vivo excision using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using at least one of the following: the Magic or WIZARD Minipreps DNA purification system (Promega); the AGTC Miniprep purification kit (Edge
BioSystems, Gaithersburg MD); and the QIAWELL 8, QIAWELL 8 Plus, and QIAWELL 8 Ultra plasmid purification systems or the R.E.A.L. PREP 96 plasmid purification kit (QIAGEN). Following precipitation, plasmids were resuspended in 0.1 ml of distilled water and stored, with or without lyophilization, at 4°C. Alternatively, plasmid DNA was amplified from host cell lysates using direct link PCR in a high-throughput format. (Rao, V.B. (1994) Anal. Biochem. 216: 1 -14.) Host cell lysis and thermal cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 384-well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically using PICOGREEN dye (Molecular Probes, Inc. (Molecular Probes), Eugene OR) and a FLUOROSKAN II fluorescence scanner (Labsystems Oy, Helsinki, Finland). III. Sequencing and Analysis cDNA sequencing reactions were processed using standard methods or high-throughput instrumentation such as the ABI CATALYST 800 thermal cycler (Applied Biosystems) or the PTC- 200 thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins
Scientific Corp., Sunnyvale CA) or the MICROLAB 2200 liquid transfer system (Hamilton). cDNA sequencing reactions were prepared using reagents provided by Amersham Pharmacia Biotech or supplied in ABI sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides were carried out using the MEGABACE 1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI protocols and base calling software; or other sequence analysis systems known in the art. Reading frames within the cDNA sequences were identified using standard methods (reviewed in Ausubel, 1997, supra. Chapter 7.7). Some of the cDNA sequences were selected for extension using the techniques disclosed in Example VIII.
IV. Assembly and Analysis of Sequences
Component sequences from chromatograms were subject to PHRED analysis and assigned a quality score. The sequences having at least a required quality score were subject to various pre- processing editing pathways to eliminate, e.g., low quality 3' ends, vector and linker sequences, polyA tails, Alu repeats, mitochondrial and ribosomal sequences, bacterial contamination sequences, and sequences smaller than 50 base pairs. In particular, low-information sequences and repetitive elements (e.g., dinucleotide repeats, Alu repeats, etc.) were replaced by "n's", or masked, to prevent spurious matches. Processed sequences were then subject to assembly procedures in which the sequences were assigned to gene bins (bins). Each sequence could only belong to one bin. Sequences in each gene bin were assembled to produce consensus sequences (templates). Subsequent new sequences were added to existing bins using BLASTn (v.1.4 WashU) and CROSSMATCH. Candidate pairs were identified as all BLAST hits having a quality score greater than or equal to 150. Alignments of at least 82% local identity were accepted into the bin. The component sequences from each bin were assembled using a version of PHRAP. Bins with several overlapping component sequences were assembled using DEEP PHRAP. The orientation (sense or antisense) of each assembled template was determined based on the number and orientation of its component sequences. Template sequences as disclosed in the sequence listing correspond to sense strand sequences (the "forward" reading frames), to the best determination. The complementary (antisense) strands are inherently disclosed herein. The component sequences which were used to assemble each template consensus sequence are listed in Table 5, along with their positions along the template nucleotide sequences.
Bins were compared against each other and those having local similarity of at least 82% were combined and reassembled. Reassembled bins having templates of insufficient overlap (less than 95% local identity) were re-split. Assembled templates were also subject to analysis by
STITCHER/EXON MAPPER algorithms which analyze the probabilities of the presence of splice variants, alternatively spliced exons, splice junctions, differential expression of alternative spliced genes across tissue types or disease states, etc. These resulting bins were subject to several rounds of the above assembly procedures. Once gene bins were generated based upon sequence alignments, bins were clone joined based upon clone information. If the 5' sequence of one clone was present in one bin and the 3' sequence from the same clone was present in a different bin, it was likely that the two bins actually belonged together in a single bin. The resulting combined bins underwent assembly procedures to regenerate the consensus sequences. The final assembled templates were subsequently annotated using the following procedure.
Template sequences were analyzed using BLASTn (v2.0, NCBI) versus gbpri (GenBank version 124). "Hits" were defined as an exact match having from 95% local identity over 200 base pairs through 100% local identity over 100 base pairs, or a homolog match having an E-value, i.e. a probability score, of < 1 x 10'8. The hits were subject to frameshift FASTx versus GENPEPT (GenBank version 124). (See Table 8). In this analysis, a homolog match was defined as having an
E-value of < 1 x 10s. The assembly method used above was described in "System and Methods for Analyzing Biomolecular Sequences," U.S.S.N. 09/276,534, filed March 25, 1999, and the LIFESEQ Gold user manual (Incyte) both incorporated by reference herein.
Following assembly, template sequences were subjected to motif, BLAST, and functional analyses, and categorized in protein hierarchies using methods described in, e.g., "Database System
Employing Protein Function Hierarchies for Viewing Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997; "Relational Database for Storing Biomolecule Information," U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-Length Biomolecular Sequence Database," U.S.S.N. 08/81 1,758, filed March 6, 1997; and "Relational Database and System for Storing Information Relating to Biomolecular Sequences," U.S.S.N. 09/034,807, filed March 4, 1998, all of which are incoφorated by reference herein.
The template sequences were further analyzed by translating each template in all three forward reading frames and searching each translation against the Pfam database of hidden Markov model-based protein families and domains using the HMMER software package (available to the public from Washington University School of Medicine, St. Louis MO). Regions of templates which, when translated, contain similarity to Pfam consensus sequences are reported in Table 3, along with descriptions of Pfam protein domains and families. Only those Pfam hits with an E-value of ≤ 1 x 10"3 are reported. (See also World Wide Web site http://pfam.wustl.edu/ for detailed descriptions of Pfam protein domains and families.) Additionally, the template sequences were translated in all three forward reading frames, and each translation was searched against hidden Markov models for signal peptides using the HMMER software package. Construction of hidden Markov models and their usage in sequence analysis has been described. (See, for example, Eddy, S.R. (1996) Curr. Opin. Str. Biol. 6:361-365.) Only those signal peptide hits with a cutoff score of 1 1 bits or greater are reported. A cutoff score of 1 1 bits or greater corresponds to at least about 91-94% true-positives in signal peptide prediction. Template sequences were also translated in all three forward reading frames, and each translation was searched against TMAP, a program that uses weight matrices to delineate transmembrane segments on protein sequences and determine orientation, with respect to the cell cytosol (Persson, B. and P. Argos (1994) J. Mol. Biol. 237: 182-192; Persson, B. and P. Argos (1996) Protein Sci. 5:363-371.) Regions of templates which, when translated, contain similarity to signal peptide or transmembrane consensus sequences are reported in Table 4.
The results of HMMER analysis as reported in Tables 3 and 4 may support the results of BLAST analysis as reported in Table 2 or may suggest alternative or additional properties of template- encoded polypeptides not previously uncovered by BLAST or other analyses. Template sequences are further analyzed using the bioinformatics tools listed in Table 8, or using sequence analysis software known in the art such as MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco CA) and LASERGENE software (DNASTAR). Template sequences may be further queried against public databases such as the GenBank rodent, mammalian, vertebrate, prokaryote, and eukaryote databases. The template sequences were translated to derive the corresponding longest open reading frame as presented by the polypeptide sequences as reported in Table 2. Alternatively, a polypeptide of the invention may begin at any of the mefhionine residues within the full length translated polypeptide. Polypeptide sequences were subsequently analyzed by querying against the GenBank protein database (GENPEPT, (GenBank version 124)). Full length polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco
CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence alignments are generated using default parameters specified by the CLUSTAL algorithm as incoφorated into the MEGALIGN multisequence alignment program (DNASTAR), which also calculates the percent identity between aligned sequences. Table 7 shows sequences with homology to the polypeptides of the invention as identified by BLAST analysis against the GenBank protein (GENPEPT) database. Column 1 shows the polypeptide sequence identification number (SEQ ID NO:) for the polypeptide segments of the invention. Column 2 shows the reading frame used in the translation of the polynucleotide sequences encoding the polypeptide segments. Column 3 shows the length of the translated polypeptide segments. Columns 4 and 5 show the start and stop nucleotide positions of the polynucleotide sequences encoding the polypeptide segments. Column 6 shows the GenBank identification number (GI Number) of the nearest GenBank homolog. Column 7 shows the probability score for the match between each polypeptide and its GenBank homolog. Column 8 shows the annotation of the GenBank homolog.
V. Analysis of Polynucleotide Expression
Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs from a particular cell type or tissue have been bound. (See, e.g., Sambrook, supra, ch. 7; Ausubel, 1995, supra, ch. 4 and 16.)
Analogous computer techniques applying BLAST were used to search for identical or related molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer search can be modified to determine whether any particular match is categorized as exact or similar. The basis of the search is the product score, which is defined as:
BLAST Score x Percent Identity
5 x minimum { length(Seq. 1 ), length(Seq. 2) }
The product score takes into account both the degree of similarity between two sequences and the length of the sequence match. The product score is a normalized value between 0 and 100, and is calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and -4 for every mismatch. Two sequences may share more than one HSP (separated by gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate the product score. The product score represents a balance between fractional overlap and quality in a BLAST alignment. For example, a product score of 100 is produced only for 100% identity over the entire length of the shorter of the two sequences being compared. A product score of 70 is produced either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap.
VI. Tissue Distribution Profiling A tissue distribution profile is determined for each template by compiling the cDNA library tissue classifications of its component cDNA sequences. Each component sequence, is derived from a cDNA library constructed from a human tissue. Each human tissue is classified into one of the following categories: cardiovascular system; connective tissue; digestive system; embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory system, sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. Template sequences, component sequences, and cDNA library /tissue information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto CA).
Table 6 shows the tissue distribution profile for the templates of the invention. For each template, the three most frequently observed tissue categories are shown in column 3, along with the percentage of component sequences belonging to each category. Only tissue categories with percentage values of ≥10% are shown. A tissue distribution of "widely distributed" in column 3 indicates percentage values of <10% in all tissue categories.
VII. Transcript Image Analysis
Transcript images are generated as described in Seilhamer et al., "Comparative Gene Transcript Analysis," U.S. Patent Number 5,840,484, incoφorated herein by reference.
VIII. Extension of Polynucleotide Sequences and Isolation of a Full-length cDNA Oligonucleotide primers designed using an mddt of the Sequence Listing are used to extend the nucleic acid sequence. One primer is synthesized to initiate 5' extension of the template, and the other primer, to initiate 3' extension of the template. The initial primers may be designed using OLIGO 4.06 software (National Biosciences, Inc. (National Biosciences), Plymouth MN), or another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target sequence at temperatures of about 68°C to about 72°C. Any stretch of nucleotides which would result in haiφin structures and primer-primer dimerizations are avoided. Selected human cDNA libraries are used to extend the sequence. If more than one extension is necessary or desired, additional or nested sets of primers are designed.
High fidelity amplification is obtained by PCR using methods well known in the art. PCR is performed in 96-well plates using the PTC-200 thermal cycler (MJ Research). The reaction mix contains DNA template, 200 nmol of each primer, reaction buffer containing Mg2\ (NH4)2S04, and β- mercaptoethanol, Taq DNA polymerase (Amersham Pharmacia Biotech), ELONGASE enzyme (Life Technologies), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair PCI A and PCI B: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C. In the alternative, the parameters for primer pair T7 and SK+ are as follows: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 57°C, 1 min; Step 4: 68°C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68°C, 5 min; Step 7: storage at 4°C.
The concentration of DNA in each well is determined by dispensing 100 μl PICOGREEN quantitation reagent (0.25% (v/v); Molecular Probes) dissolved in IX Tris-EDTA (TE) and 0.5 μl of undiluted PCR product into each well of an opaque fluorimeter plate (Coming Incoφorated (Coming), Co ing NY), allowing the DNA to bind to the reagent. The plate is scanned in a FLUOROSKAN II (Labsystems Oy) to measure the fluorescence of the sample and to quantify the concentration of DNA. A 5 μl to 10 μl aliquot of the reaction mixture is analyzed by electrophoresis on a 1 % agarose mini-gel to determine which reactions are successful in extending the sequence.
The extended nucleotides are desalted and concentrated, transferred to 384-well plates, digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and sonicated or sheared prior to religation into pUC 18 vector (Amersham Pharmacia Biotech). For shotgun sequencing, the digested nucleotides are separated on low concentration (0.6 to 0.8%) agarose gels, fragments are excised, and agar digested with AGAR ACE (Promega). Extended clones are religated using T4 ligase (New England Biolabs, Inc., Beverly MA) into pUC 18 vector (Amersham Pharmacia Biotech), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transfected into competent E. coli cells. Transformed cells are selected on antibiotic-containing media, individual colonies are picked and cultured overnight at 37°C in 384-well plates in LB/2x carbenicillin liquid media.
The cells are lysed, and DNA is amplified by PCR using Taq DNA polymerase (Amersham Pharmacia Biotech) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1 : 94°C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72°C, 5 min; Step 7: storage at 4°C. DNA is quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries are reamplified using the same conditions as described above. Samples are diluted with 20% dimethysulfoxide (1 :2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). In like manner, the mddt is used to obtain regulatory sequences (promoters, introns, and enhancers) using the procedure above, oligonucleotides designed for such extension, and an appropriate genomic library.
IX. Labeling of Probes and Southern Hybridization Analyses Hybridization probes derived from the mddt of the Sequence Listing are employed for screening cDNAs, mRNAs, or genomic DNA. The labeling of probe nucleotides between 100 and 1000 nucleotides in length is specifically described, but essentially the same procedure may be used with larger cDNA fragments. Probe sequences are labeled at room temperature for 30 minutes using a T4 polynucleotide kinase, γ32P-ATP, and 0.5X One-Phor-All Plus (Amersham Pharmacia Biotech) buffer and purified using a ProbeQuant G-50 Microcolumn (Amersham Pharmacia Biotech). The probe mixture is diluted to 107 dpm/μg/ml hybridization buffer and used in a typical membrane-based hybridization analysis.
The DNA is digested with a restriction endonuclease such as Eco RV and is electrophoresed through a 0.7% agarose gel. The DNA fragments are transferred from the agarose to nylon membrane (NYTRAN Plus, Schleicher & Schuell, Inc., Keene NH) using procedures specified by the manufacturer of the membrane. Prehybridization is carried out for three or more hours at 68 °C, and hybridization is carried out overnight at 68°C. To remove non-specific signals, blots are sequentially washed at room temperature under increasingly stringent conditions, up- to 0. Ix saline sodium citrate (SSC) and 0.5% sodium dodecyl sulfate. After the blots are placed in a PHOSPHORIMAGER cassette (Molecular Dynamics) or are exposed to autoradiography film, hybridization patterns of standard and experimental lanes are compared. Essentially the same procedure is employed when screening RNA.
X. Chromosome Mapping of mddt The cDNA sequences which were used to assemble SEQ ID NO: 1 -252 are compared with sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other implementations of the Smith-Waterman algorithm. Sequences from these databases that match SEQ ID NO: 1-252 are assembled into clusters of contiguous and overlapping sequences using assembly algorithms such as PHRAP (Table 8). Radiation hybrid and genetic mapping data available from public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for
Genome Research (WIGR), and Genethon are used to determine if any of the clustered sequences have been previously mapped. Inclusion of a mapped sequence in a cluster will result in the assignment of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. The genetic map locations of SEQ ID NO: 1 -252 are described as ranges, or intervals, of human chromosomes. The map position of an interval, in cenUMorgans, is measured relative to the terminus of the chromosome's p-arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances are based on genetic markers mapped by Genethon which provide boundaries for radiation hybrid markers whose sequences were included in each of the clusters.
XI. Microarray Analysis Probe Preparation from Tissue or Cell Samples
Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and polyA+ RNA is purified using the oligo (dT) cellulose method. Each polyA+ RNA sample is reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/μl oligo-dT primer (21mer), IX first strand buffer, 0.03 units/μl RNase inhibitor, 500 μM dATP, 500 μM dGTP, 500 μM dTTP, 40 μM dCTP, 40 μM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Pharmacia Biotech). The reverse transcription reaction is performed in a 25 ml volume containing 200 ng polyA+ RNA with GEMBRIGHT kits (Incyte). Specific control polyA+ RNAs are synthesized by in vitro transcription from non-coding yeast genomic DNA (W. Lei, unpublished). As quantitative controls, the control RNAs at 0.002 ng,
0.02 ng, 0.2 ng, and 2 ng are diluted into reverse transcription reaction at ratios of 1 : 100,000, 1 : 10,000, 1 : 1000, 1 : 100 (w/w) to sample mRNA respectively. The control mRNAs are diluted into reverse transcription reaction at ratios of 1:3, 3:1, 1 :10, 10:1, 1 :25, 25:1 (w/w) to sample mRNA differential expression patterns. After incubation at 37° C for 2 hr, each reaction sample (one with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 minutes at 85° C to the stop the reaction and degrade the RNA. Probes are purified using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. (CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The probe is then dried to completion using a SpeedVAC (Savant Instmments Inc., Holbrook NY) and resuspended in 14 μl 5X SSC/0.2% SDS.
Microarray Preparation
Sequences of the present invention are used to generate array elements. Each array element is amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR amplification uses primers complementary to the vector sequences flanking the cDNA insert. Array elements are amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 μg. Amplified array elements are then purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). Purified array elements are immobilized on polymer-coated glass slides. Glass microscope slides (Co ing) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR Scientific Products Coφoration (VWR), West Chester, PA), washed extensively in distilled water, and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 1 10°C oven. Array elements are applied to the coated glass substrate using a procedure described in US
Patent No. 5,807,522, incoφorated herein by reference. 1 μl of the array element DNA, at an average concentration of 100 ng/μl, is loaded into the open capillary printing element by a high-speed robotic apparatus. The apparatus then deposits about 5 nl of array element sample per slide.
Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water.
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate buffered saline (PBS) (Tropix, Inc., Bedford, MA) for 30 minutes at 60° C followed by washes in 0.2% SDS and distilled water as before.
Hybridization
Hybridization reactions contain 9 μl of probe mixture consisting of 0.2 μg each of Cy3 and Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The probe mixture is heated to 65° C for 5 minutes and is aliquoted onto the microarray surface and covered with an 1.8 cm2 coverslip. The arrays are transferred to a wateφroof chamber having a cavity just slightly larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 μl of
5x SSC in a comer of the chamber. The chamber containing the arrays is incubated for about 6.5 hours at 60°C. The arrays are washed for 10 min at 45°C in a first wash buffer (IX SSC, .0.1 % SDS), three times for 10 minutes each at 45° C in a second wash buffer (0.1X SSC), and dried.
Detection
Reporter-labeled hybridization complexes are detected with a microscope equipped with an Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara CA) capable of generating spectral lines at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a resolution of 20 micrometers.
In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, Hamamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate filters positioned between the array and the photomultiplier tubes are used to filter the signals. The emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, although the apparatus is capable of recording the spectra from both fluorophores simultaneously. The sensitivity of the scans is typically calibrated using the signal intensity generated by a cDNA control species added to the probe mix at a known concentration. A specific location on the array contains a complementary DNA sequence, allowing the intensity of the signal at that location to be correlated with a weight ratio of hybridizing species of 1 : 100,000. When two probes from different sources (e.g., representing test and control cells), each labeled with a different fluorophore, are hybridized to a single array for the puφose of identifying genes that are differentially expressed, the calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and adding identical amounts of each to the hybridization mixture.
The output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital (A D) conversion board (Analog Devices, Inc., Norwood, MA) installed in an IBM-compatible PC computer. The digitized data are displayed as an image where the signal intensity is mapped using a linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission spectra) between the fluorophores using each fluorophore 's emission spectrum. A grid is superimposed over the fluorescence signal image such that the signal from each spot is centered in each element of the grid. The fluorescence signal within each element is then integrated to obtain a numerical value corresponding to the average intensity of the signal. The software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte).
XII. Complementary Nucleic Acids
Sequences complementary to the mddt are used to detect, decrease, or inhibit expression of the naturally occurring nucleotide. The use of oligonucleotides comprising from about 15 to 30 base pairs is typical in the art. However, smaller or larger sequence fragments can also be used. Appropriate oligonucleotides are designed from the mddt using OLIGO 4.06 software (National Biosciences) or other appropriate programs and are synthesized using methods standard in the art or ordered from a commercial supplier. To inhibit transcription, a complementary oligonucleotide is designed from the most unique 5' sequence and used to prevent transcription factor binding to the promoter sequence. To inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding and processing of the transcript. XIII. Expression of MDDT
Expression and purification of MDDT is accomplished using bacterial or vims-based expression systems. For expression of MDDT in bacteria, cDNA is subcloned into an appropriate vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription Examples of such promoters include, but are not limited to, the trp-lac (tac) hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element Recombinant vectors are transformed into suitable bacterial hosts, e.g , BL21(DE3). Antibiotic resistant bacteria express MDDT upon induction with isopropyl beta-D- thiogalactopyranoside (IPTG). Expression of MDDT in eukaryotic cells is achieved by infecting insect or mammalian cell lines with recombinant Autographica californica nuclear polyhedrosis vims
(AcMNPV), commonly known as baculovirus The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding MDDT by either homologous recombination or bacterial-mediated transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong polyhedrin promoter drives high levels of cDNA transcription Recombinant baculovirus is used to infect Spodoptera frugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases.
Infection of the latter requires additional genetic modifications to baculovirus. (See e.g , Engelhard, supra, and Sandig, supra.)
In most expression systems, MDDT is synthesized as a fusion protein with, e.g., glutathione S- transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, affinity-based purification of recombinant fusion protein from cmde cell lysates GST, a 26-kilodalton enzyme from Schistosoma laponicum, enables the purification of fusion proteins on immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham Pharmacia Biotech) Following purification, the GST moiety can be proteolytically cleaved from MDDT at specifically engineered sites FLAG, an 8-amino acid peptide, enables immunoaffinity purification using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak
Company, Rochester NY). 6-His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins (QIAGEN) Methods for protein expression and purification are discussed in Ausubel (1995, supra. Chapters 10 and 16). Purified MDDT obtained by these methods can be used directly in the following activity assay
XIV. Demonstration of MDDT Activity
MDDT, or biologically active fragments thereof, are labeled with l25I Bolton-Hunter reagent. (See, e.g , Bolton, A E and W.M Hunter (1973) Biochem J 133 529-539 ) Candidate molecules previously arrayed in the wells of a multi-well plate are incubated with the labeled MDDT, washed, and any wells with labeled MDDT complex are assayed Data obtained using different concentrations of MDDT are used to calculate values for the number, affinity, and association of MDDT with the candidate molecules.
Alternatively, molecules interacting with MDDT are analyzed using the yeast two-hybrid system as described in Fields, S. and O. Song (1989) Nature 340:245-246, or using commercially available kits based on the two-hybrid system, such as the MATCHMAKER system (CLONTECH).
MDDT may also be used in the PATHCALLING process (CuraGen Coφ., New Haven CT) which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. Patent No. 6,057,101 ). XV. Functional Assays
MDDT function is assessed by expressing mddt at physiologically elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice include pCMV SPORT (Life Technologies) and pCR3.1 (Invitrogen Coφoration, Carlsbad CA), both of which contain the cytomegalovirus promoter. 5-10 μg of recombinant vector are transiently transfected into a human cell line, preferably of endothelial or hematopoietic origin, using either liposome formulations or electroporation. 1 -2 μg of an additional plasmid containing sequences encoding a marker protein are co-transfected.
Expression of a marker protein provides a means to distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression from the recombinant vector.
Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; CLONTECH), CD64, or a CD64-GFP fusion protein. Flow cytometry (FCM), an automated laser optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These events include changes in nuclear DNA content as measured by staining of DNA with propidium iodide; changes in cell size and granularity as measured by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to the cell surface. Methods in flow cytometry are discussed in Ormerod, M. G. (1994) Flow Cytometry. Oxford, New York NY.
The influence of MDDT on gene expression can be assessed using highly purified populations of cells transfected with sequences encoding MDDT and either CD64 or CD64-GFP. CD64 and CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake Success NY). mRNA can be purified from the cells using methods well known by those of skill in the art. Expression of mRNA encoding MDDT and other genes of interest can be analyzed by northern analysis or microarray techniques.
XVI. Production of Antibodies
MDDT substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to immunize rabbits and to produce antibodies using standard protocols.
Alternatively, the MDDT amino acid sequence is analyzed using LASERGENE software (DNASTAR) to determine regions of high immunogenicity, and a corresponding peptide is synthesized and used to raise antibodies by means known to those of skill in the art. Methods for selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well described in the art. (See, e.g., Ausubel, 1995, supra. Chapter 1 1.)
Typically, peptides 15 residues in length are synthesized using an ABI 431 A peptide synthesizer (Applied Biosystems) using fmoc -chemistry and coupled to KLH (Sigma) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity. (See, e.g., Ausubel, supra.) Rabbits are immunized with the peptide-KLH complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide activity by, for example, binding the peptide to plastic, blocking with 1 % BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated goat anti-rabbit IgG. Antisera with antipeptide activity are tested for anti-MDDT activity using protocols well known in the art, including ELISA, RIA, and immunoblotting.
XVII. Purification of Naturally Occurring MDDT Using Specific Antibodies
Naturally occurring or recombinant MDDT is substantially purified by immunoaffinity chromatography using antibodies specific for MDDT. An immunoaffinity column is constructed by covalently coupling anti-MDDT antibody to an activated chromatographic resin, such as CNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After the coupling, the resin is blocked and washed according to the manufacturer's instmctions.
Media containing MDDT are passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of MDDT (e.g., high ionic strength buffers in the presence of detergent). The column is eluted under conditions that disrupt antibody/MDDT binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such as urea or thiocyanate ion), and MDDT is collected.
All publications and patents mentioned in the above specification are herein incoφorated by reference. Various modifications and variations of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention.
Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the above-described modes for carrying out the invention which are obvious to those skilled in the field of molecular biology or related fields are intended to be within the scope of the following claims.
Figure imgf000062_0001
O o
<
Q ( '. io -O N oo o- O i- c . 'ϊ -. o r-. co O '- c. o 'j io s -D O- O '- c. c Tf ιθ u^ - θ Lθ θ > > . > <> < 3 ) ) ) r^ r^ r^ r r-~ r^ t~. ι^- r^ r^- co co o oo oo o CM CN (N CN CM CN CN CN < CN CN C\I CN CN CN CM CM CN CN 4 CN CN CN CN CN CN CN CN CN CN CN CM
Figure imgf000062_0002
u - r. p. '7 uj <) iN (D u. 1_ __ p. r_ r_ ι_ l_ ι_ ι_ ι_ n ( W r W N ( ( | cs o n [
Figure imgf000063_0001
O
_ iO '0 N -- O O '- C. rt ^ -) -0 S -D O O '- M n ' -3 '0 N 00 C. O - (N n ^ -. O Q αθ -o co co co C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 '— ■— ■— ■— ■— r— r—
G LU CO
Q —
9- £
* hi_
Figure imgf000063_0002
Q n -j in Mo o o r- t. o g B 'O MO O Q '- O 'J LO O I^ OO O '— CN O - G
Figure imgf000064_0001
o o
<
Q -~ ^ - M CM CN N CM C CN CM CM CM C O C C C C C ^ ^ ^ rr '1Jt 'cr '^ 'ϊ ^l- ~ aC C C C C C C C C - c c c^ c c0 c0 c c c c 0 c c c c c c co c c c0 (n
LU CO
Figure imgf000064_0002
ό O'-C.CO'ϊ-. Ors tOO- O '- C.rtgiOO oo oo oo oo oo cococo oo oo c >- θ- C θ-
Figure imgf000064_0003
Figure imgf000065_0001
o
_ o o — cN c 'tf O -o r . oo o O '— N c '-r - o r^ co o O '— tM n io o eo c o Q ^ ι Lθ θ - Lθ Lθ Lθ Lθ - co <) <) ) > <3 <) ) J ) 3 r^ r^ r^ r~. ι - r-- r^ r^- r^. r^- oo
— c co c c co co c co co co co co co o co co co co co co c to co co c co co co co oo co co o
Figure imgf000065_0002
r^ m n. O '- C. cO 'ϊ iO O M O' O '- C. c ^I LO O OO O O '- I C g iO O CO C> (> C> 0 0 0 0 0 0 0 0 0 0 ,— ■"" "~ """ "~ — — — — — CN CN CN CM CN CN CN CN CN
G UJ CO
Figure imgf000066_0001
z
Q O O — CN cO 'f -O O r^. cO O O '— CN oo CO O O- 0 0 0 0 0 0 0 0- 0 0 0 0 0 0 0 0 0 0 -— ■— •— . o c e. c. c c n co ^ ^t '^ '^ '!} ^ ^ ^ ^ ^ ^ ^ ^
G>
Figure imgf000066_0002
ό
Q c. . ( n <. n n o Λ w n ^ ^ ^ ^ ^ ^ ^, ^ 't -) -) -) -) io ιo ιo -) -) io
LU CO
Figure imgf000067_0001
CO O
< CO ^J LO -O r- CO O- O ■- (N ( ^ lO O N oo α O '- c. n i -xj CO O O ■— CN o ^J ■— r- CM CM CN CN CM CN CM CN CM CN CO CO CO CO CO CO CO CO a ^r ^r ^r
LU CO
Figure imgf000067_0002
CO O O r- 00 <00> t>
Figure imgf000067_0003
oo OO ggcocococococoooggcococococoggcogggcoggggggco
Figure imgf000068_0001
o
. in o Mo o O '- c. o ^ w o MO O- O '- wn ^ io oo oO'- c. n ^ -x) Q ^ ^ ^ ^ ^ io io io -O Lo iO L L Q LQ <j <3 .o j ^D <3 <) 3 rN. r^ r^ rN. r^ r. r^
gggg8D_8D-gggggggggggggg8---8CLgggggggggg8CL8CLgg8--.8CL8Q-
Figure imgf000068_0003
Figure imgf000068_0002
o
CM c ^T LO -O r. cO O '— C 'Nt i -O r^ oo O '— CN CO 'T -O ' r-v CO O O '— CM C
O O O O O O O O O O O O O O O O O O -— r— r— I— r— r— •— •— r- r- CM CM CN CM r- i— r- r- r- r- r- CM CN CN CN CN CN CN CN CN CN CN CN CM CM CN CM CN CN CN CM CN CN CN CN
Figure imgf000069_0001
co o
< n- io o r-N oo o -— CN c ^T -o -o r oo O '— c
Q 00 00 00 C0 00 00 O O O O O O O O ^^ ^-- --. _
' ' — — — -- * ' J J - J 5j iJ ^ iO iO iO iO §i i8O -8. α
Figure imgf000069_0002
TABLE 2
SEQ ID NO: Template ID GI Number Probability Score Annotation
1 LG:150318.1 :2000SEP08 gl 1643581 0 Homo sapiens PR- domain containing protein 14 (PRDM14) mRNA,
2 LG:022529.1:2000SEP08 g10047272 Homo sapiens mRNA for KIAA1599 protein, partial eds.
3 LG:352559.1:2000SEP08 g13560887 1.00E-11 Homo sapiens EZFIT- related protein 1 mRNA, complete
4 LG:175223.1:2000SEP08 g10433955 1.00E-43 (fl) (Homo sapiens) unnamed protein
5 LG:476989.1:2000SEP08 g12407394 Homo sapiens tripartite motif protein TRIM7 gRIM7) mRNA,
6 LG:253268.7:2000SEP08 gl 2239368 0 Homo sapiens LYST- interacting protein LIP9 mRNA, partial
7 LG:401322.1:2000SEP08 g13623682 5.00E-26 Homo sapiens, tubulin alpha 1, clone MGC:2321,
8 LG:1328436.1:2000SEP08 g4589587 2.00E-55 Homo sapiens mRNA for KIAA0972 protein, complete
9 LG:475404.1:2000SEP08 g!0434194 0 Homo sapiens cDNA FU12606 fis, clone
NT2RM4001483, moderately similar
10 LG: 1384132.1 :2000SEP08 g 14042849 1.00E-71 Homo sapiens cDNA FU1495 fis, clone
PLACE4000156, moderately similar
11 LG:410804.18:2000SEP08 gl3543325 3.00E-74 Homo sapiens, hypothetical protein MGC8407, clone MGC: 1820, mRNA, complete
12 LG:1082306.1:2000SEP08 g13325336 0 Homo sapiens, clone MGC: 10520, mRNA, complete
13 LG:233814.4:2000SEP08 gl0434873 1.00E-166 Homo sapiens cDNA FU13044 fis, clone
NT2RP3001355, weakly similar to TRICARBOXYLATE TRANSPORT TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
14 ' LG:977478.5:2000SEP08 g 12698000 0 Homo sapiens mRNA for KiAAl 728 protein, partial eds.
15 LG:025931.1 :2000SEP08 g 10436359 0 Homo sapiens
CDNA FLJ1401 1 fis, clone
Y79AA 1002472, weakly similar to
16 LG:885368.1 :2000SEP08 g440824 9.00E-60 (fl) (Arabidopsis thaliana) ribosomal
17 LG:1054900.1.2000SEP08 g 12052982 0 Homo sapiens mRNA; cDNA DKFZp434l1610 (from clone DKFZp434ll610);
18 LG:995186.2:2000SEP08 g 12052982 0 Homo sapiens mRNA; cDNA DKFZp434ll610 (from clone DKFZp434ll610);
19 LG:435048.23:2000SEP08 gl 0432866 1.00E-177 Homo sapiens cDNA FUl 1583 fis, clone
HEMBA1003680, weakly similar to PUTATIVE AMINOPEPTIDASE ZK353.6 IN
20 LG:954859.1 :2000SEP08 g 10438224 1.00E-176 Homo sapiens cDNA: FU21990 fis, clone HEP06386.
21 LG:364370.1 :2000SEP08 g 14043150 1.00E-163 Homo sapiens, ribosomal protein LI 3, clone MGC: 15490, mRNA,
22 LG:1098789.1 :2000SEP08 g 14029703 1.00E-137 Homo sapiens myosin regulatory light chain 2 (MRLC2) mRNA,
23 LG:201540.2:2000SEP08 g 12407386 0 0 Homo sapiens tripartite motif protein TRIM5 isoform delta (TRIM5) mRNA,
24 LG:1077357.1 :2000SEP08 g 10436361 4.00E-66 Homo sapiens cDNA FU14012 fis, clone
Y79AA 1002482, moderately similar TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
25 LG:1048846.4:2000SEP08 g10439974 1.00E-154 Homo sapiens cDNA: FLJ23327 fis, clone HEP! 2630, highly similar to HSZNF37 Homo sapiens ZNF37A
26 LG:336685.1:2000SEP08 g12697317 0 Homo sapiens partial mRNA for 27 LG:1076253.1.2000SEP08 g7959276 5.00E-34 Homo sapiens mRNA for KIAA1508 protein, partial eds.
28 LG:1400601.2:2000SEP08 g14042292 Homo sapiens cDNA FLJ 14636 fis, clone
NT2RP2001233, weakly similar to
29 LG:1079092.3:2000SEP08 g12052982 6.00E-07 Homo sapiens mRNA; cDNA DKFZp434ll610 (from clone DKFZp434ll610);
30 LG:1086064.1 :2000SEP08 g10436361 3.00E-76 Homo sapiens cDNA FLJ14012 fis, clone
Y79AA 1002482, moderately similar
31 LG:1400608.1 :2000SEP08 g 12052731 1.00E-175 Homo sapiens mRNA; cDNA DKFZp761G18121 (from clone DKFZp761G18121);
32 LG:399275.5:2000SEP08 g14042034 Homo sapiens cDNA FU 14486 fis, clone
MAM MAI 002650, weakly similar to
33 LG:293943.1 :2000SEP08 g7022603 2.00E-24 (fl) (Homo sapiens) unnamed protein
34 LG:345884.1 :2000SEP08 gl3591711 Homo sapiens immunoglobulin receptor translocation associated protein 2b (IRTA2) mRNA,
35 LG:400967.1 :2000SEP08 g 10047182 1.00E-176 Homo sapiens mRNA for KIAA1559 protein, partial eds.
36 LG:02455ό.6:2000SEP08 gl 1999276 0 Homo sapiens solute carrier (SLC25A18) mRNA, complete eds; nuclear gene for TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
37 LG:081 189.3:2000SEP08 g 14325768 0 Homo sapiens mRNA for KIAA1776 protein (fibrillin3),
38 LG:018258.1 :2000SEP08 g 12832288 3.00E-97 0 (Mus musculus)
39 LG:450399.3:2000SEP08 g 13097599 l.OOE-10 Homo sapiens. Similar to ribosomal protein L23, clone IMAGE:3606198.
40 LG:451122.1 :2000SEP08 g6002102 2.00E-38 (fl) (Digitalis lanata) Acyl-CoA binding protein (ACBP)
41 LG:451682.1:2000SEP08 g867149ό 1.00E-134 (fl) (Orγza sativa) alpha 3 subunit of 20S proteasome
42 LG:238631.4:2000SEP08 g!2803994 0 Homo sapiens. Similar to HI.A class II region expressed gene KE2, clone MGC:4178, mRNA,
43 LG:236654.1 :2000SEP08 g 1 1558487 2.00E-08 Homo sapiens mRNA for B-cell lymphoma/leukae mia l l B (BCL1 1 B
44 LG:332655.1 :2000SEP08 g 12856559 7.00E-93 0 (Mus musculus) 45 LG:217396.2:2000SEP08 g 10047294 0 Homo sapiens mRNA for KIAA1610 protein, partial eds.
46 LG:090574.1 :2000SEP08 g 12845416 5.00E-37 0 (Mus musculus) 47 LG:202943.1 :2000SEP08 g 12060829 0 Homo sapiens serologically defined breast cancer antigen NY- BR-38 mRNA,
48 LG:236928.1:2000SEP08 g14388574 Macaca fascicularis brain cDNA clone:QtrA-
49 LG:215169.2:2000SEP08 g790349 1.00E-22 Ho o sapiens (clone NE68) gene
50 LG:410726.1:2000SEP08 g9963805 2.00E-17 Homo sapiens zinc finger protein ZNF287 (ZNF287)
51 LG:234372.2:2000SEP08 g10436854 0 Homo sapiens cDNA: FU20896 fis, clone ADKA03527.
52 LG:022629.1:2000SEP08 g13879442 1.00E-178 (fl) (Mus musculus) Similar to RIKEN CDNA 2310035M22
53 LG:068682.1:2000SEP08 g13898616 0 Homo sapiens serine/threonine protein kinase SSTK (SSTK) mRNA, TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
54 LG:222335.1 :2000SEP08 g 14250137 0 Homo sapiens,
Similar to RIKEN cDNA 5730421 El 8 gene, clone
55 LG:331342.1 :2000SEP08 gl0434081 0 Homo sapiens cDNA FLJ 12538 fis, clone
NT2RM4000356, moderately similar to RAS-RELΛTED
56 LG:021770.1 :2000SEP08 g9408105 2.00E-47 Homo sapiens dNT- 2 gene for mitochondrial 5'(3'> deoxyribonucleotid
57 LG:181607.9;2000SEP08 gl2834244 8.00E-91 0 (Mus musculus) 58 LG:1042768.1 :2000SEP08 g 13937982 1.00E- 180 Homo sapiens, translocase of inner mitochondrial membrane 17 (yeast) homolog A, clone MGC: 14756,
59 LG:282729.1 :2000SEP08 g!2314268 1.00E-107 (5' incom)(Homo sapiens) dJ14N1.2 (novel S-100/ICaBP type calcium binding domain
60 LG:998305.3:2000SEP08 g 12053098 6.00E-88 Homo sapiens mRNA; cDNA DKFZp434A171 (from clone DKFZp434A171);
61 LG: 1 135213.1 :2000SEP08 g6692607 2.00E-69 (fl) (Mus musculus)
MGA protein
62 LG:267762.1 :2000SEP08 gl2854977 1.OOE-135 0(Mus musculus) 63 LG: 120744.1 :2000SEP08 g 12052773 0 Homo sapiens mRNA; cDNA DKFZp564B052 (from clone DKFZp564B052);
64 LG:403409.1 :2000SEP08 g8896163 0 Homo sapiens kinesin-like protein GAKIN mRNA,
65 LG:226874.3:2000SEP08 g3688393 3.00E-05 Homo sapiens mRNA for triple LIM 66 LG:1045521.4:2000SEP08 gl0436742 4.00E-58 Homo sapiens cDNA FU14310 fis, clone
67 LG:275876.1 :2000SEP08 g5912051 6.00E-30 (3' and 5' inco ) (Homo sapiens) TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
68 LG:475127.7:2000SEP08 g7294107 6.00E-44 (fl) (Drosophila melanogaster) CG4638 gene
69 LG: 157263.1.2000SEP08 g3047402 3.00E-40 (fl) (Homo sapiens) monocarboxylate transporter 2
70 LG:247382.7:2000SEP08 g4240292 5.00E-25 Homo sapiens mRNA for KIAA0902 protein, complete
71 LG:197367.5:2000SEP08 gl0172680 7.00E-14 (fl) (Bacillus halodurans) stage V sporulation protein C (peptidyl-
72 LG:218090.5:2000SEP08 g9295344 0 Ho o sapiens HSKM-B (HSKM-B) mRNA, complete
73 LG:216612.4:2000SEP08 g8655677 0 Homo sapiens mRNA; cDNA DKFZp547M236 (from clone
74 LG:197614.1 :2000SEP08 gl3358641 0 Macaca fascicularis brain
75 LG:378428.1 :2000SEP08 g7264026 0 (fl) (Homo sapiens) dJ876B 10.2 (novel protein (ortholog of
76 LG:28όό39.1 :2000SEP08 g 13383264 0 Homo sapiens mRNA for actin related protein,
77 LG:389870.1 :2000SEP08 g 14388335 9.00E-63 Macaca fascicularis brain cDNA clone:QflA-
78 LG:1387485.6:2000SEP08 g 10435209 0 Homo sapiens CDNA FU13261 fis, clone
OVARC 1000885, weakly similar to OXIDOREDUCTASE
79 LG:230151.1 :2000SEP08 g8655647 Homo sapiens mRNA; cDNA DKFZp762M1 15 (from clone
80 LG:215158.5:2000SEP08 gl0440162 Homo sapiens cDNA: FU23465 fis, clone HSI 10904.
81 LG:235840.1 :2000SEP08 g 14042343 Homo sapiens cDNA FU 14666 fis, clone
NT2RP2003000, weakly similar to TUMOR NECROSIS FACTOR, ALPHA- TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
82 LG:350272.1 :2000SEP08 g13477234 0 Homo sapiens.
Similar to RIKEN CDNA 0610037N03 gene, clone
83 LG:232190.1:2000SEP08 g!0434968 1.00E-143 Homo sapiens cDNA FLJ13105 fis, clone
NT2RP3002351, weakly similar to Human mRNA for NAD-dependent methylene tetrahydrofolate
84 LG:1068127.1:2000SEP08 gl0436724 1.00E-140 Homo sapiens cDNA FU 14297 fis, clone
85 LG:408751.3:2000SEP08 g8886024 0 Homo sapiens collapsin response mediator protein-5 (CRMP5) mRNA,
86 LG:1078933.1:2000SEP08 g!4042843 0 Homo sapiens cDNA FU 14954 fis, clone
PLACE3000169, weakly similar to
87 LG:958731.1:2000SEP08 g9758769 6.00E-88 (fl) (Arabidopsis thaliana) 1 1 -beta- hydroxysteroid
88 LG:024125.5:2000SEP08 gl2052958 0 0 Homo sapiens mRNA; cDNA DKFZp566J2046 (from clone DKFZp566J2046);
89 LG:373637.3:2000SEP08 gl 1118740 0 Homo sapiens
UGT1 gene locus, complete
90 LG:1053229.1:2000SEP08 g12804322 0 Homo sapiens, clone MGO4054, mRNA, complete
91 LG:248364.1:2000SEP08 g12847599 0 0 (Mus musculus)
92 LG:477130.1:2000SEP08 g6650751 1.00E-55 (fl)(Ceratopteris richardii) ribosomal
93 LG:113786.17:2000SEP08 g10440515 0 Homo sapiens mRNA for FU00106 protein, partial eds.
94 LG:347635.1 :2000SEP08 g11527996 0 Homo sapiens
NOTCH2 protein (NOTCH2) mRNA,
95 LG:242966.4:2000SEP08 g9955987 0 Homo sapiens clone TCCCIA00164 TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
96 LG:217814.1 :2000SEP08 g 10438977 0 Homo sapiens cDNA: FU22551 fis, clone HSI00804.
97 LG:476452.1 :2000SEP08 g4126809 1.00E-121 (fl) (Oryza sativa) glyoxalase I
98 LG: 1 100657.1 :2000SEP08 g340298 1.00E-107 Human vasopressin mRNA, complete
99 LG: 1 132418.2:2000SEP08 g 12653472 2.00E-86 Homo sapiens, proteasome (prosome, macropain) subunit, beta type,
100 LG:1098570.1:2000SEP08 g35069 1.00E-171 H.sapiens RNA for nm23-H2 gene.
101 LG:1097987.1:2000SEP08 g9368838 Homo sapiens mRNA; cDNA DKFZp547l014 (from clone
102 LG:337818.2:2000SEP08 g!4042395 Homo sapiens cDNA FU 14699 fis, clone
NT2RP2006571, moderately similar to CYTOCHROME
103 LG:1040582.1 :2000SEP08 gl3529277 1.00E-119 Homo sapiens, aldo-keto reductase family 1, member Al (aldehyde reductase), clone
104 LG:1099122.1:2000SEP08 g13097716 8.00E-86 Homo sapiens, guanine nucleotide binding protein (G protein), gamma 5, clone
105 LG:1327449.1 :2000SEP08 g14042109 2.00E-96 Homo sapiens cDNA FU 14531 fis, clone
NT2RM2000371, weakly similar to POLYRIBONUCLEOTI DE
106 LG:227933.5:2000SEP08 g9280028 Macaca fascicularis brain
107 LG:1043709.2:2000SEP08 g12804588 4.00E-30 Homo sapiens, Similar to CG9172 gene product, clone MGC886,
108 LG:1099871.1 :2000SEP08 gl2652698 1.00E-154 Homo sapiens, purine-rich element binding protein B, clone MGC: 1947, TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
109 LG:1399139.4:2000SEP08 g 10437077 0 Homo sapiens cD A: FU21069 fis, clone CAS01594.
110 LG:236386.1:2000SEP08 gl3477131 6.00E-87 (fl) (Homo sapiens) SH3 and PX domain-containing
111 LG:1015157.1:2000SEP08 gl881781 5.00E-72 glyoxalase I
(human, HeLa cells, mRNA Partial, 572
112 LG:1065433.1 :2000SEP08 g4589587 8.00E-29 Homo sapiens mRNA for KIAA0972 protein, complete
113 LG:236992.4:2000SEP08 gl2248381 0 Homo sapiens mRNA for SEMB,
114 LG:1071124.1 :2000SEP08 g13543418 0 Homo sapiens.
Similar to zinc finger protein 304, clone MGC:4079, mRNA,
115 LG:206425.2:200OSEP08 g!2052883 2.00E-90 Homo sapiens mRNA; cDNA DKFZp564C2478 (from clone DKFZp564C2478);
116 LG:885747.2:2000SEP08 g 12804504 2.00E-49 Homo sapiens. Similar to ribosomal protein L31, clone MGC: 1641, mRNA,
1 17 LG: 1 140501.1 :2000SEP08 g!2052919 Homo sapiens mRNA; cDNA DKFZp564ll782 (from clone DKFZp564ll782);
1 18 LG:001239.1 :2000SEP08 g14164612 Homo sapiens sialic acid binding immunoglobulin- like lectin 10
1 19 LG:018980.1 :2000SEP08 g 10435149 Homo sapiens cDNA FU 13220 fis, clone
NT2RP4002047, moderately similar to GTP-BINDING
120 LG:1083120.3:2000SEP08 g515786 3.00E-66 H.sapiens (MAR7) chromosome 19 DNA, 302bp.
121 LG:233258.3:2000SEP08 g 10433125 00 Homo sapiens cDNA FU 11790 fis, clone
122 LG:999062.1 :2000SEP08 g9759463 3.00E-60 (fl) (Arabidopsis thaliana) 40S
123 LG:887776.1 :2000SEP08 g342294 1.00E-66 Macaca mulatta serum albumin TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
124 LG: 1400301.2:2000SEP08 gl2751104 1.00E-23 Homo sapiens
PNAS-130 mRNA,
125 LG:1329362,1:2000SEP08 g14042849 0 Homo sapiens cDNA FU 14959 fis, clone
PLACE4000156, moderately similar
126 LG:1096498.1:2000SEP08 gl3097206 l.OOE-110 Homo sapiens, ribosomal protein, large, PI, clone MGC:5215, mRNA,
127 LG:1096337.1:2000SEP08 gl3097206 1.00E-121 Homo sapiens, ribosomal protein, large, PI, clone MGC:5215, mRNA,
128 LG:1400579.1:2000SEP08 gl0437945 4.00E-61 Homo sapiens cDNA: FU21781 fis, clone HEP00223.
129 LG: 1080091.1 :2000SEP08 gl0436724 1.00E-154 Homo sapiens cDNA FU 14297 fis, clone
130 LG:1082203.1:2000SEP08 g10439929 0 Homo sapiens cDNA: FU23296 fis, clone HEP10656.
131 LG:1084051.1 :2000SEP08 g6807586 1.00E-104 Novel human gene mapping to chomosome 1.
132 LG:1082393.1:2000SEP08 g10954043 0 Homo sapiens KRAB zinc finger protein ZFQR
133 LG:1086183.1:2000SEP08 g12655164 6.00E-58 Homo sapiens, zinc finger protein 256, clone MGC:1413, mRNA, complete
134 LG:1090268.1:2000SEP08 g13938479 0 Homo sapiens. Similar to hypothetical protein FU22301, clone
135 LG:1400597.5:2000SEP08 gl4250145 1.00E-156 Homo sapiens, hypothetical protein FU23407, clone MGC: 14819, mRNA, complete
136 LG:1080307.2:2000SEP08 gl4043840 2.00E-34 Homo sapiens, clone MGC: 14429, mRNA, complete
137 LG:1400603.2:2000SEP08 g4589565 4.00E-31 Homo sapiens mRNA for KIAA0961 protein, complete TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
138 LG:1052984.1.2000SEP08 g 13937998 0 Homo sapiens, Similar to DNA- binding protein, clone MGC: 14780,
139 LG: 1091259.1.2000SEP08 gl0436675 Homo sapiens cDNA FU 14260 fis, clone
PLACE1001 1 18, weakly similar to
140 LG:1082263.2:2000SEP08 gl3560887 5.00E-13 Homo sapiens EZFIT- related protein 1 mRNA, complete
141 LG:1048604.2:2000SEP08 gl4249843 6.00E-67 Homo sapiens. Similar to hypothetical protein FU23233, clone MGC: 14876,
142 LG:1085254.3:2000SEP08 g 10437559 0 Homo sapiens cDNA: FU21457 fis, clone COL04705.
143 LG:1400606.2:2000SEP08 g 14042549 9.00E-54 Homo sapiens cDNA FU 14779 fis, clone
NT2RP4000398, moderately similar
144 LG:1090358.2:2000SEP08 g 10047182 6.00E-30 Homo sapiens mRNA for KIAA 1559 protein, partial eds.
145 LG:1079064.2:2000SEP08 g 10434649 0 Homo sapiens cDNA FU 12895 fis, clone
NT2RP2004187, weakly similar to
146 LG:1076866.1.2000SEP08 g 10436460 Homo sapiens cDNA FU 14087 fis, clone
MAMMA1000183, weakly similar to
147 LG:969359.1 :2000SEP08 g 13279004 1.00E-120 Homo sapiens, ferritin, light polypeptide, clone MGC: 10465, mRNA,
148 LG:366783.1 :2000SEP08 g9651703 Homo sapiens carboxypeptidase B precursor (CPAH) mRNA, complete
149 LG:332176.3:2000SEP08 g 13365900 Macaca fascicularis brain cDNA clone:QflA- TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
150 LG:994938.1 :2000SEP08 g7023331 1.00E-76 Homo sapiens cDNA FU 10961 fis, clone
PLACE 1000588, highly similar to
INTERFERON-
151 LG:982800.1 :2000SEP08 g 12053280 0 Homo sapiens mRNA; cDNA DKFZp434J037 (from clone DKFZp434J037);
152 LG:977850.7:2000SEP08 g 10436959 5.00E-64 Homo sapiens cDNA: FU20984 fis, clone CAE00871.
153 LG:234748.2:2000SEP08 g 13182754 0 Homo sapiens
HPHRP mRNA,
154 LG:306284.1 :2000SEP08 g 14043648 0 Homo sapiens, clone MGC: 14161, mRNA, complete
155 LI:333170.3:2000SEP08 gl 2314083 1.00E-93 (fl) (Homo sapiens) dJ1007G 16.5 (novel high-mobility group (nonhistone chromosomal)
156 LI:336685.2:2000SEP08 gl 2697317 0 Homo sapiens partial mRNA for
157 LI:279013.5:2000SEP08 g 12840673 3.00E-55 0 (Mus musculus) 158 LI: 1037075.1 :2000SEP08 gl 1342540 0 Homo sapiens mRNA for putative white family ATP- binding cassette transporter (ABCG4
159 LI: 1073403.1 :2000SEP08 gl 2804680 9.00E-63 Homo sapiens, SI 00 calcium- binding protein, beta (neural), clone MGC: 1323,
160 LI: 1075296.1 :2000SEP08 g 12803084 1.00E-112 Homo sapiens, mitochondrial ribosomal protein LI 2, clone MGC:8610, mRNA,
161 LI: 1085501.1 :2000SEP08 gl2653784 1.00E-140 Homo sapiens, clone
IMAGE:3349601,
162 U:1086181.1 :2000SEP08 g3043444 1.00E-146 Homo sapiens mRNA for EDF-1 163 U: 1164493.1 :2000SEP08 g 13436439 0 Homo sapiens, clone MGC4400, mRNA, complete TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
164 U: 1175097.1 :2000SEP08 g 13937908 4.00E-45 Homo sapiens.
Similar to KIAA0961 protein, clone MGC: 12515, mRNA,
165 LI: 1092948.1.2000SEP08 g 12804414 2.00E-49 Homo sapiens. Similar to hypothetical protein FU 10891, clone MGC925,
166 LI:380378.2:2000SEP08 g8655612 3.00E-79 Homo sapiens mRNA; cDNA DKFZp76201415 (from clone
167 LI : 1029674.1 :2000SEP08 g9651088 4.00E-07 Macaca fascicularis brain
168 Ll:2048601.3:2000SEP08 g 10434880 1.00E-106 Homo sapiens cDNA FU 13048 fis, clone
NT2RP3001399, weakly similar to
169 LI: 1186208.1 :2000SEP08 g 12052731 1.00E-175 Homo sapiens mRNA; cDNA DKFZp761G18121 (from clone DKFZp761G18121);
170 LI:1 170753.1 :2000SEP08 gl4249995 l .OOE-101 Homo sapiens, clone MGC: 12518, mRNA, complete
171 LI:1 180908.1 :2000SEP08 g 14042849 0 Homo sapiens cDNA FU 14959 fis, clone
PLΛCE4000156, moderately similar
172 LI:1 182900.2:2000SEP08 g 10047304 Homo sapiens mRNA for KIAA1615 protein, partial eds.
173 LI: 1 169548.2:2000SEP08 g 1 017832 Homo sapiens mRNA for KIAA 1808 protein, partial eds.
174 LI: 1039974.1 :2000SEP08 g 13366083 Homo sapiens MARKL1 mRNA for MAP/microtubule affinity-regulating kinase like 1,
175 U:1175765.2:2000SEP08 gl3752753 5.00E-16 Homo sapiens zinc finger l l l l mRNA, complete eds.
176 LI:313948.1 :2000SEP08 g9651098 0 Macaca fascicularis brain TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
177 LI:335923.2:2000SEP08 gl 1990770 3.00E-73 (fl) (Homo sapiens) bA534G20.1.1 (novel protein similar to Lysozyme C-l (1,4-beta-N- acylmuramidase C,
178 LI:345884.1:2000SEP08 gl3591713 0 Homo sapiens immunoglobulin receptor translocation associated protein 2c (IRTA2) mRNA,
179 U:417127.1:2000SEP08 g12652726 1.00E-66 Homo sapiens, clone
IMAGE:3352566,
180 Ll:451710.1 :2000SEP08 g13899057 l .OOE-όl (fl)(Mercurialis annua) ribosomal
181 U:406882.2:2000SEP08 g7019948 2.00E-57 Homo sapiens cDNA FU20081 fis, clone COL03242.
182 LI:728223.1:2000SEP08 g2624328 2.00E-44 (fl)(Oryza sativa) 183 U:289783.19:2000SEP08 gl2654714 0 Homo sapiens,
Similar to glucose regulated protein, 58 kDa, clone
184 U:235255.8:2000SEP08 gl2597311 0 Homo sapiens clone IMAGE:72154 tRNA-guanine transglycosylase (TGT) mRNA,
185 LI:237693.5:2000SEP08 g12406772 4.00E-52 (fl) (Homo sapiens) unnamed protein
186 LI:433670.3:2000SEP08 gl4133250 0 Homo sapiens mRNA for KIAA 1479 protein, partial eds.
187 LI:202943.4:2000SEP08 g12060829 0 Homo sapiens serologically defined breast cancer antigen NY- BR-38 mRNA,
188 LI:068682.1 :2000SEP08 g13540325 0 Homo sapiens serine/threonine kinase FKSG82 (FKSG82) mRNA,
189 U:203301.3:2000SEP08 g10047160 0 Homo sapiens mRNA for KIAA1548 protein, partial eds.
190 LI:020726.3:2000SEP08 g12834087 1.OOE-157 0 (Mus musculus) TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
191 LI:027209.1:2000SEP08 g13159480 1.00E-127 (fl) (Homo sapiens) Translation may initiate at the ATG codon at nucleotides 40-42
192 U:108819.1:2000SEP08 gl2052773 0 Homo sapiens mRNA; cDNA DKFZp564B052 (from clone DKFZp564B052);
193 LI:021759.1 :2000SEP08 gl2006103 0 Homo sapiens IRA1 mRNA, complete eds, alternatively
1 4 LI:1165967.1 :2000SEP08 g13529103 1.00E- 172 Homo sapiens, ribosomal protein S27a, clone MGC: 12414, mRNA,
195 LI:1166315.1 :2000SEP08 g12653800 l.OOE-llό Homo sapiens, peptidylprolyl isomerase A (cyclophilin A), clone MGC:2351,
196 LI:204626.1:2000SEP08 g12844770 1.00E-138 0 (Mus musculus)
197 Ll:801140.1 :2000SEP08 gl2804322 0 Homo sapiens, clone MGC:4054, mRNA, complete
198 LI:286639.1:2000SEP08 gl3938318 0 Homo sapiens, clone MGC: 15664, mRNA, complete
199 LI:288905.4:2000SEP08 g12698056 0 Homo sapiens mRNA for KIAA 1756 protein, partial eds.
200 LI:332161.1:2000SEP08 g14388335 2.00E-62 Macaca fascicularis brain cDNA clone:QflA-
201 LI:184867.1:2000SEP08 g7209586 4.00E-47 (fl) (Rattus norvegicus) DA41
202 LI:229932.4:2000SEP08 g10438187 0 Homo sapiens
CDNA: FU21963 fis, clone HEP05583.
203 U:l 189932.1 :2000SEP08 gl2483887 0 Homo sapiens solute carrier 19A3 mRNA, complete
204 LI:1076689.1 :2000SEP08 g12804758 6.00E-20 Homo sapiens, ribonuclease 6 precursor, clone MGC:3554, mRNA,
205 LI:415181.2:2000SEP08 gl 1762100 0 (fl) (Zea mays) myo- inositol 1- TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation 206 LI:296358.1 :2000SEP08 g 12053148 0 Homo sapiens mRNA; cDNA DKFZp434G2226 (from clone DKFZp434G2226);
207 LI:205186.3:2000SEP08 g567232 4.00E-17 (fl) (Mus musculus) proline-rich protein 208 LI:220537.2:2000SEP08 g 13365896 0 Macaca fascicularis brain cDNA clone:QflA-
209 LI:248364.2:2000SEP08 g 12847599 0 0 (Mus musculus) 210 LI:2048338.1 :2000SEP08 g 12848905 3.00E-78 0(Mus musculus) 211 LI:1 185203.8:2000SEP08 gl 1231084 3.00E-85 Macaca fascicularis brain
212 U:021770.3:2000SEP08 g5931541 2.00E-45 Homo sapiens genomic DNA, chromosome 22ql l .2, BCRL2
213 U: 1 185841.1 :2000SEP08 g 14269501 0 Homo sapiens unconventional myosin IG valine form (MYOlG) mRNA, MYOI G-V
214 LI: 1181710.1 :2000SEP08 g7959206 2.00E-49 Homo sapiens mRNA for KIAA1473 protein, partial eds.
215 U:2048959.1 :2000SEP08 g5817101 1.00E-16 Homo sapiens mRNA; cDNA DKFZp434G1621 (from clone
216 LI:798494.1 :2000SEP08 gl0047182 2.00E-23 Homo sapiens mRNA for KIAA1559 protein, partial eds.
217 U:2049223.1 :2000SEP08 g7959206 2.00E-43 Homo sapiens mRNA for KIAA1473 protein, partial eds.
218 LI: 1177833.1 :2000SEP08 g 12052982 00 Homo sapiens mRNA; cDNA DKFZp434ll610 (from clone DKFZp434ll610);
219 U:2049267.1 :2000SEP08 g515786 6.00E-61 H.sapiens (MAR7) chromosome 19 DNA, 302bp.
220 LI: 1165939.1 :2000SEP08 g7020753 9.00E- 13 Homo sapiens cDNA FU20562 fis, clone KAT1 1992.
221 LI:1170958.1 :2000SEP08 g5262556 1.OOE-14 Homo sapiens mRNA; cDNA DKFZp569D2231 (from clone DKFZp569D2231); TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation 222 LI: 1089827.1.2000SEP08 g 14042292 0 Homo sapiens cDNA FU 14636 fis, clone
NT2RP2001233, weakly similar to
223 LI:7921 12.1 :2000SEP08 g 10437945 4.00E-61 Homo sapiens cDNA: FU21781 fis, clone HEP00223.
224 U:282219.2:2000SEP08 g 12656630 7.00E-63 Homo sapiens Kruppel-like zinc finger protein GLIS2 mRNA, complete
225 LI: 1088010.2:2000SEP08 gl3623632 00 Homo sapiens, clone MGC: 13105, mRNA, complete
226 LI: 1 165276.1 :2000SEP08 g6807586 1.OOE-106 Novel human gene mapping to chomosome 1.
227 LI: 1 169524.2:2000SEP08 g 10047182 5.00E-30 Homo sapiens mRNA for KIAA1559 protein, partial eds.
228 LI: 1 180255.1 :2000SEP08 g 14017870 0 Homo sapiens mRNA for KIAA1827 protein, partial eds.
229 LI: 1091903.1 :2000SEP08 g 14042372 2.00E-49 Homo sapiens cDNA FU 14686 fis, clone
NT2RP2004961, moderately similar to Rattus norvegicus KRAB/zinc finger
230 LI: 1 169219.1 :2000SEP08 g 13937998 0 Homo sapiens,
Similar to DNA- binding protein, clone MGC: 14780,
231 LI:2050313.1 :2000SEP08 g 12862319 0 Homo sapiens mRNA for WDC 146,
232 U:209351.3:2000SEP08 g 14042794 0 Homo sapiens cDNA FU 14923 fis, clone
PI-ACE 1008244, weakly similar to
VEGETATIBLE
233 U : 1 19900.1 :2000SEP08 g 14042821 2.00E-66 Homo sapiens cDNA FU 14939 fis, clone
PLACE 1010702, moderately similar
234 Ll:2052274.1 :2000SEP08 g8698839 4.00E-07 Homo sapiens genomic DNA, chromosome 8q23, TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation
235 U:1075502.1 :2000SEP08 gl3960141 1.OOE-1 14 Homo sapiens, uridine monophosphate synthetase (orotate phosphoribosyl transferase and orotidine-5'- decarboxylase),
236 LI:813697.1 :2000SEP08 g7020744 6.00E-59 Homo sapiens cDNA FU20557 fis, clone KAT 1 1869.
237 LI:814261.1 :2000SEP08 g7023331 1.00E-76 Homo sapiens cDNA FU 10961 fis, clone
PLACE 1000588, highly similar to INTERFERON-
238 U:775334.1 :2000SEP08 g4589587 3.00E-23 Homo sapiens mRNA for KIAA0972 protein, complete
239 LI:1180325.1 :2000SEP08 gl0438159 0 Homo sapiens cDNA: FU21941 fis, clone HEP04524.
240 LI:1 183147.3:2000SEP08 gl0440084 0 Homo sapiens cDNA: FU23407 fis, clone HEP19601.
241 LI:1 175373.3:2000SEP08 g7243242 5.00E-51 Homo sapiens mRNA for KIAA 1431 protein, partial eds.
242 U:813757.1 :2000SEP08 g4589587 1.00E-31 Homo sapiens mRNA for KIAA0972 protein, complete
243 U:1 182979.2:2000SEP08 g 14042292 2.00E-87 Homo sapiens cDNA FU 14636 fis, clone
NT2RP2001233, weakly similar to
244 LI:1 177823.2:2000SEP08 gl2052982 0 Homo sapiens mRNA; cDNA DKFZp434ll610 (from clone DKFZp434ll610);
245 LI : 1 174279.1 :2000SEP08 g 10047182 1.00E-27 Homo sapiens mRNA for KIAA1559 protein, partial eds.
246 LI : 1 17841 1.1 :2000SEP08 g 14042843 0 Homo sapiens cDNA FU 14954 fis, clone
PI.ACE3000169, weakly similar to TABLE 2
SEQ ID NO: Template ID Gl Number Probability Score Annotation 247 U: 1 182739.1.2000SEP08 g!2655164 8.00E-31 Homo sapiens, zinc finger protein 256, clone MGC: 1413, mRNA, complete
248 U:234937.4:2000SEP08 g12804418 Homo sapiens, clone MGC: 1 136, mRNA, complete
249 LI:1170660.1 :2000SEP08 g14388419 Macaca fascicularis brain cDNA clone:QmoA-
250 LI:1144409.1 :2000SEP08 g12053280 Homo sapiens mRNA; cDNA DKFZp434J037 (from clone DKFZp434J037);
251 U:246290.10:2000SEP08 gl0438695 Homo sapiens cDNA: FU22347 fis, clone HRC06188.
252 Ll:280034.1 :2000SEP08 g 12847023 1.OOE-132 0 (Mus musculus)
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
1 LG:150318.1:2000SEP08 11 79 forward 2 zf-C2H2 Zinc finger, C2H2 type 2.00E-06
2 LG:022529.1 :2000SEP08 426 701 forward 3 C2 C2 domain 3.30E-06
3 LG:352559.1 :2000SEP08 125 313 forward 2 KRAB KRAB box 1.60E-41
4 LG:175223.1 :2000SEP08 210 431 forward 3 CSD 'Cold-shock' DNA-binding domain 1.40E-18
5 LG:476989.1 :2000SEP08 149 223 forward 2 zf-C3HC4 Zinc finger, C3HC4 type (RING finger) 3.00E-10
6 LG:253268.7:2000SEP08 21 1 435 forward 1 rrm RNA recognition motif, (a.k.a. RRM, 3.50E-16 RBD, or RNP domain)
7 LG:401322.1 :2000SEP08 156 341 forward 3 tubulin Tubulin/FtsZ family 7.10E-20
7 LG:401322.1 :2000SEP08 371 478 forward 2 tubulin Tubulin/FtsZ family 2.50E-06
8 LG:1328436.1 :2000SEP08 134 322 forward 2 KRAB KRAB box 1.70E-42
9 LG;475404.1 :2000SEP08 176 328 forward 2 KRAB KRAB box 1.10E-15
10 LG: 1384132.1 :2000SEP08 157 225 forward 1 zf-C2H2 Zinc finger, C2H2 type 9.00E-04
1 1 LG:410804.18:2000SEP08 133 243 forward 1 pkinase Protein kinase domain 7.90E-04
12 LG: 1082306.1 :2000SEP08 199 378 forward 1 KRAB KRAB box 4.90E-20
12 LG: 1082306.1 :2000SEP08 559 627 forward 1 zf-C2H2 Zinc finger, C2H2 type 9.60E-05 13 LG:233814.4:2000SEP08 624 947 forward 3 mito_carr Mitochondrial carrier protein 2.50E-04
14 LG:977478.5:2000SEP08 351 449 forward 3 ank Ank repeat 1.10E-08
15 LG:025931.1 :2000SEP08 660 728 forward 3 zf-C2H2 Zinc finger, C2H2 type 4.50E-06
15 LG:025931.1 :2000SEP08 29 97 forward 2 zf-C2H2 Zinc finger, C2H2 type 7.20E-06
16 LG:885368.1 :2000SEP08 133 462 forward 1 Ribosomal_S8 Ribosomal protein S8 7.50E-48
17 LG:1054900.1 :2000SEP08 78 218 forward 3 KRAB KRAB box 2.30E-17
18 LG:995186.2:2000SEP08 151 357 forward 1 KRAB KRAB box 1.10E-04
19 LG:435048.23:2000SEP08 2 457 forward 2 Peptidase_M17 Cytosol aminopeptidase family, 5.30E-09 catalytic domain
20 LG:954859.1 :2000SEP08 245 565 forward 2 Ribosomal_L7Ae Ribosomal protein 7.80E-17 L7Ae/L30e/S12e/Gadd45 family
21 LG:364370.1 :2000SEP08 42 578 forward 3 Ribosomal_L13e Ribosomal protein L13e 2.30E-137
22 LG:1098789.1 :2000SEP08 45 131 forward 3 efhand EF hand 8.70E-05
23 LG:201540.2:2000SEP08 479 604 forward 2 zf-B_box B-box zinc finger. 9.60E-1
23 LG:201540.2:2000SEP08 254 385 forward 2 zf-C3HC4 Zinc finger, C3HC4 type (RING finger) 4.80E-13
24 LG:1077357.1 :2000SEP08 94 282 forward 1 KRAB KRAB box 4.80E-31
25 LG:1048846.4:2000SEP08 60 245 forward 3 KRAB KRAB box 7.10E-39
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
26 LG:336685.1 :2000SEP08 945 1 121 forward 3 homeobox Homeobox domain 3.10E-1 1
27 LG: 1076253.1.2D00SEP08 6 188 forward 3 KRAB KRAB box 4.10E-14
27 LG:1076253.1.2000SEP08 939 1007 forward 3 zf-C2H2 Zinc finger, C2H2 type 9.30E-06
28 LG:1400601.2:2000SEP08 38 106 forward 2 zf-C2H2 Zinc finger, C2H2 type 1.50E-05
29 LG:1079092.3:2000SEP08 314 382 forward 2 zf-C2H2 Zinc finger, C2H2 type 8.80E-06
30 LG:1086064.1.2000SEP08 128 316 forward 2 KRAB KRAB box 2.00E-34
31 LG: 1400608.1 :2000SEP08 138 326 forward 3 KRAB KRAB box 1.30E-40
32 LG:399275.5:2000SEP08 328 645 forward 1 BTB BTB/POZ domain 4.10E-18
33 LG:293943.1 :2000SEP08 265 327 forward 1 IQ IQ calmodulin-binding motif 5.10E-04
34 LG:345884.1 :2000SEP08 203 355 forward 2 '9 Immunoglobulin domain 7.40E-04
35 LG:400967.1 :2000SEP08 109 300 forward 1 KRAB KRAB box 1.50E-35
36 LG:024556.ό:2000SEP08 526 774 forward 1 mito_carr Mitochondrial carrier protein 2.90E-19
37 LG:081 189.3:2000SEP08 362 475 forward 2 EGF EGF-like domain 3.80E-06
38 LG:018258.1 :2000SEP08 174 605 forward 3 Nitroreductase Nitroreductase family 9.00E-05
« 0_0. 3<? LG:450399.3:2000SEP08 81 446 forward 3 Ribosomal_L14 Ribosomal protein L14p/L23e 1.20E-53 oo 40 LG:451 122.1 :2000SEP08 49 303 forward 1 ACBP Acyl CoA binding protein 1.70E-51
41 LG:451682.1 :2000SEP08 1 17 560 forward 3 proteasome Proteasome A-type and B-type 4.40E-59
42 LG:238631.4:2000SEP08 100 453 forward 1 KE2 KE2 family protein 3.80E-38
43 LG:236654, 1 :2000SEP08 810 878 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.20E-04
43 LG:236654.1 :2000SEP08 434 502 forward 2 zf-C2H2 Zinc finger, C2H2 type 6.60E-04
44 LG:332655.1 :2000SEP08 415 513 forward 1 ank Ank repeat 3.10E-10
45 LG:217396.2:2000SEP08 679 882 forward 1 ELM2 ELM2 domain 5.00E-14
45 LG:217396.2:2000SEP08 994 1 134 forward 1 myb_DNA-binding Myb-like DNA-binding domain 2.00E-1 1
46 LG:090574.1 :2000SEP08 33 245 forward 3 carb_anhydrase Eukaryotic-type carbonic anhydrase 3.10E-31
47 LG:202943.1 :2000SEP08 199 360 forward 1 sushi Sushi domain (SCR repeat) 3.80E-18
48 LG:236928.1 :2000SEP08 259 1101 forward 1 GNS1_SUR4 GNS1 /SUR4 family 9.80E-09
49 LG:215169.2:2000SEP08 89 208 forward 2 ldl_recept_a Low-density lipoprotein receptor 1.00E-12 domain class A
50 LG:410726.1 :2000SEP08 724 912 forward 1 KRAB KRAB box 2.10E-17
50 LG:410726.1 :2000SEP08 352 636 forward 1 SCAN SCAN domain 8.90E-55
51 LG:234372.2:2000SEP08 557 862 forward 2 PH PH domain 3.00E-09
51 LG:234372.2:2000SEP08 950 1408 forward 2 RhoGAP RhoGAP domain 3.30E-39
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
51 LG:234372.2:2000SEP08 1952 21 19 forward 2 SH3 SH3 domain 4.60E-1 1
52 LG:022629.1 :2000SEP08 261 410 forward 3 zf-C3HC4 Zinc finger, C3HC4 type (RING finger) 2.00E-07
53 LG:068682.1 :2000SEP08 176 883 forward 2 pkinase Protein kinase domain 1.70E-65
54 LG:222335.1 :2000SEP08 7 732 forward 1 DUF71 Domain of unknown function DUF71 6.70E-83
55 LG:331342.1 :2000SEP08 558 1076 forward 3 arf ADP-ribosylation factor family 1.20E-04
55 LG:331342.1 :2000SEP08 588 1 154 forward 3 ras Ras family 2.60E-79
56 LG:021770, 1 :2000SEP08 228 1 199 forward 3 adh_zinc Zinc-binding dehydrogenases 7.80E-20
57 LG:181607,9:2000SEP08 161 604 forward 2 Josephin Josephin 5.00E-50
58 LG:1042768.1 :2000SEP08 42 443 forward 3 Tim 17 Mitochondrial import inner membrane 1.40E-82 translocase subunit Tim 17
59 LG;282729.1 :2000SEP08 61 192 forward 1 SJ00 S-100/ICaBP type calcium binding 1.10E-12 domain
60 LG:998305.3:2000SEP08 266 364 forward 2 ank Ank repeat 2.10E-07
61 LG:1 135213.1 :2000SEP08 340 531 forward 1 T-box T-box 8.80E-27 62 LG:267762.1 :2000SEP08 64 1 125 forward 1 A_deaminase Adenosine/AMP deaminase 7.80E-20
63 LG:120744.1 :2000SEP08 301 813 forward 1 vwa von Willebrand factor type A domain 1.90E-52
64 LG:403409.1 :2000SEP08 1458 1652 forward 3 FHA FHA domain 3.00E-04
64 LG:403409.1 :2000SEP08 78 1 193 forward 3 kinesin Kinesin motor domain 6.80E-172
65 LG:226874.3:2000SEP08 1 18 291 forward 1 UM UM domain 8.40E-13
66 LG:1045521.4:2000SEP08 160 1464 forward 1 aminotran Aminotransferase class-l 2.00E-10
67 LG:275876.1 :2000SEP08 508 831 forward 1 CH Calponin homology (CH) domain 2.40E-26
68 LG:475127.7:2000SEP08 169 498 forward 1 GTP_EFU Elongation factor Tu family 2.20E-46
69 LG:157263.1 :2000SEP08 163 1314 forward 1 MCT Monocarboxylate transporter 6.10E-31
70 LG:247382.7:2000SEP08 1105 1353 forward 1 PDZ PDZ domain (Also known as DHR or 4.20E-07
GLGF).
70 LG:247382.7:2000SEP08 487 684 forward 1 SAM SAM domain (Sterile alpha motif) 1.20E-1 1
71 LG:197367.5:2000SEP08 50 472 forward 2 Pept_tRNA_hydro Peptidyl-tRNA hydrolase 8.90E-05
72 LG:218090.5:2000SEP08 179 295 forward 2 zf-MYND MYND finger 3.00E-10
73 LG:216612.4:2000SEP08 138 1274 forward 3 Cation_efflux Cation efflux family 8.90E-63
74 LG:197614.1 :2000SEP08 1699 3048 forward 1 OxysteroLBP Oxysterol-binding protein 4.10E-31
74 LG:1 7614.1 :2000SEP08 886 1 164 forward 1 PH PH domain 1.20E-14
75 LG:378428.1 :2000SEP08 621 920 forward 3 PH PH domain 2.00E-06
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
76 LG:286639.1 :2000SEP08 1294 1713 forward 1 actin Actin 9.10E-64
76 LG:286639.1 :2000SEP08 630 1319 forward 3 actin Actin 1.70E-62
77 LG:389870.1 :2000SEP08 131 640 forward 2 ras Ras family 3.40E-37
78 LG:1387485.6:2000SEP08 227 709 forward 2 adh_short short chain dehydrogenase 1.1 OE-25
79 LG:230151.1 :2000SEP08 631 1602 forward 1 El_dehydrog Dehydrogenase El component 1.10E-09
80 LG:215158.5:2000SEP08 1243 1377 forward 1 zf-ANl AN 1 -like Zinc finger 9.10E-04
81 LG:235840.1 :2000SEP08 1241 1537 forward 2 K etra K+ channel tetramerisation domain 5.60E-22
82 LG:350272.1 :2000SEP08 1424 1780 forward 2 SPRY SPRY domain 2.10E-10
82 LG:350272.1 :2000SEP08 557 682 forward 2 Z.-C3HC4 Zinc finger, C3HC4 type (RING finger) 5.30E-1 1
83 LG:232190.1 :2000SEP08 497 691 forward 2 zf-DHHC DHHC zinc finger domain 2.30E-38
84 LG: 1068127.1 :2000SEP08 158 307 forward 2 KRAB KRAB box 3.00E-12
85 LG:408751.3:2000SEP08 194 1204 forward 2 Dihydroorotase Dihydroorotase-like 1.20E-12
85 LG:408751.3:2000SEP08 456 1355 forward 3 Dihydroorotase Dihydroorotase-like 6.60E-08
86 LG:1078933.1 :2000SEP08 373 441 forward 1 zf-C2H2 Zinc finger, C2H2 type 1.20E-07 86 LG:1078933.1 :2000SEP08 689 775 forward 2 zf-C2H2 Zinc finger, C2H2 type 1.00E-03 87 LG:958731.1 :2000SEP08 153 707 forward 3 adh_short short chain dehydrogenase 2.40E-39
88 LG:024125.5:2000SEP08 132 635 forward 3 FAA_hydrolase Fumarylacetoacetate (FAA) hydrolase 1.20E-83 family
89 LG:373637.3:2000SEP08 61 255 forward 1 DnaJ DnaJ domain 7.40E-33
90 LG:1053229.1 :2000SEP08 514 582 forward 1 zf-C2H2 Zinc finger, C2H2 type 2.50E-08
91 LG:248364.1 :2000SEP08 253 594 forward 1 BTB BTB/POZ domain 3.60E-04
91 LG:248364.1 :2000SEP08 1544 1612 forward 2 zf-C2H2 Zinc finger, C2H2 type 7.10E-06
92 LG:477130.1 :2000SEP08 174 554 forward 3 Ribosomal_L13 Ribosomal protein LI 3 3.60E-37
93 LG:1 13786.17:2000SEP08 134 376 forward 2 PDZ PDZ domain (Also known as DHR or 8.10E-15 GLGF).
94 LG:347635.1 :2000SEP08 315 410 forward 3 EGF EGF-like domain 6.70E-09
94 LG:347635.1 :2000SEP08 605 709 forward 2 EGF EGF-like domain 1.10E-04
94 LG:347635.1 :2000SEP08 1057 1 152 forward 1 EGF EGF-like domain 1.20E-04
95 LG:242966.4:2000SEP08 7 918 forward 1 aldedh Aldehyde dehydrogenase family 1.10E-05
95 LG:242966.4:2000SEP08 1031 1993 forward 2 aldedh Aldehyde dehydrogenase family 2.20E-05
96 LG:217814.1 :2000SEP08 1167 1265 forward 3 ank Ank repeat 6.60E-08
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
97 LG:476452.1 :2000SEP08 210 581 forward 3 Glyoxalase Glyoxalase/Bleomycin resistance 8.10E-39 protein/Dioxygenase superfamily
98 LG: 1100657.1.2000SEP08 151 384 forward 1 hormoneδ Neurohypophysial hormones, C- 8.60E-47 terminal Domain
99 LG:1 132418.2:2000SEP08 109 393 forward 1 proteasome Proteasome A-type and B-type 1.10E-14
100 LG:1098570.1 :2000SEP08 129 572 forward 3 NDK Nucleoside diphosphate kinases 3.70E-1 17
101 LG:1097987.1 :2000SEP08 747 908 forward 3 Ribosomal_L23 Ribosomal protein L23 5.80E-14
102 LG:337818.2:2000SEP08 136 1518 forward 1 p450 Cytochrome P450 6.30E-175
103 LG: 1040582.1 :2000SEP08 266 556 forward 2 aldo_ket_red Aldo/keto reductase family 1.30E-47
103 LG: 1040582.1 :2000SEP08 546 635 forward 3 aldo_ket_red Aldo/keto reductase family 7.60E-06
104 LG: 1099122.1 :2000SEP08 84 248 forward 3 G-gamma GGL domain 4.90E-31
105 LG: 1327449.1.2000SEP08 461 526 forward 2 Ribosomal_S8 Ribosomal protein S8 4.80E-07
105 LG: 1327449.1.2000SEP08 246 359 forward 3 Ribosomal_S8 Ribosomal protein S8 3.00E-06
106 LG:227933.5:2000SEP08 523 1455 forward 1 AMP-binding AMP-binding enzyme 7.20E-05
107 LG:1043709.2:2000SEP08 457 723 forward 1 oxidored_q6 NADH ubiquinone oxidoreductase, 20 8.40E-08
Kd subunit
108 LG: 1099871.1 :2000SEP08 34 41 1 forward 1 histone Core histone H2A/H2B/H3/H4 4.50E-47
109 LG:1399139.4:2000SEP08 117 233 forward 3 CAP_GLY CAP-Gly domain 4.30E-10
1 10 LG:236386.1 :2000SEP08 1383 1712 forward 3 PX PX domain 1.90E-13
111 LG:1015157.1 :2000SEP08 1 15 552 forward 1 Glyoxalase Glyoxalase/Bleomycin resistance 6.00E-38 protein/Dioxygenase superfamily
112 LG: 1065433.1 :2000SEP08 264 449 forward 3 KRAB KRAB box 1.30E-38
1 13 LG:236992.4:2000SEP08 286 465 forward 1 Sema Sema domain 2.70E-21
113 LG:236992.4:2000SEP08 209 292 forward 2 Sema Sema domain 3.60E-06
114 LG: 1071 124.1 :2000SEP08 469 630 forward 1 KRAB KRAB box 1.20E-16
1 15 LG:206425.2:2000SEP08 109 519 forward 1 GBP Guanylate-binding protein, N-terminal 1.60E-100 domain
1 15 LG:206425.2:2000SEP08 491 916 forward 2 GBP Guanylate-binding protein, N-terminal 2.70E-70 domain
1 15 LG:206425.2:2000SEP08 1 164 1685 forward 3 GBP_C Guanylate-binding protein, C-terminal 3.50E-71 domain
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
1 15 LG:206425.2:2000SEP08 920 1201 forward 2 GBP_C Guanylate-binding protein, C-terminal 3.10E-63 domain
1 15 LG:206425.2:2000SEP08 1651 1773 forward 1 GBP_C Guanylate-binding protein, C-terminal 1.80E-04 domain
116 LG:885747.2:2000SEP08 372 569 forward 3 Ribosomal_L31e Ribosomal protein L31e 9.50E-16
117 LG: 1 140501.1 :2000SEP08 176 340 forward 2 ATP1G1_PLM_MAT8 ATP1G1/PLM/MAT8 family 9.30E-22
1 18 LG:001239.1 ;2000SEP08 1686 1850 forward 3 ig Immunoglobulin domain 2.50E-08
118 LG:001239.1 :2000SEP08 1391 1561 forward 2 ig Immunoglobulin domain 4.00E-08
119 LG:018980.1 :2000SEP08 395 502 forward 2 GTP_EFTU Elongation factor Tu family 3.20E-09
1 19 LG:018980.1 :2000SEP08 634 801 forward 1 GTP.EFU Elongation factor Tu family 3.50E-07
119 LG:018980.1 :2000SEP08 159 215 forward 3 GTP_EFTU Elongation factor Tu family 6.90E-07
120 LG:1083120.3:2000SEP08 117 266 forward 3 KRAB KRAB box 5.10E-22
121 LG:233258.3:2000SEP08 1797 2069 forward 3 cadherin Cadherin domain 3.90E-24
122 LG:999062.1 :2000SEP08 78 497 forward 3 Ribosomal_S19e Ribosomal protein S19e 2.10E-101
123 LG:88777ό.l :2000SEP08 101 625 forward 2 transport_prot Serum albumin family 3.30E-92
123 LG:887776.1 :2000SEP08 699 1 163 forward 3 transport_prot Serum albumin family 5.90E-41
124 LG: 1400301.2:2000SEP08 253 441 forward 1 KRAB KRAB box 2.10E-38
125 LG: 1329362.1 :2000SEP08 194 262 forward 2 zf-C2H2 Zinc finger, C2H2 type 3.90E-06
126 LG:1096498.1 :2000SEP08 17 358 forward 2 όOsjiboso al 60s Acidic ribosomal protein 1.70E-48
127 LG:1096337.1 :2000SEP08 512 637 forward 2 όOsjibosomal 60s Acidic ribosomal protein 4.10E-07
127 LG:1096337.1 :2000SEP08 619 738 forward 1 όOsjibosomal 60s Acidic ribosomal protein 1.50E-04
128 LG: 1400579.1 :2000SEP08 200 268 forward 2 zf-C2H2 Zinc finger, C2H2 type 7.80E-05
129 LG:1080091.1 :2000SEP08 151 318 forward 1 KRAB KRAB box 1.80E-25
130 LG:1082203.1 :2000SEP08 961 1029 forward 1 zf-C2H2 Zinc finger, C2H2 type 2.10E-08
131 LG: 1084051.1 :2000SEP08 195 263 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.80E-06
132 LG:1082393.1 :2000SEP08 237 425 forward 3 KRAB KRAB box 1.80E-36
132 LG:1082393.1 :2000SEP08 1251 1319 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.40E-06
133 LG: 1086183.1 :2000SEP08 308 478 forward 2 KRAB KRAB box 1.50E-15
133 LG: 1086183.1 :2000SEP08 833 901 forward 2 zf-C2H2 Zinc finger, C2H2 type 9.60E-07
134 LG:1090268.1 :2000SEP08 1167 1235 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.00E-07
135 LG:1400597.5:2000SEP08 63 131 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.20E-04
136 LG:1080307.2:2000SEP08 108 239 forward 3 KRAB KRAB box 3.60E-04
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
137 LG:1400603.2:2000SEP08 299 490 forward 2 KRAB KRAB box 5.50E-41
137 LG:1400603.2:2000SEP08 929 997 forward 2 zf-C2H2 Zinc finger, C2H2 type 1.20E-04
138 LG:1052984.1 :2000SEP08 142 330 forward 1 KRAB KRAB box 3.70E-41
139 LG: 1091259.1.2000SEP08 555 623 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.20E-07
140 LG:1082263.2:2000SEP08 258 443 forward 3 KRAB KRAB box 2.50E-29
140 LG:1082263.2:2000SEP08 981 1049 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.70E-07
141 LG:1048604.2:2000SEP08 351 530 forward 3 KRAB KRAB box 3.50E-20
142 LG:1085254.3:2000SEP08 746 814 forward 2 zf-C2H2 Zinc finger, C2H2 type 3.10E-07
143 LG: 1400606.2:2000SEP08 244 432 forward 1 KRAB KRAB box 7.60E-41
143 LG: 1400606.2:2000SEP08 904 972 forward 1 zf-C2H2 Zinc finger, C2H2 type 1.80E-07
144 LG:1090358.2:2000SEP08 385 573 forward 1 KRAB KRAB box 1.30E-47
144 LG:1090358.2:2000SEP08 862 930 forward 1 Zf-C2H2 Zinc finger, C2H2 type 7.90E-07
145 LG:1079064.2:2000SEP08 325 612 forward 1 SCAN SCAN domain 3.00E-39
146 LG:1076866.1 :2000SEP08 203 391 forward 2 KRAB KRAB box 6.20E-39 1 46 LG:1076866.1 :2000SEP08 1868 1936 forward 2 zf-C2H2 Zinc finger, C2H2 type 5.20E-08 147 LG:969359.1 :2000SEP08 221 715 forward 2 ferritin Ferritin 3.10E-86
148 LG:366783.1 :2000SEP08 70 609 forward 1 Zn_carbOpept Zinc carboxypeptidase 8.80E-76
148 LG:366783.1 :2000SEP08 705 806 forward 3 Zn_carbOpept Zinc carboxypeptidase 2.20E-10
148 LG:366783.1 :2000SEP08 623 724 forward 2 Zn_carbOpept Zinc carboxypeptidase 1.80E-05
149 LG:332176.3:2000SEP08 5 1060 forward 2 Glyco_hydro_31 Glycosyl hydrolases family 31 2.40E-163
150 LG:994938.1 :2000SEP08 3 95 forward 3 GBP Guanylate-binding protein, N-terminal 4.20E-14 domain
151 LG:982800.1 :2000SEP08 16 243 forward 1 pkinase Protein kinase domain 1.30E-10
152 LG:977850.7:2000SEP08 7 102 forward 1 zf-DHHC DHHC zinc finger domain 1.30E-04
153 LG:234748.2:2000SEP08 6 149 forward 3 PTE Phosphotriesterase family 1.20E-19
154 LG:306284.1 :2000SEP08 197 613 forward 2 Band_41 FERM domain (Band 4.1 family) 2.00E-52
155 U:333170.3:2000SEP08 292 498 forward 1 HMG_box HMG (high mobility group) box 1.80E-22
156 LI:336685.2:2000SEP08 913 1092 forward 1 homeobox Homeobox domain 4.20E-15
157 U:279013.5:2000SEP08 393 500 forward 3 WD40 WD domain, G-beta repeat 2.70E-05
158 LI :1037075.1 :2000SEP08 322 666 forward 1 ABC ran ABC transporter 7.10E-04
158 LI: 1037075.1 :2000SEP08 328 375 forward 1 PRK Phosphoribulokinase / Uridine kinase 7.40E-04 family
TABLE 3
SEQ ID NO Template ID Start Stop Frame Pfam Hit Pfam Description E-value 159 LI 107340312000SEP08 203 289 forward 2 efhand EF hand 5 50E-04 159 U 107340312000SEP08 56 187 forward 2 SJOO S-100/ICaBP type calcium binding 3 60E-23 domain
160 U 107529612000SEP08 337 543 forward 1 Rιbosomal_L12 Ribosomal protein L7/L12 C-terminal 540E-09 domain
161 LI 108550112000SEP08 561 776 forward 3 EF1 BD EF-1 guanine nucleotide exchange 1 20E-24 domain
162 LI 108618112000SEP08 305 469 forward 2 HTH_3 He x-turn-helix 1 20E-10 163 LI 116449312000SEP08 365 553 forward 2 KRAB KRAB box 8 10E-31 164 LI 117509712000SEP08 460 648 forward 1 KRAB KRAB box 7 60E-41 165 LI 109294812000SEP08 446 595 forward 2 KRAB KRAB box 280E-21 166 LI 38037822000SEP08 120 419 forward 3 mιto_carr Mitochondrial carrier protein 1 30E-25 167 LI 102967412000SEP08 360 431 forward 3 LRR Leucme Rich Repeat 900E-04 168 LI 204860132000SEP08 2 145 forward 2 >g Immunoglobulin domain 3 00E-05 169 LI 118620812000SEP08 161 349 forward 2 KRAB KRAB box 1 30E 40 -P- 170 LI 117075312000SEP08 130 318 forward 1 KRAB KRAB box 8 40E-42 171 LI 118090812000SEP08 509 577 forward 2 zf-C2H2 Zinc finger, C2H2 type 3 90E-06 171 LI 118090812000SEP08 340 408 forward 1 zf-C2H2 Zinc finger, C2H2 type 1 30E-04 172 Llll8290022000SEP08 346 414 forward 1 zf-C2H2 Zinc finger, C2H2 type 3 10E-07 173 LI 116954822000SEP08 393 500 forward 3 VHP Villin headpiece domain 7 10E-20 174 U 103997412000SEP08 191 742 forward 2 pkinase Protein kinase domain 1 30E-37 175 LI 117576522000SEP08 279 446 forward 3 KRAB KRAB box 520E-23 176 LI 31394812000SEP08 75 143 forward 3 zf-C2H2 Zinc finger, C2H2 type 2 20E 07 177 LI 33592322000SEP08 215 523 forward 2 lys C-type lysozyme/alpha-lactalbumm 1 60E-42 family
178 U 34588412000SEP08 208 360 forward 1 ig Immunoglobulin domain 7 40E-04 179 U 41712712000SEP08 359 529 forward 2 KRAB KRAB box 490E-22 180 U 45171012000SEP08 130 459 forward 1 Rιbosomal_L32e Ribosomal protein L32 4 80E-57 181 LI 40688222000SEP08 247 408 forward 1 Kelch Kelch motif 4 20E-1 1 181 U 406882.22000SEP08 656 793 forward 2 Kelch Kelch motif 8 90E-09 182 U 72822312000SEP08 158 373 forward 2 rrm RNA recognition motif (a k a RRM, 1 20E-28
RBD, or RNP domain)
,_.!-__ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
183 11289783.19:2000SEP08 821 1003 forward 2 thiored Thioredoxin 1.70E-20
183 LI:289783.19:2000SEP08 747 800 forward 3 thiored Thioredoxin 5.10E-05
184 U:235255.8:2000SEP08 419 919 forward 2 TGT Queuine tRNA-ribosyltransferase 2.10E-24
184 U:235255.8:2000SEP08 747 1346 forward 3 TGT Queuine tRNA-ribosyltransferase 7.00E-15
185 LI:237693.5:2000SEP08 48 518 forward 3 abhydrolase_2 Phospholipase/Carboxylesterase 6.20E-07
186 U:433670.3:2000SEP08 404 682 forward 2 Sema Sema domain 3.80E-26
187 LI:202943.4:2000SEP08 1 19 223 forward 2 EGF EGF-like domain 1.60E-05
187 U:202943.4:2000SEP08 1304 1465 forward 2 sushi Sushi domain (SCR repeat) 3.80E-18
188 U:068682.1 :2000SEP08 169 876 forward 1 pkinase Protein kinase domain 1.70E-65
189 U:203301.3:2000SEP08 570 716 forward 3 Band_41 FERM domain (Band 4.1 family) 1.50E-20
189 U:203301.3:2000SEP08 365 577 forward 2 Band_41 FERM domain (Band 4.1 family) 6.30E-16
190 U:020726.3:2000SEP08 542 1915 forward 2 MCT Monocarboxylate transporter 1.80E-98
191 U:027209.1 :2000SEP08 379 987 forward 1 fibrinogen_C Fibrinogen beta and gamma chains, C- 7.80E-43 terminal globular domain
1 2 U:108819.1 :2000SEP08 316 828 forward 1 vwa von Willebrand factor type A domain 1.90E-52
1 3 U:021759.1 :2000SEP08 1 136 1246 forward 2 WD40 WD domain, G-beta repeat 1.70E-07
194 U: 1 165967.1 :2000SEP08 322 462 forward 1 Ribosomal_S27 Ribosomal protein S27a 1.50E-30
194 U: 1 165967.1 :2000SEP08 22 243 forward 1 ubiquitin Ubiquitin family 4.50E-42
195 LI: 1 166315.1 :2000SEP08 131 283 forward 2 projsomerase Cyclophilin type peptidyl-prolyl cis- 1.10E-26 trans isomerase
195 U:1 166315.1 :2000SEP08 280 423 forward 1 projsomerase Cyclophilin type peptidyl-prolyl cis- 5.50E-17 trans isomerase
195 U:l 166315.1 :2000SEP08 423 482 forward 3 projsomerase Cyclophilin type peptidyl-prolyl cis- 4.10E-06 trans isomerase
196 L1:204626.1 :2000SEP08 322 1212 forward 1 Syntaxin Syntaxin 8.60E-44
197 LI :801 140.1 :2000SEP08 516 584 forward 3 zf-C2H2 Zinc finger, C2H2 type 2.50E-08
198 U:286639.1 :2000SEP08 1295 1714 forward 2 actin Actin 9.10E-64
198 LI:286639.1 :2000SEP08 772 1296 forward 1 actin Actin 1.60E-44
198 U:286639.1 :2000SEP08 630 755 forward 3 actin Actin 1.10E-09
199 LI:288905.4:2000SEP08 285 602 forward 3 CH Calponin homology (CH) domain 5.30E-27
200 U:332161.1 :2000SEP08 75 710 forward 3 ras Ras family 2.00E-48
201 LI: 184867.1 :2000SEP08 365 478 forward 2 ubiquitin Ubiquitin family 1.40E-04
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
202 U:229932.4:2000SEP08 58 969 forward 1 AMP-binding AMP-binding enzyme 3.00E-06
203 U: 1 189932.1 :2000SEP08 95 1441 forward 2 Folate_carrier Reduced folate carrier 5.80E-10
204 LI: 1076689.1 :2000SEP08 346 708 forward 1 ribonuclease_T2 Ribonuclease T2 family 6.50E-19
205 U:415181.2:2000SEP08 505 1584 forward 1 lnos-l-P_synth Myo-inosi tol-1 -phosphate synthase 2.30E-136
205 U:415181.2:2000SEP08 294 1220 forward 3 lnos-l-P_synth Myo-inositol-1 -phosphate synthase 3.40E-16
206 LI:296358.1 :2000SEP08 678 1061 forward 3 kinesin Kinesin motor domain 1.40E-46
206 U:29ό358.1 :2000SEP08 163 510 forward 1 kinesin Kinesin motor domain 6.80E-24
207 LI:205186.3:2000SEP08 476 982 forward 2 lactamase_B Metallo-beta-lactamase superfamily 3.70E-06
208 U:220537.2:2000SEP08 60 1574 forward 3 sugarjr Sugar (and other) transporter 6.40E-07
209 U:248364.2:2000SEP08 253 594 forward 1 BTB BTB/POZ domain 3.60E-04
209 U:248364.2:2000SEP08 1544 1612 forward 2 zf-C2H2 Zinc finger, C2H2 type 7.10E-06
210 U:2048338.1 :2000SEP08 261 410 forward 3 zf-C3HC4 Zinc finger, C3HC4 type (RING finger) 2.00E-07
21 1 LI:1 185203.8:2000SEP08 333 431 forward 3 ank Ank repeat 2.20E-07
212 LI:021770.3:2000SEP08 287 1 171 forward 2 adh_zinc Zinc-binding dehydrogenases 3.70E-08
213 LI: 1 185841.1 :2000SEP08 705 860 forward 3 myosinjiead Myosin head (motor domain) 1.80E-07
213 U:1 185841.1 :2000SEP08 992 1063 forward 2 myosin_head Myosin head (motor domain) 3.20E-04
214 U: 1 181710.1 :2000SEP08 62 130 forward 2 zf-C2H2 Zinc finger, C2H2 type 1.10E-05
215 LI:2048959.1 :2000SEP08 225 293 forward 3 zf-C2H2 Zinc finger, C2H2 type 4.10E-07
216 LI:798494.1 :2000SEP08 273 464 forward 3 KRAB KRAB box 2.00E-42
216 U:798494.1 :2000SEP08 675 743 forward 3 zf-C2H2 Zinc finger, C2H2 type 4.50E-05
216 U:798494.1 :2000SEP08 794 862 forward 2 zf-C2H2 Zinc finger, C2H2 type 6.80E-04
217 U:2049223.1 :2000SEP08 130 318 forward 1 KRAB KRAB box 2.10E-42
218 U:1 177833,1 :2000SEP08 78 218 forward 3 KRAB KRAB box 2.30E-17
218 LI: 1 177833.1 :2000SEP08 999 1067 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.70E-04
219 LI:2049267.1 :2000SEP08 78 227 forward 3 KRAB KRAB box 3.70E-22
220 U: 1 165939.1 :2000SEP08 247 315 forward 1 zf-C2H2 Zinc finger, C2H2 type 2.60E-06
221 U:1 170958.1 :2000SEP08 397 465 forward 1 zf-C2H2 Zinc finger, C2H2 type 3.20E-07
222 LI: 1089827.1 :2000SEP08 722 790 forward 2 Zf-C2H2 Zinc finger, C2H2 type 3.10E-07
222 LI: 1089827.1 :2000SEP08 105 173 forward 3 zf-C2H2 Zinc finger, C2H2 type 9.60E-07
222 LI: 1089827.1 :2000SEP08 433 501 forward 1 zf-C2H2 Zinc finger, C2H2 type 2.10E-06
223 LI:7921 12.1 :2000SEP08 200 268 forward 2 zf-C2H2 Zinc finger, C2H2 type 7.80E-05
224 U:282219.2:2000SEP08 129 197 forward 3 zf-C2H2 Zinc finger, C2H2 type 4.10E-05
TABLE 3
SEQ ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
225 LI: 1088010.2:2000SEP08 294 362 forward 3 zf-C2H2 Zinc finger, C2H2 type 5.80E-08
226 LI: 1 165276.1.2000SEP08 195 263 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.80E-06
227 LI:1 169524.2:2000SEP08 267 455 forward 3 KRAB KRAB box 2.70E-40
227 U:1 169524.2:2000SEP08 727 795 forward 1 zf-C2H2 Zinc finger, C2H2 type 7.90E-07
228 U: 1 180255.1 :2000SEP08 440 628 forward 2 KRAB KRAB box 2.30E-33
228 U: 1 180255.1 :2000SEP08 1027 1095 forward 1 zf-C2H2 Zinc finger, C2H2 type 2.50E-07
229 LI:1091903.1 :2000SEP08 182 370 forward 2 KRAB KRAB box 6.20E-39
230 LI:1 169219.1 :2000SEP08 142 330 forward 1 KRAB KRAB box 3.70E-41
231 LI:2050313.1 :2000SEP08 803 982 forward 2 Collagen Collagen triple helix repeat (20 copies) 9.50E-10
232 Ll:209351.3:2000SEP08 561 671 forward 3 WD40 WD domain, G-beta repeat 2.20E-06
232 U:209351 ,3:2000SEP08 313 423 forward 1 WD40 WD domain, G-beta repeat 1.00E-04
233 LI:119900.1 :2000SEP08 179 358 forward 2 KRAB KRAB box 1.50E-34
234 LI:2052274.1 :2000SEP08 193 1 197 forward 1 A_deaminase Adenosine/ AMP deaminase 4.00E-10
235 LI: 1075502.1 :2000SEP08 100 315 forward 1 OMPdecase Orotidine 5'-phosphate decarboxylase 6.30E-35 vo 235 LI: 1075502.1 :2000SEP08 305 451 forward 2 OMPdecase Orotidine 5'-phosphate decarboxylase 4.70E-26
--, 235 LI: 1075502.1 :2000SEP08 426 806 forward 3 OMPdecase Orotidine 5'-phosphate decarboxylase 1.80E-12
236 LI:813697.1 :2000SEP08 753 821 forward 3 zf-C2H2 Zinc finger, C2H2 type 1.60E-07
237 U:814261.1 :2000SEP08 3 95 forward 3 GBP Guanylate-binding protein, N-terminal 4.20E-14 domain
238 LI:775334.1 :2000SEP08 1 12 180 forward 1 zf-C2H2 Zinc finger, C2H2 type 2.10E-08
239 LI:1 180325.1 :2000SEP08 495 563 forward 3 zf-C2H2 Zinc finger, C2H2 type 2.20E-07
240 LI:1 183147.3:2000SEP08 82 150 forward 1 zf-C2H2 Zinc finger, C2H2 type 1.20E-04
241 LI: 1 175373.3:2000SEP08 191 379 forward 2 KRAB KRAB box 6.80E-45
242 U:813757.1 :2000SEP08 268 453 forward 1 KRAB KRAB box 4.30E-38
243 LI:1 182979.2:2000SEP08 341 409 forward 2 zf-C2H2 Zinc finger, C2H2 type 3.60E-06
244 U: 1 177823.2:2000SEP08 1232 1372 forward 2 KRAB KRAB box 3.00E-15
245 LI : 1 174279.1 :2000SEP08 276 464 forward 3 KRAB KRAB box 1.40E-46
245 U:1 174279.1 :2000SEP08 753 821 forward 3 zf-C2H2 Zinc finger, C2H2 type 7.90E-07
246 Ll:l 178411.1 :2000SEP08 787 855 forward 1 zf-C2H2 Zinc finger, C2H2 type 1.20E-07
247 LI: 1 182739.1 :2000SEP08 582 650 forward 3 Zf-C2H2 Zinc finger, C2H2 type 6.70E-08
248 U:234937.4:2000SEP08 61 477 forward 1 DSPc Dual specificity phosphatase, catalytic 8.70E-29 domain
TABLE 3
- ID NO: Template ID Start Stop Frame Pfam Hit Pfam Description E-value
249 LI: 1 170660.1 :2000SEP08 35 103 forward 2 zf-C2H2 Zinc finger, C2H2 type 8.70E-07
250 U: 1 144409.1.2000SEP08 16 243 forward 1 pkinase Protein kinase domain 1.30E-10
251 LI:246290.10:2000SEP08 256 453 forward 1 rrm RNA recognition motif, (a.k.a. RRM, 2.10E-08
RBD, or RNP domain)
252 U:280034.1 :2000SEP08 119 652 forward 2 AAA ATPase family associated with various 1.80E-64 cellular activities (AAA)
Figure imgf000100_0001
90 >- σ. +. o C -3 -3 D D C C 3 __. __! __! - c -- -- -- C -: -- -- -- C- -- C C C C Z . . C C . . C C. C o o o ~ o o o o _- -- O o o o o- - o- - o3 D o : o- o:3 o:- oD D o o:- z z z z z z z z z z z z z z z z z z z z z z z z z z o z z z un
H U c α. σ ε 2 -5 -522222 :≥ 2 _ S _ 5 Σ1 255 Σ o
CN CN -— O0 .— -O .- 0 .— OO r- cO CN — CM CN CN CN CO CN CN '— CN CN CO .— ■— ■— >— r- r- r- r- .— ,— -- -- -- ^ r- CN CN CN CN CN CN CN CN CN CN iδδδδδδδδδδδδδ σδδδσσδδδδδδδδδδδδδδδδδσδσσoσoσσσδδδ oooooooooooooooooooooooooooopppppppppppppppppppppp
CN
Figure imgf000101_0001
o o O
Figure imgf000101_0002
TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topoloς
48 LG:236928.1:2000SEP08 2015 2101 forward 2 TM Nin
48 LG:236928.1:2000SEP08 2501 2587 forward 2 TM Nin
48 LG:236928.1:2000SEP08 2639 2716 forward 2 TM Nin
48 LG:236928.1:2000SEP08 2786 2872 forward 2 TM Nin
48 LG:236928.1:2000SEP08 2894 2980 forward 2 TM in
48 LG:236928.1:2000SEP08 471 551 forward 3 TM Nin
48 LG:236928.1:2000SEP08 738 824 forward 3 TM Nin
48 LG:236928.1:2000SEP08 1287 1340 forward 3 TM Nin
48 LG:236928.1:2000SEP08 1626 1712 forward 3 TM Nin
48 LG:236928.1:2000SEP08 1890 1976 forward 3 TM Nin
48 LG:236928.1:2000SEP08 2034 2120 forward 3 TM Nin
48 LG:236928.1:2000SEP08 2118 2186 forward 3 TM Nin
48 LG:236928.1:2000SEP08 2520 2591 forward 3 TM Nin
48 LG:236928.1:2000SEP08 2763 2846 forward 3 TM Nin
48 LG:236928.1:2000SEP08 2916 2978 forward 3 TM Nin
48 LG:236928.1:2000SEP08 3003 3065 forward 3 TM Nin
49 LG:215169.2:2000SEP08 1363 1413 forward 1 TM Nin
49 LG:215169.2:2000SEP08 503 589 forward 2 TM Nout
49 LG:215169.2:2000SEP08 1727 1813 forward 2 TM Nout
49 LG:215169.2:2000SEP08 237 302 forward 3 TM Nout
49 LG:215169.2:2000SEP08 837 923 forward 3 TM Nout
49 LG:215169.2:2000SEP08 1398 1460 forward 3 TM Nout
49 LG:215169.2:2000SEP08 1470 1532 forward 3 TM Nout
49 LG:215169.2:2000SEP08 1605 1658 forward 3 TM Nout
50 LG:410726.1:2000SEP08 31 93 forward 1 TM Nout
50 LG:410726.1:2000SEP08 115 177 forward 1 TM Nout
50 LG:410726.1:2000SEP08 463 549 forward 1 TM Nout
51 LG:234372.2:2000SEP08 325 411 forward 1 TM Nin
51 LG:234372.2:2000SEP08 2492 2578 forward 2 TM Nout
51 LG:234372.2:2000SEP08 2669 2743 forward 2 TM Nout
51 LG:234372.2:2000SEP08 723 809 forward 3 TM Nin
52 LG:022629.1:2000SEP08 268 354 forward 1 TM Nout
52 LG:022629.1:2000SEP08 385 459 forward 1 TM Nout
52 LG:022629.1:2000SEP08 559 645 forward 1 TM Nout
52 LG:022629.1:2000SEP08 796 882 forward 1 TM Nout
52 LG:022629.1:2000SEP08 281 358 forward 2 TM Nin
53 LG:068682.1:2000SEP08 707 793 forward 2 TM Nout
54 LG:222335.1:2000SEP08 847 933 forward 1 TM Nin
54 LG:222335.1:2000SEP08 973 1023 forward 1 TM Nin
54 LG:222335.1:2000SEP08 1030 1101 forward 1 TM Nin
54 LG:222335.1:2000SEP08 1216 1290 forward 1 TM Nin
54 LG:222335.1:2000SEP08 23 109 forward 2 TM
54 LG:222335.1:2000SEP08 314 373 forward 2 TM
54 LG:222335.1:2000SEP08 428 514 forward 2 TM
54 LG:222335.1:2000SEP08 728 784 forward 2 TM
54 LG:222335.1:2000SEP08 794 868 forward 2 TM
54 LG:222335.1:2000SEP08 965 1051 forward 2 TM
54 LG:222335.1:2000SEP08 603 674 forward 3 TM Nin
54 LG:222335.1:2000SEP08 921 983 forward 3 TM Nin
54 LG:222335.1:2000SEP08 1014 1076 forward 3 TM Nin
Figure imgf000103_0001
CO CO — — CO — r- r- r- r- r- — CN CN CM CN CO CO CO CO CO CO — CM CO CO CO — CO CO CN CN CN CN CN CN CN CO — CN CN — — — CN CO — ■- ■- .- g δ δ δ δ δ δ δ δ δ δ δ σ δ δ δδ σ σ δδ δ δδ δ δ δ δ δ δδ δ δ δδ δδδ δ σδδ o δ δ σ δ δ σ δ "- o o o o o o o o o o o p .0.0.0 ,b .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0 P .0.0.0.b .0.0.0.0.o .p .0.o .0.0.0.0 co o , — -o o O r- co rN. r-N co o rN. 3- o- - rN. j5 : - 00 -O CO n o o o n r-. ^r o •o L
•o s o CN ■— cM -O 'T co o o co .N. M rN. rN. rN. o - CM -f . 00 O (N. CM s o LO f
1 — 1 •o C CO N O"
1 *~ - r r- ^- CO 'ϊ rN. r- ^. L0 ^O' ^ — l ^i N s 00 CN C CN ^ ^- N M CM-! 8 - ^ I^: r-
CO o _2 CoO
Figure imgf000103_0002
Figure imgf000103_0003
o o z
PI '* ,NJ L0 -0 L0 -0 -0 N0 3 -0 -0 NQ <) -0 -0 -0 -0 N0 NQ -0 -0 N0 ΓN. 00 00 00 00 O O O '— •— •— •— CN CN CN CN CO CO CO Q «Q -O -O -o o y -) U) ιO U) ιfi ιO -) ιO ιO -) -3 -) iO ιO -) -) ιn -) -) -) ιO W iO β ιO ιO W -> 0 0 0 '0 0 0 NQ NQ 0 O O -O o S O 33 L
•o -o -o -o -o O G
TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topoloς
66 LG 1045521.4:2000SEP08 3508 3582 forward 1 TM Nin
66 LG 1045521.4:2000SEP08 1652 1714 forward 2 TM Nin
66 LG 1045521.4:2000SEP08 2444 2500 forward 2 TM Nin
66 LG 1045521.4:2000SEP08 2627 2713 forward 2 TM Nin
66 LG 1045521.4:2000SEP08 3017 3088 forward 2 TM Nin
66 LG 1045521.4:2000SEP08 3317 3385 forward 2 TM Nin
66 LG 1045521.4:2000SEP08 1530 1604 forward 3 TM Nin
66 LG 1045521.4:2000SEP08 2496 2549 forward 3 TM Nin
66 LG 1045521.4:2000SEP08 2931 3017 forward 3 TM Nin
66 LG 1045521.4:2000SEP08 3267 3341 forward 3 TM Nin
67 LG:275876.1:2000SEP08 775 849 forward 1 TM Nin
67 LG:275876.1:2000SEP08 949 1002 forward 1 TM Nin
67 LG:275876.1:2000SEP08 842 928 forward 2 TM Nout
67 LG:275876.1:2000SEP08 777 842 forward 3 TM Nin
68 LG:475127.7:2000SEP08 137 223 forward 2 TM Nin
69 LG:157263.1:2000SEP08 175 249 forward 1 TM Nin
69 LG:157263.1:2000SEP08 295 345 forward 1 TM Nin
69 LG:157263.1:2000SEP08 406 492 forward 1 TM Nin
69 LG:157263.1:2000SEP08 793 867 forward 1 TM Nin
69 LG:157263.1:2000SEP08 889 951 forward 1 TM Nin
69 LG:157263.1:2000SEP08 1081 1143 forward 1 TM Nin
69 LG:157263.1:2000SEP08 1168 1230 forward 1 TM Nin
69 LG:157263.1:2000SEP08 1255 1341 forward 1 TM Nin
69 LG:157263.1:2000SEP08 482 544 forward 2 TM Nout
69 LG:157263.1:2000SEP08 563 625 forward 2 TM Nout
69 LG:157263.1:2000SEP08 180 266 forward 3 TM Nout
70 LG:247382.7:2000SEP08 808 894 forward 1 TM
70 LG:247382.7:2000SEP08 650 709 forward 2 TM Nout
70 LG:247382.7:2000SEP08 1412 1495 forward 2 TM Nout
71 LG:197367.5:2000SEP08 454 510 forward 1 TM Nout
72 LG:218090.5:2000SEP08 85 156 forward 1 TM Nout
72 LG:218090.5:2000SEP08 417 494 forward 3 TM Nout
73 LG:216612.4:2000SEP08 1615 1701 forward 1 TM Nin
73 LG:216612.4:2000SEP08 1942 2007 forward 1 TM Nin
73 LG:216612.4:2000SEP08 2182 2268 forward 1 TM Nin
73 LG:216612.4:2000SEP08 2413 2487 forward 1 TM Nin
73 LG:216612.4:2000SEP08 1583 1660 forward 2 TM Nin
73 LG:216612.4:2000SEP08 2114 2176 forward 2 TM Nin
73 LG:216612.4:2000SEP08 2216 2278 forward 2 TM Nin
73 LG:216612.4:2000SEP08 2339 2425 forward 2 TM Nin
73 LG:216612.4:2000SEP08 2483 2542 forward 2 TM Nin
73 LG:216612.4:2000SEP08 120 206 forward 3 TM Nin
73 LG:216612.4:2000SEP08 234 284 forward 3 TM Nin
73 LG:216612.4:2000SEP08 327 413 forward 3 TM Nin
73 LG:216612.4:2000SEP08 444 530 forward 3 TM Nin
73 LG:216612.4:2000SEP08 810 896 forward 3 TM Nin
73 LG:216612.4:2000SEP08 942 1028 forward 3 TM Nin
73 LG:216612.4:2000SEP08 1644 1730 forward 3 TM Nin
73 LG:216612.4:2000SEP08 1950 2012 forward 3 TM Nin
73 LG:216612.4:2000SEP08 2037 2099 forward 3 TM Nin
107 m D
MMNMMM N MMNM MM MMM MM M MM M MMMM MMMM MNM MMM MM MMM M MMMM MM
NNj o o o w w cn cn n i cjι ι ^ J-- Ji- j . j-- -- --- 4-- j-- ^ 4-- j^ j-. --^ ω ω u
O
00000000000000000000000000000000000000000000000000 ω iό fό ro NJ ro to rό ro ro ro w ω ω ω c- cό o- cό
CI. CB 0) α) C» αJ 0) 0! (» 0) C0 M M N N M Nl N ^ M O O '0 O O O O O O O O O O O -, -, . -0 O O 00 00 00 C» » C» » 00 00 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ -^ -^ ^ ^ ^ ^ ^ ^ -^ -^ -^ j Nj ^ -^ 0- ^ 00 O O O O O O O O O J J J-' J-- --- J -t-» --- 4^' O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O -! N co ω c- ω c G- -o ω ω ω ro ro ro ro ro ro ro ro ro — . --- --- — . _. __ _. _-, _ _. ■ __, __ -_ j _- --- --- --- --- --- --- --- --- --- --_ --- --- _- __ __ _i _3 o ) o <) ! o J ) <ι oo oo oo oo oo oo cD ω o. ^ J- ^ t- 4- J-. ^ ^ J- t- J_ Ji |_. J_1 i_ . ) ^
'? - TT 77- TT1 7? 77- 'TT1 T7- TT- 'T1
N3 M M M M M M M rO M W M M S3 M M N) N) M N) W I M M M I M M S3 N) N3 N) 3 M M
O O OO O O O OO OOO O OO O O O O O OO O OOO O O O O OOO O O O O O O O O O OO OO O O O O Φ OQQ. Q. Q O O O O OOQ O OO O O O O O OO O O OO O O O O O OO O OOO O O O O O OO O O OO O O — obpoooopoppopoooooooooooooooooooooooooooooooooooooD m m m πι rτι m m m m rπ m rπ m m m m m m m m m rπ rπ rπ rπ rπ [τι m rπ m rτι rπ m m m m m m rπ m 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
CO CO CX> C» 00 CX- 00 00 00 C» C0 CO 00 a> C» O0 CX. rø 00 a> rø
0 >3
Figure imgf000105_0001
— — — ' fj — • -_. — — — — -O- oo K i S Nj δ- i ω
Co NJ 01 01 00 ro ro 00 ro
Figure imgf000105_0002
o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o -r,
Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q D Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q d ddd -id &d Qad Q d d Q Qd
-•UωU MM -' -'-' -' -'U UUU M ^ ^^ -' WWU WωOM MM M MMMNJ MMM KJ I -' -' -' -'-' -'-' -' -' UCJ
2 2
Figure imgf000105_0003
zzzzzz 2z zzzzz O
O O O O — -. O O O O — — — — -. ooooooooooooo z~ z~ z z~~ — ??-- £- o C C C C -3 -3 C C C C -3 3 -3 _3 _3 ~
C → C→ C→ C→ C→ C→ C→ C→ C→ -→I C→ C→ C→ --i 3 -3 -3 D --- -3 D --i -3 -3 C0Q
Figure imgf000105_0004
TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
77 LG:389870.1:2000SEP08 1055 1120 forward 2 TM Nin
77 LG:389870.1:2000SEP08 1059 1142 forward 3 TM Nin
78 LG:1387485.6:2000SEP08 1213 1275 forward 1 TM Nin
78 LG:1387485.6:2000SEP08 206 292 forward 2 TM Nin
78 LG:1387485.6:2000SEP08 824 886 forward 2 TM Nin
78 LG:1387485.6:2000SEP08 914 976 forward 2 TM Nin
78 LG:1387485.6:2000SEP08 1244 1321 forward 2 TM Nin
78 LG: 1387485.6:2000SEP08 396 473 forward 3 TM Nout
78 LG:1387485.6:2000SEP08 537 602 forward 3 TM Nout
78 LG:1387485.6:2000SEP08 660 746 forward 3 TM Nout
78 LG: 1387485.6:2000SEP08 786 872 forward 3 TM Nout
78 LG: 1387485.6:2000SEP08 1164 1226 forward 3 TM Nout
78 LG:1387485.6:2000SEP08 1245 1307 forward 3 TM Nout
79 LG:230151.1:2000SEP08 1774 1860 forward 1 TM Nin
79 LG:230151.1:2000SEP08 1091 1165 forward 2 TM Nout
79 LG:230151.1:2000SEP08 1193 1246 forward 2 TM Nout
79 LG:230151.1:2000SEP08 1757 1843 forward 2 TM Nout
80 LG:215158.5:2000SEP08 199 267 forward 1 TM Nout
80 LG:215158.5:2000SEP08 820 873 forward 1 TM Nout
80 LG:215158.5:2000SEP08 892 945 forward 1 TM Nout
80 LG:215158.5:2000SEP08 908 985 forward 2 TM Nout
80 LG:215158.5:2000SEP08 1433 1504 forward 2 TM Nout
80 LG:215158.5:2000SEP08 447 533 forward 3 TM
81 LG:235840.1:2000SEP08 265 342 forward 1 TM Nout
81 LG:235840.1:2000SEP08 511 579 forward 1 TM Nout
81 LG:235840.1:2000SEP08 577 639 forward 1 TM Nout
81 LG:235840.1:2000SEP08 730 786 forward 1 TM Nout
81 LG:235840.1:2000SEP08 2011 2082 forward 1 TM Nout
81 LG:235840.1:2000SEP08 2110 2172 forward 1 TM Nout
81 LG:235840.1:2000SEP08 2326 2406 forward 1 TM Nout
81 LG:235840.1:2000SEP08 2416 2502 forward 1 TM Nout
81 LG:235840.1:2000SEP08 116 178 forward 2 TM Nout
81 LG:235840.1:2000SEP08 191 253 forward 2 TM Nout
81 LG:235840.1 :2000SEP08 500 586 forward 2 TM Nout
81 LG:235840.1:2000SEP08 731 817 forward 2 TM Nout
81 LG:235840.1:2000SEP08 848 916 forward 2 TM Nout
81 LG:235840.1:2000SEP08 956 1030 forward 2 TM Nout
81 LG:235840.1:2000SEP08 1973 2050 forward 2 TM Nout
81 LG:235840.1:2000SEP08 192 278 forward 3 TM Nout
81 LG:235840.1:2000SEP08 516 590 forward 3 TM Nout
81 LG:235840.1:2000SEP08 711 797 forward 3 TM Nout
81 LG:235840.1:2000SEP08 819 905 forward 3 TM Nout
81 LG:235840.1:2000SEP08 1446 1529 forward 3 TM Nout
81 LG:235840.1:2000SEP08 1893 1976 forward 3 TM Nout
81 LG:235840.1:2000SEP08 2298 2360 forward 3 TM Nout
82 LG:350272.1:2000SEP08 1675 1746 forward 1 TM Nin
82 LG:350272.1:2000SEP08 2374 2436 forward 1 TM Nin
82 LG:350272.1:2000SEP08 104 190 forward 2 TM Nin
82 LG:350272.1:2000SEP08 2264 2332 forward 2 TM Nin
82 LG:350272.1:2000SEP08 2369 2440 forward 2 TM Nin
Figure imgf000107_0001
CO cO CO CO — -- CN CN CM CM CN CO CO CO CO — — CN CN CO cO CN CO cO CO CN CM CN — .— — — CN CO CM CO — — CM CN CO CO CM '- CN CO CO — — —
'2δTδb'''5b'δ--!''2δδ'2δσδ"2"'δ--:δ'"''""'''""""'''''''2δδ'2b'""'''2δδ'''"2δ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
Figure imgf000107_0002
CM CO fe S cN ~i
CD - 8; § a as O
<
Figure imgf000107_0003
Figure imgf000107_0004
o o z
Q CN CN N N c o o o n n o o n n ^ ιo ι_) i_) ifi i ifi N <o <> (> θ '- c. c. c. c. N !,) n ^ ^ ,ϊ ^ '. ^ -) -o <) N eo o CO 00 C0 C0 C0 C0 C0 C0 C0 -0 C0 C0 00 00 C0 C0 C0 C0 C0 00 00 C0 00 00 00 C0 00 O O O O O O G LU CO
0
— o o o o o o o o o o o o o o o o o o o o o o o o o o o o o O F cn en <_n en en en en oi --- co e»- ro — ' O o o o o o -o -θ Nθ -o oo N e> o en en n j_- -- 4-- j-- j-- e cΛ. CΛ) ro ro N N — ■ — • — - o o J O o J!
O
Figure imgf000108_0001
o o o o o o o o o o o o o o o o ooo o o o o o o o o o o o o o o o ooo o o o o o o o o o o o o o
Q Q Q Q O Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q ddddddddddddddddddddddddddddddddddddddddd dddddddd ω ω r ι - -j -' --- ω M -- -j ro ω ω -j ^ -' ^ ω w - - ^ ω ω - -j N_ ^ - ω ω r M - ω r -- -j -' ^ -' 3 . ι M
~ ≤z ~ ~ ~ ~ ≤ ≤ ~-~ ~ ~ ~ ~
Figure imgf000108_0002
≤ <ι < - ~
Figure imgf000108_0003
z^z z z z z zzz z zzzzzzzz -z.-z.-z.-z.-z. z z z z z z z z zzzz zzzzz Z Z Z Z ZΌ
O C OC —3 -3. —3 -3. —3 -3. OC OC_ OC: —3 —3 o--: o-- o-- o-- oC o-- o-- O-: O-: O-: O-: O-: -3. -3.0c 0c_ 0_: 0c 0c -3. _3 Oc O__: Oc -3. oc oc o__: oc o o o o o 0. c _r c c c o
-<
TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
116 LG:885747.2:2000SEP08 125 205 forward 2 TM Nin
116 LG:885747.2:2000SEP08 135 191 forward 3 TM Nout
116 LG:885747.2:2000SEP08 189 239 forward 3 TM Nout
117 LG 1140501.1:2000SEP08 622 708 forward 1 TM Nout
117 LG 1140501 1:2000SEP08 844 918 forward 1 TM Nout
117 LG 1140501 1 :2000SEP08 1138 1206 forward 1 TM Nout
117 LG 1140501 1 :2000SEP08 86 169 forward 2 TM Nout
117 LG 1140501 1 :2000SEP08 221 301 forward 2 TM Nout
117 LG 1140501 1 :2000SEP08 617 703 forward 2 TM Nout
117 LG 1140501 1 :2000SEP08 606 677 forward 3 TM Nin
118 LG:001239.1:2000SEP08 1048 1122 forward 1 TM Nout
118 LG:001239.1:2000SEP08 669 755 forward 3 TM Nout
118 LG:001239.1:2000SEP08 1911 1976 forward 3 TM Nout
119 LG:018980.1:2000SEP08 124 204 forward 1 TM Nout
119 LG:018980.1 :2000SEP08 944 997 forward 2 TM Nout
119 LG:018980.1 :2000SEP08 405 464 forward 3 TM Nout
119 LG:018980.1 :2000SEP08 900 986 forward 3 TM Nout
119 LG:018980.1 :2000SEP08 1017 1103 forward 3 TM Nout
120 LG:1083120.3:2000SEP08 214 291 forward 1 TM Nout
120 LG:1083120.3:2000SEP08 233 319 forward 2 TM Nout
120 LG: 1083120.3:2000SEP08 252 320 forward 3 TM Nin
121 LG:233258.3:2000SEP08 58 141 forward 1 TM
121 LG:233258.3:2000SEP08 1783 1842 forward 1 TM
121 LG:233258.3:2000SEP08 2248 2322 forward 1 TM
121 LG:233258.3:2000SEP08 4522 4596 forward 1 TM
121 LG:233258.3:2000SEP08 4208 4294 forward 2 TM Nout
121 LG:233258.3:2000SEP08 4478 4534 forward 2 TM Nout
121 LG:233258.3:2000SEP08 390 476 forward 3 TM Nin
121 LG:233258.3:2000SEP08 2766 2852 forward 3 TM Nin
122 LG:999062.1:2000SEP08 455 508 forward 2 TM Nin
122 LG:999062.1 :2000SEP08 510 596 forward 3 TM in
123 LG:887776.1:2000SEP08 14 91 forward 2 TM Nout
124 LG: 1400301.2:2000SEP08 445 531 forward 1 TM Nout
124 LG: 1400301.2:2000SEP08 456 518 forward 3 TM Nout
125 LG:1329362.1:2000SEP08 18 104 forward 3 TM Nout
126 LG: 1096498.1:2000SEP08 137 199 forward 2 TM out
126 LG: 1096498.1:2000SEP08 201 269 forward 3 TM Nout
126 LG: 1096498.1 :2000SEP08 321 371 forward 3 TM Nout
127 LG: 1096337.1:2000SEP08 625 711 forward 1 TM Nout
127 LG: 1096337.1:2000SEP08 500 553 forward 2 TM
128 LG: 1400579.1:2000SEP08 797 883 forward 2 TM Nout
128 LG: 1400579.1:2000SEP08 9 86 forward 3 TM Nout
128 LG: 1400579.1:2000SEP08 669 755 forward 3 TM Nout
129 LG:1080091.1:2000SEP08 67 114 forward 1 TM Nout
129 LG: 1080091.1:2000SEP08 435 497 forward 3 TM Nout
130 LG: 1082203.1 :2000SEP08 1438 1521 forward 1 TM Nin
130 LG:1082203.1:2000SEP08 155 217 forward 2 TM Nout
130 LG: 1082203.1:2000SEP08 272 358 forward 2 TM Nout
131 LG: 1084051.1 :2000SEP08 301 366 forward 1 TM Nout
131 LG 1084051 1 :2000SEP08 934 1017 forward 1 TM Nout TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
131 LG: 1084051.1:2000SEP08 1072 1140 forward 1 TM Nout
131 LG: 1084051.1:2000SEP08 875 961 forward 2 TM Nout
131 LG: 1084051.1:2000SEP08 882 968 forward 3 TM Nin
131 LG: 1084051.1:2000SEP08 1071 1151 forward 3 TM Nin
132 LG:1082393.1:2000SEP08 505 579 forward 1 TM Nin
132 LG: 1082393.1:2000SEP08 1710 1784 forward 3 TM Nin
133 LG:1086183.1:2000SEP08 1093 1179 forward 1 TM Nout
133 LG:1086183.1:2000SEP08 1166 1252 forward 2 TM Nin
133 LG:1086183.1:2000SEP08 15 86 forward 3 TM
133 LG:1086183.1:2000SEP08 381 437 forward 3 TM
133 LG: 1086183.1:2000SEP08 1194 1280 forward 3 TM
134 LG:1090268.1:2000SEP08 1882 1953 forward 1 TM Nout
134 LG:1090268.1:2000SEP08 1969 2055 forward 1 TM Nout
134 LG:1090268.1:2000SEP08 1670 1729 forward 2 TM Nout
134 LG:1090268.1:2000SEP08 1853 1939 forward 2 TM Nout
134 LG:1090268.1:2000SEP08 1997 2083 forward 2 TM Nout
134 LG: 1090268.1 :2000SEP08 1485 1550 forward 3 TM Nin
134 LG:1090268.1:2000SEP08 1869 1943 forward 3 TM Nin
134 LG: 1090268.1:2000SEP08 1962 2048 forward 3 TM Nin
135 LG: 1400597.5:2000SEP08 134 199 forward 2 TM Nin
136 LG:1080307.2:2000SEP08 275 346 forward 2 TM Nout
136 LG:1080307.2:2000SEP08 347 403 forward 2 TM Nout
136 LG:1080307.2:2000SEP08 183 269 forward 3 TM Nin
136 LG:1080307.2:2000SEP08 303 371 forward 3 TM Nin
137 LG:1400603.2:2000SEP08 792 878 forward 3 TM Nin
137 LG:1400603.2:2000SEP08 903 971 forward 3 TM Nin
137 LG:1400603.2:2000SEP08 1068 1151 forward 3 TM Nin
138 LG: 1052984.1:2000SEP08 496 582 forward 1 TM Nin
138 LG: 1052984.1 :2000SEP08 509 595 forward 2 TM Nout
138 LG: 1052984.1:2000SEP08 495 581 forward 3 TM
139 LG:1091259.1:2000SEP08 799 885 forward 1 TM Nin
140 LG:1082263.2:2000SEP08 1390 1458 forward 1 TM Nin
140 LG:1082263.2:2000SEP08 1558 1644 forward 1 TM Nin
140 LG:1082263.2:2000SEPO8 83 169 forward 2 TM
140 LG:1082263.2:2000SEPO8 1526 1612 forward 2 TM
140 LG:1082263.2:2000SEP08 1578 1664 forward 3 TM Nin
141 LG: 1048604.2:2000SEP08 562 618 forward 1 TM Nin
141 LG: 1048604.2:2000SEP08 697 768 forward 1 TM Nin
141 LG: 1048604.2:2000SEP08 856 930 forward 1 TM Nin
141 LG:1048604.2:2000SEP08 332 418 forward 2 TM Nin
141 LG: 1048604.2:2000SEP08 689 775 forward 2 TM Nin
141 LG: 1048604.2:2000SEP08 1115 1192 forward 2 TM Nin
141 LG: 1048604.2:2000SEP08 483 557 forward 3 TM Nin
141 LG: 1048604.2:2000SEP08 570 638 forward 3 TM Nin
141 LG: 1048604.2:2000SEP08 678 743 forward 3 TM Nin
141 LG: 1048604.2:2000SEP08 1113 1199 forward 3 TM Nin
142 LG: 1085254.3:2000SEP08 331 378 forward 1 TM Nin
142 LG:1085254.3:2000SEP08 204 290 forward 3 TM Nout
143 LG:1400606.2:2000SEP08 1096 1176 forward 1 TM Nout
143 LG: 1400606.2:2000SEP08 794 880 forward 2 TM Nin TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
143 LG 1400606.2:2000SEP08 1031 1117 forward 2 TM Nin
143 LG 1400606.2:2000SEP08 93 173 forward 3 TM Nin
143 LG 1400606.2:2000SEP08 765 851 forward 3 TM Nin
144 LG 1090358.2:2000SEP08 250 336 forward 1 TM Nin
144 LG 1090358.2:2000SEP08 758 844 forward 2 TM Nout
144 LG 1090358.2:2000SEP08 848 919 forward 2 TM Nout
144 LG 1090358.2:2000SEP08 974 1060 forward 2 TM Nout
145 LG 1079064.2:2000SEP08 862 936 forward 1 TM Nout
146 LG 1076866.1:2000SEP08 2113 2187 forward 1 TM Nin
146 LG 1076866.1:2000SEP08 2111 2170 forward 2 TM
146 LG 1076866.1:2000SEP08 50 V 587 forward 3 TM Nin
146 LG 1076866.1:2000SEP08 2148 2234 forward 3 TM Nin
147 LG:969359.1:2000SEP08 256 303 forward 1 TM Nout
147 LG:969359.1:2000SEP08 276 350 forward 3 TM Nout
148 LG:366783.1:2000SEP08 1060 1143 forward 1 TM Nin
148 LG:366783.1:2000SEP08 470 550 forward 2 TM
148 LG:366783.1:2000SEP08 1050 1109 forward 3 TM Nin
149 LG:332176.3:2000SEP08 442 495 forward 1 TM Nin
149 LG:332176.3:2000SEP08 209 295 forward 2 TM Nin
149 LG:332176.3:2000SEP08 234 299 forward 3 TM Nin
149 LG:332176.3:2000SEP08 792 851 forward 3 TM Nin
149 LG:332176.3:2000SEP08 876 938 forward 3 TM Nin
149 LG:332176.3:2000SEP08 966 1028 forward 3 TM Nin
150 LG:994938.1:2000SEP08 562 618 forward 1 TM Nout
150 LG:994938.1:2000SEP08 287 343 forward 2 TM Nout
150 LG:994938.1:2000SEP08 512 598 forward 2 TM Nout
150 LG:994938.1:2000SEP08 279 356 forward 3 TM Nout
150 LG:994938.1:2000SEP08 474 557 forward 3 TM Nout
151 LG:982800.1:2000SEP08 25 81 forward 1 TM Nout
151 LG:982800.1:2000SEP08 1708 1782 forward 1 TM Nout
151 LG:982800.1:2000SEP08 2305 2391 forward 1 TM Nout
151 LG:982800.1:2000SEP08 1658 1744 forward 2 TM Nin
151 LG:982800.1:2000SEP08 1880 1927 forward 2 TM Nin
151 LG:982800.1:2000SEP08 2255 2341 forward 2 TM Nin
151 LG:982800.1:2000SEP08 1614 1697 forward 3 TM Nin
151 LG:982800.1:2000SEP08 2220 2306 forward 3 TM Nin
152 LG:977850.7:2000SEP08 58 144 forward 1 TM Nout
152 LG:977850.7:2000SEP08 187 237 forward 1 TM Nout
152 LG:977850.7:2000SEP08 35 112 forward 2 TM Nout
152 LG:977850J:2000SEP08 48 116 forward 3 TM Nin
153 LG:234748.2:2000SEP08 25 99 forward 1 TM out
153 LG:234748.2:2000SEP08 601 663 forward 1 TM Nout
153 LG:234748.2:2000SEP08 679 741 forward 1 TM Nout
153 LG:234748.2:2000SEPO8 871 957 forward 1 TM Nout
153 LG:234748.2:2000SEP08 1162 1230 forward 1 TM Nout
153 LG:234748.2:2000SEP08 1237 1302 forward 1 TM out
153 LG:234748.2:2000SEP08 1579 1641 forward 1 TM Nout
153 LG:234748.2:2000SEP08 1984 2070 forward 1 TM Nout
153 LG:234748.2:2000SEP08 2110 2196 forward 1 TM Nout
153 LG:234748.2:2000SEP08 2308 2355 forward 1 TM Nout 90 CM >.
CD I-- O O ._= ._= ._= ._= .£ .<= ._= .£ £._=.__• D ._Ξ .£ . C C 3 3 3 3 3 3 C C C 3 C c c c g CM _Ξ ._= ... — — o o o o o o — —
O α zzzzzzzzzzzzzzzzz °z z z z z CΛ o
H U c α. o ε -≥ -≥ - Σ 2 _≥ _> Σ _> 2 o
Q
(N CM CN CM CN CN CN CN CN C CN CO CO CO CO CO CO CO CO CO CO CN '— ■— CO -— O r- r— ■— CN CN C cO cO cO -— — CN CN C -- ■— ■— ■— ■— CN CM CN CN g δ b σ δ δ b δ δ δ α δ δ δ δ δ δ δ δ δ σ δδ σ δ δ δ σ σ δ σ δ δ δδ δ δ σδ σ δδδ δ δ δ δ δ δ δ δ
"00000000000000000000000000000000 pop.p.po op op o oo
Q- — _ CN -O r- -Λ 'NT -f o u S o LQ - r. o co O CM CM 'Nt ," f rN. }5 o S cM rN. S cN θ _o o -o c o
"st LO ΓN.
Figure imgf000112_0001
N - ι- °' w 0, c. -. ° 'T r- o «) r. c. N eo
Figure imgf000112_0002
I-- o O
_ co c c CJ cθ c cθ cθ cθ cj cθ cθ ( cθ co co co cθ -O co ^f rN. cθ O O O r— r— r- r— r- r— - N N r O n n t t lO -O O 'O O O O
Q u j - j - Lθ Lθ Lθ -o - L Lθ j -o - Lθ Lθ j uj j rN. rN. rN. |N, oo oo oo co co oo oo oo oo co co oo oo oo co co rø o O
Figure imgf000113_0001
C C C r- r- CJ CJ C CO C CJ r- r- r- r- r- r- r- r- r- r- r- CN CN CN CN CM CJ C CJ CJ CJ CJ CJ r- r- r- CN CM C CN C^
P '2 P 'PP? ? '2?P'2'2PPO?'PP'2'2P??P Ϊ P'2 P δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ δ α δ δ b b δ δ δ δ δ σ b δ σ b δ δ b b δ δ δ δ p p p p p p p p p p p p p p p p p p p p p p p p p p p o p p p p p p p p p p p p p p p p p p p p p p o N? _ π w " rι5- NSl rN §O
Figure imgf000113_0002
1_. l_. 'O N O » -. -. r. -. p. w >- ι_ ι_
Figure imgf000113_0003
I-- o
^ ■0 -θ Nθ rN. r. r-N rN. rN. rN. rN. rN. oo o o o o o o o o o o o o o o o Q θ o p Q θ θ θ θ - r- r- ϋ » eo cθ cθ (D eo o_ _ oo oθ ffl (0 (0 -3 co co «) eo oo co eθ (θ eo ιO -) -J oj !D co ι» oo oo co ι» 5 ' (> C> 5 θ' (> 5 ^ 0> C. α θ' 0' (> o O G
TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
191 U:027209.1:2000SEP08 1447 1533 forward 1 TM Nout
191 U .0272091 :2000SEP08 413 475 forward 2 TM Nout
191 U:027209.1:2000SEP08 527 601 forward 2 TM Nout
191 U:027209.1:2000SEP08 659 739 forward 2 TM Nout
191 U:027209.1:2000SEP08 785 847 forward 2 TM Nout
191 LI:027209.1:2000SEP08 1178 1234 forward 2 TM Nout
191 U:027209.1:2000SEP08 1232 1309 forward 2 TM Nout
191 U:027209.1:2000SEP08 1379 1465 forward 2 TM Nout
191 U:027209.1:2000SEP08 510 563 forward 3 TM Nin
191 U:027209.1:2000SEP08 882 938 forward 3 TM Nin
191 LI:027209.1:2000SEP08 945 1016 forward 3 TM Nin
191 LI:027209.1:2000SEP08 1041 1112 forward 3 TM Nin
191 LI:027209.1:2000SEP08 1146 1208 forward 3 TM Nin
191 U:027209.1:2000SEP08 1224 1286 forward 3 TM Nin
192 LI:108819.1:2000SEP08 196 264 forward 1 TM Nin
192 U:108819.1:2000SEP08 203 271 forward 2 TM
192 LI:108819.1:2000SEP08 290 343 forward 2 TM
193 U:021759.1:2000SEP08 133 186 forward 1 TM Nout
193 LI:021759.1:2000SEP08 59 115 forward 2 TM Nout
193 LI:021759.1:2000SEP08 495 581 forward 3 TM Nin
193 U:021759.1:2000SEP08 981 1067 forward 3 TM Nin
194 U:1165967.1:2000SEP08 323 403 forward 2 TM Nout
195 U:1166315.1:2000SEP08 693 749 forward 3 TM Nout
196 LI:204626.1:2000SEP08 19 99 forward 1 TM Nout
197 Ll:801140.1 :2000SEP08 421 498 forward 1 TM Nin
198 U:286639.1:2000SEP08 127 213 forward 1 TM Nout
198 LI:286639.1:2000SEP08 982 1068 forward 1 TM Nout
198 LI:286639.1:2000SEP08 1879 1944 forward 1 TM Nout
198 U:286639.1:2000SEP08 62 148 forward 2 TM Nout
198 LI:286639.1:2000SEP08 965 1027 forward 2 TM Nout
198 LI:286639.1:2000SEP08 1061 1123 forward 2 TM Nout
198 U:286639.1:2000SEP08 1589 1663 forward 2 TM Nout
198 U:286639.1:2000SEP08 1787 1873 forward 2 TM Nout
198 U:286639.1:2000SEP08 267 320 forward 3 TM Nout
198 U:286639.1:2000SEP08 1794 1847 forward 3 TM Nout
199 U:288905.4:2000SEP08 868 927 forward 1 TM Nout
199 LI:288905.4:2000SEP08 1552 1638 forward 1 TM Nout
199 LI:288905.4:2000SEP08 1913 1975 forward 2 TM Nout
199 LI:288905.4:2000SEP08 2000 2062 forward 2 TM Nout
199 U:288905.4:2000SEP08 99 158 forward 3 TM Nin
200 U:332161.1:2000SEP08 1507 1587 forward 1 TM Nin
200 U:332161.1:2000SEP08 1663 1743 forward 1 TM Nin
200 U:332161.1:2000SEP08 2314 2400 forward 1 TM Nin
200 LI:332161.1:2000SEP08 2890 2940 forward 1 TM Nin
200 LI:332161.1:2000SEP08 776 835 forward 2 TM
200 LI:332161.1:2000SEP08 1340 1393 forward 2 TM
200 LI:332161.1:2000SEP08 1493 1579 forward 2 TM
200 U:332161.1:2000SEP08 1637 1717 forward 2 TM
200 U:332161.1:2000SEP08 1880 1966 forward 2 TM
200 LI:332161.1:2000SEP08 2075 2161 forward 2 TM TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topoloς
200 U:332161.1:2000SEP08 2744 2818 forward 2 TM
200 U:332161.1:2000SEP08 258 311 forward 3 TM Nin
200 U:332161.1:2000SEP08 1524 1610 forward 3 TM Nin
200 U:332161.1:2000SEP08 2025 2111 forward 3 TM Nin
200 LI:332161.1:2000SEP08 2289 2375 forward 3 TM Nin
201 LI: 184867.1:2000SEP08 1985 2041 forward 2 TM Nin
201 LI: 184867.1:2000SEP08 771 854 forward 3 TM Nout
202 U:229932.4:2000SEP08 76 162 forward 1 TM Nout
202 U:229932.4:2000SEP08 229 300 forward 1 TM Nout
202 U:229932.4:2000SEP08 1249 1329 forward 1 TM Nout
202 U:229932.4:2000SEP08 1438 1524 forward 1 TM Nout
202 U:229932.4:2000SEP08 1678 1764 forward 1 TM Nout
202 U:229932.4:2000SEP08 68 142 forward 2 TM Nout
202 U:229932.4:2000SEP08 215 271 forward 2 TM Nout
202 U:229932.4:2000SEP08 734 820 forward 2 TM Nout
202 U:229932.4:2000SEP08 1220 1291 forward 2 TM Nout
202 U:229932.4:2000SEP08 1565 1618 forward 2 TM Nout
202 LI:229932.4:2000SEP08 60 146 forward 3 TM
202 LI:229932.4:2000SEP08 348 425 forward 3 TM
202 U:229932.4:2000SEP08 762 848 forward 3 TM
202 LI:229932.4:2000SEP08 1239 1325 forward 3 TM
202 LI:229932.4:2000SEP08 1401 1487 forward 3 TM
202 LI:229932.4:2000SEP08 1629 1703 forward 3 TM
203 LI: 1189932.1 2000SEP08 277 363 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 1141 1203 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 1216 1278 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 1411 1497 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 1642 1707 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 1795 1881 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 2125 2211 forward 1 TM Nout
203 LI: 1189932.1 2000SEP08 89 151 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 218 304 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 320 388 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 386 451 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 572 643 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 1142 1222 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 1412 1498 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 1928 1990 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 2021 2107 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 2141 2227 forward 2 TM Nin
203 LI: 1189932.1 2000SEP08 522 608 forward 3 TM Nout
203 U: 1189932.1 2000SEP08 858 944 forward 3 TM Nout
203 U: 1189932.1 2000SEP08 1161 1247 forward 3 TM Nout
203 LI: 1189932.1 2000SEP08 1992 2066 forward 3 TM Nout
203 LI: 1189932.1 2000SEP08 2139 2210 forward 3 TM Nout
204 LI: 1076689.1 2000SEP08 43 114 forward 1 TM Nin
204 LI: 1076689.1 2000SEP08 326 373 forward 2 TM Nin
205 LI:415181.2:2000SEP08 1033 1113 forward 1 TM Nin
205 LI:415181.2:2000SEP08 1363 1434 forward 1 TM Nin
205 Ll:415181.2:_ 2000SEP08 914 1000 forward 2 TM TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
206 LI:296358.1:2000SEP08 760 819 forward 1 TM Nout
206 LI:296358.1:2000SEP08 997 1068 forward 1 TM Nout
206 LI:296358.1:2000SEP08 1147 1233 forward 1 TM Nout
206 LI:296358.1:2000SEP08 197 247 forward 2 TM Nout
206 LI:296358.1:2000SEP08 356 442 forward 2 TM Nout
207 LI:205186.3:2000SEP08 644 697 forward 2 TM Nin
207 LI:205186.3:2000SEP08 812 877 forward 2 TM Nin
208 LI:220537.2:2000SEP08 595 681 forward 1 TM Nin
208 LI:220537.2:2000SEP08 1429 1515 forward 1 TM Nin
208 LI:220537.2:2000SEP08 245 331 forward 2 TM Nout
208 U:220537.2:2000SEP08 362 415 forward 2 TM out
208 U:220537.2:2000SEP08 1058 1129 forward 2 TM Nout
208 U:220537.2:2000SEP08 1136 1222 forward 2 TM Nout
208 LI:220537.2:2000SEP08 1472 1555 forward 2 TM Nout
208 LI:220537.2:2000SEP08 1820 1897 forward 2 TM Nout
208 LI:220537.2:2000SEP08 216 299 forward 3 TM Nout
208 LI:220537.2:2000SEP08 447 509 forward 3 TM Nout
208 LI:220537.2:2000SEP08 522 584 forward 3 TM Nout
208 LI:220537.2:2000SEP08 624 686 forward 3 TM Nout
208 LI:220537.2:2000SEP08 702 764 forward 3 TM Nout
208 U:220537.2:2000SEP08 1467 1553 forward 3 TM Nout
208 LI:220537.2:2000SEP08 1806 1892 forward 3 TM Nout
209 U:248364.2:2000SEP08 439 495 forward 1 TM Nout
210 LI:2048338.1:2000SEP08 268 354 forward 1 TM Nin
210 U:2048338.1:2000SEP08 385 459 forward 1 TM Nin
210 LI:2048338.1:2000SEP08 541 591 forward 1 TM Nin
210 LI:2048338.1:2000SEP08 281 358 forward 2 TM Nin
211 U:1185203.8:2000SEP08 73 135 forward 1 TM Nout
211 LI:1185203.8:2000SEP08 148 210 forward 1 TM Nout
211 LI:1185203.8:2000SEP08 226 294 forward 1 TM Nout
212 U:021770.3:2000SEP08 593 673 forward 2 TM Nout
212 LI:021770.3:2000SEP08 348 434 forward 3 TM Nin
212 LI:021770.3:2000SEP08 462 539 forward 3 TM Nin
212 LI:021770.3:2000SEP08 564 641 forward 3 TM Nin
.212 U:021770.3:2000SEP08 861 944 forward 3 TM Nin
213 LI: 1185841.1.2000SEP08 1132 1182 forward 1 TM Nout
213 U: 1185841.1:2000SEP08 2434 2520 forward 1 TM Nout
213 U: 1185841.1:2000SEP08 965 1030 forward 2 TM Nin
213 U: 1185841.1:2000SEP08 2381 2437 forward 2 TM Nin
214 U:1181710.1:2000SEP08 221 307 forward 2 TM Nout
215 LI:2048959.1:2000SEP08 262 318 forward 1 TM Nout
216 LI:798494.1:2000SEP08 595 657 forward 1 TM Nout
216 U:798494.1:2000SEP08 670 732 forward 1 TM Nout
216 U:798494.1:2000SEP08 215 268 forward 2 TM Nout
216 LI:798494.1:2000SEP08 12 62 forward 3 TM Nout
216 LI:798494.1:2000SEP08 60 113 forward 3 TM Nout
217 U:2049223.1:2000SEP08 436 507 forward 1 TM , Nin
217 LI:2049223.1:2000SEP08 416 484 forward 2 TM Nout
218 LI:1177833.1:2000SEP08 712 798 forward 1 TM Nin
219 LI:2049267.1:2000SEP08 215 289 forward 2 TM Nout TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology 219 LI:2049267.1:2000SEP08 240 323 forward 3 TM out 220 LI:1165939.1 :2000SEP08 554 640 forward 2 TM 221 U:1170958.1:2000SEP08 326 376 forward 2 TM N out 222 LI:1089827.1.2000SEP08 1507 1584 forward 1 TM N in 223 U:792112.1:2000SEP08 818 892 forward 2 TM N in 223 U:792112.1:2000SEP08 9 86 forward 3 TM 223 LI:792112.1:2000SEP08 669 755 forward 3 TM 223 U:792112.1:2000SEP08 783 860 forward 3 TM 224 LI:282219.2:2000SEP08 661 732 forward 1 TM N out 224 U:282219.2:2000SEP08 491 577 forward 2 TM N in 224 U:282219.2:2000SEP08 659 745 forward 2 TM N in 224 U:282219.2:2000SEP08 408 458 forward 3 TM N in 225 LI:1088010.2:2000SEP08 289 375 forward 1 TM N in 225 LI:1088010.2:2000SEP08 716 793 forward 2 TM N in 225 LI:1088010.2:2000SEP08 1229 1315 forward 2 TM N in 225 LI:1088010.2:2000SEP08 903 980 forward 3 TM N out 225 LI:1088010.2:2000SEP08 1002 1088 forward 3 TM N out 225 LI:1088010.2:2000SEP08 1 185 1247 forward 3 TM N out 225 LI:1088010.2:2000SEP08 1272 1334 forward 3 TM N out 226 Ll:l 165276.1 :2000SEP08 301 363 forward 1 TM N in 226 LI:1165276.1 :2000SEP08 882 968 forward 3 TM N out 227 LI:1169524.2:2000SEP08 623 709 forward 2 TM 227 U:1169524.2:2000SEP08 713 778 forward 2 TM 227 U:1169524.2:2000SEP08 842 928 forward 2 TM 228 LI:1180255.1 :2000SEP08 731 784 forward 2 TM N in 228 LI:1180255.1 :2000SEP08 872 931 forward 2 TM N in 228 U:1180255.1:2000SEP08 1 199 1285 forward 2 TM N in 229 LI:1091 03.1:2000SEP08 454 540 forward 1 TM N out 230 LI:1169219.1:2000SEP08 496 582 forward 1 TM N in 230 LI:1169219.1:2000SEP08 509 595 forward 2 TM N out 230 U:1169219.1:2000SEP08 495 581 forward 3 TM 231 LI:2050313.1:2000SEP08 2605 2664 forward 1 TM N in 231 LI:2050313.1:2000SEP08 3229 3315 forward 1 TM N in 231 LI:2050313.1:2000SEP08 3412 3498 forward 1 TM N in ' 231 LI:2050313.1:2000SEP08 1886 1954 forward 2 TM N in 231 LI:2050313.1:2000SEP08 2654 2731 forward 2 TM N in 231 LI:2050313.1:2000SEP08 2924 2986 forward 2 TM N in 231 LI:2050313.1:2000SEP08 3029 3091 forward 2 TM N in 231 LI:2050313.1:2000SEP08 3482 3568 forward 2 TM N in 231 U:2050313.1:2000SEP08 855 920 forward 3 TM N out 231 LI:2050313.1:2000SEP08 2445 2531 forward 3 TM N out 231 LI:2050313.1:2000SEP08 2646 2708 forward 3 TM N out 231 U:2050313.1:2000SEP08 2721 2783 forward 3 TM N out 231 U:2050313.1:2000SEP08 31 14 3170 forward 3 TM N out 231 U:2050313.1:2000SEP08 3504 3590 forward 3 TM N out 232 Ll:209351.3:2000SEP08 586 645 forward 1 TM N in 232 LI:209351.3:2000SEP08 1945 2025 forward 1 TM N in 232 Ll:209351.3:2000SEP08 353 415 forward 2 TM 232 Ll:209351.3:2000SEP08 665 718 forward 2 TM 232 Ll:209351.3:2000SEP08 1313 1387 forward 2 TM 90
O) o .Z .Z . CZ Q D Q TJ c _c c c c 3 3 3 3 3 3 C 3 C C C C C C C C C 3 3 3 3 3 3 3 C C C 3 3 3 o z z z z z z O O O O O O— O — — — — — — — — — O O O O O O O — — " O O O o. z z z z z z z z z z z o
H U fαorw α. σ E forwα Σ S S -≥ -≥ Σ -≥ -≥ -≥ -> -> 2 o
Q frw for
■— CM CN CN CN CO CO CO CO CO CJ — O r- r- c co —
σ Ό τ> "o Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό Ό σ σ o σ o o o o σ o o o σ o σ σ σ σ o σ σ σ o o σ σ o σ o o o o o σ σ o σ σ o σ σ o o o σ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o £v2>2«20>20 0 0 0 0 0 0 0 0 0 0 0
O . — o -o !3 ^ 2' <Ω ^ r^ S m ^ L> rN. co ^ S co o oo N. r- cN < oo JXj (3 r- ig Q Nt o O CN U-; co J J
£ -o o o 5 o ^r t o rN. Nθ i . ^ |N^ ^ ^ u ( W CO C rN, ^ „ w lO C ^ c^ M ^ | > o ^ 00 θ 'N CO S u -o "Nt . _ cj u r-N J cj oo -Nt oo r- oo c
Figure imgf000118_0001
o O
_ CM CN CN CM CN CN CO CJ CJ CO C Nt 'Nt 'Nt 'Nt -0 -θ Nθ -0 J rN. |-v [v. rN, rN. CO CO O O O O O O O O r- CM CO cO CJ CO Lj c co cj c c cj cj cj co co co -O co c co co cj cj cj co co co co co co co c cj co o — CM CN CN CN CN CN CN CN CN CN CN CN CN CN CM CN CN CN CN CN CN CN CN CN CN CN CN CN CN f CM CN CN CN CN CN CN t 5.5: -: -ϊ i. t ,ϊ 5t 5 O
TABLE 4
SEQ ID NO: Template ID Start Stop Frame Domain Topology
244 LI 1177823.2:2000SEP08 311 370 forward 2 TM Nin
245 LI 1174279.1:2000SEP08 634 720 forward 1 TM Nout
245 LI 1174279.1:2000SEP08 724 789 forward 1 TM Nout
246 LI 1178411.1:2000SEP08 1357 1443 forward 1 TM
246 LI 1178411.1:2000SEP08 1290 1376 forward 3 TM Nin
246 LI 1178411.1.2000SEP08 1419 1472 forward 3 TM Nin
247 LI 1182739.1.2000SEP08 853 939 forward 1 TM Nin
247 LI 1182739.1.2000SEP08 1900 1986 forward 1 TM Nin
247 LI 1182739.1:2000SEP08 2131 2217 forward 1 TM Nin
247 LI 1182739.1:2000SEP08 2341 2403 forward 1 TM Nin
247 LI 1182739.1.20005EP08 2446 2508 forward 1 TM Nin
247 LI 1182739.1.2000SEP08 803 859 forward 2 TM Nin
247 LI 1182739.1.20O0SEP08 1268 1351 forward 2 TM Nin
247 LI 1182739.1.2000SEP08 1445 1513 forward 2 TM Nin
247 LI 1182739.1.2000SEP08 1796 1882 forward 2 TM Nin
247 LI 1182739.1:2000SEP08 2090 2176 forward 2 TM Nin
247 LI 1182739.1.2000SEP08 2267 2323 forward 2 TM Nin
247 LI 1182739.1.2000SEP08 2363 2449 forward 2 TM Nin
247 LI 1182739.1.2000SEP08 1182 1244 forward 3 TM Nout
247 LI 1182739.1:2000SEP08 1254 1316 forward 3 TM Nout
247 LI 1182739.1:2000SEP08 2100 2156 forward 3 TM Nout
247 LI 1182739.1.2000SEP08 2301 2387 forward 3 TM Nout
248 LI:234937.4:2000SEP08 301 387 forward 1 TM Nout
249 LI 1170660.1.2000SEP08 604 690 forward 1 TM Nin
249 LI 1170660.1.2000SEP08 757 843 forward 1 TM Nin
249 LI 1170660.1.2000SEP08 937 990 forward 1 TM Nin
249 LI 1170660.1.2000SEP08 llll 1179 forward 1 TM Nin
249 LI 1170660.1:2000SEP08 512 583 forward 2 TM
249 LI 1170660.1:2000SEP08 911 973 forward 2 TM
249 LI 1170660.1:2000SEP08 986 1048 forward 2 TM
249 LI 1170660.1:2000SEP08 252 323 forward 3 TM Nout
249 LI 1170660.1.2000SEP08 591 677 forward 3 TM Nout
249 LI 1170660.1:2000SEP08 774 860 forward 3 TM Nout
250 LI 1144409.1:2000SEP08 25 81 forward 1 TM Nout
250 LI 1144409.1:2000SEP08 1708 1782 forward 1 TM Nout
250 LI 1144409.1.2000SEP08 2281 2367 forward 1 TM Nout
250 LI 1144409.1:2000SEP08 1658 1744 forward 2 TM Nin
250 LI 1144409.1:2000SEP08 1880 1927 forward 2 TM Nin
250 LI 1144409.1:2000SEP08 2294 2380 forward 2 TM Nin
250 LI 1144409.1:2000SEP08 1614 1697 forward 3 TM Nin
250 U 1144409.1:2000SEP08 2217 2303 forward 3 TM Nin
251 LI 246290.10:2000SEP08 248 304 forward 2 TM Nout
251 LI 246290.10:2000SEP08 641 727 forward 2 TM Nout
251 LI 246290.10:2000SEP08 3 89 forward 3 TM Nout
252 :280034.1:2000SEP08 279 365 forward 3 TM Nout TABLE 5
SEQ ID NO: Template ID Component ID Start Stop LG: 150318.1 :2000SEP08 482540R6 1 350 LG:150318.1 :2000SEP08 482540H1 1 238 LG:150318.1 :2000SEP08 6021134T8 126 531 LG:150318.1 :2000SEP08 6021134R8 126 537 LG:150318.1 :2000SEP08 6021134H1 126 629
2 LG:022529.1 :2000SEP08 7593029H1 1 521 2 LG:022529.1 :2000SEP08 g1645695 112 517 2 LG:022529.1 :2000SEP08 2160267H1 275 514 2 LG:022529.1 :2000SEP08 5284828H1 309 583 2 LG:022529.1 :2000SEP08 g1880347 313 601 2 LG:022529.1 :2000SEP08 7352207H1 330 707 2 LG:022529.1 :2000SEP08 6863346H1 339 830 2 LG:022529.1 :2000SEP08 3351341 HI 423 681 2 LG:022529.1 :2000SEP08 077299H1 432 632 2 LG:022529.1 :2000SEP08 078146H1 438 646 2 LG:022529.1 :2000SEP08 3801790F6 523 1030 2 LG:022529.1 :2000SEP08 3801790H1 523 823 2 LG:022529.1 :2000SEP08 3769238F6 523 1043 2 LG:022529.1 :2000SEP08 3802590H1 524 810 2 LG:022529.1 :2000SEP08 6217218H1 549 1039 2 LG:022529.1 :2000SEP08 g5743872 664 1137 2 LG:022529.1 :2000SEP08 g3051534 685 1047 2 LG:022529.1 :2000SEP08 g4195758 789 1045 3 LG:352559.1 :2000SEP08 6453567H1 1 503 3 LG:352559.1 :2000SEP08 4052122H1 185 457 3 LG:352559.1 :2000SEP08 4052122F7 185 636 3 LG:352559.1 :2000SEP08 g3897399 255 371 4 LG:175223.1 :2000SEP08 3081155F6 1 417 4 LG:175223.1 :2000SEP08 3081155H1 2 315 4 LG:175223.1 :2000SEP08 3081155T6 31 465 4 LG:175223.1 :2000SEP08 g1648354 58 457 4 LG: 175223.1 :2000SEP08 5800009H1 99 507 5 LG:476989.1 :2000SEP08 6874941 HI 1 169 5 LG:476989.1 :2000SEP08 6874955H1 1 204 5 LG:476989.1 :2000SEP08 g4393499 51 432 5 LG:476989.1 :2000SEP08 g4190810 96 432 6 LG:253268.7:2000SEP08 6772515H1 607 1083 6 LG:253268.7:2000SEP08 g5526194 596 941 6 LG:253268.7:2000SEP08 g4729197 690 941 6 LG:253268.7:2000SEP08 g5809986 501 941 6 LG:253268.7:2000SEP08 g2540947 535 937 6 LG:253268.7:2000SEP08 g922062 586 935 6 LG:253268.7:2000SEP08 6772515J1 454 961 6 LG:253268.7:2000SEP08 2535554F6 493 929 6 LG:253268.7:2000SEP08 g4392225 520 923 6 LG:253268.7:2000SEP08 g3424830 561 923 6 LG:253268.7:2000SEP08 g4533991 464 923 6 LG:253268.7:2000SEP08 g885146 693 914 6 LG:253268.7:2000SEP08 g917418 564 914 6 LG:253268.7:2000SEP08 g888754 605 914
US 90 CM t~- CM
O Ω- CO CN i- 'Nt 'Nt c r- 'N r . r- rN. -o rN. 'Nt -o o co - rv cj j c r- oo 'Nt CJ -— CM r- cJ CM CM UJ ^- OO OO UJ CM r- O -O CΛ o _- o_o- o_o-. c_o-. c_o- o_o- u_-. c-o co Q 5 c IQO iNlfQ c —o c _ o__ rv. o _o- uj -j co -rv. co cN co o uj c oo rv. o rv - rv. iv. - c —o c —o C -M. __ '_Nt- C-O_ '_ o o ^r -o rv. ^t oo o o uj rv. cM U cN Nt- 'Nt ^ _-. c —O 3 _2 -O t w - c -O CN CO CJ -O CN 'Nt r- c cO UJ CN -Nr 'Nt cO CN CN -O CN CO
CO O O co oo co co cό cci cό
H U α.
Figure imgf000121_0001
Figure imgf000121_0002
O z o o o o o o o o o o o o o o o o rv- rv. rv. rv. tv. rv. rv. rv. |v. rv. r-v rv. [v. rv. rv. rv. oo oo co oo co co oo co oo oo o o o o o o o o o o G LU O O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
9 LG:475404.1: 2000SEP08 3206542H1 25 202
9 LG:475404.1: 2000SEP08 4663964H1 37 289
9 LG:475404.1: 2000SEP08 7652675J1 47 172
9 LG:475404.1: 2000SEP08 gl807165 81 278
9 LG:475404.1: 2000SEP08 g2401959 129 486
9 LG:475404.1: 2000SEP08 3614680H1 155 272
9 LG:475404.1: 2000SEP08 g5101146 180 486
10 LG:1384132.1 .2000SEP08 5754964H1 1 453
10 LG: 1384132.1 2000SEP08 g5366771 73 480
10 LG:1384132.1 200OSEP08 3246016H1 198 446
11 LG:410804.18 :2000SEP08 3270045H1 1 252
12 LG:1082306.1 2000SEP08 2590822H1 60 224
12 LG: 1082306.1 2000SEP08 2875044H1 13 269
12 LG: 1082306.1 .2000SEP08 5944268H1 12 288
12 LG:1082306.1 2000SEP08 908479H1 39 241
12 LG: 1082306.1 2000SEP08 2972368F6 1 586
12 LG: 1082306.1 2000SEP08 2972368H1 1 292
12 LG:1082306.1 2000SEP08 1831460H1 393 656
12 LG:1082306.1 2000SEP08 g2270170 247 638
12 LG:1082306.1 2000SEP08 1732781 HI 118 297
12 LG:1082306.1 2000SEP08 4136083H1 590 657
12 LG:1082306.1 2000SEP08 5152624H1 463 723
12 LG:1082306.1 2000SEP08 2634451 HI 434 685
13 LG:233814.4:2000SEP08 6907331Jl 757 1055
13 LG:233814.4:200OSEP08 7435089H1 1 580
13 LG:233814.4:2000SEP08 7579295H1 475 927
13 LG:233814.4:2000SEP08 2818676H1 511 729
13 LG:233814.4:2000SEP08 1943949H1 632 863
13 LG:233814.4:2000SEP08 7403021 HI 703 1146
14 LG:977478.5:2000SEP08 7087230H1 1 549
14 LG:977478.5:2000SEP08 5525655H2 50 293
14 LG:977478.5:2000SEP08 7198649H1 74 395
14 LG:977478.5:2000SEP08 3218939H1 164 284
14 LG:977478.5:2000SEP08 2137947T7 182 714
14 LG:977478.5:2000SEP08 2137242H1 274 541
14 LG:977478.5:2000SEP08 2970423T6 312 783
14 LG:977478.5:2000SEP08 2970423F6 319 720
14 LG:977478.5:2000SEP08 2970423H2 319 633
14 LG:977478.5:2000SEP08 g6464000 464 821
14 LG:977478.5:2000SEP08 974781HI 501 774
14 LG:977478.5:2000SEP08 g6044798 530 821
15 LG:025931.1:2000SEP08 6075128F6 1 524
15 LG:025931.1:2000SEP08 7002695H1 272 846
15 LG:025931.1:2000SEP08 1949002R6 435 707
15 LG:025931.1:2000SEP08 1949002H1 435 692
15 LG:025931.1:2000SEP08 g6711855 473 929
15 LG:025931.1:2000SEP08 3348048H1 497 748
15 LG:025931.1:2000SEP08 7382758H1 573 962
15 LG:025931.1:2000SEP08 5844022H1 224 474
15 LG:025931.1: 2000SEP08 5844022F6 223 614 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
15 LG.025931.1 2000SEP08 6075128H1 1 185
16 LG:885368.1 2000SEP08 5909782T7 1 494
16 LG:885368.1 2000SEP08 6269652H1 1 415
16 LG:885368.1 2000SEP08 5909782H1 1 306
16 LG:885368.1 2000SEP08 5909782F7 1 578
16 LG:885368.1 2000SEP08 6269652F8 1 682
16 LG:885368.1 2000SEP08 5909795H1 1 305
16 LG:885368.1 2000SEP08 5909895H1 1 133
16 LG:885368.1 2000SEP08 6269652T8 99 618
16 LG:885368.1 2000SEP08 4982679T6 377 550
17 LG 1054900.1 :2000SEP08 2327449H1 15 248
17 LG 1054900.1 :2000SEP08 2327457T6 1 364
17 LG 1054900.1 :2000SEP08 2327449T6 1 288
17 LG 1054900.1 :2000SEP08 2327449R6 13 408
17 LG 1054900.1 :2000SEP08 2327457R6 13 402
17 LG 1054900.1 :2000SEP08 6537441 HI 147 499
17 LG 1054900.1 :2000SEP08 5108773H1 196 254
18 LG:995186.2:2000SEP08 5056145F6 1 549
18 LG:995186.2:2000SEP08 6061051F8 169 669
19 LG:435048.23:2000SEP08 6867561 HI 1 461
19 LG:435048.23:2000SEP08 3531553H1 53 327
20 LG:954859.1 2000SEP08 gόl96726 233 638
20 LG:954859.1 2000SEP08 g2344451 154 637
20 LG:954859.1 2000SEP08 3814110T6 148 602
20 LG:954859.1 2000SEP08 3815093T6 146 591
20 LG:954859.1 2000SEP08 6475045H1 1 548
20 LG:954859.1 2000SEP08 6413028H1 1 373
21 LG:364370.1 2000SEP08 6798273H1 1 497
21 LG:364370.1 200OSEP08 6798273F8 1 574
21 LG:364370.1 2000SEP08 4431872H1 3 82
21 LG:364370.1 2000SEP08 6792402T8 9 592
21 LG:364370.1 2000SEP08 6792402F8 9 705
21 LG:364370.1: 2000SEP08 6792402H1 9 500
21 LG:364370.1: 2000SEP08 6798273T8 362 543
22 LG:1098789.1:2000SEP08 3537002T6 1 461
23 LG:201540.2:2000SEP08 5764854H1 2 545
23 LG:201540.2:2000SEP08 4664068F6 1 316
23 LG:201540.2:2000SEP08 4664068H1 2 237
23 LG:201540.2:2000SEP08 5968685H1 26 592
23 LG:201540.2:2000SEP08 3160094H1 98 362
23 LG:201540.2:2000SEP08 652754H1 101 353
23 LG:201540.2:2000SEP08 6788325H1 118 489
23 LG:201540.2:2000SEP08 g1950834 690 910
23 LG:201540.2:2000SEP08 2739044F6 689 1171
23 LG:201540.2:2000SEP08 6829490J1 703 1219
23 LG:201540.2:2000SEP08 2739044H1 690 924
23 LG:201540.2:2000SEP08 7328749H1 733 1261
23 LG:201540.2:2000SEP08 6829490H1 712 1219
23 LG:201540.2:2000SEP08 2857116H1 777 1049
23 LG :201540.2: 2000SEP08 g769612 173 528 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
23 LG:201540.2:2000SEP08 6490433H1 287 779
23 LG:201540.2:2000SEP08 5096074T6 604 1006
23 LG:201540.2:2000SEP08 7262169H1 642 1193
23 LG:201540.2:2000SEP08 6310909H1 308 887
23 LG:201540.2:2000SEP08 5096074H2 347 523
23 LG:201540.2:2000SEP08 5096074F6 347 703
23 LG:201540.2:2000SEP08 7701612J1 518 1093
23 LG:201540.2:2000SEP08 4053114H1 100 386
23 LG:201540.2:2000SEP08 4319192H1 131 397
23 LG:201540.2:2000SEP08 3414632H1 132 252
23 LG:201540.2:2000SEP08 7200283H1 142 699
24 LG 1077357.1.2000SEP08 6245903F8 120 547
24 LG 1077357.1.2000SEP08 6245903H1 121 532
24 LG 1077357.1 :2000SEP08 g3250058 1 408
24 LG 1077357.1.2000SEP08 g3694145 26 408
24 LG 1077357.1.2000SEP08 g4333202 140 407
24 LG 1077357.1:2000SEP08 6779728J1 1 370
25 LG 1048846.4:2000SEP08 5433043T9 1 476
25 LG 1048846.4:2000SEP08 3256037H1 191 435
25 LG 1048846.4:2000SEP08 3256037R6 191 354
25 LG 1048846.4:2000SEP08 3207387H1 213 461
25 LG 1048846.4:2000SEP08 4316787H1 247 532
25 LG 1048846.4:2000SEP08 4630864T8 263 483
26 LG.336685.1: 2000SEP08 g1422427 85 461
26 LG.336685.1: 2000SEP08 3167506H1 1 292
26 LG:336685.1: 2000SEP08 g3751213 82 477
26 LG.336685.1: 2000SEP08 g3039675 93 477
26 LG:336685.1: 2000SEP08 7612160J1 631 1172
26 LG:336685.1: 2000SEP08 7585885H2 458 1146
26 LG:336685.1: 2000SEP08 7330357H1 428 965
26 LG:336685.1: 2000SEP08 7612160H1 246 869
26 LG:336685.1: 2000SEP08 g3844561 389 808
26 LG:336685.1: 2000SEP08 g2702908 347 805
26 LG:336685.1: 2000SEP08 g1422428 441 805
27 LG:1076253.1 :2000SEP08 5677775H1 24 273
27 LG:1076253.1 :2000SEP08 828479H1 617 874
27 LG:1076253.1 :2000SEP08 5677775F6 24 610
27 LG:1076253.1 :2000SEP08 7392292H1 1 584
27 LG: 1076253.1 :2000SEP08 4896201 F8 298 724
27 LG: 1076253.1 :2000SEP08 4896201 HI 298 566
27 LG: 1076253.1 :2000SEP08 7933083H1 389 898
27 LG: 1076253.1 :2000SEP08 2717909H1 429 684
27 LG: 1076253.1 :2000SEP08 392443R1 585 1112
27 LG:1076253.1 :2000SEP08 4896201T8 826 1314
27 LG: 1076253.1 :2000SEP08 4896201T9 872 1357
27 LG: 1076253.1 :2000SEP08 3989646T6 1029 1418
27 LG: 1076253.1 :2000SEP08 2779195T6 1078 1411
27 LG:1076253.1 :2000SEP08 3989646R6 354 748
27 LG:1076253.1 :2000SEP08 3989646H1 354 512
28 LG 1400601.2 :2000SEP08 6967393H1 1 600 TABLE 5
SEQ ID NO Template ID Component ID Start Stop
29 LG 107909232000SEP08 6002889F8 1 660
29 LG 107909232000SEP08 6002889T8 104 612
30 LG 108606412000SEP08 456122R6 6 395 30 LG 10860641 2000SEP08 5941956H1 8 286 30 LG 108606412000SEP08 4727660F6 1 542 30 LG 1086064.1 -∑OQOSEPOβ 7355369H1 1 526 30 LG 1086064.12000SEP08 3730253H1 9 315 30 LG 1086064.1 :2000SEP08 1798649F6 6 361 30 LG 108606412000SEP08 456122R1 6 583 30 LG 1086064.1 :2000SEP08 787063H1 16 134 30 LG 1086064.1 -2000SEP08 7354 16H1 19 537 30 LG 10860641 -2000SEP08 456122F1 45 518 30 LG 1086064.1 -2000SEP08 3523404H1 127 467 30 LG 1086064.1 :2000SEP08 456122T6 154 476 30 LG 1086064.1 :2000SEP08 1698965F6 165 529 30 LG 1086064.12000SEP08 1698965T6 179 500 30 LG 1086064.1 :2000SEP08 2514834H1 16 323 30 LG 1086064.1 :2000SEP08 456180H1 6 246 30 LG 1086064 V2000SEP08 4549786H1 1 245 30 LG 1086064.1.2000SEP08 4822628H1 2 283 30 LG 1086064.1 '2000SEP08 457138H1 6 149 30 LG 1086064.1 :2000SEP08 4727660H1 1 267 30 LG 10860641 :2000SEP08 458658H1 6 239 30 LG 10860641 :2000SEP08 461412H1 6 260 30 LG 1086064.1 :2000SEP08 456122H1 6 247 30 LG 1086064.1 :2000SEP08 460344H1 6 278 30 LG 108606412000SEP08 460848H1 6 242 30 LGl086064.1.20005EP08 1698965H1 165 396 30 LG 10860641 2000SEP08 1322674H1 2 256 30 LG 10860641 ^OOOSEPOβ 460283H1 6 240 30 LG 1086064.1.2000SEP08 3735303H1 22 340 30 LG 1086064.1 :2000SEP08 1798649H1 6 280
30 LG 1086064.1 :2000SEP08 1698730H1 165 359
31 LG 140060812000SEP08 4399221 HI 25 234 31 LG 1400608.1 :2000SEP08 5120882F6 1 353 31 LG 1400608.1:2000SEP08 014670H1 7 288 31 LG 140060812000SEP08 5378061 HI 133 391 31 LG 1400608.1.2000SEP08 5378061 F6 133 591 31 LG 1400608.1 :2000SEP08 g4265183 180 641 31 LG 1400608.1.2000SEP08 4881456H1 274 353 31 LG 14006081 '2000SEP08 2882961 HI 13 272 31 LG 14006081.2000SEP08 5120882H1 1 265 31 LG 1400608.1 :2000SEP08 1899877H1 41 265
31 LG 1400608.1 :2000SEP08 gl959316 41 357
32 LG-39927552000SEP08 2054642H1 354 539 32 LG:399275.5:2000SEP08 7688287H1 1 410 32 LG:399275.5:2000SEP08 1561014H1 169 351
32 LG 399275.5-2000SEP08 7688287J1 192 657
33 LG.293943.1.2000SEP08 7102403R8 1 647 33 LG:293943.1 :2000SEP08 g2007300 100 453 LJ-CM rv. rv. o o -o -o o co rv. vt o o co cj CN 'g; vt r- rv. rv. r— -Vt <— CN co rv
O CO CO CM O vt r- r- UJ OO CN LO CN 'Nt rN. CM -O OO O vt CN O O C CO UJ CO J CO o co vt uj o o o rv- u i— 0 0- rv. co u o rv. CO LO LO rN- CN O O cO CN cO vt r- CN LO cO O O O CN vt — UJ CJ CJ O vt vt vt vt r 2 O r UJ O -O O O O O CO CO 'vt t O 'vt r- CM CN cO
H U α.
^ "Nt Sf Sj ^ -f o _t |. _t _+ J-, J~, CJ O O CJ CJ CJ r— O CO CJ CO LQ OO O UJ UJ σ UJ UJ UJ CO [3 U 00J O ■— CN ■— ^ ^ 2 ^ C0 r- j rv. r- r- ^ S C5 S _0 θ θ O CJ l 0 O O ^ O L -0 U J CM C r- r- r- r- ^ CN CN CN CJ UJ r- CN — " r- CN CO CO UJ " " w ^ O r_ C Oj C c co 'vT vt ' 'vT vt '= ' r j CN C —
Figure imgf000126_0001
CL
LU
CO J J UJ CM
Figure imgf000126_0002
O o st st s Lθ Lθ j ιo -o - Lθ -θ Lθ θ rv. rv. rv. rv. rv. rv. rv rv, rv. oo oo co co
C CJ CJ CJ C CO C C CJ C C C CJ CJ C CJ CJ C C _ C CO CJ C C} C C C C C C C C CO C C^ o 0 O
90 CM t~- CM
O ^o o■ c ιo c. o ιn n co ι» (ι ^ (O C. (0 '- o ()' -) i- ■O i- (» ^ o <) ^n -) 0 'O O ^ flO lO (0 ^ 9) ^ ^ -) -) ^ -) -) -) ^o oO '0 CΛ
CO O st LO CM LO UJ O LO LO st CM CO St st st sJ cj CN LO LO CO St CN CJ CJ CJ CN CM CO -O CO CO OO CO O OO OO CO CO OO CO CO CO
H U α.
O ST st CN st CM O O rv rv. C 0 st oo r-v <— r- r- O os CJ st 00t 00t Q sNt U-J OC0 SO rC-N CrN- Orv. LNj o - — " o ,-it UJ r- o CO
CO J - 00 O O O ■— •O CM CM CN CO J CO t st t st 5 St st st st UJ UJ UJ UJ O O O
2 0. LU LU LU LU LU LU O o CM CN co cj co c c co CO OO CO CO OO CO CO CO OO CO CJ CJ CO CJ CJ CJ CJ CJ CO CM CN CN CN CN CM
Figure imgf000127_0001
0 o Q CO CO OO O O O O O O O O O O O O O O -— ■— ■— r- CM CN CN CN CN CM CM CM CN CM CM CN CM CN CM CN CM CM CM CN CM CN CN CN CN CN CN CN C O CJ cj c c cj co cj c st st st sr st t st st t st st st st st st st st st st st st st st st st st st st st sr st st st st st st st st st st st o a UJ o O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
42 LG:238631.4:2000SEP08 3401789H1 13 239
42 LG:238631.4:2000SEP08 4648809H1 13 297
42 LG:238631.4:2000SEP08 363401OH1 15 201
42 LG:238631.4:2000SEP08 4653815H1 16 316
42 LG:238631.4:2000SEP08 3152423H1 13 281
42 LG:238631.4:2000SEP08 4414637H1 12 275
42 LG:238631.4:2000SEP08 6121345H1 14 648
42 LG:238631.4:2000SEP08 3428203H1 15 274
42 LG:238631.4:2000SEP08 3744645H1 16 309
42 LG:238631.4:2000SEP08 2698861 HI 18 305
42 LG:238631.4:2000SEP08 4143702H1 18 299
42 LG:238631.4:2000SEP08 3995879H1 18 309
42 LG:238631.4:2000SEP08 1830216H1 18 247
42 LG:238631.4:2000SEP08 2998645H1 19 295
42 LG:238631.4:2000SEP08 5464919H1 19 306
42 LG:238631.4:2000SEP08 1713036H1 19 257
42 LG:238631.4:2000SEP08 6116806H1 19 314
42 LG:238631.4:2000SEP08 5788396H1 19 305
42 LG:238631.4:2000SEP08 5121572H1 19 299
42 LG:238631.4:2000SEP08 4715287H1 20 256
42 LG:238631.4:2000SEP08 6197104H1 23 561
42 LG:238631.4:2000SEP08 493734H1 28 246
42 LG:238631.4:2000SEP08 3988991 HI 40 341
42 LG:238631.4:2000SEP08 2048688H1 59 314
42 LG:238631.4:2000SEP08 2757550H1 59 328
42 LG:238631.4:2000SEP08 2755553H1 59 330
43 LG:236654.1 :2000SEP08 670448T6 802 1151 43 LG:236654.1 :2000SEP08 677184R6 802 1154 43 LG:236654.1 :2000SEP08 g2810503 842 1188 43 LG:236654.1 :2000SEP08 g2336018 909 1189 43 LG:236654.1:2000SEP08 2892611 HI 947 1180 43 LG:236654.1:2000SEP08 1289991 HI 1083 1189 43 LG:236654.1:2000SEP08 g1979821 1 187 43 LG:236654.1 :2000SEP08 g3835204 1 325 43 LG:236654.1 :2000SEP08 g6990824 1 298 43 LG:236654.1:2000SEP08 g6704413 1 299 43 LG:236654.1 :2000SEP08 6757618J1 8 93 43 LG:236654.1 :2000SEP08 7604986J1 9 544 43 LG:236654.1 :2000SEP08 g4072180 18 469 43 LG:236ό54.1 :2000SEP08 3519808H1 80 368 43 LG:236654.1:2000SEP08 g843784 234 500 43 LG:236654.1 :2000SEP08 1614746F6 288 734 43 LG:236654.1 :2000SEP08 1614746H1 288 506 43 LG:236654.1 :2000SEP08 7604986H1 364 845 43 LG:236654.1 :2000SEP08 1575232H1 366 578 43 LG:236654.1 :2000SEP08 g2209653 404 914 43 LG:236654.1 :2000SEP08 6322462H1 521 720 43 LG:236654.1 :2000SEP08 2892611T6 647 1138 43 LG:236654.1 :2000SEP08 g5660108 723 1188 43 LG:236654.1:2000SEP08 1614746T6 735 1150 CM I-- CM
O 9-2 θ rN. ∞ !^ ^ α ^ g 33 ^ 5 rv. I r- r- r- ( O O r- L O rN. r- 0 - - t O θ S ? CN CΛ 0 -.2 fe ϊ S N 2 -. n 222 -. Λ ^ W O ^ ffl C. α) ^ CO -) r- θ β N 30 N r. N S ^ S -. ϊ 2 * W N (. C. I N '- .
S ^ ^ Ξ ^ 2 » ^ ^ 2 ^ ^ r- ^ ^ - W Nt -) N vt N N Nt c. j n n θ Nt Nt N? Nr vt 2 S Ϊ_I ^ I I I -s - » w ω N
H U α.
j O OO CN CN CN O cO O O rv OO O lvj CO O O UJ UJ st _ .- St cO CJ O st LO oO O UJ st CN O UJ OO O I-v O OO OO OO O O UJ O st . _ O_ O_ O_ O_ O _ O ■— 'J MO S O N O O N '- '- CN CJ UJ UJ i: ^Z r— c u o o oo oo t . o_ op : r~v. i "v. CM st st cM θ rv. rv. r- L L θ ώ rv. rv. co co oo o o rv. rv. oo o o ^_. t t cN CN CM cj "^ O O O O O O lO O St O UJ LO LO r- CN r-
CO 00 00 00 CO oo en o o CN CM C O O CJ r-v r-v rv rv rv O CO CJ CO CN CN CN CM CN CN CM CN CN CM
Figure imgf000129_0001
o Q O J CO J J — st st st st « st st st sj t st st st st st st st st st st sj vt sj t st vt st st st o G O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
45 LG:217396.2:2000SEP08 g2818339 123 530
45 LG:217396.2:2000SEP08 3750696H1 232 517
45 LG:217396.2:2000SEP08 7676247H2 401
45 LG:217396.2:2000SEP08 50 1261T6 375
45 LG:217396.2:2000SEP08 2996140H1 265
46 LG:090574.1:2000SEP08 2659966F6 192
46 LG:090574.1:2000SEP08 2659966T6 194
46 LG:090574.1:2000SEP08 2659966H1 247
47 LG:202943.1:2000SEP08 1226590R6 174 583
47 LG:202943.1:2000SEP08 5639884H1 136 365
47 LG:202943.1:2000SEP08 1226590T6 383 888
47 LG:202943.1:2000SEP08 5092132H1 515 763
47 LG:202943.1:2000SEP08 1226590H1 174 418
47 LG:202943.1:2000SEP08 g2020858 180 420
47 LG:202943.1:2000SEP08 7213845H1 242 792
47 LG:202943.1:2000SEP08 2579002H2 975 1177
47 LG:202943.1:2000SEP08 6472564H1 942 1357
47 LG:202943.1:2000SEP08 7151349H1 58 561
47 LG:202943.1:2000SEP08 5089408H1 1 270
47 LG:202943.1:2000SEP08 5091008H1 1 269
47 LG:202943.1:2000SEP08 7594783H1 818 1226
47 LG:202943.1:2000SEP08 6469878H1 624 1185
47 LG:202943.1:2000SEP08 5092608H1 1 274
48 LG:236928.1:2000SEP08 3227840H1 1509 1789
48 LG:236928.1:2000SEP08 2487371 HI 1524 1754
48 LG:236928.1:2000SEP08 2908866H1 1579 1768
48 LG:236928.1:2000SEP08 g2538920 1584 1877
48 LG:236928.1:2000SEP08 487800H1 1633 1905
48 LG:236928.1:2000SEP08 5404331 HI 1669 1966
48 LG:236928.1:2000SEP08 2904955T6 1672 2110
48 LG:236928.1:2000SEP08 4243773H1 1679 1938
48 LG:236928.1:2000SEP08 2904624H1 1682 1978
48 LG:236928.1:2000SEP08 g4990382 1689 2149
48 LG:236928.1:2000SEP08 3291485T6 1692 2110
48 LG:236928.1:2000SEP08 7365257H1 1694 2145
48 LG:236928.1:2000SEP08 g3144565 1717 2158
48 LG:236928.1:2000SEP08 g3144841 1723 2149
48 LG:236928.1:2000SEP08 7761153H1 1 630
48 LG:236928.1:2000SEP08 2905587H1 26 296
48 LG:236928.1:2000SEP08 7741081 HI 173 752
48 LG:236928.1:2000SEP08 1491957H1 180 280
48 LG:236928.1:2000SEP08 4741520H1 195 447
48 LG:236928.1:2000SEP08 4741520F6 195 579
48 LG:236928.1:2000SEP08 6616428H1 203 717
48 LG:236928.1:2000SEP08 4697183H1 235 483
48 LG:236928.1:2000SEP08 4697183F6 235 679
48 LG:236928.1:2000SEP08 4028773H1 432 694
48 LG:236928.1:2000SEP08 4028773F8 441 957
48 LG:236928.1:2000SEP08 7761153J1 472 1051
48 LG:236928.1:2000SEP08 7109534H1 524 1104 TABLE 5
! ID NO: Template ID Component ID Start Stop
48 LG:236928.1:2000SEP08 3291485F6 798 1256
48 LG:236928.1:2000SEP08 2285984H1 798 1018
48 LG:236928.1:2000SEP08 3291485H1 800 1044
48 LG:236928.1:2000SEP08 6842952H1 822 1148
48 LG:236928.1:2000SEP08 6018945H1 841 1148
48 LG:236928.1:2000SEP08 6616428J1 847 1412
48 LG:236928.1:2000SEP08 4785174H1 867 1127
48 LG:236928.1:2000SEP08 2904955F6 1086 1487
48 LG:236928.1:2000SEP08 2904955H1 1086 1380
48 LG:236928.1:2000SEP08 g3163646 1101 1348
48 LG:236928.1:2000SEP08 2918045H1 1170 1452
48 LG:236928.1:2000SEP08 4740283H1 1210 1470
48 LG:236928.1:2000SEP08 g1962038 1291 1631
48 LG:236928.1:2000SEP08 1251534H1 1448 1674
48 LG:236928.1:2000SEP08 1251534F6 1448 1853
48 LG:236928.1:2000SEP08 2556286H1 1483 1733
48 LG:236928.1:2000SEP08 6131688H1 1790 2071
48 LG:236928.1:2000SEP08 g3679201 1791 2159
48 LG:236928.1:2000SEP08 2919136H1 1807 2066
48 LG:236928.1:2000SEP08 6442673H1 1814 2356
48 LG:236928.1:2000SEP08 6870682H1 1869 2347
48 LG:236928.1:2000SEP08 6438933H1 1871 2347
48 LG:236928.1:2000SEP08 645255H1 1877 2114
48 LG:236928.1:2000SEP08 1251534T7 1922 2108
48 LG:236928.1:2000SEP08 2288452H1 1966 2122
48 LG:236928.1:2000SEP08 6855869H1 1968 2537
48 LG:236928.1:2000SEP08 1296086F6 2002 2338
48 LG:236928.1:2000SEP08 1296086H1 2002 2237
48 LG:236928.1:2000SEP08 5021561 HI 2022 2302
48 LG:236928.1:2000SEP08 6616156H1 2040 2488
48 LG:236928.1:2000SEP08 g1624634 2135 2305
48 LG:236928.1:2000SEP08 3638554H1 2167 2353
48 LG:236928.1:2000SEP08 3356574H1 2207 2428
48 LG:236928.1:2000SEP08 531771R6 2245 2582
48 LG:236928.1:2000SEP08 531771H1 2245 2486
48 LG:236928.1:2000SEP08 2556128F6 2315 2567
48 LG:236928.1:2000SEP08 2556128H1 2315 2566
48 LG:236928.1:2000SEP08 gl 186813 2339 2535
48 LG:236928.1:2000SEP08 2781958H1 2345 2603
48 LG:236928.1:2000SEP08 4748766H1 2371 2635
48 LG:236928.1:2000SEP08 4748737H1 2371 2635
48 LG:236928.1:2000SEP08 3563133H1 2372 2566
48 LG:236928.1:2000SEP08 531771T6 2411 2935
48 LG:236928.1:2000SEP08 5021561TI 2492 2929
48 LG:236928.1:2000SEP08 1269755F6 2505 2772
48 LG:236928.1:2000SEP08 1269755H1 2505 2761
48 LG:236928.1:2000SEP08 1269755F1 2505 2873
48 LG:236928.1:2000SEP08 1269755T6 2506 2934
48 LG:236928.1:2000SEP08 2556128T6 2511 2933
48 LG:236928.1:2000SEP08 g2106737 2522 2954 m D en en en en cn en en S en en en en e cji cn j_ t_ tN. -- ^ tN. t- |N. 4^- -- - j_ j^ jN. t_ j-- i-. t-. Ji- J^- -- t-. J^ --- --- o o o o o 8 en en en en 8 tN. tN. t-. -- 4
O
en en en c_n cn cn en en en en cn en o o o o o o o o o o ^ r NJ NJ NJ NJ N j t j k> k) Nj "■ "— ' ^ ^ ^
Figure imgf000132_0001
— ' — ' -. fN, J N NJ N N r r ( J r N CΛ
O 4-- ^ -vl o o _ o ^ r o o o- - *- e o o 0 0 --- o NJ θ θ » ^ ^ c» Nj e N ^ oo oo co Nj e> o en en en e ^ o o o oo o -^ fB Nj O o e e oo NJ O o oo e Q o o Cn °O , NJ NJ Cn CO — ' ffl ϋl O -O
K ^ o- crι oι (fl α [ji M M M _- « M _ M -- -- ^ n C'i Ni c- θ' θ> ι (> _- _, g g o ω ω ω o θ Nj C» o c» - N ω g o __J - 0 g ω o, t , 0 ω o ■δ °: ^ » r fv vi » w ω _ ϋι
90
Q_ CN CJ O CO Cθ rv. CJ |v. O rv. ιv, r-- σ (θ u co st cθ r- o o o st co co o o -o ιv, o rv, CN θ rv co uj rv, oo c rv. cM t t cj co o uj θ r— Q St r- cO ^ _ O _ c O _ UJ st cO st OO CJ O CM st O CN O st c rv. O O OO CJ CN '— O CO CO O O OO UJ O UJ O CJ UJ C rv cO O OO CN co <) O -) <ι -) <) -) -) <) '- c. '- '- -o n . n ^ ^ ^ n c ^ ffl -) <. -) oo N co co N oo ιθ N -) <) N -- N 'θ s N C. θ' N co
CO C CN CN CN CN CN CN CN CN CM CM CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN C^
H U α.
t o o co co co co
Q rø rN v O O r- st st CJ CJ st st C C C CM C C CN
Figure imgf000133_0001
00 00 CO oo s oo CO CO CO CO 00 CO CO CO 00
CL 0_ 0_ CL 0. CL CL 0- 0- CL Q-
UJ UJ LU LU LU LU LU LU LU LU LU LU LU LU LU UJ
CO CO CO CO CO CO O CO CO O CO CO
CM CN CN CN CN o o
CN CN CN CN CN CN CN CN CN CN CN CN CN CN CM CN CN CN CN CN
CN CN CM CN CM CN CN CN CN CM CN CN CN C CN C C CN CM CM l-v rv. rv. rv. r (v. rv. rv. rv
CO CJ CJ CJ CJ CJ CO CJ CJ J co CO CJ CJ CJ CJ CJ CJ st
CO CJ CJ CJ CJ CO CJ J CJ CJ J CJ CJ CJ J J CO
CM CM CN CN CN CN CN CN CM CN CN CM CM CM CM CN CN CN CM
Figure imgf000133_0002
O
I-- o
— Q- IO UJ UJ UJ UJ UJ UJ J IO UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ -0 -0 UJ -0 - UJ UJ UJ UJ UJ UJ UJ -0 -0 UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ o G UJ O CO
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
51 LG:234372.2:2000SEP08 2541227H1 1552 1796
51 LG:234372.2:2000SEP08 7653281 HI 1556 2161
51 LG:234372.2:2000SEP08 2308666H1 1612 1801
51 LG:234372.2:2000SEP08 1719968F6 1614 2114
51 LG:234372.2:2000SEP08 1719968H1 1614 1837
51 LG:234372.2:2000SEP08 7322428H1 1727 2305
51 LG:234372.2:2000SEP08 g1646882 1750 2182
51 LG:234372.2:2000SEP08 4317152H1 1763 2035
51 LG:234372.2:2000SEP08 5301059F8 1776 2249
51 LG:234372.2:2000SEP08 7940028H1 1829 2452
51 LG:234372.2:2000SEP08 2666542F6 1854 2343
51 LG:234372.2:2000SEP08 2666542H1 1854 2099
51 LG:234372.2:2000SEP08 964975H1 1879 1991
51 LG:234372.2:2000SEP08 1381321H1 1890 2142
51 LG:234372.2:2000SEP08 5979524H1 1891 2177
51 LG:234372.2:2000SEP08 7060659H1 1895 2448
51 LG:234372.2:2000SEP08 2616003H1 799 1062
51 LG:234372.2:2000SEP08 2616003F6 799 1287
51 LG:234372.2:2000SEP08 7406896H1 837 1173
51 LG:234372.2:2000SEP08 3086412H1 861 1155
51 LG:234372.2:2000SEP08 6075203H1 879 1174
51 LG:234372.2:2000SEP08 7653281Jl 888 1467
51 LG:234372.2:2000SEP08 5059222H1 1071 1336
51 LG:234372.2:2000SEP08 590936H1 1088 1346
51 LG:234372.2:2000SEP08 590970H1 1088 1209
51 LG:234372.2:2000SEP08 590936R1 1088 1612
51 LG:234372.2:2000SEP08 1317631H1 1091 1427
51 LG:234372.2:2000SEP08 3601816H1 1109 1413
51 LG:234372.2:2000SEP08 3529032H1 1114 1423
51 LG:234372.2:2000SEP08 3116380H1 1123 1407
51 LG:234372.2:2000SEP08 3117081H1 1124 1391
51 LG:234372.2:2000SEP08 4405926H1 1137 1397
51 LG:234372.2:2000SEP08 1511731 F6 1143 1655
51 LG:234372.2:2000SEP08 2841541 HI 1144 1399
51 LG:234372.2:2000SEP08 1511731H1 1143 1328
51 LG:234372.2:2000SEPO8 1511356H1 1143 1327
51 LG:234372.2:2000SEP08 4056493H1 1302 1580
51 LG:234372.2:2000SEP08 g1959546 1304 1747
51 LG:234372.2:2000SEP08 g1349612 1328 1793
51 LG:234372.2:2000SEP08 g880451 1336 1792
51 LG:234372.2:2000SEP08 g830969 1336 1711
51 LG:234372.2:2000SEP08 2181970H1 1336 1606
51 LG:234372.2:2000SEP08 5326001 H2 1356 1608
51 LG:234372.2:2000SEP08 5325701 HI 1356 1610
51 LG:234372.2:2000SEP08 1564961 HI 1368 1557
51 LG:234372.2:2000SEP08 g2968054 2661 2838
51 LG:234372.2:2000SEPO8 7033565H1 1 576
51 LG:234372.2:2000SEP08 4252317H1 46 191
51 LG:234372.2:2000SEP08 7171693H1 221 736
51 LG:234372.2:2000SEP08 7179847H1 221 651 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
51 LG:234372.2:2000SEP08 140734H1 240 341
51 LG:234372.2:2000SEP08 4938818H1 286 553
51 LG:234372.2:2000SEP08 7592182H1 465 1020
51 LG:234372.2:2000SEP08 1840424F6 478 838
51 LG:234372.2:2000SEP08 1840424H1 478 755
51 LG:234372.2:2000SEP08 4011776F6 494 963
51 LG:234372.2:2000SEP08 4011776H1 496 749
51 LG:234372.2:2000SEP08 2770053H1 556 809
51 LG:234372.2:2000SEP08 7759056H1 616 937
51 LG:234372.2:2000SEP08 4447071HI 636 912
51 LG:234372.2:2000SEP08 5373391 HI 767 999
51 LG:234372.2:2000SEP08 1943237H1 2481 2735
51 LG:234372.2:2000SEP08 g2077756 2484 2840
51 LG:234372.2:2000SEP08 g883241 2495 2847
51 LG:234372.2:2000SEPO8 g831477 2497 2849
51 LG:234372.2:2000SEP08 5301059T9 2527 2732
51 LG:234372.2:2000SEP08 g3700868 2531 2837
51 LG:234372.2:2000SEP08 g3872992 2532 2842
51 LG:234372.2:2000SEP08 971981T6 2531 2813
51 LG:234372.2:2000SEP08 5102936H1 2539 2791
51 LG:234372.2:2000SEP08 g1664296 2539 2633
51 LG:234372.2:2000SEP08 1292736H1 2564 2796
51 LG:234372.2:2000SEP08 3112301 HI 2565 2851
51 LG:234372.2:2000SEP08 g4109330 2565 2633
51 LG:234372.2:2000SEP08 g2968055 2573 2838
51 LG:234372.2:2000SEP08 g4971344 2598 2839
51 LG:234372.2:2000SEP08 2717958H1 2609 2837
51 LG:234372.2:2000SEP08 2766115H1 2752 2837
52 LG:022629.1 :2000SEP08 3627874H1 253 538 52 LG:022629.1 :2000SEP08 5786985H1 209 495 52 LG:022629.1 :2000SEP08 3627896H1 242 449 52 LG:022629.1 :2000SEP08 5403369H1 1 231 52 LG:022629.1 :2000SEP08 5403369T9 337 738 52 LG:022629.1 :2000SEP08 6082254H1 201 724 52 LG:022629.1 :2000SEP08 g2198283 364 668 52 LG:022629.1 :2000SEP08 5403369T8 192 649 52 LG:022629.1 :2000SEP08 g3162701 219 638 52 LG:022629.1 :2000SEP08 3627896F6 155 599 52 LG:022629.1 :2000SEP08 5403369F8 19 594 52 LG:022629.1 :2000SEP08 1519408H1 354 551 52 LG:022629.1 :2000SEP08 1519416H1 354 541 52 LG:022629.1 :2000SEP08 7683101H1 748 1276 52 LG:022629.1 :2000SEP08 6569535H1 649 1 171 52 LG:022629.1 :2000SEP08 1519416T6 592 877 52 LG:022629.1 :2000SEP08 595891OH1 759 865 52 LG:022629.1 :2000SEP08 g6450919 360 843 52 LG:022629.1 :2000SEP08 g6836892 618 843 52 LG:022629.1 :2000SEP08 g4738617 377 842 52 LG:022629.1 :2000SEP08 g4186678 404 842 52 LG:022629.1 :2000SEP08 g5541594 602 840 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
52 LG:022629.1:2000SEP08 3627896T6 288 800
52 LG:022629.1:2000SEP08 1519416F6 354 771
52 LG:022629.1:2000SEP08 g2783063 400 762
52 LG:022629.1:2000SEP08 g2958447 367 762
52 LG:022629.1:2000SEP08 g2198251 454 762
52 LG:022629.1:2000SEP08 g2876399 393 762
53 LG:068682.1:2000SEP08 2011384H1 190 263
53 LG:068682.1:2000SEP08 g5438746 1 423
53 LG:068682.1:2000SEP08 g3805312 34 423
53 LG:068682.1:2000SEP08 g4372490 78 423
53 LG:068682.1:2000SEP08 g2954208 76 423
53 LG:068682.1:2000SEP08 g2954218 142 422
53 LG:068682.1:2000SEP08 g6838215 124 383
53 LG:068682.1:2000SEP08 g3307490 142 344
53 LG:068682.1:2000SEP08 6829315H1 314 884
53 LG:068682.1:2000SEP08 g3109791 492 811
53 LG:068682.1:2000SEP08 g5452473 492 650
53 LG:068682.1:2000SEP08 g4564783 26 423
53 LG:068682.1:2000SEP08 g6043518 78 423
54 LG:222335.1:2000SEP08 6804715H1 612 1189
54 LG:222335.1:2000SEP08 2317449H1 764 853
54 LG:222335.1:2000SEP08 2583954F6 816 1189
54 LG:222335.1:2000SEP08 2583954H1 816 1087
54 LG:222335.1:2000SEP08 5591720H1 954 1099
54 LG:222335.1:2000SEP08 785266H1 1056 1206
54 LG:222335.1:2000SEP08 785266R6 1056 1423
54 LG:222335.1:2000SEP08 6336926H1 1134 1419
54 LG:222335.1:2000SEP08 6336926F8 1134 1420
54 LG:222335.1:2000SEP08 7063506H1 1 370
54 LG:222335.1:2000SEP08 3778951 HI 1 282
54 LG:222335.1:2000SEP08 6258260H1 1 284
54 LG:222335.1:2000SEP08 5560479H1 7 218
54 LG:222335.1:2000SEP08 3113348H1 10 289
54 LG:222335.1:2000SEP08 702435H1 15 242
54 LG:222335.1:2000SEP08 4974222H1 18 276
54 LG:222335.1:2000SEP08 2343445H1 20 268
54 LG:222335.1:2000SEP08 2343445F6 20 546
54 LG:222335.1:2000SEP08 711988H1 85 306
54 LG:222335.1:2000SEP08 704169H1 85 364
54 LG:222335.1:2000SEP08 337408H1 93 282
54 LG:222335.1:2000SEP08 3163266H1 155 431
54 LG:222335.1:2000SEP08 4640786H1 165 437
54 LG:222335.1:2000SEP08 2343445T6 264 815
54 LG:222335.1:2000SEP08 439477H1 272 514
54 LG:222335.1:2000SEP08 3778951T6 332 830
54 LG:222335.1:2000SEP08 1737725F6 401 888
54 LG:222335.1:2000SEP08 1737725H1 401 608
54 LG:222335.1:2000SEP08 4001016H1 435 583
54 LG:222335.1:2000SEP08 g3841204 548 853
54 LG:222335.1:2000SEP08 6336726H1 1174 1420 TABLE 5
ID NO: Template ID Component ID Start Stop
54 LG:222335.1:2000SEP08 6850125H1 1178 1420
55 L.G.331342.1.2000SEP08 3415969F7 497 980
55 LG:331342.1:2000SEP08 1003818H1 601 809
55 LG:331342.1:2000SEP08 7340365H1 601 1134
55 LG:331342.1:2000SEP08 4542316H1 633 873
55 LG:331342.1:2000SEP08 4542316F6 639 1203
55 LG:331342.1:2000SEP08 4519202H1 712 975
55 LG:331342.1:2000SEP08 1272054H1 899 1126
55 LG:331342.1:2000SEP08 3966023H1 915 1057 5 LG:331342.1:2000SEP08 5316380H1 376 591 5 LG:331342.1:2000SEP08 5316222H1 376 534 5 LG:331342.1:2000SEP08 4050724H1 457 746 5 LG:331342.1:2000SEP08 3415969H1 468 727 5 LG:331342.1:2000SEP08 2414808T6 1 201 5 LG:331342.1:2000SEP08 194403H1 3 176 5 LG:331342.1:2000SEP08 3962146F6 8 193 5 LG:331342.1:2000SEP08 2414808F6 8 441 5 LG:331342.1:2000SEP08 3962146H1 60 193 5 LG:331342.1:2000SEP08 2414808H1 191 441 5 LG:331342.1:2000SEP08 7637162H1 302 876 5 LG:331342.1:2000SEP08 6606050H1 312 789 5 LG:331342.1:2000SEP08 7312830H1 361 823 5 LG:331342.1:2000SEP08 5316167H1 376 584 5 LG:331342.1:2000SEP08 5318337H1 376 622 6 LG:021770.1:2000SEP08 5875430H1 597 866 6 LG:021770.1:2000SEP08 718539H1 656 941 6 LG:021770.1:2000SEP08 6553139T8 594 1115 6 LG:021770.1:2000SEP08 4618506H1 590 827 6 LG:021770.1:2000SEP08 7160062H1 453 950 6 LG:021770.1:2000SEP08 7603731Jl 525 957 6 LG:021770.1:2000SEP08 3778516H1 433 735 6 LG:021770.1:2000SEP08 4819519H1 70 340 6 LG:021770.1:2000SEP08 4970514H1 341 610 6 LG:021770.1:2000SEP08 5686262F6 57 592 6 LG:021770.1:2000SEP08 g2155825 66 560 6 LG:021770.1:2000SEP08 4819519F7 69 652 6 LG:021770.1:2000SEP08 5754744H1 1 497 6 LG:021770.1:2000SEP08 5686262H1 43 299 6 LG:021770.1:2000SEP08 5068302H1 43 315 6 LG:021770.1:2000SEP08 g1442458 54 265 6 LG:021770.1:2000SEP08 2477611 HI 858 1061 6 LG:021770.1:2000SEP08 2477611F7 858 1292 6 LG:021770.1:2000SEP08 2477611F6 858 1333 6 LG:021770.1:2000SEP08 4313608H1 871 1193 6 LG:021770.1:2000SEP08 1461091 HI 899 1137 6 LG:021770.1:2000SEP08 639781HI 945 1212 6 LG:021770.1:2000SEP08 7619831J1 953 1308 6 LG:021770.1:2000SEP08 2477611T6 1011 1575 6 LG:021770.1:2000SEP08 1967278R6 1019 1507 6 LG:021770.1:2000SEP08 1967278H1 1020 1295 00 CM I-- CM o o o o o rv. uj o o j CM r- o o o O r- rv CN co r r- o O g -. iO N n N O . Q i- Q -. CN M m . M n rM
CΛ ro- o- ro-or- Oo- Oo CM c cM st rv. st rv. uj u uj iv. u c rv. o O
O O O rv. O r-v O O O O st O '— st st st r"N . _ r- - N O N ( C. -. 00 _ 'J _ O C O _O -_ 2 ? g S S
Figure imgf000138_0001
r- r- CN r- ■— r- r- CM CN CM CM CM CN CM C CN C sNt PCN UCMJ UCMJ CsNt CUMJ CsNt CsMt CιoM CsNt C-0M C-0N C.-O CJ θCO S -f. _?J . PO
H U α.
t '- - N ^ co N n ^ . oo . <) θ' » n c. ^ ^ 'θ (> t. r- r- ) ^ ^ r. ιo ιn . ι. N N N '- ω » ιn n ιn . ι- o c. π N < ^ <l ) C0 C> l> -) -) <) '0 ^ r- ) -) ) ύ c0 -) . _3 ι0 N i0 O ι- n ^ ^ -) -) IN . C. q '0 '0 '0 N π W 'C '- n N O . r- 0' ^ O -— CN CN CN CN CN CN C CJ C C CO st st UJ UJ UJ UJ rv. rv. oθ OO OO O O O O CJ CJ CO C C CO c c st st ι CO O c^ r- r- r- r- r- r- r- r- r- r- r- r- ■— r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- CN CN CN CN CN CM CM CM CN CN CM CM CN
CO OO OO CO OO OO OO CO OO CO CO CO CO ∞ o_^oooooopoooooooooo C CO CO C C C CO
Figure imgf000138_0002
ό z o Q O O O O O O O O O O o o o o o o o o o o o o o rv. |v. rv, rv, rv, — UJ UJ UJ UJ UJ -O UJ UJ UJ UJ -8 o UJoUJoUJoL oUJ UoJoUJ -o-J UoJ oUJ 8 o UJ UJ UJ o UJ oJ oJ oUJ oJ oJ UJ UJ UJ UJ UJ UJ UJ J UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ o G uu o
90
*o o ^ CN r- o O p st OO st cO CO CM O CO O CN "— CO O CJ St st CM CN rv. CM Iv. CJ CN O OO rv. rv --. O. O _ OO _ C.N I— uj . u . u . cj cj st c o _ rv rv co rv.
0 <) N r- ) iδ ffl N O ) C. O W '- ιO -> '0 -' & ιO O '0 (0 ) r. ^ C O '0 -) C. O t S ι0 3 l> ^ r- 0> ^ C. lO C. I)O O N OO N N '- fe rN. CN J C C st UJ UJ ( st O UJ O CO CN CN C C C rN. CN CO θ rN. O CO O UJ ^ C st st -O UJ r— CJ CJ CN CM CM CO St st st r- ^t
H U α.
σ 3 c _J u r-v ,— CN o sf — f CO C t ln ^-) r-v |-v O O O CO O co fv /v O O CM CM '- r- CO O O r_ __ C CN| C St r- r- ^ J st U J r- ^ ^ rv. rv. O O rv. r- r- r- r- O CM CN CM n ^ ^ st j o o rv, ■— CN CM CJ " CO CO Ξ 83 & .
CO CO OO CO CO CO CO CO CO OO CO Gp CO OO OO CO OO --. --- --- l_L -L CL CL CL CL θL Q- co co co co co co co co co O O O
CN CN CN CN CN CN CM CM CM lO UJ UJ - -O - UJ -O -O UJ UJ -O UJ UJ c
Figure imgf000139_0001
O z: o Q rv. rv. rv rv. rv rv. rv. |v. rv. co oo oo co co o o o o o o o o o o o O O O O r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r-
UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ O O O O o o o o o o o o o o o o o o o o o o o o o o o o σ O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
61 LG:1135213.1:2000SEP08 g4326525 1 141
61 LG:1135213.1:2000SEP08 g2525795 28 236
62 LG:267762.1:2000SEP08 3941090F8 1 334
62 LG:267762.1:2000SEP08 1704531 HI 153 292
62 LG:267762.1:2000SEP08 7658244J1 267 830
62 LG:267762.1:2000SEP08 5674936F6 283 888
62 LG:267762.1:2000SEP08 5674936H1 283 545
62 LG:267762.1:2000SEP08 5785070H1 482 768
62 LG:267762.1:2000SEP08 2831361 HI 514 783
62 LG:267762.1:2000SEP08 g1523057 527 994
62 LG:267762.1:2000SEP08 6244881 HI 600 677
62 LG:267762.1:2000SEP08 3731660H1 675 978
62 LG:267762.1:2000SEP08 g1950270 777 1103
62 LG:267762.1:2000SEP08 5674936T6 782 1313
62 LG:267762.1:2000SEP08 7762849J1 824 1474
62 LG:267762.1:2000SEP08 g1523013 882 1063
62 LG:267762.1:2000SEP08 g3678477 895 1352
62 LG:267762.1:2000SEP08 2797262H1 986 1231
62 LG:267762.1:2000SEP08 2797262F6 986 1555
62 LG:267762.1:2000SEP08 3732853H1 1059 1205
62 LG:267762.1:2000SEP08 g766967 1146 1503
62 LG:267762.1:2000SEP08 2797262T6 1199 1645
62 LG:267762.1:2000SEP08 6768646J1 1221 1437
62 LG:267762.1:2000SEP08 2561680H1 1367 1657
62 LG:267762.1:2000SEP08 g3750627 1401 1645
62 LG:267762.1:2000SEP08 g2557941 1510 1645
63 LG:120744.1:2000SEP08 5643301 HI 1 266
63 LG:120744.1:2000SEP08 5643301 F6 1 412
63 LG:120744.1:2000SEP08 7007819H1 12 557
63 LG:120744.1:2000SEP08 7161666H1 34 435
63 LG:120744.1:2000SEP08 7008530H1 19 463
63 LG:120744.1:2000SEP08 7161666F8 34 651
63 LG:120744.1:2000SEP08 7017571 HI 42 211
63 LG:120744.1:2000SEP08 7017571 F8 41 656
63 LG:120744.1:2000SEP08 3277305H1 128 393
63 LG:120744.1:2000SEP08 3277305F6 138 728
63 LG:120744.1:2000SEP08 3565182H1 492 782
63 LG:120744.1:2000SEP08 5643301 8 940 1116
63 LG:120744.1:2000SEP08 g4293857 672 947
63 LG:120744.1:2000SEP08 7161666R8 707 1169
63 LG:120744.1:2000SEP08 299231 H1 816 1098
63 LG:120744.1:2000SEP08 5643301 6 823 1115
64 LG:403409.1:2000SEP08 5643224H1 22 276
64 LG:403409.1:2000SEP08 g1998848 22 296
64 LG:403409.1:2000SEP08 3126704H1 22 298
64 LG:403409.1:2000SEP08 3126978F6 22 611
64 LG:403409.1:2000SEP08 3126704F6 22 600
64 LG:403409.1:2000SEP08 3256088H1 24 259
64 LG:403409.1:2000SEP08 1728945H1 26 246
64 LG:403409.1:2000SEP08 6141236H1 28 370 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
64 LG:403409.1:2000SEP08 6141236F8 28 633
64 LG:403409.1:2000SEP08 6328822H1 47 592
64 LG:403409.1:2000SEP08 3388023H1 89 361
64 LG:403409.1:2000SEP08 6134014H1 110 401
64 LG:403409.1:2000SEP08 3323733H1 192 461
64 LG:403409.1:2000SEP08 3409854H1 357 618
64 LG:403409.1:2000SEP08 4068325F6 515 1108
64 LG:403409.1:2000SEP08 4068325H1 517 804
64 LG:403409.1:2000SEP08 3629455H1 560 835
64 LG:403409.1:2000SEP08 5499626H1 1 255
64 LG:403409.1:2000SEP08 5500026H1 1 161
64 LG:403409.1:2000SEP08 5500309H1 1 227
64 LG:403409.1:2000SEP08 5499909H1 1 206
64 LG:403409.1:2000SEP08 2723940H1 1 243
64 LG:403409.1:2000SEP08 2723940F6 1 333
64 LG:403409.1:2000SEPO8 3751233H1 577 867
64 LG:403409.1:2000SEP08 5812189H1 644 840
64 LG:403409.1:2000SEP08 5812190H1 644 845
64 LG:403409.1:2000SEP08 311752H1 657 744
64 LG:403409.1:2000SEP08 373638H1 663 880
64 LG:403409.1:2000SEP08 7377730H1 697 1236
64 LG:403409.1:2000SEP08 6333709H1 705 1240
64 LG:403409.1:2000SEP08 6329688H1 705 1325
64 LG:403409.1:2000SEP08 7080559H1 760 1170
64 LG:403409.1:2000SEP08 6532643H1 803 1384
64 LG:403409.1:2000SEP08 g6570650 881 1310
64 LG:403409.1:2000SEP08 6141236T8 989 1579
64 LG:403409.1:2000SEP08 4068325T6 1018 1641
64 LG:403409.1:2000SEP08 3126704T6 1056 1580
64 LG:403409.1:2000SEP08 7068876H1 1069 1483
64 LG:403409.1 :2000SEP08 3256362H1 1308 1554
64 LG:403409.1:2000SEP08 gl400213 1337 1682
64 LG:403409.1:2000SEP08 g3254782 1346 1682
64 LG:403409.1:2000SEP08 g1383466 1395 1696
64 LG:403409.1:2000SEP08 6412487H1 1494 1931
64 LG:403409.1:2000SEP08 5534143H1 1578 1820
64 LG:403409.1:2000SEP08 7468129H1 1672 2123
64 LG:403409.1:2000SEP08 6869946H1 1717 2275
64 LG:403409.1:2000SEP08 g4510725 1831 2272
64 LG:403409.1:2000SEP08 6298042H1 1871 2146
64 LG:403409.1:2000SEP08 g1998847 1973 2275
64 LG:403409.1:2000SEP08 3541104H1 1991 2272
65 LG:226874.3:2000SEP08 6435270H1 1 397
65 LG:226874.3:2000SEP08 6431679H1 1 485
65 LG:226874.3:2000SEP08 6427109H1 1 445
65 LG:226874.3:2000SEP08 6779705H1 178 713
65 LG:226874.3:2000SEP08 5767442F8 178 779
65 LG:226874.3:2000SEP08 5767442H1 316 780
65 LG:226874.3:2000SEP08 6779705R8 335 929
65 LG:226874.3:2000SEP08 6779705J1 335 933 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
65 LG:226874.3:2000SEP08 7198301 HI 693 1099
65 LG:226874.3:2000SEP08 6536795H1 895 1461
65 LG:226874.3:2000SEP08 g2071279 927 1231
65 LG:226874.3:2000SEP08 g3802687 950 1338
65 LG:226874.3:2000SEP08 7264450H1 1077 1547
65 LG:226874.3:2000SEP08 6246443H1 1364 1871
66 LG: 1045521.4:2000SEP08 g4453997 3410 3766 66 LG: 1045521.4:2000SEP08 g5436696 3402 3773 66 LG: 1045521 ,4:2000SEP08 g4266778 3401 3775 66 LG:1045521.4:2000SEP08 g3174193 3398 3774 66 LG: 1045521.4:2000SEP08 4136385H1 3545 3829 66 LG:1045521.4:2000SEP08 g4195581 3607 3774 66 LG:1045521.4:2000SEP08 g4438434 3604 3773 66 LG:1045521.4:2000SEP08 g4392683 3394 3775 66 LG:1045521.4:2000SEP08 g5745081 3397 3777 66 LG:1045521.4:2000SEP08 2761101H1 3369 3598 66 LG:1045521.4:2000SEP08 2816971 HI 3378 3641 66 LG:1045521 ,4:2000SEP08 g3756033 3380 3775 66 LG:1045521.4:2000SEP08 g3094407 3385 3679 66 LG: 1045521.4:2000SEP08 2189244H1 3390 3636 66 LG:1045521.4:2000SEP08 g1376982 3338 3766 66 LG:1045521.4:2000SEP08 7247994H1 3343 3773 66 LG:1045521.4.2000SEP08 g4223122 3345 3775 66 LG:1045521.4:2000SEP08 g3839512 3351 3775 66 LG:1045521.4:2000SEP08 g5887631 3350 3775 66 LG:1045521.4:2000SEP08 g5855693 3351 3776 66 LG:1045521.4:2000SEP08 g4394054 3354 3773 66 LG:1045521.4:2000SEP08 g5743698 3363 3775 66 LG:1045521.4:2000SEP08 2112746T6 3317 3735 66 LG:1045521.4:2000SEP08 1813318T6 3319 3739 66 LG:1045521.4:2000SEP08 g5673655 3323 3775 66 LG: 1045521.4:2000SEP08 1813318F6 3326 3657 66 LG:1045521.4:2000SEP08 1813318H1 3326 3558 66 LG:1045521.4:2000SEP08 g4309999 3332 3776 66 LG:1045521.4:2000SEP08 g5837582 3338 3781 66 LG:1045521.4:2000SEP08 g694185 3465 3787 66 LG:1045521.4:2000SEP08 g566999 3478 3773 66 LG:1045521.4:2000SEP08 2082531T6 3413 3734 66 LG:1045521 ,4:2000SEP08 g5934167 3415 3778 66 LG:1045521.4:2000SEP08 g5365380 3415 3775 66 LG:1045521.4:2000SEP08 gl 194703 3414 3773 66 LG:1045521.4:2000SEP08 g2348619 3414 3775 66 LG:1045521.4:2000SEP08 g5675116 3415 3776 66 LG:1045521.4:2000SEP08 g2584365 3416 3775 66 LG:1045521.4:2000SEP08 g2953747 3417 3777 66 LG:1045521.4:2000SEP08 g2264833 3419 3775 66 LG:1045521.4:2000SEP08 g5152477 3424 3753 66 LG:1045521.4:2000SEP08 951625H1 3450 3682 66 LG:1045521.4:2000SEP08 3016423H1 3453 3753 66 LG:1045521.4:2000SEP08 g6474480 3316 3775 GO fN I--
O Q. r- O t UJ CO sj o st cO O CO UJ UJ UJ O rv. r- CN UJ ^ O st CJ O oO CN UJ CM -^ CM OO CN UJ st st cO CO CJ UJ CO '— co st O r , o ■— o •— st CΛ C t lN- rv- cj oO r- r— CN Cj cj r— c Uj rv. u >— _ _ _ . .. _ . _ O O st r— o c j j st r— v rv. r , r c cj cj rv. rv vt rv, vt t st o j rv rv. rv, rv, rv, r . c. Lj u- -rv. -r . -rv, -u-) --j --j r— co co c _N. co- c_M. C_M CM cj cj cj c st rv.
CO CN CJ r— CJ CO CJ CJ CJ CJ CJ CJ CJ C CJ CO CO CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CN r- 1^ CJ CJ CJ CO CJ CJ CJ CJ CO CJ CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ
H U α.
t st CJ — r- rv. CM c -O r- O O r- UJ r- O CM CM r- r- C CJ UJ CO st O O ^ st O c st r- ^ rv O cO CN st O O O O CN O CN OO OO UJ π W - S - O ^ r- O CN CN st O tv. oO O O O r- CM CM UJ CO CO OO O st O cO ^ O r- ( rv. CM O UJ O rv. CN CN st st S st UJ rv. |v. cO OO O i- O CJ 3=! c cO CO CJ O — r— r- r- r- r- r- CM CM CN CN CM CM CN CN CN CN O '— CJ _ CN UJ UJ C CJ CN CM O O O O O O O O O O Q O O O en •— CO U-' C CO CJ CJ CJ CJ CJ CJ C CJ CJ CJ CJ C CO CJ C CO CJ C CO CJ CN CN '— CO CJ CJ CO CJ CO CJ CN CM CJ CJ CJ CJ CJ CJ CJ CJ CO CJ CO CJ
CN CM CN CM CM CM CN CM CM CN CN CM CM
Figure imgf000143_0001
o β O O O O O O O O O O O O O O O O O O O O O O O O O O O O -Q O O O O O O O O O O O O O O O O O O O O O — O O- O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O o a o
so
CM
o Q. r- ιθ st CN L r- ^ θ c OO CO CO - rN. O c O O O O CO r- r- cj O CN rN. st r- c < st CJ O C -O C r- UJ |N. O O r- c UJ cN
CΛ T l^- st |v, o- C- - O r- O O O CJ -— O -O O O st O CN UJ st st CN O -— ■— N O- N -J O i. CM S n N N N C. n 'J O 'ϊ 'J O O O' lD i- CN O O '— OO r- CNJ r- CJ r- r- r- r— r- r- CN '— S O N θ N N O N N N -) -) (θ eθ O O OO' -) lO -) S N θO O '- U) ll) ( CS l<) t
CO CJ CO CJ CJ CM cO cO CO cO cO CO CJ CJ CO CJ OO CO CN CN CN CN CN CN CO CM CN CN CN CM CN CN CM CN CN CN CM CN CN cO O
H U α.
CN CO r— r- N r- O r- r- O o oo o u o o j rN. c t O rv. cO OO oO O O O '— ■— r—
Figure imgf000144_0001
OO OO OO OO OO OO OO CO m 00 CO CO oo "- CL CL CL a. CL Q- 0. CL CL
CO CO o co co co co co co co CO CO CO O CO O CO o
CN CN CM CM CM CN CN CN CN CN CM CM CN CN CN CN CN st f f f
CN CN CN CN CN CN CN CM CN CM CN CN CN ΓM CM CN ΓM
U) UJ UJ
Figure imgf000144_0002
o o Q —O OoOoOooO OooOoO OoOoOoOoOoOooO OoOoOoOoOooO OooO OoOoOoOoOoOooOoOoOo--QoOoOoOoOoOoOo--oO OoOoOoOoOoOoOoOoOo o G o
90 CM v5 t-- CM
O O OO O rv- rv- O O O O O O O cO rv O O O - — — O O O O O UJ O O — O O CJ O — CO CJ O O — — O — O O O O — CM O O CΛ n CJ CJ — O rv- CJ O O CN - — — CM — CN — O CN CJ CN - — — — — CO — O CN CJ — O — CM CM O CM CN St CM O CN — — CM - CN O UJ OO i st O st LO CJ UJ O OO O O O O O O O O O O O O O O O O O O O rv. O O O OO O O O OO O O θO θ r-N θ O O O O O OO CM UJ CO CN CN CN CN CN CN CN — — — — — — — — — — — — — — — — — — CN CN — — — — — — — — — — — — — — — — — — — — — CM -
H U α.
t cj cj — — UJ O OO O O O CO O CO O CN CN CO UJ CO O UJ Γ-N CO CO O — uj CN c st rN. cN c rN. oo st uj o o rN. uj oo rv. o o cj oo - o π N N «) (0 0 '- C. ι -) -) > ) N -^ O O r- . N t ι ^ ^ W N O Cθ α r- r- r. t\I N N CN| ( (. ( ^ S ιOJ U U3J N C> 0 '- C. '- >0 . — — — — cN C c o r^ rv. v r. r. rv. |v, |v, . rv. rv. rv. t st u uj uj j u uj u u uj uj uj U"J UJ U"J "J O - O - O - - sf
CO CN CN CN CN CN CN CN — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — r-
00 00 CO 00 CO 00 CO 00
LU LU 2 LU2 LU LU LU LU α LU_ LU LU LU CO CO CO CO O CO O O O CO O CO O O CO CO CO
CN CN CN CM CM CN CM CM CN oCM oCN CM CN CM
CN CN CM CN CM CM CM CN CM CM CN CN CM CN CN CN CN CN CN
UJ UJ UJ UJ J LO UJ LO J LO LO lO st
Figure imgf000145_0001
Figure imgf000145_0002
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
66 LG 1045521.4:2000SEP08 2643406T6 1457 1878
66 LG 1045521.4.2000SEP08 6630849H1 1476 1921
66 LG 1045521.4:2000SEP08 7340290H1 1192 1771
66 LG 1045521.4:2000SEP08 g4327763 3512 3775
66 LG 1045521.4:2000SEP08 2301252H2 3493 3666
66 LG 1045521.4.2000SEP08 2510130H1 3513 3734
66 LG 1045521.4'2000SEP08 g5671827 3479 3773
66 L 1045521 ,4:2000SEP08 g3677607 3479 3773
66 LG 10455214-2000SEP08 g3755042 3481 3775
66 LG 1045521.4:2000SEP08 5102917H1 3489 3763
66 LG 1045521.4:2000SEP08 g3755759 3486 3776
66 LG 1045521.4:2000SEP08 g4328020 3488 3775
66 LG 1045521.4.2000SEP08 6616957H1 1254 1386
66 LG 1045521.4:2000SEP08 5447708H2 3314 3558
66 LG 1045521.4:2000SEP08 2115461H1 1425 1704
66 LG 10455214'2000SEP08 g1947654 1424 1631
66 LG 1045521.4:2000SEP08 3526153H1 1444 1718
66 LG 1045521.4:2000SEP08 2758729F6 1418 1919
66 LG 1045521.4:2000SEP08 3151868H1 1417 1703
66 LG 1045521.4:2000SEP08 3151862T6 1417 1880
66 LG 1045521.4:2000SEP08 7705621Jl 2286 2850
66 LG 1045521.4:2000SEP08 6549421 HI 1277 1806
66 LG 1045521.4:2000SEP08 6549423H1 1277 1587
66 LG 1045521.4.2000SEP08 917768H1 1289 1522
66 LG 1045521.4:2000SEP08 915656H1 1289 1584
66 LG 1045521.4:2000SEP08 1688229T6 1310 1882
66 L 1045521.4:2000SEP08 2758729H1 1324 1502
66 LG 1045521.4:2000SEP08 2758729R6 1324 1851
66 LG 1045521.4:2000SEP08 5813069T8 1334 1842
66 LG 1045521.4:2000SEP08 945106H1 1329 1605
66 LG 1045521.4:2000SEP08 1686019H1 1331 1520
66 LG 1045521.4:2000SEP08 1686019F6 1331 1850
66 LG 1045521.4:2000SEP08 5104890H1 1334 1560
66 LG 1045521.4:200OSEP08 5104512H1 1336 1575
66 LG 1045521.4.2000SEP08 6500521 HI 1360 1913
66 LG 1045521.4:2000SEP08 601501 HI 1382 1634
66 LG 1045521.4:200OSEP08 3151862R6 1417 1919
66 LG 1045521.42000SEP08 2510294H1 934 1252
66 LG 1045521.4:2000SEP08 2510294F6 937 1482
66 LG 1045521.4:2000SEP08 6977627H1 964 1515
66 LG 1045521.4:2000SEP08 3297985H1 973 1220
66 LG 1045521.4:2000SEP08 gl964154 1107 1527
66 LG 1045521 ,4:2000SEP08 3951209H1 1127 1375
66 LG 1045521.4:2000SEP08 7279066H1 1131 1540
66 LG 1045521.4:2000SEP08 6839180H1 1147 1699
66 LG 1045521.4:2000SEP08 5542001 HI 1154 1374
66 LG.1045521.4:2000SEP08 4892761 HI 1181 1459
66 LG:1045521.4:2000SEP08 g2159379 1189 1613
66 LG.l045521.4.2000SEP08 5387974H1 1192 1461
66 LG:1045521.4:2000SEP08 3069606H1 1208 1500 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
66 LG:1045521 ,4:2000SEP08 3501320H1 1214 1509
66 LG:1045521.4:2000SEP08 5960233H1 1223 1761
66 LG: 1045521.4:2000SEP08 5817458H1 1231 1513
66 LG:1045521.4:2000SEP08 5821151 HI 1231 1513
66 LG:1045521.4:2000SEP08 6315173H1 1231 1819
66 LG:1045521.4.2000SEP08 985023R1 1248 1697
66 LG:1045521.4:2000SEP08 985023H1 1248 1524
66 LG:104552..4:2000SEP08 6270034H2 1272 1798
66 LG: 1045521.4:2000SEP08 2643406H1 330 583
66 LG:1045521.4:2000SEP08 7256990H1 431 803
66 LG: 1045521.4.2000SEP08 5877287H1 498 743
66 LG: 1045521.4:2000SEP08 2112746H1 499 749
66 LG:1045521 ,4:2000SEP08 5072105H1 543 816
66 LG:1045521 ,4:2000SEP08 1493714H1 599 814
66 LG:1045521.4:2000SEP08 5897630H1 620 928
66 LG:1045521.4:2000SEP08 6041048H1 625 1192
66 LG:1045521.4:2000SEP08 g707605 636 884
66 LG:1045521 ,4:2000SEP08 7280773H1 638 1185
66 LG:1045521.4:2000SEP08 g1815503 641 896
66 LG:1045521.4:2000SEP08 5598229H1 677 926
66 LG:1045521.4:2000SEP08 2706109F6 683 1153
66 LG:1045521.4:2000SEP08 2706109H1 683 978
66 LG:1045521.4:2000SEP08 4308862H1 708 842
66 LG:1045521 ,4:2000SEP08 gl 189689 727 1134
66 LG:1045521.4:2000SEP08 3268156H1 746 981
66 LG:1045521.4:2000SEP08 g1959888 747 1207
66 LG:1045521.4:2000SEP08 6913104J1 765 1279
66 LG:1045521.4:2000SEP08 g1692292 767 1116
66 LG:1045521.4:2000SEP08 7584006H1 858 941
66 LG:1045521.4:2000SEP08 7280450H1 867 1393
66 LG:1045521.4:2000SEP08 4557319H1 873 1128
66 LG:1045521.4:2O00SEP08 6315112H1 873 996
66 LG:1045521.4:2000SEP08 7170747H1 875 972
66 LG:1045521.4:2000SEP08 4127083H1 909 1138
66 LG:1045521.4:2000SEP08 g4243806 340
66 LG:1045521 ,4:2000SEP08 7601257J1 256
66 LG: 1045521.4:2000SEP08 g4244618 450
66 LG:1045521 ,4:2000SEP08 g4851668 475
66 LG:1045521.4:2000SEP08 g4111830 469
66 LG:1045521.4:2000SEP08 g898045 349
66 LG:1045521.4:2000SEP08 6818820J1 232
66 LG:1045521 ,4:2000SEP08 6764773J1 591
66 LG:1045521 ,4:2000SEP08 g5511183 476
66 LG:1045521.4:2000SEP08 g5590133 16 474
66 LG:1045521 ,4:2000SEP08 g4088559 33 452
66 LG:1045521 ,4:2000SEP08 g4300922 33 418
66 LG:1045521.4:2000SEP08 g898069 40 304
66 LG:1045521.4:2000SEP08 g810354 45 328
66 LG:1045521.4:2000SEP08 g2898466 48 455
66 LG:1045521.4:2000SEP08 g3050233 49 227 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
66 LG 1045521.4:2000SEP08 5623371 HI 71 387
66 LG 1045521.4:2000SEP08 6863535H1 104 594
66 LG 1045521.4:2000SEP08 6996631 HI in 438
66 LG 1045521.4:2000SEP08 g1648497 139 498
66 LG 1045521.4:2000SEP08 g1668085 139 495
66 LG 1045521.4:2000SEP08 g774564 177 307
66 LG 1045521.4:2000SEP08 g810462 259 566
66 LG 1045521.4:2000SEP08 7081930H1 287 837
66 LG 1045521.4:2000SEP08 2643406F6 330 832
66 LG 1045521.4:2000SEP08 5813069F8 1231 1851
66 LG 1045521.4:2000SEP08 1551431H1 3304 3506
66 LG 1045521.4:2000SEP08 1811620H1 3411 3589
66 LG 1045521.4-.2000SEP08 4722921 HI 3449 3513
67 LG:275876.1:2000SEP08 5502856R6 711 1190
67 LG:275876.1:2000SEP08 g6704557 1 400
67 LG:275876.1:2000SEP08 g3250183 33 488
67 LG:27587ό.l:2000SEP08 4251835H1 91 305
67 LG:275876.1:20O0SEP08 6880661 HI 263 857
68 LG:475127.7:2000SEP08 g1964734 1 363
68 LG:475127.7:2000SEP08 7169164H1 1 501
68 LG:475127.7:2000SEP08 3601433H1 160 458
69 LG:157263.1:2000SEP08 7401256H1 1076 1531
69 LG:157263.1:2000SEP08 g4018570 221 444
69 LG:157263.1:2000SEP08 7618286H1 1 565
69 LG:157263.1:2000SEP08 6916646H1 406 716
69 LG:157263.1:2000SEP08 7641384H1 475 826
69 LG:157263.1:2000SEP08 7765238H1 569 1157
69 LG:157263.1:2000SEP08 7618286J1 700 1286
69 LG:157263.1:2000SEP08 7765238J1 929 1542
69 LG:157263.1:2000SEP08 817477H1 958 1222
70 LG:247382.7:2000SEP08 2989208H1 1 192
70 LG:247382.7:2000SEP08 7600764J1 1 498
70 LG:247382.7:2000SEP08 7600764H1 16 515
70 LG:247382.7:2000SEP08 2071263H1 1108 1372
70 LG:247382.7:2000SEP08 2852080H1 1140 1356
70 LG:247382.7:2000SEP08 7703934H1 1154 1680
70 LG:247382.7:2000SEP08 3532904H1 1196 1480
70 LG:247382.7:2000SEP08 4305547H1 1235 1545
70 LG:247382.7:2000SEP08 4640181 HI 1462 1671
70 LG:247382.7:2000SEP08 7697061 HI 495 1087
70 LG:247382.7:2000SEP08 2686388H1 515 755
70 LG:247382.7:2000SEP08 010297H1 533 800
70 LG:247382.7:2000SEP08 010377H1 533 816
70 LG:247382.7:2000SEP08 010299H1 535 796
70 LG:247382.7:2000SEP08 013850H1 550 837
70 LG:247382.7:2000SEP08 013958H1 550 831
70 LG:247382.7:2000SEP08 g5540183 605 952
70 LG:247382.7:2000SEP08 5291818H1 670 929
70 LG:247382.7:2000SEP08 7703934J1 869 1414
70 LG:247382.7:2000SEP08 2503245H1 929 1156 00
g- CJ O CJ O st o- O sj — CO ≥ O UJ — ^ — $ OO O O cO O CN st O CO cO O O r-v CO st cO st UJ O st cO UJ — OO O O O CJ
O rv. o uj «j st o o o o rv. oo S uj — - -" o- - o" O st O oO O CN O o st uj o rv. LO UJ st st oO O — O O st — CN UJ CM v- uj o o rv- uj oo uj cM CN ^ co uj oo C ∞ — ≥ — Os O
CN CM ΓN UJ CJ CN O ΓV O — CO CN UJ _! __ ϊ3J CJ
— r _ rv. O O CO O OO O OO O CN O O O OO
CM CN CN CN CN CM CM — — CM — CM — r— r— CJ CO CO CO CO St CO St St CO CJ CO CJ
H U α.
r. O rv CN sf CM st 00 OO CN St l-N CJ St O rv OO CO St O O O OO O OO CM CJ CJ CN
O C; — S S 0 0 fQ N CN .0 r_ ( ( <r, / S CM CM o o 00 J st rv O co rv- CM - O rv- O O — CN CJ cO -O r-N O O CM CN CN CN
+5 °° CO O CO CO Sf UJ CO CJ oJ — CN cj co uj o o o — c rv. rv. rv. rv. rv. rv. rv. oo rv. 0 0 0 0
CM CN CN CM CM CN CN ■— r— — — CN — — — C co cj co c co cj co co co c cj cj
α ι ___.Q.S.a_-_.Q.o._._._._.
CM CN CN CM CN CN CN CN CN C^
Oo rOv. Orv. Oιv. Orv. , 1 — O O O O CM CN CN CN CN — — — —
Figure imgf000149_0001
o
I-- o z
Q O O O O O O — CN CN CN CN CO CO CJ CJ CO CO CO CJ CO CO CO CJ CJ CJ CJ CO CJ CO CJ CJ CO CJ CO CO CJ CJ Sf St Sf S^ rv. rv. rv. rv. r rv. r-v iv. rv. ιv, |v. rv, rv. rv. rN. |N^ rN. ιv. rN. rv. r- iv. rv. ιv. ι-v iv. rv. rN. rv. ιv, |v, r^ o G O
90
&gS_^ rv.cro- - - co ιv. ∞ ^ 222 c_5 S - g § § S c -θ co o o co st co o 9 S ^ g c55 r^ g
CO O —I O≡ StS ^t^ CNSg8SSc -o- o S S S ° S § 55 | r o § § § ^ ^ °f ^ ^ N. ^ § ^ ^ ^ - | S o CJ -Nt CJ CJ CJ CJ sf O t CM UJ O S | S CO CJ O
H U α.
-_ — — SI CJ O CN O St o- O eO O> tv- p p p O O — t-v O st — st oo o o o CJ CM O CM oo oo Q O sf UJ co CM CN rv. — CJ st cJ CN Sf O CN — CN CN CO sf UJ UJ yO - — CM UJ C„. rCj. rN3S.^sf — rv. o o rv i-v o rv . O ^5_iOjSO«^ (^,'^^ n^ s O 00 N N NT CO r- r- 3^3^_ — CM CM CO st CJ St CN CJ CO CM CM CM CN CN
C CJ CJ NCJi-NO(-NoCJoC -)^ N^i-"1-; ^_ ^w ^w-^) st st vt st sf st sf st
Figure imgf000150_0001
O o z Q t st t t sf st f sf ^ t sf r sf t t st t st st st t t t t vt t --r ^ -ςf --r -^ --r .--r -^
— r I-N rv. rv. rv. rv. iN. iN ^ rv. r rv. r^ tv. rv. rv. r. iN ^ r rv. r^ rv. rv. rv. rv. rv. rv. r rv. i-v rv. i-v o G
LU o CO
00 fN
O n ffl θ -) fl -) N C. 'θ CΛi N N iN ) C C N α N -) iθ N N (> ()' C> n θ' N t 'i 'i 5 (> o. t r- n <i N i> o ι θ' -) θ ^ w >o (> CΛ n N N O - ^ N <) & '0 -) ) <) ^ -) (. N n ^ ( N iN ^ (. ^ -) a) - r- cvi n i» (> ^ ^ N N O' ^ '0 0 ^ (> w n ^ _) (> i- c. >o ^ e -) O » -> ^ <. \f ^ Nt ^ n n ^ ^ ^ -O "t n '. 'j -) N i ) N 'O <) ) <) 00 N N N -) ( -) N C. (> (> e N N O' (> O' -j (o s co sf ^ -^ co co co co co co cj co cj c co cj c co co cj c o cj co co cj cj cj c cj c c co co co co co c c cj c o c^
H U α.
t uj o uj uj cj cj uj uj o co o cj o o co o - o - o - o - c_.j_ c-o- — o -- S CN r LO st CO OO Sf - — oO CN CN O sr st O UJ O O oO CJ O CN CJ CJ CN O _ Q_ O_ O_ O_ O_ O_ O_ CN CJ sf st st - i-v rv. -rv. "rv. o ~ _ - C -N- c —o _ o_ — _ rv. rv, — o o — st st rv, oo o o — — — CM st N rv. rv. o o
CM CO CO O O O O O O — — — — — — — — — — CM CN CN CN CJ CJ c CO CJ CJ CJ O st ^ O st st st ^ LO UJ iO UJ UJ UJ UJ UJ UJ UJ O O sf sf ^ cj c co cj cj cj c co co c co c co co c co co cj co co c co co c co c co co co co co co c c c c co co c cj co
Figure imgf000151_0001
rN. [N. rv. rv. rv. rv, rv, r rv,
Figure imgf000151_0002
00 m
D
N M M M N M M N N N M M N M M N M M M N M M N N M M M M M M N N M M N N NJ M M M M M N M M N M M M j l__ ^ ^ 4^- 4-- - ^ J-. _- ^ _- ^ -- t-. J^ ^ t^ _^ t-. t-. t-. J_. ^ t^ J_. -^ JN. M JN, p D o
o vj
O j-, ro σ o C/J τo
Figure imgf000152_0001
r NJ rj tNj N N N N N NJ N N N rN No r t. r rj N N N r Nj rNj N N r N 4__ ^ ^ 4-- --- 4-- _-. 4-- J-- NJ NJ 4-- 4-- _-. _-. -- --- t-. 4_. t_. Co CΛ
W eo NJ CO NJ O O O Cn Λ- £_ J-- OO OO C» OO OO C» OO vj v oo vj oo θo vj ^j oo vj .. NJ vj en co cn ςn θ _ 4.-- N .J_ — ' vj _ N r. . o_ t .- N . N. θ __ 4-- ^. e _n ^ .-- *- j-- c — ' 4^- — O — — ' 0-0' O OO O— ' O OO 0O0 4O-- θhO Nr0J N0J NOJ ONJ rOO ONJ OrN OrvJ ONJ rLTJ j_. — ' o en ϋ. —fl -)) 0 -5 C -fl ' -» o —o ^ -_. -rι ω - -' - C -)ι ^ - c —o N ω - α— w- a- ω- ω— -r. .M- ω N C^ ω α) c c^ co (» ^ -• J- -- -- -- ^ r- ^ 3■
M M M M ig ι. ω ι to κ) s> t ω > ω ω ω ω ω ω o) ω co cj ω ω w w ι j- jN. t-. J_. t_ i- J__ 4__ JN > CO CO J__ ±_. JN. ^. £-. t__ J-_ i-_ £- Cθ CΛ
O _. O_.i. O_.i. O —O N - O-.- t .O_ C-B- N - N^ - vOj Oi-' O *— » - —' Wi-fc. MrvJ Oi-A- Mr . Oi-j - —' I ^ — O - —' ' - —' - —' ' NrvJ. - —' ' Oj O v' o o *,- w o- ι NwJ w cn. o co ro j__ — en S en co co o o o o O o co oo N .t-_ ro c-. vj o .E_. NJ — J — cn o o oo oo o o cn oo oo co — • cn — • en — — —. o o o — o o o e fc- S j j e — - o — — - r jy en j_. o o vj — . l Oi l. E O J- J- O O' O M -' W Oi O -O M O W O OO C. i-. *- o — o t-ι vj _- o- c e j o — ' Co o co cn o u
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
74 LG:197614.1 :20∞SEP08 7650036H2 2417 2975
74 LG:197614.1:2000SEP08 7156980H1 938 1507
74 LG:197614.1 :2000SEP08 7663421 HI 1134 1649
74 LG:197614.1:2000SEP08 7650036J2 1157 1772
74 LG:197614.1:2000SEP08 7359954H1 1252 1784
74 LG:197614.1:2000SEP08 7621265J1 1262 1833
74 LG:197614.1:2000SEP08 5732485H1 1126 1318
74 LG:197614.1 :2000SEP08 7080048H1 1640 2223
74 LG:197614.1:2000SEP08 6485720H1 1880 2304
74 LG:197614.1:2000SEP08 6485720F9 1926 2075
74 LG:197614.1 :2000SEP08 081459H1 1973 2232
74 LG:197614.1:2000SEP08 7663421Jl 2044 2608
74 LG:197614.1 :2000SEP08 g3423345 3018 3467
74 LG:197614.1 :2000SEP08 g5590054 3020 3467
74 LG:1 7614.1 :2000SEP08 g4111681 3027 3467
74 LG:197614.1:2000SEP08 2617018H2 3029 3265
74 LG:197614.1 :2000SEP08 3815347H1 3033 3315
74 LG:197614.1 :2000SEP08 g777251 3046 3330
74 LG:197614.1:2000SEP08 g2817792 3086 3475
74 LG:197614.1:2000SEP08 4897664H1 3826 3991
74 LG:1 7614.1 :2000SEP08 4896259H1 3849 4001
74 LG:197614.1:2000SEP08 gl212198 3850 4178
74 LG:197614.1:2000SEP08 g1748459 3880 3967
74 LG:197614.1:2000SEP08 5063386H1 3886 4001
74 LG:197614.1 :2000SEP08 1491069H1 3900 4105
74 LG:197614.1 :2000SEP08 1491069F6 3900 4001
74 LG:197614.1:2000SEP08 g896855 4022 4245
74 LG:197614.1:2000SEP08 g4901112 4022 4378
74 LG:197614.1 :2000SEP08 g4990162 4025 4297
74 LG:197614.1:2000SEP08 gόό99628 4025 4483
74 LG:197614.1:200OSEP08 180009T6 4072 4569
74 LG:197614.1:2000SEP08 1491069T6 4075 4576
74 LG:197614.1 :2000SEP08 6208837H1 4088 4388
74 LG:197614.1 :200OSEP08 7355849H1 4097 4587
74 LG:197614.1:2000SEP08 2588654H1 2874 3132
74 LG:1 7614.1 :2000SEP08 426115R6 2948 3453
74 LG:197614.1:2000SEP08 g1492260 2879 3120
74 LG:197614.1:2000SEP08 3873481 HI 2937 3214
74 LG:197614.1 :2000SEP08 425031HI 2948 3210
75 LG:378428.1 :2000SEP08 6910888F8 560 708 75 LG:378428.1 :2000SEP08 7379444H1 36 599 75 LG:378428.1 :2000SEP08 6816543H1 185 640 75 LG:378428.1 :2000SEP08 6816543F8 285 641 75 LG:378428.1 :2000SEP08 7429391 HI 439 956 75 LG:378428.1 :2000SEP08 6816543R8 1 644 75 LG:378428.1 :2000SEP08 7379418H1 36 609 75 LG:378428.1 :2000SEP08 6910888J1 768 1376 75 LG:378428.1 :2000SEP08 6910888R8 773 1449 75 LG:378428.1:2000SEP08 6044422H1 726 1023 75 LG:378428.1 :2000SEP08 6044422F8 725 1341 10 J N M N N N N N N M NI N N M N N M M M M M N N N N M N M N N N N N N N M N N N NJ N M N N N M M N vj vj — o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oi cn cn en en en cn en en en en en en en en en en en en O
0
øøøøøøøøøøøøøøøøøø rό r rNJ K- r Nj rNJ NJ r J NJ rJ NJ M
CB OO OO OO OO CJO OO OO OO α α CB OO OO α O CXI OO . O O O O O O O O O O O O O O O O O O ^1 ωωOw0uωw0uwwwω0ωOωωωOω0w0ω -ii O O O O O O O O O O -g r r r r NJ r r ro NJ rό ro r rNJ r r o o o o o o o o o o o o o o o ό o ό C o OoOOoOoOoOoOOO O OOO
(Λ (Λ <Λ C ω cnowoc)oCΛoOoO M (Λ «oO co)ocnowowoO woO — (Λ D ) lm)mlmImIm)m)mIm)mTI )mImImTIm2m-Om
O O TI
00O00O00OCX3 00O00O00OC»OO0O00O00OC»O00OC»OCJ0CCDJ0OC»OC»
Figure imgf000154_0001
> en
Figure imgf000154_0002
eo NJ KJ —
_, O O p vj en o co oo s C J-- C- J-. — ' O — - to — c r rN -J ^ co ^ ^ ^ o ^o en o o o o NJ en tN_ r co en en cπ en -+ o o o o — o oo o P o oo o o o — ro o o fc- — ~ ro o & oi & 5 Q ro — oi o o t-- — i o o " ro e 'n e ^n J --. — 5 — o o o o — • — > ? roi Ti NJ NJ Co Co — Co o NJ . i. O Co O Ol O O O O N M O M J_ o o — °° .
o. fj jv 'N 'Nj i iNJ — NJ to — ' ro ro — — • — • — — • — , <-__ - N N r rNj r N N — ■ ■ to to — co K K ft w M ^ M ) M M ) M w w α ϋι ) » (> j- θ' t oo N ^ o o ^ S S ^ ω ω ^ co fc o o ^ « cn N θ « ^ (» oo o o w * N y ^ oo co ^ W N » oo oo ^ & o t t ^ N 5 ^ α M rι ^ N S _ o >o -o -' N θ cιo o <) o o ϋι -. Nj » Nj to co oo o. u β ^ -' -' Ui NI Oι -' -O' -' Nl O O C- Oι C>ι lO b M OJ 5 ™ 00 ' O>"n ι N>lJ Jf--- 1" - N"C"» 'C.- i''- O- t'> 0 —0 O — b - O - -, N - { -- -0 - -' O _ W.. I.»- 0_0- -- O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
77 LG:389870.1 2O00SEP08 7065623H1 534 764
77 LG:389870.1 2000SEP08 7191788H2 368 640
77 LG:389870.1 2000SEP08 7354238H1 129 677
IT LG:389870.1 2000SEP08 g2740716 1 454
11 LG:389870.1 2000SEP08 g4389793 1 454
11 LG:389870.1 2000SEP08 7459406H1 769 1222
11 LG:389870.1 2000SEP08 4030167H1 620 850
11 LG:389870.1 2000SEP08 7583301 HI 710 1249
78 LG:1387485.6:2000SEP08 1259442F6 727 1286
78 LG:1387485.6:2000SEP08 575641HI 210 423
78 LG:1387485.6:2000SEP08 2695041 HI 441 744
78 LG: 1387485.6:2000SEP08 5394337H1 572 857
78 LG:1387485.6:2000SEP08 4773576H1 617 849
78 LG:1387485.6:2000SEP08 1722977H1 707 964
78 LG:1387485.6:2000SEP08 1722977F6 707 1113
78 LG:1387485.6:2000SEP08 1259442H1 726 966
78 LG:1387485.6:2000SEP08 5004203H1 894 1024
78 LG:1387485.6:2000SEP08 4545090H1 897 1175
78 LG:1387485.6:2000SEP08 2944136F6 907 1283
78 LG:1387485.6:2000SEP08 2944136H1 907 1198
78 LG:1387485.6:2000SEP08 3495979H1 912 1194
78 LG:1387485.6:2000SEP08 g1202624 939 1381
78 LG:1387485.6:2000SEP08 g1012265 955 1249
78 LG:1387485.6:2000SEP08 g1042452 956 1258
78 LG:1387485.6:2000SEP08 g983177 962 1336
78 LG:1387485.6:2000SEP08 3049691 HI 963 1254
78 LG:1387485.6:2000SEP08 gl218110 1001 1269
78 LG:1387485.6:2000SEP08 2652533H1 1016 1269
78 LG:1387485.6:2000SEP08 2098947H1 1224 1456
78 LG:1387485.6:2000SEP08 1309581 HI 1328 1575
78 LG:1387485.6:2000SEP08 1004812H1 1367 1552
78 LG:1387485.6:200OSEP08 1568541 HI 1392 1607
78 LG:1387485.6:2000SEP08 2764023H1 1 259
78 LG:1387485.6:2000SEP08 530901 HI 2 134
78 LG:1387485.6:2000SEP08 4049678H1 11 290
78 LG:1387485.6:2000SEP08 3594169H1 11 279
78 LG:1387485.6:2000SEP08 4330447H1 17 271
78 LG:1387485.6:2000SEP08 4565262H1 39 285
78 LG:1387485.6:2000SEP08 6408733H1 162 683
78 LG:1387485.6:2000SEP08 g1295497 175 615
78 LG:1387485.6:2000SEP08 7039706H1 356 875
78 LG:1387485.6:2000SEP08 986902H1 225 462
78 LG:1387485.6:2000SEP08 7939046H1 446 921
78 LG:1387485.6:2000SEP08 7270571 HI 206 743
78 LG:1387485.6:2000SEP08 1704717H1 255 425
78 LG:1387485.6:2000SEP08 4723677H1 207 436
79 LG:230151.1:2000SEP08 g1760799 1489 1929
79 LG:230151.1:2000SEP08 2834338F6 1524 2017
79 LG:230151.1:2000SEP08 2588088F6 1290 1516
79 LG:230151.1:2000SEP08 2588088H1 1290 1533
Figure imgf000156_0001
- Λ m Λ Ό - st st t r- o o o o r-v st cN o oo N O '- rN. rv r- rN. t uj .- s. j - ^ Rj .o O O O O i— CJ CJ ^ 2 _n _ ^ j t __ (θ θ θ θ r- o rv. rv. rv. c C rv. uj _o o j Lθ rv. cj s iθ 5 ^ ∞ . ° 55 5 cN CM ^ t t N CM CM rN
CN r-v rv. r- oo o !__ i— m °° ^r- _ U_J O_ _- e0 N C0 r- '- - ^ ^ ^ ^ ^ ^ iO . 00 O -) iO '0 -0 N N \ CN CN CM CM CM CN CN CM CN CN CN r- CM r- CM CN CN '— CM CN CN CM CN CM
Figure imgf000156_0002
ooo coo coo oo c oo oo oo oo oo co co cpo oooooo O O O O O O O O O
δδo
Figure imgf000156_0003
o|N o|v,ov,orN.orv.orN or-voiv.
Figure imgf000156_0004
oooooooooooooooooooo
Figure imgf000156_0005
00
o Q. ιO O cO CN O O CN CN O r. (^ -) ^ . ) UJ |v. c C st O θ rv. rv. O J CM cO CO cO CN CN O n c c r- c^ θ - -- c> ^ fy 3 ^ co Cι θ r- c st N θ θ θ θ o o o u uj uj u j LQ j uό j
^ O CO O OO CO CO r S ^ & S u sf ^ ^ -^ ^ ^ U UJ J U UJ U UJ J U U UJ U UJ UJ U U U UJ Uj Λ
C CM CN CN CN CM CN CN CM CN CO CJ CN Sf o co r-v
H U α.
UJ O O O O LO |V. ΓNI ,—
O rv. cO O O O CM c CN ^ r- CN n$5 N^ n'- ._o CN> Cr-. _- .[S c<θJ ^c r-) n θ WO Oθ NCN Cr-0 ^ Nt Ct. QO 5O nt (v^. -) '- ^ IN -) Q '- <. 'Q ) r- _ n
C NM NCN CNN CcoN -CM) C-M) CN. C.M CN. S0 S10 < c. ^ '~ ccN. c-_. SC0 '_-- W_- N_- SC0 n_- _-. -_- ^_- (_- '_-- n_- SO '_-- -_- -_- _-. -_- -_- ^_- ^_- -_- -_. '_-- '_f- o_- -_- o_- c__. '_-- -_- Srv. SO SOi
•J-)
Figure imgf000157_0001
co co oo co co oo co co co oo cg co co co oo co oo oo co co oo co oo oo co co oo oo oo co oo oo oo oo oo -o co oo oo oo co oo rø o o o p o o o o o p o o o o o p o o p o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
_ IlLl i D. __ _- ι -. -il-l αIII -11.1 -".I IιII IιII -II. ι -. -I.I iIII -II.I -II.I αMl III _II-I ιIII -II-I -M-l -II.I ιIII -M.l -II.I ιIII ιIII -II.I -II.I -II.I ιIII ιMl III -11.1 -II.I -II-I -II.I IιιII -II-I -II.I ιIII -M.l -M.l αIII -in. Q — Oo Oo OoOoOoOoOoOoOo OoOoOoOo Oo OoOo OoQoOoOoOoOoOo OoOo OoOoOoOpOpOoOoOoOooOoOoO OoOoOooO OoOoQoOoOoOoOoOoOo φ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O -^ CN C CN C C C CN CN CM CN CN CN CM CN CN CM CM CM CN CM CM
D ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ iO o ύj iij io in in io in io io in i ύj
P- r^ r^ r^ r^ r^ r^ r^ r^ r^ r^ r-^ r^ r-' r— r- OO CO θd cd θd θO θO c θO c cO θd θO » t lO UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ U UJ UJ UJ UJ UJ UJ UJ UJ U^ ^ O O O O O O O O O O O O O O O UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ Uj Lθ UJ J UJ J UJ UJ UJ UJ UJ UJ UJ UJ J UJ j !_ lθ
CJ CO CO CJ CJ C CJ CJ C^ CJ CJ CJ CJ CJ r— r- r- i— r- r- r— r- r- r- r- r- i— r- r— r— r- r- r— r- r— ι— r- r— r- r— i— r- r- i— i— r— r— r- r-
CN CN CN CN CN CN CN CN CN CM CN CN CN CN CNI CN CN CN CN CN CM CN C^
0000000000000000
∞oooooωoooooffloococoooooocoocoo- cooo-)(oD(o0(oDooooco
Figure imgf000157_0002
<oocoooooocoocoocooooocoo(Doooooo
Figure imgf000157_0003
*o o R-o -- g S 2 i8 ^ rl S -- cN - o o uj vt st o - rv. O cO O CO O CO θ rN. rv. CM O -N -N O c OO O sf CM O O rv- O O st t
O uj co uj O rv st st CN CN O 00 st rv rv. |-v r- r- o O O O CM O O O O CN CJ O OO cO f Iv.
_J 2 ? r? L!≤Π m2 c?) 2 r5 o>o oOo oOo oOo oOo Orv. Orv. O ^o st st CD O O O O O O
H U α.
Figure imgf000158_0001
o CO CO 00 CO 00 00
CL o CL o CL o CL s_ o Q- oCL o CL
LU LU LU LU LU LU O CO CO CoO
ι
CO CO od od CO CO
UJ UJ UJ J UJ
UJ UJ UJ UJ UJ UJ J UJ UJ J UJ UJ
Figure imgf000158_0002
o z o Q o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o G o
_-
_J st UJ O O sf O UJ UJ O O O rv. O 00 r- i- ^ U 'J J c '— O CM CJ -— ■— ^ |v. cO UJ UJ CJ j rv. cj O O rNi . -. rr. f^ J st Uj rv. ^ n rv. cM r- r- v rN. co r- ^ rv co uj o o o oo - u uj uj j c CM Uj cN CM CN C '— sf g- o oo -o cj cj uj u uj g; W y S st u o o cj o
-y cO c rt cO CN C CN CJ C st st st st i st st UJ UJ UJ UJ UJ J UJ UJ UJ UJ UJ UJ UJ UJ CO st st st cj sf sf UJ UJ U^ UJ St CM CM CN CN CM CM
H U α.
Figure imgf000159_0001
o
I-- z o Q O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O r- r- r- r- r- r- 00 00 co oo co co oo co co co co co co co co co oo co co co co co αj cD oo co oo αo oo co co co co co co co co co co co co co ∞ o G O
so
o CN O CM UJ Sf O O CJ CO O —
Figure imgf000160_0001
r- r- r- r- CM
H U α.
t O ffi i- O O- N CM r- C .M. _ «t c5J C^M CN rN. rv CO CO O UJ oO CN CN CN O O O O CO O UJ UJ t UJ st π co cj o oo co j uj uj _ st ιo " rCNO- s cM rOv 5-_ § o§ θ§ cQ ■— Γ?. O O ΓN. O '— CM CM CN cO O O Q cO st g J st o cj r-
CN CJ CJ CM CJ CJ CJ CJ sf t sf CN 00 jή CN CJ CJ CJ r- CN CM CN CM CN UJ UJ J UJ s CO CM CN CM CN CN CM CN CM -M C CN CM CM CN CM CM _. _. O O UJ s ^ r- t s} -5t «^- 3 rv. O
Figure imgf000160_0002
o p o o o o o o o o D 11.1 _ M_l Q in. Q in. l i_n. α. Q. α _- -. co co co co co co o o o o o o o o o o O O Q O O O O O O O O O O O O O O O O O
Figure imgf000160_0003
Figure imgf000160_0004
o
I-- o
—. co co co oo oo co oo co co co co oo co oo oo oo co co oo co co oo oo co oo oo oo oo co co co co oo oo co oo oo co co oo oo co ∞ o G o
TABLE 5
ID NO: Template ID Component ID Start Stop
81 LG:235840.1:2000SEP08 7613750H1 1669 2267
81 LG:235840.1:2000SEP08 4898449H1 1682 1867
81 LG:235840.1:2000SEP08 3464906H1 1690 2022
81 LG:235840.1:2000SEP08 004533H1 1582 1874
81 LG:235840.1:2000SEP08 6730659H1 1604 1825
81 LG:235840.1:2000SEP08 41 6254H1 2224 2519
81 LG:235840.1:2000SEP08 5914590R8 1568 2109
81 LG:235840.1:2000SEP08 2960481H2 1566 1853
81 LG:235840.1 :2000SEP08 5056390H1 1578 1850
81 LG:235840.1:2000SEP08 5034382T6 2283 2679
81 LG:235840.1:2000SEP08 2676562H1 2277 2526
81 LG:235840.1:2000SEP08 5693701 HI 2282 2543
81 LG:235840.1:2000SEP08 7738333H1 2045 2594
81 LG:235840.1:2000SEP08 4664169T6 2049 2595
81 LG:235840.1:2000SEP08 639274H1 2058 2332
81 LG:235840.1:2000SEP08 544137H1 2149 2293
81 LG:235840.1:2000SEP08 g1270286 2033 2216
81 LG:235840.1:2000SEP08 1285809H1 1979 2216
81 LG:235840.1:2000SEP08 6442655H1 1795 2303
81 LG:235840.1:2000SEP08 6711829H1 1755 2246
81 LG:235840.1:2000SEP08 4351829H1 1735 2063
81 LG:235840.1:2000SEP08 7213696H1 1946 2423
81 LG:235840.1:2000SEP08 g692276 1883 2221
81 LG:235840.1:2000SEP08 gό92306 1883 1965
81 LG:235840.1:2000SEP08 6377861 HI 1919 2189
82 LG:350272.1:2000SEP08 960820R6 1570 1941
82 LG:350272.1:2000SEP08 2697244H1 1573 1858
82 LG:350272.1:2000SEP08 960820H1 1570 1845
82 LG:350272.1:2000SEP08 7326176H1 1452 1912
82 LG:350272.1:2000SEP08 6311958H1 1468 1895
82 LG:350272.1:2000SEP08 6201854H1 1480 1915
82 LG:350272.1:2000SEP08 3559406H1 1479 1582
82 LG:350272.1:2000SEP08 3490063H1 1481 1775
82 LG:350272.1:2000SEP08 5020696T1 1335 1784
82 LG:350272.1:2000SEP08 2538770H1 1336 1545
82 LG:350272.1:2000SEP08 g2674996 1426 1831
82 LG:350272.1:2000SEP08 gl 186534 1432 1832
82 LG:350272.1:2000SEP08 603382H1 1336 1584
82 LG:350272.1:2000SEP08 3740987H1 1339 1633
82 LG:350272.1:2000SEP08 5020588T1 1354 1783
82 LG:350272.1:2000SEP08 4241989H1 1420 1748
82 LG:350272.1:2000SEP08 581813R6 1212 1561
82 LG:350272.1:2000SEP08 7322258H1 1251 1868
82 LG:350272.1:2000SEP08 2127622H1 1277 1536
82 LG:350272.1:2000SEP08 7082642H1 857 1251
82 LG:350272.1:2000SEP08 gl 152456 872 1192
82 LG:350272.1:2000SEP08 g3958809 873 1219
82 LG:350272.1:2000SEP08 g3277672 873 1323
82 LG:350272.1:2000SEP08 g3249855 873 1308
82 LG:350272.1:2000SEP08 g!101302 901 1173 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
82 LG:350272.1 2000SEP08 g2820552 903 1337
82 LG:350272.1 2000SEP08 5020696H1 907 1173
82 LG:350272.1 2000SEP08 5062519H1 941 1210
82 LG:350272.1 2000SEP08 g3163179 974 1240
82 LG:350272.1 2000SEP08 5696180H1 1003 1267
82 LG:350272.1 2000SEP08 6821260J1 1043 1481
82 LG:350272.1 2000SEP08 3224632H2 1053 1337
82 LG:350272.1 2000SEP08 7764948H1 1150 1503
82 LG:350272.1 2000SEP08 3721403H1 1177 1425
82 LG:350272.1 2000SEP08 5865088H1 1201 1485
82 LG:350272.1 2000SEP08 581813H1 1212 1475
82 LG:350272.1 2000SEP08 4010319H1 2042 2252
82 LG:350272.1 2000SEP08 1741002T6 2042 2491
82 LG:350272.1 2000SEP08 2650967H1 2043 2222
82 LG:350272.1 2000SEP08 4013519H1 2051 2256
82 LG:350272.1 2000SEP08 4353493H1 2051 2142
82 LG:350272.1 2000SEP08 211293H1 2348 2536
82 LG:350272.1 200OSEP08 4353485H1 2051 2140
82 LG:350272.1 2000SEP08 211114H1 2052 2102
82 LG:350272.1 2000SEP08 3621890H1 2051 2133
82 LG:350272.1 2000SEP08 581813T6 2053 2491
82 LG:350272.1 2000SEP08 211696H1 2348 2529
82 LG:350272.1 2000SEP08 g2820887 2054 2532
82 LG:350272.1 2000SEP08 1684883T6 2062 2494
82 LG:350272.1 2000SEP08 633648H1 2354 2542
82 LG:350272.1 2000SEP08 1684883F6 2062 2529
82 LG:350272.1 2000SEP08 6606673H1 2077 2530
82 LG:350272.1 2000SEP08 3565543H1 2365 2487
82 LG:350272.1 2000SEP08 g2359505 2443 2529
82 LG:350272.1 2000SEP08 g4649884 2445 2523
82 LG:350272.1 2000SEP08 g4970896 2078 2534
82 LG:350272.1 2000SEP08 g4125734 2080 2518
82 LG:350272.1 2000SEP08 g3741618 2095 2538
82 LG:350272.1 200OSEPO8 g6038705 2111 2529
82 LG:350272.1 2000SEP08 g4074869 2114 2535
82 LG:350272.1 2000SEP08 g5755616 2127 2535
82 LG:350272.1 2000SEP08 g2465965 2144 2535
82 LG:350272.1 2000SEP08 g2669493 2154 2531
82 LG:350272.1 2000SEP08 g4618967 2155 2529
82 LG:350272.1 2000SEP08 g2270187 2176 2532
82 LG:350272.1 2000SEP08 g855861 2193 2526
82 LG:350272.1 2000SEP08 g5397025 2183 2529
82 LG:350272.1 2000SEP08 g3307326 2185 2538
82 LG:350272.1 2000SEP08 6615126H1 2467 2529
82 LG:350272.1 2000SEP08 g2715505 2186 2535
82 LG:350272.1 2000SEP08 g4267877 2203 2523
82 LG:350272.1 2000SEP08 g4267523 2203 2523
82 LG:350272.1 2000SEP08 g4267458 2203 2523
82 LG:350272.1 2000SEP08 1637245H1 2205 2400
82 LG:350272.1 2000SEP08 g2669985 2196 2427 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
82 LG:350272.1 2000SEP08 2561156H1 2221 2501
82 LG:350272.1 2000SEP08 g5364704 2224 2530
82 LG:350272.1 2000SEP08 6843381 HI 2239 2529
82 LG:350272.1 2000SEP08 g5755074 2262 2535
82 LG:350272.1 2000SEP08 6326064H1 2271 2532
82 LG:350272.1 2000SEP08 5306551 HI 2276 2400
82 LG:350272.1 2000SEP08 644238T6 2276 2495
82 LG:350272.1 2000SEP08 5306583H1 2277 2429
82 LG:350272.1 2000SEP08 6383176H1 2285 2491
82 LG:350272.1 2000SEP08 g4079566 2287 2529
82 LG:350272.1 2000SEP08 g2577306 2288 2529
82 LG:350272.1 2000SEP08 g5812197 2319 2515
82 LG:350272.1 2000SEP08 1684883H1 2327 2529
82 LG:350272.1 2000SEP08 3932451 HI 2346 2529
82 LG:350272.1 2000SEP08 6058322H1 1563 1897
82 LG:350272.1 2000SEP08 4121623H1 1566 1861
82 LG:350272.1 2000SEP08 1888879H1 1519 1804
82 LG:350272.1 2000SEP08 4665428H1 1578 1849
82 LG:350272.1 2000SEP08 708387H1 1595 1868
82 LG:350272.1 2000SEP08 g3213833 1605 1830
82 LG:350272.1 2000SEP08 2715220H1 1611 1854
82 LG:350272.1 2000SEP08 g2575091 1627 1878
82 LG:350272.1 2000SEP08 6486886H1 1625 2162
82 LG:350272.1 2000SEP08 2206242H1 1627 1874
82 LG:350272.1 2000SEP08 6821260H1 710 1191
82 LG:350272.1 2000SEP08 7746167H1 848 1336
82 LG:350272.1 2000SEP08 3107667H1 1 210
82 LG:350272.1 2000SEP08 7674927H2 133 323
82 LG:350272.1 2000SEP08 754982H1 168 390
82 LG:350272.1 2000SEP08 7346935H1 183 661
82 LG:350272.1 2000SEP08 g4685374 380 855
82 LG:350272.1 2000SEP08 g5878247 383 855
82 LG:350272.1 2000SEP08 7313314H1 494 846
82 LG:350272.1 2000SEP08 g4649651 510 855
82 LG:350272.1 2000SEP08 g3191587 668 829
82 LG:350272.1 2000SEP08 g4889943 847 1276
82 LG:350272.1 2000SEP08 3844367H1 1634 1945
82 LG:350272.1 2000SEP08 g2900340 1644 1819
82 LG:350272.1 2000SEP08 2805489H1 1667 1917
82 LG:350272.1 2000SEP08 5766360H1 1673 2184
82 LG:350272.1 2000SEP08 3016843H1 1699 1895
82 LG:350272.1 2000SEP08 6552652H1 1766 2282
82 LG:350272.1 2000SEP08 6552052H1 1766 2225
82 LG:350272.1 2000SEP08 4138187H1 1789 1895
82 LG:350272.1 2000SEP08 1864387H1 1808 1895
82 LG:350272.1 2000SEP08 2116744H1 1814 1895
82 LG:350272.1 2000SEP08 g855860 1834 2117
82 LG:350272.1 2000SEP08 644363H1 1838 2105
82 LG:350272.1 2000SEP08 644238R6 1838 2391
82 LG:350272.1 2000SEP08 960820T6 2038 2494 _- o Q. O UJ O UJ C0 CN CN O rv, rv. st r— O UJ C o o o st oo c rv. E; st t gJ ^ rv O CN St sf st CN UJ UJ O ■— rv, O C O c cj uj CM r— c o co sj u cg cj co o UJ UJ st
UJ 0O -- CJ CN — O O CN U U st UJ O CJ ^ y- π -J ^ J r- r CM st g o o SS r- UJ st O OO OO OO O UJ CO I-v UJ r- o rv. st CM O — O J O O ^ OO O ^ ΓV lO s st st st sf t . O - r- O O CM CJ 00 O O O O
CO CN '— CM CM C CM r- r- r- r_ r_ r_ r_ r- r- r- ' J UJ '- UJ O U C -J O - O^ ^ ^ OO O CO OO O UJ O rv.
H U α.
t vt st O O st vt st CM rv. CO rv. O O CO s -| 0 _0 CN co CJ o
Figure imgf000164_0001
CJ CJ CO CM CN st st st UJ UJ Sf UJ UJ CJ CJ CJ CJ
UJ
<
Figure imgf000164_0002
CM CM CN C CN cO c CJ cO c c CJ CJ CJ CJ CJ CJ CO cO CO CJ cO CO CJ CJ CJ st st st sf st sf sf st s^ CO 00 ω co » o ω co rø ω ω «ι » ω ι» co co ω co co co ffl ffl co ffl cβ co (c o5 co co ® co cθ (θ co co co « (a ffl co cθ (» co co co co co ffl
Figure imgf000164_0003
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
85 LG:408751.3:2000SEP08 7359626H1 379 948
85 LG:408751.3:2000SEP08 4338771 HI 359 628
85 LG:408751.3:2000SEP08 g708822 393 694
85 LG:408751.3:2000SEP08 g764692 395 736
85 LG:408751.3:2000SEP08 g816062 372 784
85 LG:408751.3:2000SEP08 7587293H1 379 952
85 LG:408751.3:2000SEP08 3864471 HI 374 591
85 LG:408751.3:2000SEP08 6990907H1 383 885
85 LG:408751.3:2000SEP08 6866026H1 381 974
85 LG:408751.3:2000SEP08 5674272H1 391 645
85 LG:408751.3:2000SEP08 6120160H1 391 790
85 LG:408751.3:2000SEP08 6448066H1 412 963
85 LG:408751.3:2000SEP08 7580272H1 393 907
85 LG:408751.3:2000SEP08 g691925 443 755
85 LG:408751.3:2000SEP08 533539R6 424 939
85 LG:408751.3:2000SEP08 4705993H1 1999 2149
85 LG:408751.3:2000SEP08 7158434H1 2001 2270
85 LG:408751.3:2000SEP08 6983326H1 2077 2666
85 LG:408751.3:2000SEP08 6770575H1 2300 2823
85 LG:408751.3:2000SEP08 6765966H1 2133 2654
85 LG:408751 ,3:2000SEP08 7580816H1 2166 2665
85 LG:408751.3:2000SEP08 1270536H1 2175 2415
85 LG:408751.3:2000SEP08 7281449H1 2232 2815
85 LG:408751.3:2000SEP08 7582690H1 2242 2758
85 LG:408751.3:2000SEP08 7275431Jl 2370 2836
85 LG:408751.3:2000SEP08 5920291 HI 208 267
85 LG:408751.3:2000SEP08 1456735F6 189 605
85 LG:408751.3:2000SEP08 6721132H1 193 579
85 LG:408751.3:2000SEP08 4203426H1 212 337
85 LG:408751.3:2000SEP08 1992224H1 206 475
85 LG:408751.3:2000SEP08 7259028H1 204 579
85 LG:408751.3:2000SEP08 g766593 289 587
85 LG:408751.3:2000SEP08 7058996H1 309 890
85 LG:408751.3:2000SEP08 g570855 2997 3307
85 LG:408751.3:2000SEP08 5371992H1 3015 3187
85 LG:408751.3:2000SEP08 g389944 3100 3330
85 LG:408751.3:2000SEP08 2416693F6 3191 3330
85 LG:408751.3:2000SEP08 2416693H1 3191 3330
85 LG:408751.3:2000SEP08 g318775 3202 3330
85 LG:408751.3:2000SEP08 1594111 HI 3236 3330
85 LG:408751.3:2000SEP08 g389942 3242 3330
85 LG:408751.3:2000SEP08 5406949T6 1495 1919
85 LG:408751 ,3:2000SEP08 5945223H1 1574 1656
85 LG:408751.3:2000SEP08 6773005H1 1585 2157
85 LG:408751.3:2000SEP08 6869723H1 1599 1944
85 LG:408751.3:2000SEP08 g2985356 1730 1957
85 LG:408751.3:2000SEP08 6768978H1 1703 2265
85 LG:408751.3:2000SEP08 2154505H1 1905 2188
85 LG:408751.3:2000SEP08 7585176H1 1906 2405
85 LG:408751 ,3:2000SEP08 2154505F7 1906 2204 _- o co o t rv v rN O r- r- t c oo o o r iN cj i— o r- o ~ "~ "_ ?; t C ( O D
N-. CT N-. Γ V. -8. T ^T T T ^T O O O ΓN O O O O O O O
H U α.
_P n- c5 rv O O Cj co oθ o o ^ S ^ c r- r- co co co oo oo cj t oo uj o oo o co o o g co co oo Q cj c co co o cj r- L θ ό rv. 0 0 0
2 _ _ O rN rN [N [N rN CO 00 O O 00 O O O O c0 st st st st L0 UJ CJ CJ CJ CJ c0 St L0 st st st st -f- -t s —t
J s
CO <
Figure imgf000166_0001
o z: o — co co co co co co co oo o- co co co oo oo co co oo oo co oo co co co co co co co co co co oo co co co co co ∞ o G O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
85 LG:408751 ,3:2000SEP08 6572506H1 1480 2041
85 LG:408751.3:2000SEP08 7410777H1 1480 2021
85 LG:408751 ,3:2000SEP08 g5511164 1289 1671
85 LG:408751.3:2000SEP08 g3649444 1290 1673
85 LG:408751.3:2000SEP08 g314750 1302 1671
85 LG:408751.3:2000SEP08 g775420 1276 1685
85 LG:408751.3:2000SEP08 g4617815 1287 1678
85 LG:408751.3:2000SEP08 g314920 1339 1671
85 LG:408751.3:2000SEP08 1270695T6 1192 1632
85 LG:408751.3:2000SEP08 4717574T6 1201 1650
85 LG:408751.3:2000SEP08 1476570F6 1203 1671
85 LG:408751.3:2000SEP08 1476571F6 1203 1547
85 LG:408751.3:2000SEP08 1476570H1 1203 1409
85 LG:408751.3:2000SEP08 g614326 1215 1678
85 LG:408751.3:2000SEP08 1476571T6 1221 1634
85 LG:408751 ,3:2000SEP08 g4152280 1234 1403
85 LG:408751.3:2000SEP08 g4598685 1244 1672
85 LG:408751.3:2000SEP08 2153570H1 1256 1530
85 LG:408751.3:2000SEP08 g314775 1258 1671
85 LG:408751.3:2000SEP08 4492503H1 1264 1672
85 LG:408751.3:2000SEP08 g615988 1270 1671
85 LG:408751.3:2000SEP08 g4223790 827 1266
85 LG:408751.3:2000SEP08 5407853F6 814 1181
85 LG:408751.3:2000SEP08 6717166H1 833 1295
85 LG:408751.3:2000SEP08 7160561 HI 835 1188
85 LG:408751.3:2000SEP08 g3331126 848 1268
85 LG:408751.3:2000SEP08 5310872H1 850 1076
85 LG:408751.3:2000SEP08 7388116H1 852 1314
85 LG:408751.3:2000SEP08 g6992853 862 1266
85 LG:408751.3:2000SEP08 5267191 HI 869 1129
85 LG:408751.3:2000SEP08 4940779H1 890 1162
85 LG:408751.3:2000SEP08 1270258H1 892 1130
85 LG:408751.3:2000SEP08 g794503 898 1291
85 LG:408751.3:2000SEP08 g816007 896 1257
85 LG:408751.3:2000SEP08 g901436 904 1267
85 LG:408751.3:2000SEP08 5205391 HI 597 836
85 LG:408751.3:2000SEP08 5498383F6 573 1056
85 LG:408751 ,3:2000SEP08 5498383H1 573 811
85 LG:408751 ,3:2000SEP08 5311056H1 591 753
85 LG:408751.3:2000SEP08 7372949H1 644 1194
85 LG:408751.3:2000SEP08 5907142H1 671 950
85 LG:408751.3:2000SEP08 5924427H1 693 983
85 LG:408751.3:2000SEP08 6869327H1 736 1241
85 LG:408751.3:2000SEP08 5406949H1 814 1046
85 LG:408751.3:2000SEP08 6984258H1 733 1127
85 LG:408751.3:2000SEP08 5406949F6 814 981
85 LG:408751.3:2000SEP08 009349H1 773 1119
85 LG:408751.3:2000SEP08 6888770H1 784 1299
85 LG:408751.3:2000SEP08 4943311T6 797 1243
85 LG:408751.3:2000SEP08 7279978H1 787 1322 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop 85 LG:408751 ,3:2000SEP08 7292792H1 805 1378 85 LG:408751 ,3:2000SEP08 gl 192539 814 1266 85 LG:408751,3:2000SEP08 6037020H1 802 1344 85 LG:408751.3:2000SEP08 5407853H1 814 940 85 LG:408751.3:2000SEP08 4705993T9 1119 1569 85 LG:408751.3:2000SEP08 004952H1 1179 1438 85 LG:408751.3:2000SEP08 1476570T6 1187 1631 85 LG:408751 ,3:2000SEP08 748579H1 1095 1335 85 LG:408751.3:2000SEP08 748579R1 1091 1668 85 LG:408751.3:2000SEP08 859218T6 1028 1600 85 LG:408751.3:2000SEP08 6855475H1 1028 1225 85 LG:408751.3:2000SEP08 6871051 HI 1080 1596 85 LG:408751.3:2000SEP08 1270292T6 1031 1592 85 LG:408751.3:2000SEP08 g822109 1041 1250 85 LG:408751.3:2000SEP08 2 16693T6 1106 1626 85 LG:408751.3:2000SEP08 g6086997 914 1274 85 LG:408751.3:2000SEP08 533539T6 921 1238 85 LG:408751.3:2000SEP08 5371992T9 928 1565 85 LG:408751 ,3:2000SEP08 g314842 931 1237 85 LG:408751.3:2000SEP08 g683067 953 1236 85 LG:408751.3:2000SEP08 7290682H1 ' 961 1496 85 LG:408751.3:2000SEP08 859218R1 990 1510 85 LG:408751.3:2000SEP08 859218H1 990 1204 85 LG:408751.3:2000SEP08 859218R6 990 1432 85 LG:408751.3:2000SEP08 g567610 995 1237 85 LG:408751 ,3:2000SEP08 4943311 HI 175 458 85 LG:408751.3:2000SEP08 4943311F6 175 595 85 LG:408751.3:2000SEP08 6818987H1 197 267 85 LG:408751.3:2000SEP08 1265660F1 181 790 85 LG:408751.3:2000SEP08 g1978747 1 307 85 LG:408751 ,3:2000SEP08 g5553287 1 315 85 LG:408751.3:2000SEP08 6818987J1 33 250 85 LG:408751 ,3:2000SEP08 6989857H1 1 436 85 LG:408751.3:2000SE.P08 6955370H1 22 540 85 LG:408751 ,3:2000SEP08 g6701393 25 503 85 LG:408751 ,3:2000SEP08 g4390046 24 500 85 LG:408751.3:2000SEP08 g4534562 24 504 85 LG:408751.3:2000SEP08 gl 192915 25 170 85 LG:408751.3:2000SEP08 g2003054 31 344 85 LG:408751.3:2000SEP08 6770575J1 35 555 85 LG:408751.3:2000SEP08 6572506J1 33 460 85 LG:408751 ,3:2000SEP08 6768978Jl 33 631 85 LG:408751 ,3:2000SEP08 6765966J1 33 606 85 LG:408751.3:2000SEP08 6773005J1 33 637 85 LG:408751.3:2000SEP08 6818431J1 33 570 85 LG:408751.3:2000SEP08 g2003419 45 421 85 LG:408751.3:2000SEP08 g1551472 61 213 85 LG:408751.3:2000SEP08 6147606H1 71 625 85 LG:408751.3:2000SEP08 g615579 115 462 85 LG:408751.3:2000SEP08 g389770 122 510 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
85 LG:408751.3:2000SEP08 5311056F8 141 753
85 LG:408751.3:2000SEP08 6888770J1 153 753
85 LG:408751.3:2000SEP08 g615989 174 503
86 LG 1078933.1 2000SEP08 2730285H1 654 890
86 LG 1078933.1 2000SEP08 g2728816 753 922
86 LG 1078933.1 2000SEP08 4002923H1 773 1072
86 LG 1078933.1 2000SEP08 g2537503 800 1089
86 LG 1078933.1 2000SEP08 gό299364 1051 1124
86 LG 1078933.1 2000SEP08 4992604T6 442 970
86 LG 1078933.1 2000SEP08 2795327H1 459 701
86 LG 1078933.1 2000SEP08 6000894T8 511 998
86 LG 1078933.1 2000SEP08 4992604F6 1 561
86 LG 1078933.1 2000SEP08 4992604H1 1 207
86 LG 1078933.1 2000SEP08 7264011 HI 45 555
86 LG 1078933.1 2000SEP08 4251691 HI 398 656
86 LG 1078933.1 2000SEP08 5305837H1 410 651
86 LG 1078933.1 2000SEP08 2651913F6 438 990
86 LG 1078933.1 2000SEP08 2651913H1 438 683
86 LG 1078933.1 2000SEP08 1720617T6 592 1054
86 LG 1078933.1 2000SEP08 4030861T6 612 1166
87 LG:958731.1:2000SEP08 6274461 H2 1 390
87 LG:958731.1:2000SEP08 6274461 F8 1 586
87 LG:958731.1:2000SEP08 6274461T8 394 1095
88 LG:024125.5:2000SEP08 7937162H1 1 577
88 LG:024125.5:2000SEP08 3359165H1 14 254
88 LG:024125.5:2000SEP08 927867H1 28 176
88 LG:024125.5:2000SEP08 4729644H1 32 297
88 LG:024125.5:2000SEP08 g5430947 34 309
88 LG:024125.5:2000SEP08 g6300785 626 783
88 LG:024125.5:2000SEP08 g3918107 627 783
88 LG:024125.5:2000SEP08 1848687T6 640 818
88 LG:024125.5:2000SEP08 g4078394 641 783
88 LG:024125.5:2000SEP08 g6301603 646 783
88 LG:024125.5:2000SEP08 g4109417 671 783
88 LG:024125.5:2000SEP08 3799006H1 673 805
88 LG:024125.5:2000SEP08 g5531063 676 783
88 LG:024125.5:2000SEP08 g2115664 677 783
88 LG:024125.5:2000SEP08 g4004611 682 783
88 LG:024125.5:2000SEP08 g3417887 690 783
88 LG:024125.5:2000SEP08 g3422421 690 783
88 LG:024125.5:2000SEP08 gl 141030 716 783
88 LG:024125.5:2000SEP08 g3078258 744 820
88 LG:024125.5:2000SEP08 g873980 752 809
88 LG:024125.5:2000SEP08 g5431328 622 805
88 LG:024125.5:2000SEP08 1528650H1 625 829
88 LG:024125.5:2000SEP08 3246738H1 568 823
88 LG:024125.5:2000SEP08 4544253H1 571 808
88 LG:024125.5:2000SEP08 4329943H1 557 759
88 LG:024125.5:2000SEP08 4270595H1 567 820
88 LG :024125.5:_ 2000SEP08 2995696H1 549 800 GO CM
_ CM-
O 9- UJ CJ f CJ O C O O J O sf UJ CJ C CMM OO ssft CM IN O CM CO O CJ IN CN O UJ CJ O O Cj rv rN O UJ O r— IN O IN CΛ O st oo O 00 CJ O O O UJ O CJ IN O~ O—O OO CJ O OO Sf lN OO IN O CN r- |N θ r- rN CO CJ CM r- r- O CJ O O CO St t C CJJ CN
±: IN IN IN IN IN 5 CJ t CJ CJ CM CJ CN •— CM CM CN -N CJ UJ r- cO CN CJ r- CM CJ CJ CM CM CJ st 'vt UJ OO rN CO St O St CN O st cO CJ
Figure imgf000170_0001
H U α.
Figure imgf000170_0002
00 00 00 00 00 00 CO CO 00 OO 00 CO oo CO oo
CL CL CL CL CL CL CL CL CL CL CL 0_
LU LU LU J LU LU LU LU LU LU LU LU U LU UJ
CN CN CN CN CM CM CN CN CM CN CM CN CM CM CM CM CM CM CM uj uj cj
CM C CN CM CN CM CM CN CM CM CN CN CM CM CM CM CM CM CN C
Sf CO
CN CN CN CM CM CN CM CM CM CM CN CN CN CM CM CM CM CM CM IN
Figure imgf000170_0003
00 CO co co oo oo oo oo oo oo co co co co co oo oo co co co oo co oo oo oo oo oo oo co oo oo oo oo co co oo oo co co co oo oo oo oo oo oo oo oo oo o CO CO CO -O CO CO OO OO CO CO CO OO OO OD OD OJ CO CO OO OO CO OD CO CO CO CO OO OO OO CO CO OO OO OO rø
Figure imgf000170_0004
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
89 LG:373637.3:2000SEP08 g2942533 587 903
89 LG:373637.3:2000SEP08 g6041238 605 900
89 LG:373637.3:2000SEP08 g3051904 649 900
89 LG:373637.3:2000SEP08 g3804542 400 782
89 LG:373637.3:2000SEP08 6245663F8 1 684
89 LG:373637.3:2000SEP08 6245663H1 1 508
89 LG:373637.3:2000SEP08 g3180013 567 1011
89 LG:373637.3:2000SEP08 g3594985 588 1008
89 LG:373637.3:2000SEP08 g5675509 577 1004
89 LG:373637.3:2000SEP08 6245663T8 237 904
89 LG:373637.3:2000SEP08 g2953832 641 903
89 LG:373637.3:2000SEP08 g6036927 493 903
89 LG:373637.3:2000SEP08 g5394478 447 . 903
90 LG 1053229.1 :2000SEP08 1445465F6 56 334
90 LG 1053229.1 :2000SEP08 6103338F7 66 595
90 LG 1053229.1 :2000SEP08 1445465H1 53 322
90 LG 1053229.1 :2000SEP08 g1259655 1 217
90 LG 1053229.1 :2000SEP08 g1439745 13 308
90 LG 1053229.1 :2000SEP08 6103338H1 66 376
90 LG 1053229.1 :2000SEP08 1415866T6 393 587
90 LG 1053229.1 :2000SEP08 gl442104 1 292
91 LG:248364.1 2000SEP08 7363704H1 621 1110
91 LG:248364.1 2000SEP08 7041822R8 693 1335
91 LG:248364.1 2000SEP08 7468175H1 456 898
91 LG:248364.1 2000SEP08 6758436H1 1100 1572
91 LG:248364.1 2000SEP08 6353449F7 1048 1595
91 LG:248364.1 2000SEP08 6608287H1 1244 1796
91 LG:248364.1 2000SEP08 7754413H1 1124 1659
91 LG:248364.1 2000SEP08 814363R1 1500 1930
91 LG:248364.1 2000SEP08 814363R6 1500 1854
91 LG:248364.1 2000SEP08 5047370H1 1447 1737
91 LG:248364.1 2000SEP08 7608161J1 1374 1930
91 LG:248364.1 20OOSEP08 3071079H1 1364 1661
91 LG:248364.1 20O0SEP08 g3931760 1512 1907
91 LG:248364.1 2000SEP08 814363H1 1500 1729
91 LG:248364.1 2000SEP08 7092325H1 146 384
91 LG:248364.1 2000SEP08 6567819H1 146 688
91 LG:248364.1 2000SEP08 6567819F6 146 606
91 LG:248364.1 2000SEP08 6776971Jl 1 596
91 LG:248364.1 2000SEP08 g3873143 220 416
91 LG:248364.1 2000SEP08 g1067627 187 516
91 LG:248364.1 2000SEP08 g6575611 182 600
91 LG:248364.1 2000SEP08 g3734235 180 494
91 LG:248364.1 2000SEP08 7608161 HI 805 1211
91 LG:248364.1 2000SEP08 7384566H1 843 1160
91 LG:248364.1 2000SEP08 7097046H1 1009 1496
92 LG:477130.1 2000SEP08 5910620H1 1 302
92 LG:477130.1 2000SEP08 5910620F8 1 460
92 LG:477130.1 2000SEP08 5910668H1 2 301
92 LG :477130.1 2000SEP08 5910620T9 280 764 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
92 LG:477130.1 :2000SEP08 5910620T6 480 889
93 LG: 1 13786.17:2000SEP08 2441606F6 1 496 93 LG: 1 13786.17:2000SEP08 2441606H1 1 218 93 LG: 1 13786.17:2000SEP08 2472557H1 33 262
93 LG: 1 13786.17:2000SEP08 2472557F6 33 395
94 LG:347635.1 :2000SEP08 5639879H1 58 292 94 LG:347635.1 :2000SEP08 5639879F6 58 586 94 LG:347635.1 :2000SEP08 7186104H1 1 542 94 LG:347635.1 :2000SEP08 g2359789 585 1021 94 LG:347635.1 :2000SEP08 7714779H1 677 1165 94 LG:347635.1 :2000SEP08 g274093 691 991 94 LG:347635.1 :2000SEP08 5699927F6 1064 1517 94 LG:347635.1 :2000SEP08 g2743365 457 916 94 LG:347635.1 :2000SEP08 7152112H1 582 1003 94 LG:347635.1 :2000SEP08 6777521Jl 235 837 94 LG:347635.1 :2000SEP08 7714779J1 23 594 94 LG:347635.1 :2000SEP08 7757195J1 1 542 94 LG:347635.1 :2000SEP08 7757195H1 89 667
94 LG:347635.1 :2000SEP08 5699927H1 1065 1313
95 LG:242966.4:2000SEP08 568455H1 1669 1916 95 LG:242966.4:2000SEP08 2512929T6 1690 1934 95 LG:242966.4:2000SEP08 2196270H1 1716 1926 95 LG:242966.4:2000SEP08 1660835H1 1719 1939 95 LG:242966.4:2000SEP08 1538251 HI 1742 1964 95 LG:242966.4:2000SEP08 600839H1 1755 2010 95 LG:242966.4:2000SEP08 039755H1 1920 2026 95 LG:242966.4:2000SEP08 g3329925 1786 1973 95 LG:242966.4:2000SEP08 g2524293 1878 1941 95 LG:242966.4:2000SEP08 gόl33327 1668 1982 95 LG:242966.4:2000SEP08 1417037H1 1658 1903 95 LG:242966.4:2000SEP08 1417037F6 1658 2026 95 LG:242966.4:2000SEP08 2597301T6 1602 1937 95 LG:242966.4:2000SEP08 983791HI 1606 1903 95 LG:242966.4:2000SEP08 g6701826 1609 1973 95 LG:242966.4:2000SEP08 1383448T6 1610 1933 95 LG:242966.4:2000SEP08 3282241 HI 1628 1865 95 LG:242966.4:200OSEP08 5098821 HI 1634 1895 95 LG:242966.4:2000SEP08 g4648103 1652 1973 95 LG:242966.4:2000SEP08 g5836941 1652 1973 95 LG:242966.4:2000SEP08 g5741352 1652 1977 95 LG:242966.4:2000SEP08 5817026H1 1524 1697 95 LG:242966.4:2000SEP08 g1734258 1550 1642 95 LG:242966.4:2000SEP08 477616R7 309 763 95 LG:242966.4:2000SEP08 7766841J2 338 939 95 LG:242966.4:2000SEP08 7763806H1 469 1087 95 LG:242966.4:2000SEP08 g2009268 496 708 95 LG:242966.4:2000SEP08 867326H1 504 762 95 LG:242966.4:2000SEP08 2180603F6 515 955 95 LG:242966.4:2000SEP08 2180603H1 515 800 95 LG:242966.4:2000SEP08 3344274H1 566 824 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
95 LG:242966.4:2000SEP08 7262796H1 1 524
95 LG:242966.4:2000SEP08 7633839J1 55 559
95 LG:242966.4:2000SEP08 6699726H1 189 550
95 LG:242966.4:2000SEP08 477616H1 309 570
95 LG:242966.4:2000SEP08 6420668H1 214 388
95 LG:242966.4:2000SEP08 2159734H1 277 500
95 LG:242966.4:2000SEP08 784416H1 286 603
95 LG:242966.4:2000SEP08 7677844H1 1485 1957
95 LG:242966.4:2000SEP08 1403054H1 1479 1725
95 LG:242966.4:2000SEP08 1403054F6 1479 1859
95 LG:242966.4:2000SEP08 6769563J1 1497 1981
95 LG:242966.4:2000SEP08 2490413H1 1506 1745
95 LG:242966.4:2000SEP08 2180603T6 1509 1931
95 LG:242966.4:2000SEP08 5817345H1 1522 1748
95 LG:242966.4:2000SEP08 1475794R6 1281 1636
95 LG:242966.4:2000SEP08 6352941 F7 1403 1879
95 LG:242966.4:2000SEP08 1475794H1 1281 1459
95 LG:242966.4:2000SEP08 3016785H1 1414 1708
95 LG:242966.4:2000SEP08 2489244T6 1290 1789
95 LG:242966.4:2000SEP08 756332H1 1301 1559
95 LG:242966.4:2000SEP08 7766841 HI 1308 1587
95 LG:242966.4:2000SEP08 1543702H1 1308 1491
95 LG:242966.4:2000SEP08 477616T7 1357 1905
95 LG:242966.4:2000SEP08 1475794T6 1371 1921
95 LG:242966.4:2000SEP08 2760146H1 1376 1648
95 LG:242966.4:2000SEP08 6352941 F8 1417 2023
95 LG:242966.4:2000SEP08 6512741 HI 1446 1589
95 LG:242966.4:2000SEP08 7696070H1 1376 1565
95 LG:242966.4:2000SEP08 7035051 HI 1391 1964
95 LG:242966.4:2000SEP08 2491496H1 1960 2026
95 LG:242966.4:2000SEP08 2891711H1 613 884
95 LG:242966.4:2000SEP08 2597301 HI 1246 1515
95 LG:242966.4:2000SEP08 3223664H1 632 893
95 LG:242966.4:2000SEP08 6937623H1 757 1032
95 LG:242966.4:2000SEP08 6300077H1 835 1118
95 LG:242966.4:2000SEP08 2860363H1 881 1149
95 LG:242966.4:2000SEP08 7763806J1 887 1510
95 LG:242966.4:2000SEP08 2519487H1 945 1116
95 LG:242966.4:2000SEP08 1572340H1 1134 1316
95 LG:242966.4:2000SEP08 1507445H1 1146 1318
95 LG:242966.4:2000SEP08 4635969H1 1184 1419
95 LG:242966.4:2000SEP08 2597301F6 1246 1631
95 LG:242966.4:2000SEP08 4599922H1 1564 1843
95 LG:242966.4:2000SEP08 g5527462 1579 1973
95 LG:242966.4:2000SEP08 g4875402 1581 1981
95 LG:242966.4:2000SEP08 g3329924 1584 1973
95 LG:242966.4:2000SEP08 g5527471 1585 1973
95 LG:242966.4:2000SEP08 7128343H2 1597 1970
96 LG:217814.1:2000SEP08 2203116F6 1355 1841 96 LG:217814.1:2000SEP08 2203116T6 1344 1838 _- o ^ _8 !^ ^ ^ S ^ S _N- O O S -O st r- c g oO r- 5 g S O - C r- c O O - - CM O st
0 S S m m Λ ^ m n m _ϊ S cθ C> N S < C. 22 n n - ^ ∞ g ^ -- n θ 00 -0 'J O '-
^ ιo o ® « ^ |£ » θ ffl « ^ ^ ιιj c. ιo -j ^ o> ^ ^ ° ° ^ N) ^ ^ c N -) -) -. n ιθ Nr
H U α.
-_ O CM O U) UJ UJ r- r- cO π CN O r- Lθ _ CN CD CN _ t t O O st ΓN o O o UJ CM CM st st CM "^ IN r- (N fN IN •— i— i— IN ΓN r- C. r- 1-
CN CN CN CN st
CO 00 00 CO 00 CO CO CO 00 CO CO 00 00 CO CL CL CL Q. CL CL CL LU CO CO O oCM o CMoCMo o o _ CM IN IN IN |N |N CO o o o ΓN
CO O o o Z L . o o
Figure imgf000174_0001
O o z nt_ oo oo O O O O O O O O rN rN rN rN rN rN rN r rN rN r r rN rN r iN o co co co co co co co oo co co o S ^ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o θ o o o o o o θ o ~ ^ι ~ ~ o < o L_U CO
J m o o o o oo o o o o o o o o o o o o o o o o o o o ooo o o o o o oo o ooo o oo o o o o o o o o o
N M M r r M rO t M tO M rO M M rO M M M M M M N D o
eo co °i°_oooooooooo NO NO ON NO ONON ON ON ON -φ. Nl j Nl Nl Nl Nl --- --- --- --- --- --- Q oo oo op o o o pv-j CΛ C kS O O O O O O O O O O 00
Figure imgf000175_0001
ro ro ro ro ro ro ro N NJ ro ro ro — ro ro ro ro ro ro ro — ■ t_ b ω ω oi o n 4-- 4-. co ooco co ro ro ro _, oooo oo .fc- 4-- 4-. 0 KB N -. ω σ -i -' j- — • →-
— ■ — • — . O O .J 2 N o NJ NJ 00 —i —■ ro e en o oo o — ' Co oo — - co Q o cn ro ro ro o _ O o_ O o_o_ O o_ - c .- O o c — O o g. N _ NI2 J . 0 o en s c O o - j- ω ϋibω-Oi t--1 oo NI —- ro ro o ro NI NJ NJ 00 —' O 0_0 oNl Ni Ni en o o Ni en ro — O .
e Nn r eOn N enJ N I " rr -O NI -J Nt -OJ Nt-OJ r - J -, κ- M .1 ., M I M ^ ^ ^ M M M .^ -- en c o o to-OJ W oN-J NM ς-J rrc -OO n - ^—c —» ' —- -' ' M -NJ r .NO ' ioJ_ -—-»_' • -—C-D_' ' _ -1 , ~ a » ~ g « ^ J r^ 5N> c — ■ o ^ ^ S -' ^ -^ ^ ^ ^ o o ^ j ω 0 cn -H. cn o o ω o c» α o o o o o w o o o o ^ g o g ∞ o o o o -__ ._. c> ro o g : : — o to C0 O 4-- C04-- 4-- r O Cn — ■ N 'O I W & ω fl w ω w w ro w βl W I 0 ^ l> _ n Cn ϋι O' 0 - 'i 0 cl t' co f l (> -
CN St c cO IN CN
Figure imgf000176_0001
-t oO CJ st CM CM r- cj O O O CJ O -O CM O O r- OO CO sf vI vt rN CN CJ O CJ O CJ O ■— St |N π CM UJ st CM CM UJ OO O CJ st O O CN UJ UJ sI UJ UJ O r- 'vt LO r_ o o cj c c o r- co o r- co c 0^0 i O; N r rN o ■— UJ CJ UJ O ry o o r- 'ct -J- CM CM CN O O O O O O O O r- r- CN CN CN CN CN CO CJ CJ ^ •o o N N N Ma ^ -) (. t S r- r- « CO CN CN CN .— ^ -— <— CN CM CN CN CN CN CN CM CM CN CN CN CN CN CM O SO r. o o o c. c. n c. S o
<
Figure imgf000176_0002
o
-z o ^.CM (N ( CN CN ( CN CN CN C CN CN ( CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN C^
Qoooooooooooooooooooooooooooooooooooooooooooooooooo o LU o CO
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
102 LG:337818.2:2000SEP08 7719403J1 528 1194
102 LG:337818.2:2000SEP08 6826541 HI 672 1240
102 LG:337818.2:2000SEP08 7634622J1 718 1162
102 LG:337818.2:2000SEP08 7634622H1 720 1162
102 LG:337818.2:2000SEP08 3857549H1 797 1106
102 LG:337818.2:2000SEP08 1749882H1 811 1085
102 LG:337818.2:2000SEP08 1749882F6 811 1171
102 LG:337818.2:2000SEP08 7675304H1 658 1121
102 LG:337818.2:2000SEP08 6829715H1 860 1357
102 LG:337818.2:2000SEP08 7746862H1 873 1313
102 LG:337818.2:2000SEP08 g1953371 885 986
102 LG:337818.2:2000SEP08 g1953334 885 1086
102 LG:337818.2:2000SEP08 2271032R6 893 1370
102 LG:337818.2:2000SEP08 2271032H1 893 1152
102 LG:337818.2:2000SEP08 7128738H1 913 1433
103 LG :1040582.1 :2000SEP08 g1260446 2 321
103 LG :1040582.1 :2000SEP08 6791379H1 1 402
103 LG :1040582.1 :2000SEP08 gl614819 220 660
103 LG :1040582.1 :2000SEP08 g1647514 249 548
104 LG: :1099122.1: :2000SEP08 6798356T8 13 405
104 LG: :1099122.1: :2000SEP08 6793934H1 2 425
104 LG: :1099122.1: :2000SEP08 6798356H1 14 514
104 LG; :1099122.1: :2000SEP08 6790886T8 1 416
104 LG: :1099122.1: :2000SEP08 6790886F8 1 537
104 LG: :1099122.1: :2000SEP08 595627H1 138 223
104 LG :1099122.1: :2000SEP08 6798356F8 14 518
104 LG :1099122.1: :2000SEP08 6790886H1 15 529
105 LG :1327449.1: :2000SEP08 2870587F6 1 500
105 LG: :1327449.1: :2000SEP08 2870587H1 1 262
105 LG: :1327449.1: :2000SEP08 2870587T6 233 564
105 LG; :1327449.1: :2000SEP08 1692515H1 338 553
105 LG :1327449.1: :2000SEP08 2753173H1 418 595
105 LG :1327449.1 :2000SEP08 5754938H1 497 596
106 LG:227933.5:2000SEP08 7701259J1 183 820
106 LG:227933.5:2000SEP08 g6402044 1 505
106 LG:227933.5:2000SEP08 7701259H1 1 503
106 LG:227933.5:2000SEP08 7700960H1 7 583
106 LG:227933.5:2000SEP08 7700960J1 301 820
106 LG:227933.5:2000SEP08 7242063H1 113 702
106 LG:227933.5:2000SEP08 270716H1 1017 1350
106 LG:227933.5:2000SEP08 3337660H1 1114 1369
106 LG:227933.5:2000SEP08 3337660F6 1114 1457
106 LG:227933.5:2000SEP08 008680H1 1221 1504
106 LG:227933.5:2000SEP08 4163591 HI 1276 1576
106 LG:227933.5:2000SEP08 7583129H1 303 862
106 LG:227933.5:2000SEP08 3825814H1 305 598
106 LG:227933.5:2000SEP08 3825862H1 305 589
106 LG:227933.5:2000SEP08 6859841 HI 337 895
106 LG:227933.5:2000SEP08 3618330H1 468 766
106 LG:227933.5:: 2000SEP08 2984273H1 469 731 Q |N 0 _r_: rc-, j r o rN t rN rN r- r- co CO CN CJ O O UJ O O UJ st O -— O rN O CO st O CJ CJ O cO UJ UJ
-_ Orv r- ro-o CM 3J 0r- ΓN co S^ :~ y r- Nt o sf co rø r . co i |N. r- o iNN ^ o ∞ rN rN r θ st co θ iN. i- r- CM C r- C r- g ∞ IN O UJ UJ CJ OO O O O Q u o
O O O LO sf CN CN O O O i o ^t o ^+ c ./. _r. '^ *+' ' LU UU UJ UJ UJ 1^ UJ --/ ^J v u OO υO' Oj- O *-_- CM O O C CO - '— o \_ /
^ ~ ~ ^ ^ ^ C sC CJ C CJ CJ CJ O CJ C CJ C CJ 3CO CN CM CM CJ CJ CM C CJ CJ CJ CJ
H U α.
; sf 00 CO
O cJ CJ CJ CJ CJ O r-
5 00 00 CO CO 00
Figure imgf000178_0001
Figure imgf000178_0002
co co oo OQ. O_. Oli αO O_. Ol co co co co O CN CN CN CN CN CN CN CN Z7 COO COJ COJ cOO cOO COJ CJ CJ CJ COJ COJ COO COO COO CM CN CN < CM CN CM ( CN
Figure imgf000178_0003
o o -- O O O O O O rv rN CO cO oO cO oO oO oO cO O O O O O O O O O O O O O O O O O O O O O O O O O
Q O O O O O O O O O O O O O O O O O O O O O O O O O r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- r- ooo oooooo ^ r- o LU o
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
110 LG:236386.1:2000SEP08 2857322H1 2920 3177
110 LG:236386.1:2000SEP08 792748H1 2924 3151
110 LG:236386.1:2000SEP08 793130H1 2925 3131
110 LG:236386.1:2000SEP08 792748R1 2925 3509
110 LG:236386.1:2000SEP08 7159471H1 2934 3484
110 LG:236386.1:2000SEP08 1541872H1 2951 3157 no LG:236386.1:2000SEP08 6559394H1 1836 2450 no LG:236386.1:2000SEP08 6553230H1 1836 2188 no LG:23638ό.l:2000SEP08 3382113H1 1906 2113 no LG:236386.1:2000SEP08 2272356H1 1647 1915 no LG:236386.1:2000SEP08 2661806H1 2112 2383 no LG:236386.1:2000SEP08 2661806F6 2112 2553 no LG:236386.1:2000SEP08 g6476309 2172 2528 no LG:236386.1:2000SEP08 2627073H1 2183 2413 no LG:236386.1:2000SEP08 2627315H1 2183 2411 no LG:236386.1:2000SEP08 3901711 HI 2270 2513 no LG:236386.1:2000SEP08 5763849H1 2373 2890 no LG:236386.1:2000SEP08 7256511 HI 2420 2921 no LG:236386.1:2000SEP08 3572311 HI 2509 2721 no LG:236386.1:2000SEP08 3572311 F6 2509 3086 no LG:236386.1:2000SEP08 7336064H1 2549 2991 no LG:236386.1:2000SEP08 2272356T6 2588 3010 no LG:236386.1:2000SEP08 685902H1 2627 2848 no LG:236386.1:2000SEP08 6831490J1 496 688 no LG:236386.1:2000SEP08 6831490H1 496 688 no LG:236386.1:2000SEP08 7738646H1 755 1252 no LG:236386.1:2000SEP08 6938224H1 973 1366 no LG:236386.1:2000SEP08 7267489H1 1082 1603 no LG:236386.1:2000SEP08 g3888759 1148 1514 no LG:236386.1:2000SEP08 6454789H1 1317 1820 no LG:236386.1:2000SEP08 684735H1 1380 1626 no LG:236386.1:2000SEP08 7649660H2 1420 2039 no LG:236386.1:2000SEP08 6952285H1 1506 2072 no LG:236386.1:2000SEP08 4458494F6 1519 1967 no LG:236386.1:2000SEP08 4458494H1 1520 1755 no LG:236386.1:2000SEP08 7255931 H2 1596 1777 no LG:236386.1:2000SEP08 6909665J1 1633 2177 no LG:23638ό.l:2000SEP08 2272356R6 1647 1966 no LG:236386.1:2000SEP08 4181419H1 1 220 no LG:236386.1:2000SEP08 6779195J1 119 758 no LG:236386.1:2000SEP08 113399R6 483 847 no LG:236386.1:2000SEP08 5104505H1 3515 3738 no LG:236386.1:2000SEP08 g4081742 3517 3883 no LG:236386.1:2000SEP08 1452312T6 3521 3836 no LG:236386.1:2000SEP08 g898312 3538 3878 no LG:236386.1:2000SEP08 6499719H1 3537 3869 no LG:236386.1:2000SEP08 g4081564 3538 3883 no LG:236386.1:2000SEP08 g2335900 3571 3880 no LG:236386.1:2000SEP08 g6451467 3573 3875 no LG:236386.1:2000SEP08 g1521304 3576 3891 90
Q. c CJ UJ CO O r- O CM UJ CM O Lθ C rN θ O CM L CM rv r- O l CO O oO st oO lN CN UJ OO O CM st r- ^t O r- oo r- OO CJ CJ O O CJ CJ UJ O OO O rN iΛ oo o ocj oo rN oo co oo oo -O st ςN cj rN O st o _ st o _ O_ CM NO CM — o o c o co o o cj r— oo Q c o st uj r- o rv o rv co cj , oo oo oo oό oo co co oo co co co co oo oo oo oo oό αj io cό st cό c uj o o st rN ^ st st co st rN rN i-j o u o c^ j c c c co co co c c cj co co c c c co c co co co co co cj c c c c co co co co cj co c co co cj co co c co co rt
H U α.
-t rN θ O O _N UJ rø r- ^ O (N St r- r- sI _0 CM 00 St SI CN ( r- r- ιQ 00 rN 00 00 CM UJ UJ UJ O st UJ IN vt lN rN θ C0 CM rN Q π rv cθ CO CO O CN st c -O UJ UJ O O O CO O UJ O CM CN CJ st rN θ O O O O O O rN [N rN CM C C (N C st st rN IN IN ∞ -. - UJ UJ UJ -O UJ O O O O O O O O rN rN CN r- r- ^ ^ r- r- r- CN CM CM ( CM CN CN CN CM CN CJ CJ CJ O O O O O O O O O O r- O CJ CJ CJ CJ CJ CJ CJ CO CO CO CJ C CJ CO CJ CJ CJ CO CJ CJ CJ CJ CO CO CO CJ CJ CO CJ CJ CJ CO CO CJ CJ CJ CJ CO CJ CJ
UJ
Figure imgf000180_0001
<
00 00 00 00 00 00 CO 00 CO CO C0 CO 00 00 C0 C0 00 00 00 00 00 00 00 00 CO CO 00 00 CO 00 CO CO 00 CO 00 00 00 00 (O (XJ 00 00 00 <» C0 00 00 ∞ oαoαo-.o-.ooιoo-.-o-o-.o-.o-.oιo-.-o-o---o.o-.o-.o-.oιo--ooιoαooQ.oα-o-aoo-.o-.o--oαo-.o-.o--αoo-.o-.oαoιo-.o-.oo-.oαo-.p-.o-.
Q—ooooooooooooooooooooooooooooooooo ooo oooooooooooooooooooooooooooooooooooooooooooo
feøo o o C CMoCNoCNoCNoCNo( oCN CMpCNoCNoCNoCMo( oCNooooooooooooooooooooooooooooooooooooooooooooooooooooooo
^ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -CJ -O ό* ό - 0 0 0 0 0 0 0 0 0 0 0 0 0O 0O CO rø 0O 0O 0O OO CO 00 0O CO 00 00 O CO O0 C0 0O 00 0O 0O OO 00 (» <» 0O 0O 0O CO 0O CO 0O CO CO 00 CO CO 00 0O CO 0O -0 CO cO cO CJ CJ cO CJ CJ co c cJ CJ CJ CO cO CO CO CO cO cO O cO CJ CJ CJ cO cO CJ CJ cO cO cO CJ CJ __ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0 0 -0 -0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
CJ CJ CJ CO CJ CJ CJ CQ CJ CJ CJ CJ C CJ CJ CO CJ CJ C CJ CJ CJ CJ CJ CJ CO CJ CJ CO CJ CJ CJ CJ CJ CJ C CJ CJ CJ CO CJ CJ CJ CJ C^
CN (N < CN CN CN CN CM < ( CM CM CN ( CN CN CN CN CN C CN CN CN <>I CN CN CM CN CN CN C^
000000000000000000
oooooooooooooooooooooooooooooooooooooooooooooooooo
Figure imgf000180_0002
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
110 LG:236386.1: 2000SEP08 2658395H1 3974 4210 no LG:236386.1: 2000SEP08 4099042H2 3777 3887 no LG:236386.1: 2000SEP08 1243554H1 3777 3883 no LG.236386.1: 2000SEP08 7645333H1 3782 4137 no LG.236386.1: 2000SEP08 7645333J1 3782 4131 no LG:236386.1: 2000SEP08 g4325490 3795 3875 no LG.236386.1: 2000SEP08 700495H1 3961 4212 no LG.236386.1: 2000SEP08 2623608H1 3349 3576 no LG.236386.1: 2000SEP08 840648R1 3394 3875
111 LG 1015157.1 :2000SEP08 6795378F8 1 638
111 LG 1015157.1 :2000SEP08 6795378H1 2 438
111 LG 1015157.1 :2000SEP08 6795378T8 175 658
112 LG 1065433.1 :2000SEP08 4023626H1 701 980
112 LG 1065433.1 :2000SEP08 4216384H1 563 832
112 LG 1065433.1 2000SEP08 3449946H1 1 233
112 LG 1065433.1 2000SEP08 5133790T9 317 911
112 LG 1065433.1 2000SEP08 4216384F6 563 1029
112 LG 1065433.1 2000SEP08 3449946R6 1 580
113 LG:236992.4:2000SEP08 2023720H1 380 603
113 LG:236992.4:2000SEP08 5711092T6 425 897
113 LG:236992.4:2000SEP08 2023720T6 621 988
113 LG:236992.4:2000SEP08 1957414H1 645 918
113 LG:236992.4:2000SEP08 3691485H1 302 574
113 LG:236992.4:2000SEP08 7582909H1 1 443
113 LG:236992.4:2000SEP08 5711092F6 1 595
113 LG:236992.4:2000SEP08 5711092H1 1 277
113 LG:236992.4:2000SEP08 3366992F7 9 280
113 LG:236992.4:2000SEP08 3366992H1 9 276
113 LG:236992.4:2000SEP08 4415026H1 10 250
113 LG:236992.4:2000SEP08 1672338H1 18 238
113 LG:236992.4:2000SEP08 2023720F6 379 746
114 LG:1071124.1 2000SEP08 6182057F7 342 928
114 LG:1071124.1 2000SEP08 g3092207 896 1332
114 LG:1071124.1 2000SEP08 2950882H1 1035 1171
114 LG:1071124.1 2000SEP08 g3739050 494 820
114 LG:1071124.1 2000SEP08 688011OH1 1 365
114 LG:1071124.1 2000SEP08 6273782F8 656 1305
114 LG:1071124.1 2000SEP08 g1880374 545 648
114 LG:1071124.1 2000SEP08 6182057H1 668 928
114 LG:1071124.1 2000SEP08 6272957T8 583 1276
114 LG:1071124.1 2000SEP08 g1639276 582 840
114 LG:1071124.1 200OSEP08 6272957H1 755 1305
114 LG:1071124.1 2000SEP08 4290705H1 229 481
114 LG:1071124.1 2000SEP08 494966H1 245 471
114 LG:1071124.1 2000SEP08 4099889H1 225 479
114 LG:1071124.1 2000SEP08 4290705F6 228 609
114 LG:1071124.1 2000SEP08 1676346F6 213 459
114 LG:1071124.1 2000SEP08 7196034H1 1 453
114 LG:1071124.1 2000SEP08 1676346H1 213 438
115 LG :206425.2: 2000SEP08 5523393H1 1 255
Figure imgf000182_0001
cO oo
Figure imgf000182_0002
C,|
Figure imgf000182_0003
O z o Q ι uj uj j u uj u j uj u uj uj u u - ιθ u u ιo u u j uj uj uj uj u o rN rN rN rN rN rN rN rN rN rv iN iN rN [N o G o
Q. cO st ^t CN O CN O CO St cO UJ rN θ st r- r- O O O CJ st CJ CN CM CJ CN O CM '— ■— O cO CN urj CM UJ O CM O O rN Ui n rN CM CM CM L r- 'S r- CM ^ CM CJ CJ CM -M O CM CN O O CJ Cj rv r- ,— CN -— CJ ^ OO CM O O sl; .— .— St .— CM CO CJ CO O st o o
4-. st i r iN -o rv _ uj rv U IN Lθ IN IN |N uj rN rN rN O uj o rN rN i i iN rN o rv ^ ^ 'vt ^ cN '— CM -— r- vt ^ r— IN ΓN co
C r- r- r- r- r- r- r- r- r- st CM UJ
H U α.
t π ∞0 sOt Ost OO sOt ClN rCNN rø< CON ∞CM usθt O^ O^ Ol0 ∞lΛ OOJ MO O-0 OIN OrN OC0 0O0 OO røO rr-- rr-- -r0- O-N C-NM st rO- rCO- OCM CO0 0θ 0S, rSv fSvj iβN ry. rrv I-^N- r— o C CO IN
[N
^ CM CN c cO CJ CJ CJ CJ c c CJ CJ CJ cO C C CO CJ CO CJ CO CO c st ^ vt st O UJ UJ UJ UJ st st st ^ g g ^ § § § IN 00 CO o __ O O t O
Figure imgf000183_0001
00000000000000000000000000000000000000000000000000
^ ^ ^ ^ ^^ ^ ^ ^^^ ^ ^^ ^^ ^^ ^ ^^ ^^ ^ ^^ ^^ ^ ^^ ^^ ^^^ ^^ ^ ^^ ^^ ^ ^^ ^^ ^
Figure imgf000183_0002
-l ^ r-ι co rv C0 IN |N IN αJ O n. l- st N lN UJ CM r- 00 O st UJ st ι0 L0 '- CJ st < O 00 rv CN IN θ ^ ft O on S^ cSn N^ t c^^ OjOOr- or- SO^^ CoM Oi- CrN- rUJ- Ocj ^- oC ^ U^J st O^^t ^^ O s^f - cO St rN r- rN c r- o S
H U α.
Figure imgf000184_0001
Q
O™On nO O«wO On nO nO <O°'O'O(,'(O,'O,,'
Figure imgf000184_0002
^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ (O o co co oo co oo oo oo «) co eo «) (O cO ffl eo co oO (O cO (O co -) oo (β oo co »
Figure imgf000184_0003
OJ m
M t t t r Nj r rN ro t r t r r t rN r r N r M t j r
--- --- --- -j — . --- --- --- --- --- --- --- --- --- --- --- --- --- --- --I --- --- --- --- --- --- --- - ■ --- — i O O O O O O O O O O O O O O O O O CO OO OO 9 D. o
oob oboooo ooob OO OO OO OO OO C» CJ- O0 00 00 — • — ' -^ O O O O O O NJ NJ Nj r' r r^^' . r. r r r. oOoOoOoOoO m 0O 00 00 CJO C» O0 00 OO OO C» 00
Figure imgf000185_0001
t. t- -_ ^ {_ i_ t_ i_ {_ i_ {_ ^ t_ 0 j_ {_ _. Ji t_ t_ i_ i_ t_ £- Ji £_ t_ t_ i. t_ __ _- _- <_ . fn Cn £
0 0 00 0 0 0 0 0 0 0 0 0 0 -- -' -' -' -' O O O O O Cn r- -' -' -' -' C ^ ^ ^ ^ y g ω C S _ .n o o rNJ ^ ω S 2 θ. .J
O α> O^ -—' i M^. M^. Mcj M4_. WO N-jl K-JI Mcn p3_i θ— • —- ' -N.i bK «o ωo θen C—0 ■ 0—l ' 0—! ' MCn M4_. 0- C—> ' IC) O ! (N!0i 0N'i ϋS-l θ g> g ^ g *- 5 — g j_. w w S cn ™ — ϊ — C j-'- 9 o K oo ϊ — — _ —_ S *- S ^
f- ^ t. i. fc i_ 4_ 4_ t_ J_ J_ b fc t_ J_ ^ ^ t ft t ^ fc fc ^ ^ ^ ^ ^ t_ f. ,. ,. ,. .n ra -J κ n, -- -' -- Λ ιrl -- 1, - KJ NJ KJ Co CO CJ CO C eJ C CO fe Nl oo OO CJ OO KJ NJ NJ NJ KJ CJ CO KJ . _ . - i_ ϋι - ω - g t Nj g ^ g Nj j » 0. o c, fc ^ oo en Ni o o o -) Nj o o o NJ en o -- o rN O o oo NJ oo
O CO O c0 CN CJ vt 'vt
Figure imgf000186_0001
t O CN CN UJ O CO C cO CcoJ rr-- ^ o co o r- co o st O r- - N O st o st rN i— rN θ co iΛ - r u r- ^ r- ιo o rN C st st ι O CO CO CM O ~ O - r- r- Nt vt O rN IN OJ O O r- IN O l-J O O O CM CN CO CO CO CN st r- LO LO OO CM rN r- O O rN rø
IN OO OO OO O O — — r- r-^ r- rN IN lN CO CO rN CO O O O O O O O O O cO OO CO CO OO CO CO CO OO O O O O O O O O O O O CO f .— ■— .— CJ CJ CJ CO CO CJ CJ CJ CJ CJ CJ CO CO CO CN CN CN CN CN CO CO CJ CO CJ CJ CO CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ C^
00 00 CO CO CO CO CO OO C CO
CN CN C CM CM CM CN oCM o CN o O
CJ CJ J CJ J CJ CO CO CO CJ CJ CJ
CO 00 CO CO CO CO 00 CO CO OO CO 00
UJ uj uj uj J UJ J UJ UJ UJ CM CM CN CN CM CN CN CM CM CM CN CM CM CM CN CM
CO J CO CO CO CJ J CJ CO CJ CO CJ CJ
CJ C) C) J CJ J CO CO J CJ CJ CJ
CN CN CN CN CN CN CM CM CN CN CN CM CN CN CM
Figure imgf000186_0002
O o Q CM CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN C^ o LU O CO
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
121 LG:233258.3:2000SEP08 6840521 HI 3890 4335
121 LG:233258.3:2000SEP08 1642516T6 3913 4287
121 LG:233258.3:2000SEP08 4533427T1 3895 4291
121 LG:233258.3:2000SEP08 g283557ό 3906 4329
121 LG:233258.3:2000SEP08 g5879910 3889 4329
121 LG:233258.3:2000SEP08 7728747J1 1892 2497
121 LG:233258.3:2000SEP08 7628991 HI 1905 2431
121 LG:233258.3:2000SEP08 7717561 HI 1990 2623
121 LG:233258.3:2000SEP08 4900264F6 2105 2383
121 LG:233258.3:2000SEP08 4900264H2 2105 2391
121 LG:233258.3:2000SEP08 7410352H1 2137 2678
121 LG:233258.3:2000SEP08 5426338H1 2626 2805
121 LG:233258.3:2000SEP08 4901947H1 2753 3021
121 LG:233258.3:2000SEP08 2509912H1 2761 3000
121 LG:233258.3:2000SEP08 5475476H1 2892 3131
121 LG:233258.3:2000SEP08 7587947H1 3589 4171
121 LG:233258.3:2000SEP08 1642516H1 3601 3776
121 LG:233258.3:2000SEP08 2922448H1 3608 3881
121 LG:233258.3:2000SEP08 2411609H1 3557 3800
121 LG:233258.3:2000SEP08 5119288H1 3509 3799
121 LG:233258.3:2000SEP08 5349663H1 3513 3754
121 LG:233258.3:2000SEP08 1349048F1 3578 4053
121 LG:233258.3:2000SEP08 7687714J1 3522 3616
121 LG:233258.3:2000SEP08 3345379H1 3531 3799
121 LG:233258.3:2000SEP08 1456075H1 3542 3780
121 LG:233258.3:2000SEP08 1349048H1 3579 3836
121 LG:233258.3:2000SEP08 4385471 HI 3582 3820
121 LG:233258.3:2000SEP08 2312950H1 3429 3675
121 LG:233258.3:2000SEP08 7057669H1 3448 4059
121 LG:233258.3:2000SEP08 2363124H1 3448 3694
121 LG:233258.3:2000SEP08 5216535H1 3479 3765
121 LG:233258.3:200OSEP08 5640562H1 3481 3729
121 LG:233258.3:2000SEP08 4309879H1 3478 3788
121 LG:233258.3:2000SEP08 3500829H1 3490 3771
121 LG:233258.3:2000SEP08 7580996H1 3184 3575
121 LG:233258.3:2000SEP08 g317699 3198 3516
121 LG:233258.3:2000SEP08 6208746H1 3216 3477
121 LG:233258.3:2000SEP08 4165236H1 3246 3550
121 LG:233258.3:2000SEP08 4139559H1 3307 3595
121 LG:233258.3:2000SEP08 4143858H1 3307 3578
121 LG:233258.3:2000SEP08 1482987F6 3311 3803
121 LG:233258.3:2000SEP08 1482987H1 3311 3391
121 LG:233258.3:2000SEP08 699001HI 3373 3579
121 LG:233258.3:2000SEP08 3882343H1 3327 3546
121 LG:233258.3:2000SEP08 2349726F6 3334 3952
121 LG:233258.3:2000SEP08 2349726H1 3334 3539
121 LG:233258.3:2000SEP08 6777575H1 3358 3936
121 LG:233258.3:2000SEP08 3498890H1 3359 3643
121 LG:233258.3:2000SEP08 7244035H1 3385 3935
121 LG:233258.3:2000SEP08 g894952 3640 3790 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
121 LG:233258.3:2000SEP08 16 1794H1 3701 3836
121 LG:233258.3:2000SEP08 5133771H1 3674 3938
121 LG:233258.3:2000SEP08 g779021 3678 3931
121 LG:233258.3:2000SEP08 g850896 3678 3934
121 LG:233258.3:2000SEP08 4824305H1 3696 3827
121 LG:233258.3:2000SEP08 3771947H1 3610 3885
121 LG:233258.3:2000SEP08 6311763H1 3629 4146
121 LG:233258.3:2000SEP08 5475930T8 3629 4228
121 LG:233258.3:2000SEP08 6311663H1 3630 4189
121 LG:233258.3:2000SEP08 4620140H1 3629 3876
121 LG:233258.3:2000SEP08 5913558H1 3636 3912
121 LG:233258.3:2000SEP08 g2167549 3637 4101
121 LG:233258.3:2000SEP08 977882H1 3758 4064
121 LG:233258.3:2000SEP08 5323426T9 3765 4223
121 LG:233258.3:2000SEP08 5021669T1 3768 4277
121 LG:233258.3:2000SEP08 4627772H1 3710 3967
121 LG:233258.3:2000SEP08 2343388H1 3711 3968
121 LG:233258.3:2000SEP08 7628991Jl 3717 4252
121 LG:233258.3:2000SEP08 2355986H1 3738 3956
121 LG:233258.3:2000SEP08 4533427H1 3740 3897
121 LG:233258.3:2000SEP08 4706876H1 3750 4021
121 LG:233258.3:2000SEP08 977882R1 3758 4319
121 LG:233258.3:2000SEP08 5598851 HI 307 562
121 LG:233258.3:2000SEP08 7324630H1 408 686
121 LG:233258.3:2000SEP08 7660821 HI 524 1076
121 LG:233258.3:2000SEP08 2197276F6 584 953
121 LG:233258.3:2000SEP08 1897612H1 1 326
121 LG:233258.3:2000SEP08 1897612F6 1 501
121 LG:233258.3:2000SEP08 6776039F8 78 688
121 LG:233258.3:2000SEP08 7433752H1 161 667
121 LG:233258.3:2000SEP08 4458276H1 229 490
121 LG:233258.3:2000SEP08 7734751Jl 271 903
121 LG:233258.3:2000SEP08 6779277H1 303 715
121 LG:233258.3:2000SEP08 6776039R8 861 1469
121 LG:233258.3:2000SEP08 6776039J1 878 1494
121 LG:233258.3:2000SEP08 7176796H1 1004 1462
121 LG:233258.3:2000SEP08 7660821Jl 1176 1705
121 LG:233258.3:2000SEP08 6779277J1 1222 1791
121 LG:233258.3:2000SEP08 2880118H1 1265 1493
121 LG:233258.3:2000SEP08 7717561J1 1365 1989
121 LG:233258.3:2000SEP08 1427649H1 1546 1795
121 LG:233258.3:2000SEP08 1427649F6 1567 1990
121 LG:233258.3:2000SEP08 1721856H1 1592 1797
121 LG:233258.3:2000SEP08 7359946H1 1643 2184
121 LG:233258.3:2000SEP08 6259602H1 1604 1679
121 LG:233258.3:2000SEP08 2987344H1 1681 1937
121 LG:233258.3:2000SEP08 7329967H1 826 1418
121 LG:233258.3:2000SEP08 5616252H1 688 973
121 LG:233258.3:2000SEP08 2199941 F6 584 790
121 LG:233258.3:2000SEP08 2199941 HI 584 834 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
121 LG:233258.3:2000SEP08 2197276H1 584 818
121 LG:233258.3:2000SEP08 7261410H1 618 1237
121 LG:233258.3:2000SEP08 g3163278 640 1003
121 LG:233258.3:2000SEP08 7595923H1 2139 2739
121 LG:233258.3:2000SEP08 6456564H1 2153 2732
121 LG:233258.3:2000SEP08 6351657H1 2218 2541
121 LG:233258.3:2000SEP08 6777080J1 2298 2876
121 LG:233258.3:2000SEP08 6777575J1 2478 3097
121 LG:233258.3:2000SEP08 6809274J1 2315 2761
121 LG:233258.3:2000SEP08 6809274H1 2315 2761
121 LG:233258.3:2000SEP08 g564218 2493 2651
121 LG:233258.3:2000SEP08 683211 HI 2399 2666
121 LG:233258.3:2000SEP08 7745596H1 2515 3057
121 LG:233258.3:2000SEP08 7371787H1 2520 3034
121 LG:233258.3:2000SEP08 687106H1 2399 2653
121 LG:233258.3:2000SEP08 7260937H1 2581 3160
121 LG:233258.3:2000SEP08 683211R6 2399 2638
121 LG:233258.3:2000SEP08 7728747H1 2416 3060
121 LG:233258.3:2000SEP08 3467934H1 2453 2702
121 LG:233258.3:2000SEP08 1645077H1 2455 2669
121 LG:233258.3:2000SEP08 1645077F6 2455 2883
121 LG:233258.3:2000SEP08 3986427H1 2463 2739
121 LG:233258.3:2000SEP08 6777134J1 2478 3125
121 LG:233258.3:2000SEP08 1854836H1 2592 2839
121 LG:233258.3:2000SEP08 1854036H1 2592 2876
121 LG:233258.3:2000SEP08 1854836F6 2592 2936
122 LG:999062,1:2000SEP08 6268754H1 1 563
122 LG:999062.1:2000SEP08 6268754T8 1 594
122 LG:999062.1:2000SEP08 6268754F8 18 682
123 LG:887776.1:2000SEP08 5985557H1 5 96
123 LG:887776.1:2000SEP08 7381148H1 261 852
123 LG:887776.1:2000SEP08 6045524H1 1776 1997
123 LG:88777ό.l:2000SEP08 6080103H1 1586 1923
123 LG:887776.1:2000SEP08 5338564T8 1626 1783
123 LG:887776.1:2000SEP08 7381784H1 1 540
123 LG:887776.1:2000SEP08 5338555F8 1238 1817
123 LG:887776.1:2000SEP08 5338555H1 1238 1375
123 LG:887776.1:2000SEP08 5338564H1 1239 1450
123 LG:887776.1:2000SEP08 5335668T8 1562 1832
123 LG:887776.1:2000SEP08 6454788H1 1485 1991
123 LG:887776.1:2000SEP08 5338555T8 1507 1886
123 LG:887776.1:2000SEP08 6601228F8 1556 2010
123 LG:887776.1:2000SEP08 6601228H1 1556 1999
123 LG:887776.1:2000SEP08 6601228T8 1556 1908
123 LG:887776.1:2000SEP08 5335668H1 455 630
123 LG:887776.1:2000SEP08 5335668F8 455 1051
123 LG:887776.1:2000SEP08 7380908H1 525 1117
123 LG:88777ό.l:2000SEP08 6453407H1 863 1426
123 LG:887776.1:2000SEP08 5952782H1 886 1181
123 LG:887776.1:2000SEP08 5338564F8 1239 1812 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
124 LG:1400301.2:2000SEP08 3796761 F6 1 403
124 LG: 1400301.2:2000SEP08 3796761 HI 1 306
124 LG: 1400301.2:2000SEP08 4654776T6 126 637
124 LG:1400301.2.2--O0SEP08 g895564 59 388
124 LG:1400301.2:2000SEP08 4654776F6 34 415
124 LG:1400301.2:2000SEP08 1729735H1 52 280
124 LG:1400301.2:2000SEP08 4654776H1 34 285
125 LG:1329362.1:2000SEP08 g1734408 55 460 125 LG:1329362.1.2000SEP08 7697049J1 156 405 125 LG:1329362.1:2000SEP08 6490895R9 1 461 125 LG:1329362.1:2000SEP08 5564064H1 240 445
125 LG:1329362.1:2000SEP08 7697049H1 167 405
126 LG:1096498.1 :2000SEP08 6798228H1 1 331 126 LG:1096498.1 :2000SEP08 6792384H1 1 502 126 LG:1096498.1:2000SEP08 6792384F8 1 579 126 LG:1096498.1:2000SEP08 6790975T8 8 281 126 LG:1096498.1:2000SEP08 6790975H1 8 403
126 LG:1096498.1:2000SEP08 6790975F8 8 402
127 LG:1096337.1:2000SEP08 g812407 542 897 127 LG:1096337.1 :2000SEP08 3668515T6 251 782 127 LG:1096337.1 :2000SEP08 g812493 379 638 127 LG:1096337.1:2000SEP08 6451060H1 51 513 127 LG:1096337.1:2000SEP08 5665201 HI 398 508
127 LG:1096337.1:2000SEP08 3668515F6 1 492
128 LG:1400579.1 :2000SEP08 6895912H1 1 353 128 LG:1400579.1 :2000SEP08 6895912F8 1 353 128 LG:1400579.1:2000SEP08 7388585H1 89 627 128 LG:1400579.1:2000SEP08 1251881H1 111 367 128 LG:1400579.1:2000SEP08 6895946F8 185 353 128 LG:1400579.1:2000SEP08 5547349H1 358 577 128 LG:1400579.1 :2000SEP08 697696T6 583 914 128 LG:1400579.1 :2000SEP08 697696R6 595 914 128 LG:1400579.1 :2000SEP08 697696H1 598 773 128 LG:1400579.1 :2000SEP08 696573H1 595 813 128 LG:1400579.1:2000SEP08 4418332H1 149 394 128 LG:1 00579.1:2000SEP08 6895946H1 1 353 128 LG:1400579.1:2000SEP08 477786F1 251 865 128 LG:1400579.1:2000SEP08 3449269H1 361 504 128 LG:1400579.1:2000SEP08 3898628H1 1 241
128 LG:1400579.1:2000SEP08 477786T6 433 830
129 LG:1080091.1:2000SEP08 5484791 H2 7 250 129 LG:1080091.1:2000SEP08 4850844H1 7 286 129 LG:1080091.1:2000SEP08 1274729H1 7 268 129 LG:1080091.1:2000SEP08 5491715H1 7 257 129 LG:1080091.1:2000SEP08 3069676H1 30 286 129 LG:1080091.1 :2000SEP08 212172H1 274 455 129 LG:1080091.1:2000SEP08 7345695H1 1 551 129 LG:1080091.1:2000SEP08 4850844F6 7 477 129 LG:1080091.1:2000SEP08 1274729F6 7 285 129 LG:1080091.1:2000SEP08 3596975H1 17 300 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
129 LG:1080091.1.2000SEP08 5189963H1 31 299
129 LG:1080091.1.2000SEP08 288741OH1 28 283
129 LG:1080091.1:2000SEP08 2662961 HI 16 252
129 LG:1080091.1.2000SEP08 3069676F6 30 229
129 LG:1080091.1 :2000SEP08 5078633H1 26 274
129 LG:1080091.1:2000SEP08 g3434079 181 285
130 LG:1082203.1:2000SEP08 3770530F6 907 1374 130 LG: 1082203.1:2Q00SEP08 6255271 HI 730 986 130 LG:1082203.1.2Q00SEP08 287741OH1 1038 1321 130 LG:1082203.1:2000SEP08 3497137T6 934 1471 130 LG:1082203.1:2000SEP08 7430791 HI 977 1540 130 LG:1082203.1:2000SEP08 4216384T6 997 1673 130 LG:1082203.1:2000SEP08 g5436703 1058 1510 130 LG:1082203.1 :2000SEP08 2881835T6 1097 1658 130 LG:1082203.1 :2000SEP08 5802512H1 1118 1404 130 LG:1082203.1 :2000SEP08 5542972H1 1210 1412 130 LG:1082203.1:2000SEP08 3216760T6 1388 1662 130 LG:1082203.1:2000SEP08 7766080J1 569 1181 130 LG:1082203.1:2000SEP08 5529680H1 961 1238 130 LG:1082203.1 :2000SEP08 3497137F6 524 1061 130 LG:1082203.1:2000SEP08 065669H1 699 986 130 LG:1082203.1:2000SEP08 6489704R9 1 560 130 LG:1082203.1:2000SEP08 4023626F8 69 658 130 LG:1082203.1:2000SEP08 6615979H1 497 921 130 LG:1082203.1:2000SEP08 1233083H1 539 745 130 LG:1082203.1:2000SEP08 4708380H1 783 1045 130 LG:1082203.1 :2000SEP08 g1099997 791 1046 130 LG:1082203.1 :2000SEP08 3216760H1 832 1108 130 LG:1082203.1 :2000SEP08 3216760F6 833 1324 130 LG:1082203.1:2000SEP08 2881835H1 459 722 130 LG:1082203.1:2000SEP08 3497137H1 525 810 130 LG:1082203.1 :2000SEP08 2881835F6 457 823 130 LG:1082203.1:2000SEP08 4707980H1 783 988
130 LG:1082203.1:2000SEP08 3026964H1 1012 1128
131 LG:1084051.1:2000SEP08 5217490H1 1 198 131 LG:1084051.1 :2000SEP08 1542833H1 1 137 131 LG:1084051.1 :2000SEP08 3836068F6 7 549 131 LG:1084051.1:2000SEP08 3052342H1 46 335 131 LG:1084051.1 :2000SEP08 g5661058 155 347 131 LG:1084051.1:2000SEP08 3525414H1 171 494 131 LG:1084051.1:2000SEP08 7113909H1 182 747 131 LG:1084051.1:2000SEP08 6182691T8 227 873 131 LG:1084051.1:2000SEP08 3679568T9 546 925 131 LG:1084051.1:2000SEP08 3836068T6 560 1044 131 LG:1084051.1:2000SEP08 4315370T9 585 850 131 LG:1084051.1:2000SEP08 3022533T6 568 902 131 LG:1084051.1:2000SEP08 3021710R6 568 943 131 LG:1084051.1:2000SEP08 6371853H1 587 865 131 LG:1084051.1:2000SEP08 g4737276 630 1082 131 LG:1084051.1:2000SEP08 7108014H1 774 1258 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
131 LG 1084051.1 2000SEP08 6018614H1 102 700
131 LG 1084051.1 2000SEP08 6182691F8 334 815
131 LG 1084051.1 2000SEP08 5735738H1 78 348
131 LG 1084051.1 2000SEP08 1891884H1 19 264
131 LG 1084051.1 2000SEP08 1895443H1 19 272
131 LG 1084051.1 2000SEP08 4729485H1 86 235
131 LG 1084051.1 2000SEP08 3021710H1 568 853
131 LG 1084051.1 2000SEP08 3021038T6 568 880
131 LG 1084051.1 2000SEP08 3836068H1 7 289
132 LG 1082393.1 2000SEP08 4129648H1 1189 1391
132 LG 1082393.1 2000SEP08 6118209F8 1050 1590
132 LG 1082393.1 2000SEP08 4129281F6 1189 1637
132 LG 1082393.1 2000SEP08 2041133H1 1341 1576
132 LG 1082393.1 2000SEP08 7235041 HI 1357 1907
132 LG 1082393.1 2000SEP08 7626695J1 1370 1963
132 LG 1082393.1 2000SEP08 3392427T8 1407 1953
132 LG 1082393.1 2000SEP08 g1291764 1425 1959
132 LG 1082393.1 2000SEP08 4181819T8 1554 2024
132 LG 1082393.1 2000SEP08 6051495R8 1570 2175
132 LG 1082393.1 2000SEP08 6118412T8 1709 2206
132 LG 1082393.1 2000SEP08 6118209H1 1048 1543
132 LG 1082393.1 2000SEP08 1691842H1 781 1020
132 LG 1082393.1 2000SEP08 5004434H1 1076 1286
132 LG 1082393.1 2000SEP08 7275193H1 1 456
132 LG 1082393.1 2000SEP08 7740514H1 13 431
132 LG 1082393.1 2000SEP08 5814635T8 35 590
132 LG 1082393.1 2000SEP08 4820753F6 38 605
132 LG 1082393.1 2000SEP08 7438990H1 46 456
132 LG 1082393.1 2000SEP08 6919464F6 44 616
132 LG 1082393.1 2000SEP08 7160272H1 47 452
132 LG 1082393.1 2000SEP08 g4895686 291 692
132 LG 1082393.1 2000SEP08 g5366722 365 733
132 LG 1082393.1 2000SEP08 7740514J1 409 1070
132 LG 1082393.1 2000SEP08 4200727F8 518 1091
132 LG 1082393.1 2000SEP08 424948H1 558 783
132 LG 1082393.1 2000SEP08 3392427H1 597 885
132 LG 1082393.1 2000SEP08 6244520F8 597 1264
132 LG 1082393.1 2O00SEP08 7626695H1 658 1192
132 LG 1082393.1 2000SEP08 6121404H1 1048 1519
132 LG 1082393.1 2000SEP08 6051387J1 1662 2175
132 LG 1082393.1 2000SEP08 5813479F8 8 619
132 LG 1082393.1 2000SEP08 3500086H1 1252 1450
132 LG 1082393.1 2000SEP08 gό567918 453 686
132 LG 1082393.1 2000SEP08 g6700463 1626 1968
132 LG 1082393.1 2000SEP08 g6567913 267 687
132 LG 1082393.1 2000SEP08 5814635F8 8 617
132 LG 1082393.1 2000SEP08 1691166H1 964 1056
132 LG 1082393.1 2000SEP08 1694272F7 781 1380
132 LG 1082393.1 2000SEP08 6560051 F8 697 1208
132 LG 1082393.1 2000SEP08 6121404F8 1048 1620
Figure imgf000193_0001
00 IN L UJ J O O O- co cj- uj o rv
Figure imgf000193_0002
sf O tN rN CN st
J
CQ
Figure imgf000193_0003
< oo oo oo co co oo co ∞ oo oo co co oo co co rø co co co co oo co co oo co co oo co co co co co co co co co co oo co co co ro o CL o_L o_ o_L o_L o_L o_ o_ o_ o_ o_L o_L o_L o_L oQ- o-L o-L o-L o-L o-L o-L o-L o-L o_ o_ o_L o_L o_L o_L o_ oCL o_L o o o o o o o o o o p o o o o o o o ι_ _j _j _j m m _j _j _j _j _j _ι m _j _j ιιι _J _j _J _J _J -J _J _J _j m _j _j ιij _j _j _j - _J _j _J _J _J _j ω
GO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO -J CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO -N O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O S O O O O O O O O O Q O O O Q p O O O O O O O O O O O O O O O O O O O p O O O O O O O Q O O O O O O O ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -Φ C CN < CN CN CN CN CN CN CM CN CN CN CN CN CN CM CN CN CN ( CN CN CN CN CN CM CN CN CN CN CN C^
Ω. cO c c c C- C CJ cO C. c cO c cj cO CO C- C rt
E O O O O O O O O O O O O O O O O O O O O O O O O CO CO OO OO CO OO OO CO OO CO CO CO CO CO CO OO CO CO CO OO OO OO OO OO OO CO CO CJ CJ CJ CJ CJ CJ CJ CJ CJ Cj rO CJ CO CJ CO CO CJ CJ CJ CJ CJ CJ CJ r- r— i— r- r- i— i— r- r- r— r- r— r- i— r- r- i— i— i— r— i— r- r- r- .— r-
Φ CM CN CN CM CN CM CM CN CM CM C C CM C C CM CN CM CN CM CM CN CN CM O O O O O O O O O O O O O O O O O O O O O O O O O O .— CO CO CO CO OO CO CO CO OO CO CO OO OO OO CO OO OO O CO OO CO CO CO CO CO OO OO OO CO CO - -D - OO OO CO CO OO OO CO CO CO
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
000000000000000000
ό o ^ C CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN I CN CN CM CM CN CO C CJ CJ CJ C^ Q c c co co vj co co c co co o c co co co co cj co co cj co c co co c c co co cj co cj c cj co co co c^ o o
00
o iN O r iN rN O oo rN co o o uj o o co u co co co o oo oo o o c o o rN o o u rN o oo o o cN O O O O co cj rN θ o o 8 co S ^ - c-5- - c-o- -^- c-> oo - og- o- u--j o- -r-N θ - r^ - st- cN oo oo N rN rN uj o rN o rN oo ^ rN o rN u o rN o cM p o o rN o rN C t Sf 'sf r- ^ ^ l-J - L '— in ^ Λ r- r- r- r- r- r- r- •— LO O ■— •— O ■— <, O
CM r- CM CM — r- r- r- r- r- r- r o
H U α.
Figure imgf000194_0001
ό o Q cj c c c o j st st st 'q- t r sf 'sf 't st s! t St St St Sj- St v Sr st St St ^- St ^j r st vT sr st st st st st st sf st t st sf Nt sI
CJ CJ CJ CJ CO CJ CJ CO CJ CJ CJ CO CO CJ CO CO CJ 3 S co co cj c c cj cj c co c c cj co co cj co co co c$ CJ CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CJ CJ o G o LU 00
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
134 LG:1090268.1 2000SEP08 7601734J1 981 1180
134 LG:1090268.1 2000SEP08 893366H. 1846 2102
134 LG:1090268.1 2000SEP08 g1445330 853 1188
134 LG:1090268.1 2000SEP08 g6399975 2211 2351
134 LG:1090268.1 2000SEP08 g2183965 2159 2483
134 LG:1090268.1 2000SEP08 g846167 2311 2483
134 LG:1090268.1 2000SEP08 g855476 1251 1586
134 LG:1090268.1 2000SEP08 3244065H1 2290 2482
134 LG:1090268.1 2000SEP08 2233146H1 2139 2360
134 LG:1090268.1 2000SEP08 2568814H1 1826 2076
134 LG:1090268.1 2000SEP08 2152933H1 1606 1858
134 LG:1090268.1 2000SEP08 3144079H1 1277 1586
134 LG:1090268.1 2000SEP08 1732920H1 1930 2144
134 LG:1090268.1 2000SEP08 7718061J1 650 1265
134 LG:1090268.1 2000SEP08 g4373019 742 1196
134 LG:1090268.1 2000SEP08 g3431675 1207 1652
134 LG: 1090268.1 2000SEPO8 g4522588 741 1196
134 LG:1090268.1 2000SEP08 g3003118 1189 1336
134 LG: 1090268.1 2000SEP08 g3934836 738 1178
134 LG: 1090268.1 2000SEP08 g1483699 1189 1317
134 LG: 1090268.1 2000SEP08 g5396099 1208 1664
134 LG: 1090268.1 2000SEP08 g4852321 902 1196
134 LG:1090268.1 2000SEP08 2903867F6 877 1051
134 LG:1090268.1 2000SEP08 6754670J1 639 1189
134 LG:1090268.1 2000SEP08 g2841837 1218 1587
134 LG:1090268.1 2000SEP08 g3797598 754 1196
134 LG:1090268.1 2000SEP08 1574256H1 2148 2377
134 LG:1090268.1 2000SEP08 2492240H1 2209 2482
134 LG:1090268.1 2000SEP08 1306727F6 1355 1876
134 LG:1090268.1 2000SEP08 1617117H1 833 1035
134 LG:1090268.1 2000SEP08 2095111 HI 2030 2319
134 LG:1090268.1 2000SEP08 1760228H1 1769 2031
134 LG:1090268.1 2000SEP08 g2563663 1255 1463
134 LG:1090268.1 2000SEP08 1304083H1 2028 2245
134 LG: 1090268.1 2000SEP08 1909588H1 2297 2482
134 LG:1090268.1 2000SEP08 1213940H1 2097 2316
134 LG: 1090268.1 2000SEP08 g786692 1236 1488
134 LG:1090268.1 2000SEP08 g2818092 982 1126
134 LG:1090268.1 2000SEP08 g4282074 1094 1187
134 LG:1090268.1 2000SEP08 g1898035 909 1196
134 LG:1090268.1 2000SEP08 g2818846 985 1130
134 LG:1090268.1 2000SEP08 g2817406 912 1101
134 LG:1090268.1 2000SEP08 g2910074 903 1110
134 LG:1090268.1 2000SEP08 g2903216 913 1132
134 LG:1090268.1 2000SEP08 g3835163 900 1196
134 LG:1090268.1 2000SEP08 g2877817 897 1129
134 LG:1090268.1 2000SEP08 Q3181114 984 1196
134 LG:1090268.1 2000SEP08 g5545123 700 1196
134 LG:1090268.1 2000SEP08 g5592081 930 1178
134 LG:1090268.1 2000SEP08 g6438602 731 1196 TABLE 5
ID NO: Template ID Component ID Start Stop
134 LG: 1090268.1 2000SEP08 6753494H1 319 804
134 LG: 1090268.1 2000SEP08 6753494J1 581 1187
134 LG: 1090268.1 2000SEP08 2930018H1 640 942
134 LG:1090268.1 2000SEP08 4960753H1 674 947
134 LG:1090268.1 2000SEP08 g4971679 698 1196
134 LG: 1090268.1 2000SEP08 7706430J1 726 1307
134 LG:1090268.1 2000SEP08 2903867H1 877 1165
134 LG:1090268.1 2000SEP08 7763068J1 980 1195
134 LG:1090268.1 2000SEP08 g2754250 1195 1642
134 LG:1090268.1 2000SEP08 6770449J1 1198 1745
134 LG:1090268.1 2000SEP08 4111913H1 1237 1534
134 LG:1090268.1 2000SEP08 3180441 HI 1271 1571
134 LG:1090268.1 2000SEP08 4277537H1 1271 1545
134 LG:1090268.1 2000SEP08 3144349H1 1277 1595
134 LG:1090268.1 2000SEP08 6824119J1 1315 1982
134 LG:1090268.1 2000SEP08 7763068H1 1350 1827
134 LG: 1090268.1 2000SEP08 7348839H1 1405 1982
134 LG:1090268.1 2000SEP08 2541932H1 1572 1806
134 LG:1090268.1 2000SEP08 g2184071 1601 2026
134 LG: 1090268.1 2000SEP08 063284H1 1669 1844
134 LG: 1090268.1 2000SEP08 g786737 1714 1983
134 LG:1090268.1 2000SEP08 6812555R8 1790 1934
134 LG: 1090268.1 2000SEP08 503446H1 1831 2058
134 LG:1090268.1 2000SEP08 2109222H1 1832 2073
134 LG:1090268.1 2000SEP08 3480694H1 1849 1968
134 LG:1090268.1 2000SEP08 7308968H1 1935 2484
134 LG:1090268.1 2000SEP08 345564H1 1966 2178
134 LG:1090268.1 2000SEP08 3495146H1 311 534
134 LG:1090268.1 2000SEP08 g810697 2015 2250
134 LG:1090268.1 2000SEP08 1323858H1 2041 2292
134 LG:1090268.1 2000SEP08 1323858T6 2041 2445
134 LG:1090268.1 2000SEP08 g2069240 2061 2485
134 LG:1090268.1 2000SEP08 504232H1 2143 2359
134 LG:1090268.1 2000SEP08 2753977H1 2210 2483
134 LG:1090268.1 2000SEP08 3181838H1 2221 2484
134 LG:1090268.1 2000SEP08 g3422817 1193 1566
134 LG: 1090268.1 2000SEP08 gl892524 20 475
134 LG:1090268.1 2000SEP08 3020237H1 170 452
134 LG:1090268.1 2000SEP08 7461969H1 187 717
134 LG:1090268.1 2000SEP08 3514175H1 195 452
134 LG:1090268.1 2000SEP08 7718061 HI 1 563
134 LG:1090268.1 2000SEP08 7706430H1 237 792
134 LG:1090268.1 2000SEP08 2304939H1 1670 1895
134 LG:1090268.1 2000SEP08 g5513894 719 1196
134 LG:1090268.1 2000SEP08 g678609 2077 2478
134 LG:1090268.1 2000SEP08 g1483703 1189 1579
134 LG:1090268.1 2000SEP08 g3595778 1207 1667
134 LG: 1090268.1 2000SEP08 g2848875 1189 1499
134 LG: 1090268.1 2000SEP08 g855586 1619 1983
134 LG: 1090268.1 2000SEP08 g5395903 1205 1663 TABLE 5
SEQ ID NO Template ID Component ID Start Stop
134 LG 109026812000SEP08 7606141Jl 1200 1786
134 L 109026812000SEP08 gl516773 1670 2113
134 LG 109026812000SEP08 g5395662 1207 1679
134 LG 109026812000SEP08 g5367381 721 1175
134 L 109026812000SEP08 g3645380 1189 1674
134 LG 109026812000SEP08 839547H1 2258 2482
134 LG 109026812000SEP08 g5363202 1207 1667
134 LG 109026812000SEP08 g2458340 1189 1615
134 LG 109026812000SEP08 g5904809 1220 1666
134 LG 109026812000SEP08 gό451781 1189 1489
134 LG 109026812000SEP08 g5887491 1189 1509
134 LG 109026812000SEP08 063491 HI 1669 1883
134 LG 109026812000SEP08 g1064322 914 1150
134 LG 109026812000SEP08 g3147590 884 1196
134 LG 109026812000SEP08 g6704625 893 1195
134 LG 109026812000SEP08 g2554235 850 1115
134 LG 109026812000SEP08 g1925854 1067 1173
134 LG 109026812000SEP08 g4486399 849 1196
134 LG 109026812000SEP08 g2783700 846 1144
134 LG 109026812000SEP08 g5513830 840 1196
134 LG 109026812000SEP08 g5812533 840 1196
134 LG 109026812000SEP08 g3279089 838 1196
134 LG 109026812000SEP08 g6462439 837 1196
134 LG 109026812000SEP08 gό043117 837 1187
134 LG 109026812000SEP08 g2993463 1241 1412
134 LG 109026812000SEP08 g5526060 808 1150
134 LG 109026812000SEP08 g810364 1639 1912
134 LG 109026812000SEP08 g726667 1959 2182
134 LG 109026812000SEP08 g846116 1216 1508
134 LG 109026812000SEP08 842557H1 2325 2482
134 LG 109026812000SEP08 g3086781 1189 1525
134 LG 109026812000SEP08 1306727H1 1355 1566
134 LG 109026812000SEP08 g761083 2232 2470
134 L 109026812000SEP08 g810267 2228 2482
134 LG 109026812000SEP08 g4899638 721 1196
134 LG 109026812000SEP08 g3430748 720 1178
134 LG 109026812000SEP08 g2817276 696 1088
134 LG 109026812000SEP08 3289524H1 1326 1569
134 LG 109026812000SEP08 063093H1 1669 1909
134 LG 109026812000SEP08 1684795H1 2294 2482
134 LG 109026812000SEP08 g4535972 756 1196
134 LG 109026812000SEP08 g810598 2221 2483
134 LG 109026812000SEP08 g5110273 720 1196
134 LG 109026812000SEP08 g2881573 1260 1617
134 LG 10902681200OSEPO8 g4851194 748 1196
134 LG 109026812000SEP08 g1970707 1193 1467
134 LG 109026812000SEP08 g2884408 1215 1617
134 LG 109026812000SEP08 1704927H1 1982 2183
134 LG 109026812000SEP08 389097H1 1892 2040
134 LG 109026812000SEP08 g1477294 1189 1391 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
134 LG:1090268.1:2000SEP08 g5446508 700 1196
134 LG:1090268.1:2000SEP08 g1023498 2096 2472
134 LG:1090268.1:2000SEP08 g3086610 1189 1713
134 LG:1090268.1:2000SEP08 6753471R8 591 1187
134 LG:1090268.1:2000SEP08 737337H1 1574 1806
134 LG:1090268.1 :2000SEP08 6308005H1 1875 1943
134 LG:1090268.1 :2000SEP08 g2732602 1188 1472
134 LG:1090268.1 :2000SEP08 g1023507 1848 2152
134 LG:1090268.1 :2000SEP08 g1634136 1590 1983
134 LG:1090268.1 :2000SEP08 g5675999 493 707
134 LG:1090268.1 :2000SEP08 g901373 1746 2119
134 LG:1090268.1.2000SEP08 gl 114673 2128 2461
134 LG:1090268.1:2000SEP08 g680642 1802 2070
134 LG:1090268.1.2000SEP08 6753471Jl 584 1189
134 LG: 1090268.1 :2000SEP08 691673H1 1662 1869
134 LG:1090268.1:2000SEP08 6754141 R8 753 1186
134 LG:1090268.1:2000SEP08 gόl97565 1189 1684
135 LG:1400597.5:2000SEP08 g6025620 1 396
135 LG:1400597.5:2000SEP08 2503713T6 1 308
136 LG: 1080307.2:2000SEP08 7734018H2 1 575
137 LG:1400603.2:2000SEP08 6949927H1 18 605 137 LG: 1400603.2:2000SEP08 7129806R8 748 1195 137 LG: 1400603.2:2000SEP08 1257339H1 172 409 137 LG: 1400603.2:2000SEP08 g545109ό 176 626 137 LG: 1400603.2:2000SEP08 1398471F6 177 586 137 LG: 1400603.2:2000SEP08 g3245052 228 519 137 LG:1400603.2:2000SEP08 2694772H1 302 513 137 LG:1400603.2:2000SEP08 6449019H1 330 932 137 LG:1400603.2:2000SEP08 g6837153 346 765 137 LG:1400603.2:2000SEP08 7740723H1 691 1296 137 LG:1400603.2:2000SEP08 6949927R8 706 1300 137 LG:1400603.2:2000SEP08 6768303J1 747 1319 137 LG:1400603.2:2000SEP08 4178665F6 785 1307 137 LG:1400603.2:2000SEP08 4178665H1 786 1047 137 LG:1400603.2:2000SEP08 1290821H1 975 1223 137 LG:1400603.2:2000SEP08 7265187H1 1 570 137 LG:1400603.2:2000SEP08 6949927F8 18 653 137 LG:1400603.2:2000SEP08 g4690049 24 371 137 LG:1400603.2:20O0SEP08 1592931T6 53 588 137 LG:1400603.2:2000SEP08 3035748H1 152 433 137 LG:1400603.2:2000SEP08 7740723J1 167 744 137 LG:1400603.2:2000SEP08 6304330H2 21 520 137 LG:1400603.2:2000SEP08 7265118H1 12 534 137 LG:1400603.2:2000SEP08 6768403H1 16 482 137 LG:1400603.2:2000SEP08 6609870H2 40 608
137 LG: 1400603.2:2000SEP08 7129806H1 748 1318
138 LG:1052984.1 :2000SEP08 651414H1 16 252 138 LG:1052984.1 :2000SEP08 5486139H1 2 263 138 LG:1052984.1:2000SEP08 7315588H1 1 538 138 LG:1052984.1:2000SEP08 2939892H1 14 274 00
o Q- IN CJ CJ Sf CN θ r- o rN θ sf oo rv CJ r- cO O r- O OO O O CJ CN r- CN CN UJ CM O CM < C r- Q rø co rN CM st o- o 0 .0 Sf ■— O UJ
O o oo CM ΓN CJ CN CJ CM r- O OO O OO O " O " O r- Q O -— LO CN CM OO C st oO CJ CO UJ OO CO O UJ CO ■— o Sf O O UJ O O CO CM CN CN CM C St CN -O O C CJ CN -O O CN -O O O O UJ UJ - - st UJ
H U α.
πθMc.ι»ιnvc ■s— cj; oS CN CJ r— IN CN y CM CN CO CJ cN5 cM CM st > -)w
Figure imgf000199_0001
J poo CoO LU
CO cn O O CO O CO O O O O
C J )
C ) C ) C ) 888
CM CM CN CN CN CoMoCMoCM
CN CM CM C. CN CM
CO CO O cj cj ) O O
(N CN < CM CN CM
CN CM CM CM CM CN CM CM CM CO
Figure imgf000199_0002
O o -- CO OO CO CO CO CO CO CO OO CO CO OO CO OO OO CO OO CO OO OO OO OO OO O O O O O O O O O O O O O O O O O O O O O O O O O O O U - co co c v c cθ cθ c0 CJ CJ cθ cθ <J0 CJ c cθ CJ cθ CJ c cj ι^ CJ CJ < v0 cθ v^ o LU O O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
140 LG:1082263.2:2000SEP08 4384384H1 213 347
140 LG:1082263.2:2000SEP08 6331250F8 285 913
140 LG:1082263.2:2000SEP08 7613142J1 538 1050
140 LG:1082263.2:2000SEP08 4029293H1 548 786
140 LG:1082263.2:2000SEP08 4029293F8 548 932
140 LG:1082263.2:2000SEP08 4676879H1 560 826
140 LG:1082263.2:2000SEP08 1348730H1 807 990
140 LG:1082263.2:2000SEP08 6124822F8 922 1468
140 LG:1082263.2:2000SEP08 g1887432 1112 1496
140 LG:1082263.2:2000SEP08 2666231T6 1164 1687
140 LG:1082263.2:2000SEP08 1403760H1 1169 1366
140 LG:1082263.2:2000SEP08 7287944H1 1170 1302
140 LG:1082263.2:2000SEP08 6124822H1 913 1357
140 LG:1082263.2:2000SEP08 g2904589 142 520
140 LG:1082263.2:2000SEP08 g1933867 629 1032
140 LG:1082263.2:2000SEP08 257020H1 82 349
140 LG:1082263.2:2000SEP08 2990552F6 87 495
140 LG:1082263.2:2000SEP08 1403760F6 1169 1567
140 LG:1082263.2:2000SEP08 7032020H1 997 1310
140 LG:1082263.2:2000SEP08 1916902H1 1346 1611
140 LG:1082263.2:2000SEP08 1849907H1 1281 1442
140 LG:1082263.2:2000SEP08 2990552H1 87 283
140 LG:1082263.2:2000SEP08 g3230500 143 460
140 LG:1082263.2:2000SEP08 6197337F8 91 441
140 LG:1082263.2:2000SEP08 g2204741 1230 1632
141 LG:1048604.2:2000SEP08 g2837787 383 465 141 LG:1048604.2:2000SEP08 4252763F8 1 500 141 LG:1048604.2:2000SEP08 4252763R8 26 602 141 LG:1048604.2:2000SEP08 5767951 F8 225 808 141 LG:1048604.2:2000SEP08 6840843H1 679 1285 141 LG:1048604.2:2000SEP08 6837578H1 898 1285 141 LG:1048604.2:2000SEP08 6837261 HI 820 1285
141 LG:1048604.2:2000SEP08 5767951 HI 225 606
142 LG:1085254.3:2000SEP08 7214794H1 324 851 142 LG:1085254.3:2000SEP08 g4069986 418 854 142 LG:1085254.3:2000SEP08 5048141 R8 1 323 142 LG:1085254.3:2000SEP08 179261Rl 48 539 142 LG:1085254.3:2000SEP08 179261R6 47 451 142 LG:1085254.3:2000SEP08 7664077J1 322 877 142 LG:1085254.3:2000SEP08 721 657H1 322 897
142 LG:1085254.3:2000SEP08 g4069988 401 855
143 LG:1400606.2:2000SEP08 5798223H1 739 1070 143 LG:1400606.2:2000SEP08 1459526H1 769 998 143 LG:1400606.2:2000SEP08 3879554H1 805 1033 143 LG:1400606.2:2000SEP08 g669395 898 1183 143 LG:1400606.2:2000SEP08 g3988920 476 932 143 LG:1400606.2:2000SEP08 2512112H1 826 1147 143 LG:1400606.2:2000SEP08 g2824819 857 1167 143 LG:1400606.2:2000SEP08 g884098 887 1261 143 LG:1400606.2:2000SEP08 063489H1 935 1129 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
143 LG:1400606.2:2000SEP08 117874R1 369 829
143 LG:1400606.2:2000SEP08 6946885H1 978 1298
143 LG:1400606.2:2000SEP08 6442528H1 1 516
143 LG:1400606.2:2000SEP08 3081595F6 169 414
143 LG:1400606.2:2000SEP08 6819534J1 314 903
143 LG:1400606.2:2000SEP08 117874H1 366 597
143 LG:1400606.2:2000SEP08 3081595H1 197 466
143 LG:1400606.2:2000SEP08 117874R6 379 825
143 LG:1400606.2:2000SEP08 1459526F6 769 1070
143 LG:1400606.2:2000SEP08 g669394 898 1083
143 LG:1400606.2:2000SEPO8 5798223F7 738 1070
143 LG:1400606.2:2000SEP08 g673044 898 1120
143 LG:1400606.2:2000SEP08 6442528F8 13 518
143 LG:1400606.2:2000SEP08 5928832F8 248 682
144 LG:1090358.2:2000SEP08 g5741053 367 581 144 LG:1090358.2:2000SEP08 g4536188 361 585 144 LG:1090358.2:2000SEP08 6478543H1 715 1149 144 LG:1090358.2:2000SEP08 6 10793J1 313 925 144 LG:1090358.2:2000SEP08 505833R6 453 710 144 LG:1090358.2:2000SEP08 g6576993 263 710 144 LG:1090358.2:2000SEP08 g2540081 257 710 144 LG:1090358.2:2000SEP08 505833F1 453 710 144 LG:1090358.2:2000SEP08 505847R7 249 700 144 LG:1090358.2:2000SEP08 1726344T6 117 671 144 LG:1090358.2:2000SEP08 505833T6 453 668 144 LG:1090358.2:2000SEP08 g2583400 160 656 144 LG:1090358.2:2000SEP08 3841840T8 1 639 144 LG:1090358.2:2000SEP08 505847T7 249 637 144 LG:1090358.2:2000SEP08 3015762H1 275 561 144 LG:1090358.2:2000SEP08 6156187H1 130 452 144 LG:1090358.2:2000SEP08 gόl 17148 356 581 144 LG:1090358.2:2000SEPO8 g4983712 346 710
144 LG:1090358.2:2000SEP08 505833H1 453 651
145 LG:1079064.2:2000SEP08 7191108H2 39 631 145 LG:1079064.2:2000SEP08 6999243R8 1 657 145 LG:1079O64.2:2000SEP08 4696877F8 11 583 145 LG:1079064.2:2000SEP08 7578884H1 27 508 145 LG:1079064.2:2000SEP08 3038074F6 43 658 145 LG:1079064.2:2000SEP08 7583916H1 42 629 145 LG:1079064.2:2000SEP08 7017183H1 44 638 145 LG:1079O64.2:2000SEP08 6605987H1 73 661 145 LG:1079064.2:2000SEP08 g917354 226 548 145 LG:1079064.2:2000SEP08 6247848T8 394 917 145 LG:1079064.2:2000SEP08 5963556F8 484 1029 145 LG:1079064.2:2000SEP08 5994871 F8 525 1127 145 LG:1079064.2:2000SEP08 5808655F8 525 1098 145 LG:1079064.2:2000SEP08 5393866F7 527 1123 145 LG:1079064.2:2000SEP08 7654781 HI 545 1044 145 LG:1079064.2:2000SEP08 5531790H1 586 838 145 LG:1079064.2:2000SEP08 7178988H1 58 588 TABLE 5 ID NO' Template ID Component ID Start Stop
145 LG 1079064.2:2000SEP08 6247848F8 45 720
145 LG 10790642'2000SEP08 7188233H1 29 559
145 LG 1079064.2:2000SEP08 7654781Jl 286 662
145 LG 1079064.2:2000SEP08 7738722H1 43 703
145 LG 1079064.2:2000SEP08 7260311 HI 421 895
145 LG 10790642-2000SEP08 7642987H1 128 324
145 LG'1079064.2:2000SEP08 7758240J1 107 512
145 LG 1079064.2:2000SEP08 g6301088 80 362
145 LG 1079064.2:2000SEP08 7176120H1 11 546
145 LG 10790642:2000SEP08 1389713H1 384 632
145 LG 1079064.2:2000SEP08 5963556H1 484 1008
145 LG 1079064.2:2000SEP08 7580135H1 29 483
145 LG 1079064.2:2000SEP08 7581585H1 73 619
145 LG 1079064.2:2000SEP08 7189302H2 69 644
145 LG 1079064.2:2000SEP08 6445475H2 137 658
145 LG 1079064.2:2000SEP08 6298888F7 71 698
145 LG 1079064.2.2000SEP08 5994871 HI 525 810
145 LG'1079064.2:2000SEP08 6247848H1 45 586
145 LG'1079064.2:2000SEP08 6929423H1 338 798
145 LG 1079064.2'2000SEP08 7267188H2 45 531
145 LG1079064.2.2000SEP08 6929423F8 338 933
145 LG1079064.2:2000SEP08 7461231H2 79 496
145 LG 1079064.2:2000SEP08 7367949H1 43 616
146 LG 1076866 V2000SEP08 5879062H1 1062 1314
146 LGl076866.1:2000SEP08 264599H1 1182 1430
146 LGl076866.1.2000SEP08 4830902H1 1600 1674
146 LGl076866.1:2000SEP08 5879056H1 1062 1322
146 LG 107686612000SEP08 2939523H1 1791 2057
146 LGl076866.1:2000SEP08 184111H1 1476 1707
146 LGl076866.1:2000SEP08 3174078H1 97 329
146 LG 1076866.1 :2000SEP08 2585657F6 206 419
146 LG.l076866.1:2000SEP08 6742253H1 1198 1653
146 LGl076866.1:2000SEP08 691991OH1 1657 2117
146 LG 1076866.1 :2000SEP08 625782R1 1796 2292
146 LG 1076866 V2000SEP08 1001517R1 1984 2476
146 LGl076866.1:2000SEP08 1001517H1 1984 2228
146 LG 1076866.1:2000SEP08 3957124H2 2035 2322
146 LGl076866.1:2000SEP08 7107946H1 1 488
146 LGl076866.1:2000SEP08 3174078F6 91 623
146 LGl076866.1:2000SEP08 4323175H1 303 560
146 LGl076866.1:2000SEP08 4323175F7 303 899
146 LGl076866.V2000SEP08 4323175T7 332 816
146 LGl076866.1:2000SEP08 g5850461 457 919
146 LGl076866.1:2000SEP08 g3837638 556 944
146 LG:1076866.1 :2000SEP08 3174078T6 573 881
146 LG 10768661:2000SEP08 3778843F9 797 1255
146 LGl076866.1:2000SEP08 3688101 HI 1114 1344
146 LG 1076866.1 :2000SEP08 3688101F6 1114 1516
146 LGl076866.1:2000SEP08 6765719J1 1291 1870
146 LGl076866.1:2000SEP08 1543396R6 1477 1987 TABLE 5
- ID NO: Template ID Component ID Start Stop
146 LG 1076866.1 2000SEP08 6078450H1 1574 1858
146 LG 1076866.1 2000SEP08 4830902F8 1600 2178
146 LG 1076866.1 2000SEP08 1697599H1 1610 1747
146 LG 1076866.1 2000SEP08 g5439960 519 944
146 LG 1076866.1 2000SEP08 625782R6 1796 2053
146 LG 1076866.1 2000SEP08 1543396H1 1477 1682
146 LG 1076866.1 2000SEP08 6742253F8 1205 1744
146 LG 1076866.1 2000SEP08 4830902F9 1600 2157
146 LG 1076866.1 2000SEP08 625782H1 1796 2039
146 LG 1076866.1 2000SEP08 4764175H1 1755 2006
147 LG:969359.1:2000SEP08 6795715H1 3 510
147 LG:969359.1:2000SEP08 6798649F8 3 488
147 LG:969359.1:2000SEP08 6798649H1 3 299
147 LG:969359.1:2000SEP08 6795715F8 3 571
147 LG:969359.1:2000SEP08 6793426H1 10 454
147 LG:969359.1:2000SEP08 6790981T8 169 792
147 LG:969359.1:2000SEP08 6292801 HI 303 449
147 LG:969359.1:2000SEP08 6798649T8 599 773
147 LG:969359.1 :2000SEP08 6793426T8 657 779
147 LG:969359.1:2000SEP08 gl616171 710 900
147 LG:969359.1:2000SEP08 6790981 HI 1 486
147 LG:969359.1:2000SEP08 6790981F8 1 497
148 LG:366783.1:2000SEP08 g1068298 1080 1313
148 LG:366783.1:2000SEP08 g1068059 1069 1310
148 LG:366783.1:2000SEP08 4948403T6 621 1232
148 LG:366783.1:2000SEP08 4948403F6 441 948
148 LG:366783.1:2000SEP08 4948403H1 441 676
148 LG:366783.1:2000SEP08 4600759H1 286 543
148 LG:366783.1:2000SEP08 5763587T7 1 472
149 LG:332176.3:2000SEP08 6623593H1 203 786
149 LG:332176.3:2000SEP08 7338995H1 662 1225
149 LG:332176.3:2000SEP08 6623593J1 731 1329
149 LG:332176.3:2000SEP08 6326238T7 856 1019
149 LG:332176.3:2000SEP08 7618143J1 1 467
149 LG:332176.3:2000SEP08 g1975879 32 318
149 LG:332176.3:2000SEP08 1467511T7 100 537
149 LG:332176.3:2000SEP08 6326238H1 107 405
149 LG:332176.3:2000SEP08 526662H1 124 307
149 LG:332176.3:2000SEP08 2856544H1 133 245
149 LG:332176.3:2000SEP08 7247177H1 161 539
150 LG:994938.1:2000SEP08 959442H1 1 275
150 LG:994938.1 :2000SEP08 960385H1 1 184
150 LG:994938.1:2000SEP08 959442R6 1 366
150 LG:994938.1:2000SEP08 3387872T8 185 740
150 LG:994938.1:2000SEP08 g2934522 342 630
150 LG:994938.1:2000SEP08 6325528H1 346 538
150 LG:994938.1:2000SEP08 407022H1 371 614
150 LG:994938.1:2000SEP08 960385T6 423 772
150 LG:994938.1:2000SEP08 g2242138 425 810
150 LG :994938.1:_ 2000SEP08 959442T6 431 761
202 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
150 LG:994938.1 2000SEP08 g2969314 445 541
151 LG:982800.1 2000SEP08 3118103H1 2147 2437
151 LG:982800.1 2000SEP08 6020943H1 2062 2430
151 LG:982800.1 2000SEP08 3219963H1 2062 2228
151 LG:982800.1 2000SEP08 g2540261 2063 2448
151 LG:982800.1 2000SEP08 6147820H1 2078 2363
151 LG:982800.1 2000SEP08 4445313H1 2080 2266
151 LG:982800.1 2000SEP08 955100T2 2090 2570
151 LG:982800.1 2000SEP08 955100R1 2090 2453
151 LG:982800.1 2000SEP08 955100H1 2090 2358
151 LG:982800.1 2000SEP08 3814175T6 2095 2570
151 LG:982800.1 2000SEP08 g2011857 2103 2393
151 LG:982800.1 2000SEP08 2875837H1 2116 2394
151 LG:982800.1 2000SEP08 g1849425 2131 2424
151 LG:982800.1 2000SEP08 4337528T6 2144 2581
151 LG:982800.1 2000SEP08 3118103F6 2146 2582
151 LG:982800.1 2000SEP08 822341 Rl 2189 2449
151 LG:982800.1 2000SEP08 g3086258 2215 2609
151 LG:982800.1 2000SEP08 2101604T6 2217 2565
151 LG:982800.1 2000SEP08 636591 HI 2217 2470
151 LG:982800.1 2000SEP08 g2809907 2239 2430
151 LG:982800.1 2000SEP08 496959H1 2271 2449
151 LG:982800.1 2000SEP08 g1927355 2303 2606
151 LG:982800.1 2000SEP08 406063H1 2316 2543
151 LG:982800.1 2000SEP08 7335618H1 2331 2616
151 LG:982800.1 2000SEP08 g4329276 2347 2613
151 LG:982800.1 2000SEP08 630611 HI 2382 2607
151 LG:982800.1 2000SEP08 904646R2 2397 2601
151 LG:982800.1 2000SEP08 904646H1 2398 2608
151 LG:982800.1 2000SEP08 6870517H1 1296 1814
151 LG:982800.1 2000SEP08 3814709H1 1477 1746
151 LG:982800.1 2000SEP08 3814709F7 1498 1730
151 LG:982800.1 2000SEP08 4556204H1 1561 1642
151 LG:982800.1 2000SEP08 4551845F6 1561 1744
151 LG:982800.1 2000SEP08 4557749H1 1561 1744
151 LG:982800.1 2000SEP08 4556204F8 1561 2151
151 LG:982800.1 2000SEP08 4557749F8 1580 2139
151 LG:982800.1 2000SEP08 21016O4R6 2028 2532
151 LG:982800.1 2000SEP08 2101604H1 2028 2294
151 LG:982800.1 2000SEP08 g2016400 2051 2147
151 LG:98280 .l 2000SEP08 804149H1 2058 2162
151 LG:982800.1 2000SEP08 3485587H1 2062 2165
151 LG:982800.1 2000SEP08 3999943H1 2062 2229
151 LG:982800.1 2000SEP08 4337528F6 583 991
151 LG:982800.1 2000SEP08 4337528H1 583 841
151 LG:982800.1 2000SEP08 2790452H2 853 1006
151 LG:982800.1 2000SEP08 2792488H1 853 1139
151 LG:982800.1 2000SEP08 2792488F6 853 1240
151 LG:982800.1 2000SEP08 7269729H1 1056 1608
151 LG:982800.1 2000SEP08 g1802842 1156 1515
203 CJJ Cn cti cn cn cti cn cti cri Cn cn cn cn cn cn cn cn cn cn cn cn cn cti cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cn cπ cn cn c^ w ω u w w u ω u ω u w Ai ω ω cJ W CJ U C- w w - ω ω ω ω - ω w M M - - - - - - - - - z
O
00000000000000000000000000000000000000000000000000 J KJ NJ NJ NJ KJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ KJ NJ KJ NJ KJ NJ NJ NJ NJ NJ NJ NJ O O O O o o o o Oo 00 CJ 00 00 CJ CJ CJ CJ CJ CJ CJ CJ CO CJ 00 CJ CJ CJ CJ CJ CJ Co CJ CJ CJ C CJ 00 Nl NJ oo 0o0 0o0oCoo00 00 00 00 00 00 00 00 0o0 0o0 oOD 0o0 o00 o 00 o 00
4- 4-- 4-- J-- t-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4_. _. 4-- J-- 4.- J- J_ ti. Nl KJ INJ KJ NJ N NJ NJ KJ NJ NJ NJ NJ NJ KJ KJ NJ KJ NJ NJ
Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl M M Nl Nl Nl Nl Nl NJ 00 00 00 00 00 00 00 00 00 00 oo 00 00 00 00 00 00 00 00 00
4- 4-. J-- J-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- 4-- J-- J-- 4-- 4-- 4-- 4-- 4-- 4-- O O O O o o o o o
00 00 0 00 oo 00 00 00 oo do 00 00 00 00 Co 00 00 00 00 00 00 00 00 00 Co oi
00 00 00 0 o o o o o O O o o o o o oo oo o ~ o o o o o o o o o
KJ NJ NJ rO t NJ NJ NJ NJ KJ rNj KJ KJ KJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ KJ KJ N Nj Nl Ό ro Kb Q
NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ ro ro NJ NJ NJ NJ KJ NJ NJ NJ NJ NJ KJ KJ NJ NJ NJ NJ NJ NJ NJ NJ NJ KJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ o o 88 O O O O O O O O o o O O O O O 8 O O O o o o o o o o o O O O O O ( ) CD
O O O O O O O O O o o O O O o o o o ( ) O O O O O O O ( ) O o O O O O O O O O 88 O O o O O O_ O O O ( ) O O O O
CΛ oCΛ C CO CΛ CO CΛ CΛ CΛ CΛ o CΛ oCΛ o CΛ 8_ CΛ CΛ ~ OoO_ 8
CoΛ CΛ o CΛ o CΛ o CΛ o CΛ CΛ CΛ CΛ CΛ CΛ CΛ o CΛ o CΛ o CΛ CO CΛ CΛ CΛ CΛ CΛ CO CΛ CΛ CΛ CΛ O m m m m 111 III m m m rn m m 171 III ITI ni III III III III III
TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TJ TI TJ TJ TJ TJ TJ TJ TJ T) TJ TJ TJ TJ TJ I TJ TI TI TI TI T TJJ TJ TI TJ TJ
CD O O O O O O O O O O O O O O O O O O O O p o o o o CJ oo o CJ 00 00 00 00 00 00 00 00 00 03 oo oo oo oo oo oo oo oo oo oo oo o CO o 00 o 00 o 00 o 00 0 o0 o 00 o 00 o 00 00 00 00 s TJ 00 0 o0 o 00 00 00 o 00 oo oo 00 00 00 00 CO 00 00 OO
> 03
OJ
Figure imgf000205_0001
o en en oo - _- -_ _ _- _- _. --- --- __ --- --- _- κ -- KJ NJ NJ NJ KJ — o — oo o oo _ ro KJ o o o o O O O Cn Nl Nj oO NJ O O O -vl o O O __ 4- J- oi en -- _, co iv !-. --- -_. --. --- eri -_ e Cj- co — CO 4_. o co o CJ cn o KJ —• NI O CJ o N iNj ci t- eti cti K Oo — ' KJ cn o eti oo o ro — — O O NJ f C° Qt o cn —■ o NI NJ oo oo o KJ CJl — ' 00 O CJ NJ — ' O C0 4__, 0 — ' O NJ KJ CJ — ~3fegg_ji°_3«3$8$ O =
j- — '
S rS θo
Nl
Figure imgf000205_0002
c^ cjι θ. cn cn ctι cn cn cn cn cji cn cn cn oι Cn cn cn cti cn cn erι cn cn cjJ Cjι Ci ctι cn cjι Cjι cn cn cjι cn ctι Cn cn cj w oo c_o j c_o cj j cj c o e ej c j c cjj ej ej o θ) c o cj cj ω cj (_o
Figure imgf000206_0001
Figure imgf000206_0002
rO M —N —-O ' rO r NJ S oo oo cn co cti en o oJ O Cn o en o en en Nj Co M -- MKJ MN r- -NJ N-J N-J' N-J' KMj OrO NO.J MM NJ NMJ -NJ' KWJ O—1 _ -. ^., —-' r— ' rg ^g-i N^i rg cji 4_ — - o o — - J- — ' αJ -vJ Co o o cti 4_ 4_ cj cχι cι oo eji N ^ S e> e K fe s > -- O N -i O rO tO Oi -i U 4- J- -' - « 4- Q NJ O NI O O — ■ en cj oo o o i-. o c o o oo o o o — J -O CO O 0 O^ 0000 ™ 00 " 000 £- c o κj cn oo — ' o o oo — ' OO O NI CO CJ — ■ _J.
ϊ oo ϊoo oo o o o o o o o o --i: -. ro ro o o — ' Cn Cn cn cn o o
Figure imgf000206_0003
M m C0 KJ O vj o *- O O O-_i
Figure imgf000206_0004
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
153 LG:234748.2:2000SEP08 g1696922 114 441
153 LG:234748.2:2000SEP08 g3849721 5 392
153 LG:234748.2:2000SEP08 g4606643 98 386
153 LG:234748.2:2000SEP08 5411009H1 1618 1874
153 LG:234748.2:2000SEP08 042271 HI 1660 1862
153 LG:234748.2:2000SEP08 418644T6 1358 1853
153 LG:234748.2:2000SEP08 3705837T6 1220 1841
153 LG:234748.2:2000SEP08 1505116H1 1546 1765
153 LG:234748.2:2000SEP08 418644R6 1294 1753
153 LG:234748.2:2000SEP08 2750637H1 1449 1716
153 LG:234748.2:2000SEP08 029958H1 1328 1571
153 LG:234748.2:2000SEP08 5909796H1 1264 1557
153 LG:234748.2:2000SEP08 418644H1 1294 1544
153 LG:234748.2:2000SEP08 2896217H1 1242 1535
153 LG:234748.2:2000SEP08 1582705H1 1344 1533
153 LG:234748.2:2000SEP08 413712H1 1294 1527
153 LG:234748.2:2000SEP08 7694715J1 1008 1523
153 LG:234748.2:2000SEP08 3155457H1 1228 1518
153 LG:234748.2:2000SEP08 412062H1 1294 1508
153 LG:234748.2:2000SEP08 3320618H1 504 759
153 LG:234748.2:2000SEP08 7694715H1 301 718
153 LG:234748.2:2000SEP08 g5633027 254 708
153 LG:234748.2:2000SEP08 g2725769 279 705
153 LG:234748.2:2000SEP08 g3147355 280 705
153 LG:234748.2:2000SEP08 4798887H1 428 685
153 LG:234748.2:2000SEP08 2768878T6 94 666
153 LG:234748.2:2000SEP08 3705837H1 356 639
153 LG:234748.2:2000SEP08 4715141H1 281 563
153 LG:234748.2:2000SEP08 g2737814 1629 1900
153 LG:234748.2:2000SEP08 g6746116 1480 1894
153 LG:234748.2:2000SEP08 g5109422 1480 1891
153 LG:234748.2:2000SEP08 g5444199 1441 1891
153 LG:234748.2:2000SEP08 g4852038 1433 1891
154 LG:306284.1 :2000SEP08 5972128H1 167 621 154 LG:306284.1:2000SEP08 7091536H1 1 511 154 l_G:306284.1 :2000SEP08 7190549H1 1 488
154 LG:306284.1 :2000SEP08 g2882959 58 298
155 LI:333170.3:2000SEP08 g5837651 208 614 155 U:333170.3:2000SEP08 g3742832 264 617 155 LI:333170.3:2000SEP08 2004907R6 58 580 155 LI:333170.3:2000SEP08 2004907T6 245 576 155 LI:333170.3:2000SEP08 g5363197 166 618 155 U:333170.3:2000SEP08 g5445884 247 618 155 1-1:333170.3:2000SEP08 g5637698 360 618 155 U:333170.3:2000SEP08 g2805844 246 618 155 1-1:333170.3:2000SEP08 g5753838 157 618 155 U:333170.3:2000SEP08 g3038739 476 618 155 LI:333170.3:2000SEP08 g3017051 280 623 155 LI:333170.3:2000SEP08 g5445633 156 618 155 LI:333170.3:2000SEP08 g6700310 210 617
206
Figure imgf000208_0001
<— O U O rN CJ N r- O OO 'N C O 00 00 o o o O O fN "J |N r— oo ~t c ~t o oo o ΓN IN r- r- |N
O oO UJ oO IN CJ O O CJ r- o. o- O O '— o Nt v i- co u rN rN O J CM O 'Nt -- ^t
^ - r- U -t r- CN r- O CM CJ CM -N OO CJ CM CM r- CJ r- r- -- -- -- CM >— CN CN UJ J o vt ~t CN CO CJ 00 00 IN 00 J J
Figure imgf000208_0002
CO
O
O O o O CM CN CM CJ CN CN CN CN CM CM CN CM CM CN CN UJ UJ J J J CJ CJ CO CO
CO ■—
O O O O O O O O O CJ CJ C C C CJ O CO CO o IN |N |N CO CJ C J J CO CJ cj CJ co CJ CN CM CM CN
Figure imgf000208_0003
o o -- LO uj u u u u u u uj uj u uj u uj u uj uj uj uj uj u uj uj u uj u u uj u uj u uj uj u u o o o o o o o o o o o rN iN rN rN
Q lO UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ lΛ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ U^ o LU O O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
157 1-1:279013.5:2000SEP08 70600920V1 545 1103
157 1-1:279013.5:2000SEP08 70600178V1 548 988
157 LI:279013.5:2000SEP08 70598102V1 554 1073
157 1-1:279013.5:2000SEP08 70602680V1 562 1161
157 U:279013.5:2000SEP08 70601624V1 563 1131
157 U:279013.5:2000SEP08 70598886V1 603 1087
157 1-1:279013.5:2000SEP08 70598993V1 619 1089
157 LI:279013.5:2000SEP08 70600050V1 616 1138
157 1-1:279013.5:2000SEP08 70601863V1 618 1134
157 1-1:279013.5:2000SEP08 70601661VI 615 1033
157 1-1:279013.5:2000SEP08 70599912V1 666 1215
157 1-1:279013.5:2000SEP08 70598696V1 683 1067
157 LI:279013.5:2000SEP08 70601955V1 696 1131
157 1-1:279013.5:2000SEP08 70600352V1 1 283
157 LI:279013.5:2000SEP08 70599180V1 1 440
157 LI:279013.5:2000SEP08 70600806V1 1 76
157 LI:279013.5:2000SEP08 2083811F6 1 455
157 LI:279013.5:2000SEP08 70611716V1 1 219
157 LI:279013.5:2000SEP08 70597203V1 1 454
157 LI:279013.5:2000SEP08 70597842V1 1 234
157 LI:279013.5:2000SEP08 70597537V1 44 737
157 LI:279013.5:2000SEP08 70598005V1 275 459
157 LI:279013.5:2000SEP08 70602584V1 279 459
157 LI:279013.5:2000SEP08 70600575V1 280 860
157 U:279013.5:2000SEP08 70599593V1 312 816
157 LI:279013.5:2000SEP08 70598759V1 302 617
157 LI:279013.5:2000SEP08 70594144V1 317 625
157 U:279013.5:2000SEP08 70598864V1 321 808
157 LI:279013.5:2000SEP08 70599792V1 404 989
157 LI:279013.5:2000SEP08 70598913V1 419 901
157 1-1:279013.5:2000SEP08 70600311VI 484 916
157 LI:279013.5:2000SEP08 2083811T6 496 733
157 LI:279013.5:2000SEP08 70599521VI 1 404
157 LI:279013.5:2000SEP08 70598407V1 1 162
157 LI:279013.5:2000SEP08 70597887V1 1 650
157 LI:279013.5:2000SEP08 70611227V1 904 1068
157 LI:279013.5:2000SEP08 70612859V1 928 1087
158 LI:1037075.1 2000SEP08 7413090H1 84 687
158 LI:1037075.1 2 O0SEP08 7413080H1 102 687
158 LI:1037075.1 2000SEP08 5758628T8 1 379
158 LI:1037075.1 2000SEP08 5758628H1 2 317
158 LI:1037075.1 2000SEP08 5758628F8 117 301
159 LI:1073403.1 2000SEP08 6792032T8 10 226
159 LI:1073403.1 2000SEP08 6795549H1 54 327
159 LI:1073403.1 2000SEP08 6795549F8 54 327
159 LI:1073403.1 20O0SEP08 6794588F8 69 181
159 LI:1073403.1 2000SEP08 6794588H1 1 329
159 LI:1073403.1 2000SEP08 6793290H1 1 327
159 LI:1073403.1 2000SEP08 6792032H1 10 329
159 LI:1073403.1 2000SEP08 6792032F8 10 329
208 CO m
__ _ __ __ __ __ __
0* 0N 0 0 0, 0 0S' 0 o O O O O O O cπ cπ cn cji cn ui ui c i a oooo fcoofco]_o4_ j_o&ofcoo {o_oC)oωo
Oi ωoooooooooooo o o o o o o o J- " - cj) ω ω i ) ω ω ω c) r M r N) KJ KJ NJ — — — ' O O O O O O O o
O O O O O O O O O O — . -- -- — -- — —• — —«- — —• — — ' —- O O O O O O O O O O O O O O O O O O
O O O O O O O O O O ^l Nl Nl Nl Nl Nl Nl NI Nl NI NI O O O O O O O O O O O CD CO OO CO OO CO OO CD CO OO Nl Nl Nl Nl Nl Nl Nl Nl
I. I M M M M N) M M I -ι Cl Cjl CjJ C_ι Cn CjJ Cfl Cl C_! Oi J
O O O O O O O O O O O O O O O O O O O O S 5 S S & 5 K K S 5 5 — ■ —■ —• —■ —■ —■ — ' e en &i N N NJ N K N N N Cϋ
£_ i_ 4_. t- t- *- £- f-. £- i» O O O O O O O O O O O O O O O O O O O O O 00 00 03 00 C0 00 00 O O O O O O O O O ^ po po o o po po oo o po po Ni Ni Ni Ni Ni Ni Ni Nj Ni Ni Ni oo ω ω ω w ω p o ω o -^ POOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOQOC poo — w w - ftpcononoωow Moo O ω woO ωooO ω coO OO ΛoOooO wo oOoOoOQQOOO ω ω oOoO OOOOO cfl CΛ ω w w cn ωooiΛoooooooooπ m m m m m m m m m m m m m τoto)To)to)τooτ3oto)oto)To)oto)τoι-oDTo)τoooτ.3o!oιτo.'ot)-ooτ τ.
Figure imgf000210_0001
O0 OO O0 00 0O 00 O0 00 O0 O0 00 O0 O0 o)ooιo-oτo.τoo'oti'oc)"oD"σ 00 O0 O0 C» 00 00 0O 00 00 W o
Figure imgf000210_0002
— ' — j- CJ CJ NJ — ■ — ' — ' J
00 00 00 Nl CJ CJ O ≤ O CJ t- ^. ^. C- _, _, KJ — _, _, 00 00 - _- _ι _, _, rt
_, _. ζ 00 CJ KJ CO NJ O O O CJ O — ' o to * w * w Nl oo o o o Q
N t N ej cj r NJ 4^- --- ω o c-j j-. X-. e e c-j χ_. e cji N N CJ c>J Cti Nj (χ cβ oo oo c» cj ^ w
NJ OJ O CJ O — ' O CJJ — ' CJ CJ j-. CJ O NJ 00 G- NJ O 0D NJ Cn C0 O .fc- — ' Cn co — ' OO O
O O W M C)l M N U l. -' J_ M <0 » O O O « ^ 0 O -, -, C0 Cn M « M N CIi M -- -' t 0) 0> t C)l M O M -' >0 O -' -' O C>' M 'n
Figure imgf000210_0003
Figure imgf000210_0004
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop 165 1092948.1 2000SEP08 3808866F8 7 463 165 1092948.12000SEP08 1525513H1 9 225 165 1092948.12000SEP08 2889208F6 8 343 165 1092948.12000SEP08 2889208H1 8 282 165 1092948.12000SEP08 7264977H1 19 611 165 1092948.12000SEP08 3016535H1 23 307 165 1092948.12000SEP08 4760775F6 40 613 166 U:380378.2:2000SEP08 6052233R8 22 497 166 U:380378.2:2000SEP08 6861022H1 138 218 166 LI:380378.2:2000SEP08 6052233H1 1 502 166 U:380378.2:2000SEP08 6052233J1 1 497 166 LI:380378.2:2000SEP08 6052233F8 1 497 167 1029674.1:2000SEP08 g6716886 1321 1562 167 1029674.1:2000SEP08 6763379J1 1073 1551 167 1029674.1:2000SEP08 676807UI 1117 1552 167 1029674.1:2000SEP08 g7039790 1343 1522 167 1029674.1:2000SEP08 6803363J1 634 1235 167 1029674.1:2000SEP08 6763379H1 221 774 167 1029674.1:2000SEP08 6803363H1 222 613 167 1029674.1:2000SEP08 6768071 HI 1 561 168 2048601.3:2000SEP08 71558389V1 1 391 168 2048601.3:2000SEP08 71558424V1 1 391 169 1186208.1:2000SEP08 4294932F6 35 302 169 1186208.1:2000SEP08 1899877H1 67 291 169 1186208.1:2000SEP08 2882961 HI 40 298 169 1186208.1:2000SEP08 1315261 HI 37 159 169 1186208.1:2000SEP08 5120882H1 29 291 169 1186208.1:2000SEP08 gl959316 67 383 169 1186208.1:2000SEP08 1313874H1 c 37 159 169 1186208.1:2000SEP08 4399221 HI 48 260 169 1186208.1:2000SEP08 g2401851 35 267 169 1186208.1:2000SEP08 1257339H1 46 267 169 1186208.1:2000SEP08 7976538H1 1 562 169 1186208.1:2000SEP08 5120882F6 29 379 169 1186208.1:2000SEP08 014670H1 34 314 169 1186208.1:2000SEP08 8099166H1 37 657 169 1186208.1:2000SEP08 1315261 F6 37 540 169 1186208.1:2000SEP08 5378061 HI 159 400 169 1186208.1:2000SEP08 5378061 F6 159 617 169 1186208.1:2000SEP08 g4265183 206 667 169 1186208.1:2000SEP08 4881456H1 300 379 169 1186208.1:2000SEP08 1315261T6 1 65 170 1170753.1:2000SEP08 6200892T8 1 296 170 1170753.1:2000SEP08 5406931 HI 256 392 170 1170753.1:2000SEP08 g2264772 1 331 170 1170753.1:2000SEP08 6200319T8 1 297 170 1170753.1:2000SEP08 6200892H1 1 420 170 1170753.1:2000SEP08 6200792H1 1 413 170 1170753.1:2000SEP08 6200319F8 12 383 170 1170753.1:2000SEP08 5406931F8 264 393
210 m o
Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl vj Nl NI NI Nl Nj Nl Nl Nl NI Nl Nl Nl Nl Nl Nj Nl Nl Nl Nl Nl NI Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl fe fe J-- fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe CJ NJ NJ NJ NJ K) — ' — ■ — . --- --- --- --- --- --- --- --- — . — ' — o o Ό
0
t J
Figure imgf000212_0001
ro NJ -- — ■ fe fe — — — — - co — ' NJ C
__ -_ e — ' e Ni o NJ O fe c — ' fe Nj -j
NJ NJ
Figure imgf000212_0002
KJ KJ o en o fe Ni fe o o __
cn ^ ro w n, ) N n to rk ,„ fI1 fB ,. . Λ. n Oo ϋi J- J- Oi -i -i - w NI CJ fe fe CO C-O CO OJ fe NJ fe Nl NI CJ CO fe CO NI KJ — ' CJ CJ CΛ o cji cxi N g oo o S -fe S Nl Nl N. N^ . f S -' e c oo o o o c -' Nl Nl NJ OO OO Nl O O fe NJ — ' — ' — ' Nl — ' — ' NJ O O fe O O O o o iNj e ^ t S o ^ ^ ^ ^ ^ ^ g o- O ^ fe n co Nj Nj r fe fe Nj O CJI Nl o cj Nl O KJ O O CO fe fe O O fe O O O OO O O TJ
Figure imgf000212_0003
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
174 LI: 1039974.1 2000SEP08 6893004J1 689 1290
174 LI: 1039974.1 2000SEP08 7438574H1 1013 1333
174 LI: 1039974.1 2000SEP08 7709152J1 1039 1598
174 LI:1039974.1 2000SEP08 3423941 HI 1085 1344
174 LI: 1039974.1 2000SEP08 3423941F6 1085 1374
174 LI:1039974.1 2000SEP08 6982460H1 1104 1534
174 LI: 1039974.1 2000SEP08 6947579H1 1112 1620
174 LI:1039974.1 2000SEP08 7763864H1 1 489
174 LI:1039974.1 2000SEP08 70938370V1 763 961
174 LI:1039974.1 2000SEP08 3463824F6 844 1240
174 LI:1039974.1 2000SEP08 3463824H1 844 1063
174 LI:1039974.1 2000SEP08 70941631V1 856 1063
174 LI:1039974.1 2000SEP08 7091561 HI 871 1157
174 LI:1039974.1 2000SEP08 71152516V1 891 1393
174 LI:1039974.1 2000SEP08 6883174H1 906 1417
174 LI:1039974.1 2000SEP08 6883129H1 906 1237
174 LI:1039974.1 2000SEP08 6996726H1 907 1480
174 LI:1039974.1 2000SEP08 7663341Jl 924 1545
174 LI: 1039974.1 2000SEP08 70941275V1 924 1063
174 LI: 1039974.1 2000SEP08 g1695282 710 919
174 LI:1039974.1 2000SEP08 71152694V1 626 1063
174 LI:1039974.1 2000SEP08 7091561 F8 671 1149
174 LI: 1039974.1 2000SEP08 71151884V1 685 1063
174 LI: 1039974.1 2000SEP08 1251928H1 1166 1438
174 LI: 1039974.1 2000SEP08 7707065J1 947 1424
174 LI:1039974.1 2000SEP08 3463824T7 976 1081
174 LI: 1039974.1 2000SEP08 7746155J1 993 1346
174 LI:1039974.1 2000SEP08 71301718V1 647 1063
174 LI:1039974.1 2000SEP08 7639946J2 416 1070
174 LI:1039974.1 2000SEP08 7121329H1 490 909
174 LI:1039974.1 2000SEP08 7735245J1 669 1255
174 LI:1039974.1 2000SEP08 3337158H1 564 843
174 LI:1039974.1 2000SEP08 71302652V1 584 1112
174 LI:1039974.1 2000SEP08 71301474V1 595 1108
174 LI:1039974.1 2000SEP08 7709152H1 532 1137
174 LI:1039974.1 2000SEP08 7091561 R8 319 951
174 LI:1039974.1 2000SEP08 7678465H1 239 692
174 LI:1039974.1 2000SEP08 5517746H1 346 566
174 LI:1039974.1 2000SEP08 8036193H1 406 1033
175 LI:1175765.2 2000SEP08 6985759F8 1 481
175 LI:1175765.2 2000SEP08 6985759R8 1 481
176 LI:313948.1:2000SEP08 g4150106 244 602
176 LI:313948.1:2000SEP08 g2930824 278 601
176 LI:313948.1:2000SEP08 8025310J1 1 593
176 LI:313948.1:2000SEP08 8020685J1 29 593
176 U:313948.1:2000SEP08 g5111878 111 569
176 LI:313948.1:2000SEP08 g4533060 154 569
177 U:335923.2:2000SEP08 6243152H1 1 565
177 LI:335923.2:2000SEP08 528371OH1 8 159
177 Ll:335923.2:: 2000SEP08 5283642H1 8 201
212 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
177 LI:335923.2:2000SEP08 201 1071 R6 15 484
177 LI:335923.2:2000SEP08 201 1071 H1 15 221
177 LI:335923.2:2000SEP08 5283929H1 23 296
177 LI:335923.2:2000SEP08 5283676H1 23 285
177 LI:335923.2:2000SEP08 6247161 HI 23 479
177 LI:335923.2:2000SEP08 g2035703 28 384
177 U:335923.2:2000SEP08 2005418H1 168 357
177 U:335923.2:2000SEP08 g4683784 245 728
177 U:335923.2:2000SEP08 g3678673 308 738
177 U:335923.2:2000SEP08 g4734389 311 735
177 U:335923.2:2000SEP08 g3038675 646 737
178 U:345884.1 :2000SEP08 g2240240 1 315
178 U:345884.1 :2000SEP08 4676488H1 8 263
178 U:345884.1 :2000SEP08 4676488F6 7 392
178 U:345884.1 :2000SEP08 776081 UI 59 265
178 LI:345884.1 :2000SEP08 g4991320 192 650
178 LI:345884.1 :2000SEP08 4676488T6 214 620
178 LI:345884.1 :2000SEP08 g7319648 465 557
179 LI:417127.1 :2000SEP08 5854850T8 1 541
179 LI:417127.1 :2000SEP08 5854850F8 1 556
179 Ll:417127.1 :2000SEP08 5854850H1 1 286
180 LI:451710.1 :2000SEP08 5914133H1 1 282
180 Ll:451710.1 :2000SEP08 5914133F6 1 630
180 LI:451710.1 :2000SEP08 5321108F9 1 537
180 U:451710.1 :2000SEP08 5914133F8 1 391
180 LI:451710.1 :2000SEP08 5914133T6 52 609
181 U:406882.2:2000SEP08 2074401 HI 1 240
181 LI:406882.2:2000SEP08 7277422H1 1 572
181 U:406882.2:2000SEP08 6387643H1 1 138
181 LI:406882.2:2000SEP08 5678260H1 2 262
181 U:406882.2:2000SEP08 3488977F6 19 435
181 LI:406882.2:2000SEP08 3488977H1 19 214
181 LI:406882.2:2000SEP08 355761 1 HI 21 316
181 LI:406882.2:2000SEP08 6771944J1 1 10 686
181 LI:406882.2:2000SEP08 g2179336 168 373
181 LI:406882.2:2000SEP08 322004H1 316 451
181 LI:406882.2:2000SEP08 322004R6 322 786
181 LI:406882.2:2000SEP08 gl l86140 342 702
181 LI:406882.2:2000SEP08 g2184370 491 909
181 LI:406882.2:2000SEP08 g 1426251 665 963
181 LI:406882.2:2000SEP08 g2715236 750 1226
181 LI:406882.2:2000SEP08 3099966T6 767 1 180
181 LI:406882.2:2000SEP08 3099966F6 774 1 198
181 LI:406882.2:2000SEP08 3099966H1 774 1087
181 LI:406882.2:2000SEP08 6984563H1 849 1340
181 LI:406882.2:2000SEP08 g2184149 860 1226
181 LI:406882.2:2000SEP08 g3801693 863 1227
181 LI:406882.2:2000SEP08 gl 153584 1000 1236
181 LI:406882.2:2000SEP08 6983618H1 1013 1165
181 LI:406882.2:2000SEP08 6983555H1 1013 1350
213 TABLE 5
_ ID NO: Template ID Component ID Start Stop
181 U:406882.2:2000SEP08 g2348522 1012 1226
181 U:406882.2:2000SEP08 g3841648 1047 1230
181 LI:406882.2:2000SEP08 g3842646 1049 1232
182 U:728223.1 :2000SEP08 6269813T8 1 599
182 LI:728223.1 :2000SEP08 6269813F8 1 658
182 U:728223.1 :2000SEP08 6269813H1 1 404
183 LI:289783.19:2000SEP08 550475R1 800 1 163
183 U:289783.19:2000SEP08 5512967H1 81 1 1037
183 LI:289783.19:2000SEP08 3326063H1 816 1 105
183 U:289783.19:2000SEP08 3401385H1 814 1043
183 LI:289783.19:2000SEP08 777649H1 833 1046
183 LI:289783.19:2000SEP08 495183H1 848 1089
183 LI:289783.19:2000SEP08 2787013H1 878 1 139
183 LI:289783.19:2000SEP08 5432830H1 889 1108
183 LI:289783.19:2000SEP08 550475H1 800 1026
183 LI:289783.19:2000SEP08 7267409H1 688 1 177
183 LI:289783.19:2000SEP08 g2537937 557 1041
183 U:289783.19:2000SEP08 60214935U1 324 873
183 LI:289783.19:2000SEP08 7084522H1 328 823
183 LI:289783.19:2000SEP08 8016832J1 1 455
183 LI:289783.19:2000SEP08 8016705J1 1 455
184 LI:235255.8:2000SEP08 5898645H1 511 773
184 LI:235255.8:2000SEP08 5901761 HI 51 1 796
184 LI:235255.8:2000SEP08 564321 1 HI 514 771
184 LI:235255.8:2000SEP08 3188563H1 523 863
184 U:235255.8:2000SEP08 7254655H1 554 911
184 LI:235255.8:2000SEP08 4569548H1 568 845
184 LI:235255.8:2000SEP08 2213566F6 580 856
184 U:235255.8:2000SEP08 2224227F6 580 1019
184 LI:235255.8:2000SEP08 2224227H1 580 823
184 LI:235255.8:2000SEP08 2834918H1 592 839
184 U:235255.8:2000SEP08 7613084H1 597 1 179
184 LI:235255.8:2000SEP08 5528066H1 610 871
184 LI:235255.8:2000SEP08 675898H1 634 903
184 LI:235255.8:2000SEP08 70568158V1 660 1065
184 LI:235255.8:2000SEP08 4625774H1 1 250
184 LI:235255.8:2000SEP08 4625546H1 5 259
184 LI:235255.8:2000SEP08 7203158H1 90 447
184 LI:235255.8:2000SEP08 70300587D1 120 420
184 LI:235255.8:2000SEP08 70513655D1 122 698
184 LI:235255.8:2000SEP08 6883819H1 185 453
184 LI:235255.8:2000SEP08 6777538J1 224 898
184 LI:235255.8:2000SEP08 7680150J1 272 878
184 U:235255.8:2000SEP08 3594396H1 283 584
184 U:235255.8:2000SEP08 4383752H1 285 541
184 U:235255.8:2000SEP08 70570626V1 294 969
184 LI:235255.8:2000SEP08 5501591 OH 1 310 726
184 LI:235255.8:2000SEP08 744281 HI 317 547
184 LI:235255.8:2000SEP08 761 975H1 318 826
184 LI:235255.8:2000SEP08 1436369F1 317 798
214 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
184 U:235255.8:2000SEP08 7619975J1 360 901
184 U:235255.8:2000SEP08 5021074H1 349 613
184 U:235255.8:2000SEP08 5020874H1 349 617
184 LI:235255.8:2000SEP08 70566905V1 376 466
184 LI:235255.8:2000SEP08 7688171J1 381 901
184 LI:235255.8:2000SEP08 70570192V1 414 1008
184 U:235255.8:2000SEP08 70570560V1 417 983
184 LI:235255.8:2000SEP08 5775092H1 418 911
184 LI:235255.8:2000SEP08 4533965H1 453 706
184 U:235255.8:2000SEP08 5275166H1 454 705
184 LI:235255.8:2000SEP08 2362011 HI 456 699
184 U:235255.8:2000SEP08 2362011R6 456 796
184 LI:235255.8:2000SEP08 70568674V1 462 769
184 LI:235255.8:2000SEP08 6889089J1 462 914
184 U:235255.8:2000SEP08 70553659V1 462 768
184 LI:235255.8:2000SEP08 70570674V1 462 768
184 LI:235255.8:2000SEP08 1506551 HI 478 675
184 LI:235255.8:2000SEP08 70571785V1 496 729
184 U:235255.8:2000SEP08 5903079H1 511 796
184 LI:235255.8:2000SEP08 5897612H1 511 821
184 LI:235255.8:2000SEP08 5903174H1 511 811
184 U:235255.8:2000SEP08 3029777H1 679 970
184 U:235255.8:2000SEP08 6931063H1 708 1207
184 LI:235255.8:2000SEP08 4226879H1 726 1017
184 LI:235255.8:2000SEP08 7688171 HI 736 1290
184 U:235255.8:2000SEP08 7606153J1 749 1198
184 U:235255.8:2000SEP08 1968849H1 771 969
184 LI:235255.8:2000SEP08 630569H1 770 899
184 U:235255.8:2000SEP08 3051523H1 775 1066
184 LI:235255.8:2000SEP08 6621315H1 796 1284
184 LI:235255.8:2000SEP08 2791215H1 799 1103
184 LI:235255.8:2000SEP08 1804236H1 803 1053
184 LI:235255.8:2000SEP08 1460331 HI 805 1044
184 LI:235255.8:2000SEP08 70571366V1 813 1318
184 LI:235255.8:2000SEP08 4835570H1 825 1092
184 U:235255.8:2000SEP08 3150504H1 837 951
184 LI:235255.8:2000SEP08 7931885H1 839 1427
184 U:235255.8:2000SEP08 5041870H1 841 1081
184 LI:235255.8:2000SEP08 2433958R6 849 1394
184 U:235255.8:2000SEP08 6840250H1 866 1015
184 LI:235255.8:2000SEP08 4899142H2 863 1149
184 LI:235255.8:2000SEP08 4898538H1 863 1138
184 U:235255.8:2000SEP08 7071147H1 876 1275
184 LI:235255.8:2000SEP08 3663348H1 877 1153
184 LI:235255.8:2000SEP08 5605778H1 878 992
184 LI:235255.8:2000SEP08 7215091 HI 878 1135
184 U:235255.8:2000SEP08 893148H1 906 1188
184 LI:235255.8:2000SEP08 1652130H1 906 1145
184 LI:235255.8:2000SEP08 2691823H1 923 1162
184 LI:235255.8:2000SEP08 600445H1 927 1159
215 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
184 U:235255.8:2000SEP08 7658901Jl 954 1477
184 LI:235255.8:2000SEP08 3725892H1 974 1263
184 U:235255.8:2000SEP08 2272178H1 976 1247
184 U:235255.8:2000SEP08 3725965H1 975 1256
184 LI:235255.8:2000SEP08 6910727J1 986 1098
184 LI:235255.8:2000SEP08 4859915H1 1000 1239
184 LI:235255.8:2000SEP08 6883819J1 1014 1509
184 LI:235255.8:2000SEP08 1361213H1 1018 1250
184 LI:235255.8:2000SEP08 1361092F1 1018 1350
184 LI:235255.8:2000SEP08 1718984T6 1029 1514
184 LI:235255.8:2000SEP08 1705634T6 1033 1514
184 LI:235255.8:2000SEP08 2047749H1 1037 1208
184 U:235255.8:2000SEP08 2128803H1 1041 1211
184 LI:235255.8:2000SEP08 5290142H1 1041 1286
184 LI:235255.8:2000SEP08 1641905H1 1053 1268
184 U:235255.8:2000SEP08 3089073H1 1061 1341
184 U:235255.8:2000SEP08 2418969H1 1085 1319
184 LI:235255.8:2000SEP08 6163740H1 1094 1474
184 U:235255.8:2000SEP08 g2900015 1097 1559
184 U:235255.8:2000SEP08 5108013H1 1104 1308
184 LI:235255.8:2000SEP08 2433958T6 1107 1547
184 LI:235255.8:2000SEP08 2100492H1 1112 1301
184 LI:235255.8:2000SEP08 g4896364 1132 1477
184 LI:235255.8:2000SEP08 g6044400 1149 1477
184 U:235255.8:2000SEP08 1655388T6 1149 1509
184 LI:235255.8:2000SEP08 1439226H1 1161 1420
184 U:235255.8:2000SEP08 223 189H1 1162 1408
184 LI:235255.8:2000SEP08 g3229346 1173 1561
184 LI:235255.8:2000SEP08 2071873T6 1203 1517
184 LI:235255.8:2000SEP08 2070989H1 1205 1459
184 LI:235255.8:2000SEP08 2096546H1 1222 1460
184 U:235255.8:2000SEP08 g2873693 1236 1551
184 LI:235255.8:2000SEP08 5311408H1 1275 1477
184 LI:235255.8:2000SEP08 1694442H1 1346 1560
184 U:235255.8:2000SEP08 3349083H1 1382 1556
185 LI:237693.5:2000SEP08 5083439F8 1 528
185 LI:237693.5:2000SEP08 2989661 HI 1 301
185 U:237693.5:2000SEP08 2752263R6 2 478
185 LI:237693.5:2000SEP08 5446248H1 1 264
185 LI:237693.5:2000SEP08 2752263H1 2 269
185 LI:237693.5:2000SEP08 2922381 HI 24 287
185 LI:237693.5:2000SEP08 2473969H1 33 243
185 LI:237693.5:2000SEP08 2473969F6 33 243
185 LI:237693.5:2000SEP08 3316534H1 54 296
185 LI:237693.5:2000SEP08 g4703345 65 509
185 LI:237693.5:2000SEP08 3502790H1 312 587
186 U:433670.3:2000SEP08 3319844H1 1 268
186 LI:433670.3:2000SEP08 3319844F6 1 302
186 U:433670.3:2000SEP08 7232871 HI 179 787
186 LI:433670.3:2000SEP08 4172439T6 662 1103
216 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop 187 202943.4:2000SEP08 60210577U1 1 432 187 202943.4:2000SEP08 70443926D1 31 494 187 202943.4:2000SEP08 70444091 DI 31 398 187 202943.4:2000SEP08 70443917D1 31 419 187 202943.4:2000SEP08 70442607D1 79 470 187 202943.4:2000SEP08 70444223D1 246 854 187 202943.4:2000SEP08 70444717D1 512 881 187 202943.4:2000SEP08 8108821 HI 568 1216 187 202943.4:2000SEP08 70444084D1 629 1023 187 202943.4:2000SEP08 70442853D1 685 1115 187 202943.4:2000SEP08 g2020942 809 1047 187 202943.4:2000SEP08 70443670D1 844 1182 187 202943.4:2000SEP08 8108821J1 943 1553 187 202943.4:2000SEP08 7152636H1 1063 1150 187 202943.4:2000SEP08 5092608H1 1106 1379 187 202943.4:2000SEP08 5089408H1 1106 1375 187 202943.4:2000SEP08 5091008H1 1106 1374 187 202943.4:2000SEP08 7151349H1 1163 1666 187 202943.4:2000SEP08 5639884H1 1241 1470 187 202943.4:2000SEP08 1226590R6 1279 1688 187 202943.4:2000SEP08 1226590H1 1279 1523 187 202943.4:2000SEP08 g2020858 1285 1525 187 202943.4:2000SEP08 7213845H1 1347 1897 187 202943.4:2000SEP08 1226590T6 1488 1993 187 202943.4:2000SEP08 5092132H1 1620 1868 187 202943.4:2000SEP08 6469878H1 1729 2290 187 202943.4:2000SEP08 7594783H1 1923 2331 187 202943.4:2000SEP08 8113737H1 1946 2575 187 202943.4:2000SEP08 6472564H1 2047 2462 187 202943.4:2000SEP08 2579002H2 2080 2282 188 068682.1:2000SEP08 g3307490 141 343 188 068682.1:2000SEP08 6829315H1 313 883 188 068682.1:2000SEP08 g3109791 491 810 188 068682.1:2000SEP08 g5452473 491 649 188 068682.1:2000SEP08 g4372490 78 422 188 068682.1:2000SEP08 g3805312 34 422 188 068682.1:2000SEP08 g6043518 78 422 188 068682.1:2000SEP08 g4564783 26 422 188 068682.1:2000SEP08 g5438746 1 422 188 068682.1:2000SEP08 g2954208 76 422 188 068682.1:2000SEP08 g2954218 141 421 188 068682.1:2000SEP08 g6838215 123 382 188 068682.1:2000SEP08 2011384H1 189 262 189 203301.3:2000SEP08 2613134T6 2172 2550 189 203301.3:2000SEP08 3904374H1 2189 2332 189 203301.3:2000SEP08 g5810033 2192 2612 189 203301.3:2000SEP08 7931883H1 2214 2633 189 203301.3:2000SEP08 g5540842 2229 2602 189 203301.3:2000SEP08 g7040110 2231 2602 189 203301 ,3:2000SEP08 g6658405 2231 2608
217 TABLE 5
SEQ ID NO. Template ID Component ID Start Stop 189 U'203301.3 2000SEP08 7667756H1 1 611 189 Ll'203301 3 2000SEP08 7263722H1 55 606 189 LI.203301.3 2000SEP08 2516560F6 58 548 189 Ll:203301 3 2000SEP08 2516560H1 58 402 189 LI:203301.3 2000SEP08 5541375H1 57 257 189 Ll'203301 3 2000SEP08 60212962U1 61 615 189 LI 203301.3 2000SEP08 7283792H1 69 696 189 LI 203301.3 2000SEP08 60212961 UI 69 545 189 Ll'203301.3 2000SEP08 267136H1 69 494 189 LI 203301 32000SEP08 5182929H1 74 301 189 Ll'203301.3 2000SEP08 7159563H1 86 568 189 U:203301.3 2000SEP08 1234347F1 123 720 189 Ll'203301 3 2000SEP08 1234347F6 123 628 189 LI.203301.3 2000SEP08 1234347H1 123 389 189 LI:203301.3 2000SEP08 g2079759 135 549 189 U:203301.3 2000SEP08 g318174 146 498 189 LI 203301 3 2000SEP08 60212963U1 234 677 189 U:203301.3 2000SEP08 g1509838 280 523 189 U:203301.3 2000SEP08 2074934H1 299 560 189 LI.203301 3 2000SEP08 2074934F6 299 551 189 Ll'203301.3 2000SEP08 2791160F6 320 806 189 Ll'203301.3 2000SEP08 2791160H2 320 641 189 LI:203301.3 2000SEP08 4310273H1 467 786 189 Ll'203301 3 2000SEP08 6609896H2 474 960 189 U:203301.3 2000SEP08 g6474225 508 832 189 U:203301.3 2000SEP08 7087174H1 568 1086 189 LI.203301.3 2000SEP08 1675180H1 723 931 189 LI 203301 3 2000SEP08 7606685H1 717 1141 189 LI.203301.3 2000SEP08 6308156H1 739 1116 189 LI.203301.3 2000SEP08 3039416F6 956 1492 189 Ll'203301.3 2000SEP08 3039416H1 956 1223 189 LI 203301.3 2000SEP08 7933647H1 1024 1706 189 LL203301.3 2000SEP08 7674856H2 1152 1709 189 LI:203301.3 2000SEP08 3892582H1 1338 1621 189 Ll'203301 3 2000SEP08 g618400 1359 1709 189 U:203301.3 2000SEP08 6531461 HI 1579 2120 189 LI:203301.3 2000SEP08 125847H1 1680 1862 189 LI:203301.3 2000SEP08 4769649H1 1770 2033 189 LI 203301.3 2000SEP08 3405320H1 1771 2032 189 LI:203301.3 2000SEP08 2791160T6 1930 2432 189 U:203301.3 2000SEP08 4080504H1 1937 2244 189 Ll'203301 3 2000SEP08 6609896T2 1941 2533 189 LI 203301.3 2000SEP08 789847H1 1945 2178 189 LI:203301.3 2000SEP08 790089H1 1946 2161 189 Ll'203301.3 2000SEP08 3039416T6 1962 2566 189 LI 203301 3 2000SEP08 956099R7 1991 2394 189 LI.203301.3 2000SEP08 956099H1 1991 2252 189 U:203301.3 2000SEP08 g7237547 2035 2493 189 U:203301.32000SEP08 g6464289 2062 2472 189 LI.203301.3 2000SEP08 2074934T6 2078 2574
218 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
189 LI:203301.3:2000SEP08 g4310000 2084 2547
189 U:203301.3:2000SEP08 2187776T6 2086 2569
189 U:203301.3:2000SEP08 2187776F6 2093 2584
189 U:203301.3:2000SEP08 2187776H1 2093 2377
189 U:203301.3:2000SEP08 956099T6 2098 2568
189 LI:203301.3:2000SEP08 g4683266 2139 2608
189 U:203301.3:2000SEP08 7247184H1 2140 2722
189 U:203301.3:2000SEP08 g3240807 2142 2612
189 LI:203301.3:2000SEP08 g3804586 2145 2605
189 U:203301.3:2000SEP08 g5638660 2154 2602
189 U:203301.3:2000SEP08 g4297425 2160 2602
189 U:203301.3:2000SEP08 g4736851 2167 2602
189 U:203301.3:2000SEP08 2516560T6 2166 2550
189 U:203301.3:2000SEP08 g7150807 2173 2610
189 LI:203301.3:2000SEP08 g3753686 2240 2617
189 U:203301.3:2000SEP08 669418H1 2286 2573
189 LI:203301.3:2000SEP08 669124H1 2286 2558
189 LI:203301.3:2000SEP08 g5742262 2297 2602
189 LI:203301.3:2000SEP08 g1220014 2302 2615
189 LI:203301.3:2000SEP08 g4987993 2334 2602
189 LI:203301.3:2000SEP08 g3839358 2344 2602
189 LI:203301.3:2000SEP08 4913428H1 2456 2741
189 LI:203301.3:2000SEP08 3641081H1 2602 2728
189 LI:203301.3:2000SEP08 3485460H1 2614 2890
189 LI:203301.3:2000SEP08 4106769F6 2722 3162
189 LI:203301.3:2000SEP08 4106769H1 2721 2910
190 U:020726.3:2000SEP08 70858739V1 458 1073
190 LI:020726.3:2000SEP08 71225604V1 499 1009
190 U:020726.3:2000SEP08 70858001VI 500 1062
190 U:020726.3:2000SEP08 70796452V1 508 901
190 U:020726.3:2000SEP08 71226010V1 563 1119
190 LI:020726.3:2000SEP08 7179848H1 562 912
190 U:020726.3:2000SEP08 71224790V1 581 1124
190 LI:020726.3:2000SEP08 70793655V1 639 743
190 LI:020726.3:2000SEP08 70855458V1 701 1340
190 U:020726.3:2000SEP08 70858062V1 702 1306
190 U:020726.3:2000SEP08 71226289V1 729 1174
190 U:020726.3:2000SEP08 71224784V1 765 1232
190 U:020726.3:2000SEP08 71224903V1 774 1282
190 U:020726.3:2000SEP08 70861562V1 837 1436
190 LI:020726.3:2000SEP08 70855548V1 835 1325
190 U:020726.3:2000SEP08 70855022V1 860 1390
190 U:020726.3:2000SEP08 7724233H1 854 1272
190 U:020726.3:2000SEP08 5311769H1 862 1049
190 LI:020726.3:2000SEP08 5311769F8 862 1491
190 LI:020726.3:2000SEP08 70857896V1 913 1567
190 LI:020726.3:2000SEP08 70857337V1 915 1530
190 LI:020726.3:2000SEP08 71225621V1 947 1531
190 LI:020726.3:2000SEP08 71225381VI 981 1564
190 LI:02072ό.3:2000SEP08 5990468H1 1046 1194
219 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
190 LI:020726.3:2000SEP08 6059824F8 1061 1589
190 LI:020726.3:2000SEP08 71225618V1 1091 1435
190 LI:020726.3:2000SEP08 70858268 V 1 788 141 1
190 U:020726.3:2000SEP08 70858492 V 1 790 1259
190 U:020726.3:2000SEP08 70796790V 1 810 1065
190 LI:020726.3:2000SEP08 71227063V 1 812 1095
190 U:020726.3:2000SEP08 70858313V1 815 1364
190 U:020726.3:2000SEP08 8097890H1 1 501
190 U:020726.3:2000SEP08 7441541 H1 13 484
190 LI:020726.3:2000SEP08 70855180 V 1 111 1400
190 U:020726.3:2000SEP08 70854812V1 781 1360
190 LI:020726.3:2000SEP08 71225057 VI 314 873
190 U:020726.3:2000SEP08 70856238 V 1 1 167 1506
190 U:020726.3:2000SEP08 7724233J1 1 178 1737
190 U:020726.3:2000SEP08 70858568 V 1 1248 1728
190 LI:020726.3:2000SEP08 70856388 V 1 1272 1750
190 LI:020726.3:2000SEP08 2738605T6 1334 1877
190 LI:020726.3:2000SEP08 71224942 V 1 1370 1816
190 LI:020726.3:2000SEP08 71224964V 1 1376 1874
190 LI:020726.3:2000SEP08 2738601T7 1383 1810
190 LI:020726.3:2000SEP08 5906145H1 1506 181 1
190 LI:020726.3:2000SEP08 5906145F6 1506 1790
190 LI:020726.3:2000SEP08 g 1383637 1553 1914
190 LI:020726.3:2000SEP08 7427246H1 1717 2139
191 LI:027209.1 :2000SEP08 1388139H1 1 262
191 LI:027209.1 :2000SEP08 2737908H1 3 238
191 LI:027209.1 :2000SEP08 2737908F6 3 395
191 LI:027209.1 :2000SEP08 6306041 F8 5 651
191 U:027209.1 :2000SEP08 6306041 HI 5 464
191 LI:027209.1 :2000SEP08 70563274V 1 229 974
191 LI:027209.1 :2000SEP08 70563246V 1 340 843
191 LI:027209.1 :2000SEP08 70565819V1 440 1018
191 LI:027209.1 :2000SEP08 1904413H1 563 709
191 LI:027209.1 :2000SEP08 70563142V1 627 1 120
191 LI:027209.1 :2000SEP08 70564681 VI 633 1098
191 LI:027209.1 :2000SEP08 70565477V1 675 1279
191 LI:027209.1 :2000SEP08 70562989V1 740 1342
191 LI:027209.1 :2000SEP08 70563804V1 745 1341
191 LI:027209.1 :2000SEP08 70565738V1 758 1337
191 LI:027209.1 :2000SEP08 2739431 HI 801 1034
191 LI:027209.1 :2000SEP08 2739431 F6 801 1 135
191 LI:027209.1 :2000SEP08 70564623V1 974 1362
191 LI:027209.1 :2000SEP08 6306041 T8 982 1512
191 LI:027209.1 :2000SEP08 2737908T6 1 1 16 1572
191 U:027209.1 :2000SEP08 1265467H1 1494 1829
191 U:027209.1 :2000SEP08 2819460T6 1496 1585
191 U:027209.1 :2000SEP08 2819460F6 1503 1615
191 U:027209.1 :2000SEP08 2819460H1 1504 1587
192 U:108819.1 :2000SEP08 71304273V 1 726 1132
192 LI:108819.1 :2000SEP08 2992317H1 831 1 1 13
220 00
o Q- . S -o CM 00 CM ■— O -J C UJ O vt CO O O IN CO r— 0 __ ,r- Γ CJ O0 'CVMT 0O0 'CNNJ CrvO vTJ rN O ^
, , O O C i— o o rv oo oo ΓN -j rv u o CN o lO OO CJ C CJ CJ CJ CJ CJ CJ CO ^ cM CNM O L - ^Ot. Ω ^-- 5l=_S-_* ,*-, C-NNi Oιr. ^ O O OO CO IN o o LO CO LO J vT — T — -J --- !- -- - --- --- --- --- --- - ^ CN CM vr vt -. -j Nf lO MD O' S S°2S-gg
H U α.
^ OO CN O CN CN L O L Cn - CN CM C c rN -O oO O vj CM O CO CN CN . _ _ - LO O -) C * t. 2 - ■" m uj -- M -) ■- to c. - ,^ cr, ~ r^ -^ r- rN rN J N N ∞ CMO iO O C. C. C. O O J5 t CJ CJ CM θ O LO oo rN o rN iN O co -' CM CM CJ CJ vt UJ lO J UJ UJ
Figure imgf000222_0001
CO 00 CO CO (» C0 00 CO 00 _0 00 00 ∞ o-.-o.o-.o-.o-.o-.σ-.oαp-.o-.o-.oα-o.-o.p-.o-.o-.o-.-o.o-.o-.Do. co co co co co co co co co co O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O o o o o o o o o o p o o o o o o p o o o o o '-. '-. '-. '-. '-. -. '-. -. *-. '-. -. -. '-. -. '-. -. -. -. -. '-: -. '-. o--o--o--o--o--or-o- ooooooooooooooo C - lO LO UJ UJ UJ U LO UJ UJ UJ Uj L U L L oDo-oo)!o0o-oo)aco)-co)IcDoN--N --N--K--N--N--N--K--N--N--N--N --N_N_N_- O O O O O O O CN CN CN CN CN CN CN CN CM CN CM CN CM CM CN -- -- r- -- r- r- r- 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Figure imgf000222_0002
o o - CM CN C C CM C C C C C CN CM CN CN CN CN CN C C C CN CN CN C LJ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O o UJ o
TABLE 5
I ID NO Template ID Component ID Start Stop
193 LI 02175912000SEP08 7098314H1 579 949
193 LI 02175912000SEP08 2966371F6 685 1166
193 LI 02175912000SEP08 2966371 HI 686 989
193 LI 02175912000SEP08 7739693H1 745 1294
193 LI 02175912000SEP08 g1642049 756 1057
193 LI 02175912000SEP08 g1753778 957 1246
194 LI 116596712000SEP08 g6639152 1 415
194 LI 116596712000SEP08 6795757H1 1 531
194 LI 116596712000SEP08 6795757F8 1 523
194 LI 116596712000SEP08 6795757T8 1 425
194 LI 116596712000SEP08 6790411T8 140 415
194 LI 116596712000SEP08 6790411F8 141 406
194 LI 116596712000SEP08 6790411 HI 280 526
194 LI 116596712000SEP08 6795161T8 301 421
194 LI 116596712000SEP08 6795161 F8 301 522
194 LI 116596712000SEP08 6795161 HI 301 512
195 LI 116631512000SEP08 2996394H1 1 264
195 U 116631512000SEP08 2919416H2 28 275
195 LI 116631512000SEP08 2972940H1 241 530
195 LI 116631512000SEP08 2972940F6 241 389
195 LI 116631512000SEP08 2829325H1 487 756
195 LI 116631512000SEP08 2950735H1 526 796
195 LI 116631512000SEP08 71601665V1 619 1055
195 LI 116631512000SEP08 2972940T6 717 895
196 LI 20462612000SEP08 70901761VI 593 1048
196 LI 20462612000SEP08 1543453T6 610 1221
196 LI 20462612000SEP08 1921258H1 832 1046
196 LI 20462612000SEP08 1921258F6 833 1233
196 LI 20462612000SEP08 70902979V1 25 623
196 LI 20462612000SEP08 70902424V1 520 1141
196 LI 20462612000SEP08 4778273T9 559 1136
196 LI 20462612000SEP08 71269109V1 24 482
196 LI 20462612000SEP08 1543453R6 24 440
196 LI 20462612000SEP08 71268878V1 24 332
196 LI 20462612000SEP08 5055563F9 38 607
196 LI 20462612000SEP08 5055563H1 38 307
196 LI 20462612000SEP08 71269305V1 48 656
196 LI 20462612000SEP08 70899115V1 100 741
196 LI 20462612000SEP08 4778273F8 126 752
196 LI 20462612000SEP08 4778273H1 126 397
196 LI 20462612000SEP08 2724784H1 137 372
196 LI 20462612000SEP08 71269307V1 152 747
196 LI 20462612000SEP08 70901209V1 181 645
196 LI 20462612000SEP08 71269923V1 198 768
196 LI 20462612000SEP08 71270183V1 244 806
196 LI 20462612000SEP08 70900135V1 264 662
196 LI 20462612000SEP08 71270585V1 282 757
196 LI 20462612000SEP08 71268793V1 282 772
196 LI 20462612000SEP08 70899191VI 310 668
196 LI 20462612000SEP08 2925738H1 339 605
222 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
196 U.204626.1 2000SEP08 70901814V1 399 1015
196 Ll:204626.1 2000SEP08 71270451V1 484 1071
196 U.204626.1 2000SEP08 7931380H1 497 907
196 LI.204626.1 2000SEP08 6868907H1 1 542
196 Ll:204626.1 2000SEP08 1543453H1 24 217
196 Ll:204626.1 2000SEP08 71269839V1 24 555
196 U.204626.1 2000SEP08 70902750V1 24 509
196 Ll:204626.1 2000SEP08 71269008V1 24 569
197 U.801140.1 2000SEP08 6103338H1 66 375
197 U.801140.1 2000SEP08 1415866T6 391 627
197 U.801140.1 2000SEP08 1445465H1 53 320
197 LI:801140.1 2000SEP08 6103338F7 66 594
197 LI:801140.1 2000SEP08 g1442104 1 290
197 LI:801140.1 2000SEP08 g1439745 13 306
197 U:801140.1 2000SEP08 1445465T6 44 628
197 U.801140.1 2000SEP08 1445465F6 53 367
198 Ll:286639.1 2000SEP08 g6569543 1 466
198 Ll:286639.1 2000SEP08 4138803H1 17 304
198 Ll:286639.1 2000SEP08 3595464H1 344 623
198 Ll:286639.1 2000SEP08 7687060J1 370 953
198 U.286639.1 2000SEP08 8031936J1 648 1260
198 Ll:286639.1 2000SEP08 5508394F6 681 1105
198 Ll:286639.1 2000SEP08 5508394H1 681 879
198 Ll:286639.1 2000SEP08 g2025699 726 1057
198 Ll:286639.1 2000SEP08 3217093H1 740 1010
198 Ll:286639.1 2000SEP08 3217093F6 740 1172
198 Ll:286639.1 2000SEP08 7687060H1 1022 1613
198 Ll:286639.1 2000SEP08 4919304F6 1113 1524
198 Ll:286639.1 2000SEP08 5507629H1 1179 1448
198 Ll:286639.1 2000SEP08 5832442H1 1196 1449
198 U.286639.1 2000SEP08 70449421V1 1211 1798
198 Ll:286639.1 2000SEP08 301366R6 1211 1673
198 Ll:286639.1 2000SEP08 301366H1 1211 1427
198 Ll:286639.1 2000SEP08 301366T6 1243 1769
198 Ll:286639.1 2000SEP08 3217093T6 1310 1845
198 Ll:286639.1 2000SEP08 6178302H1 1388 1561
198 Ll:286639.1 2000SEP08 491 304H1 1393 1524
198 Ll:286639.1 2000SEP08 70450250V1 1444 1656
198 Ll:286639.1 2000SEP08 g2057022 1461 1801
198 Ll:286639.1 2000SEP08 g3933409 1496 1976
198 Ll:286639.1 2000SEPO8 g3446647 1520 1976
198 Ll:286639.1 2000SEP08 g7037249 1645 2077
198 Ll:286639.1 2000SEP08 5508394R6 1791 2258
198 Ll:286639.1 2000SEP08 g5812196 1868 1967
198 Ll:286639.1 2000SEP08 g5397776 1882 2282
198 Ll:286639.1 2000SEP08 g4739487 1905 2282
198 Ll:286639.1 2000SEP08 g4739468 1907 2282
198 Ll:286639.1 2000SEP08 4249162F6 1966 2348
198 Ll:286639.1 2000SEP08 4249162H1 1966 2116
198 Ll:286639.1 2000SEP08 g3109536 2092 2282
223 -ϊ -ϊ rN o c - - c θ θ θ S S o o o gj s s s s s s o o rN n jr io _n ϊn CD CN
Figure imgf000225_0001
Figure imgf000225_0002
o r- o 0 0 0 0 0 0 0 0 0 0 0 00 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 00
Q § & 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ooooooo 0 o0 0o0
C-o0 M C~o0 M C~N C~o M C~M CM o G O
90 r- o C^ CN θ 'v uj o -- oo rN c cM -t uj o - j rN θ rN θ 'v ^ u rN - rN oo c θ '^ uj j o o u-1 _-> ^
O CJ CN O r- O tN O UJ O O O CM O O O t t O -- OO O O UJ CM (N CJ L O CO Lή ιO UJ UJ ^ rN CJ C> [^ |^ ^ UJ |N ^ CM -- '— o - Q! j^
IN O J O O UJ O rN io oo UJ O O O O O O O O rN OO O O OO O CM O rN CN O O O O O i— CN CN iy iii! 5, 00 ■— "J CO N
(Λ ( CM CM ( n ι- M CM CM CM C CM N C . CM n ( CM CM CM CN O O CM O C c C\I CM O O (1 J O C. CM CM t t O ~ o o o
H U α.
-t iN CM LO O t O oo 'vt o -- c o o cj o c - cj o oo LO o rv rv cj co co t o o oo oo uj oo t n-, O 1/N UJ C0 00 O O (-K r. - σ 0 0 3 § 00 'vT « ,vt CN L rN OO -- -- ( -0 - CJ O O C CJ - O O ^ rN r- r- CN UJ UJ UJ ' r c _ t iN vt t uj c - -t cj c cj c c -t -j - uj rN rN UJ U uj uj uj uj o o u o rN rv rN rN rN oo o co °v. 2 ^ ^ ^ 2 ^ 2 C
Figure imgf000226_0001
oo oo <» co oo co oo oo oo oo co co oo oo oo oo σo co co <» oo oo oo oo oo co oo oo oo co oo oo co oo oo co co oo co co oo «J oo oo oo co oo co rø o _L o_L o_L o_L o_L oCL o_L o_L o_L o_L o_L o_L o_L oCL o__. o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o_L o p_L o p o o p o o o o p o o o p o o o rv co co co cn co co c co co co co co co co co co co co co co co co co co co co co co co co co co co co c co co co co co co
---OooOOoOoOoOooOOooOoQOoOoOoOoOoOooOOoOoOoOoOoOoOoOooOOoQoOooOOoOoOoOoOoOoOoOoQoOgOoOoOoOoOoOoOoOoOoOo iϋoooooooooooooooooooooooooooooooooooooooooooooooooo
Q. ^; ;__ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^• ^ ^ ^ ^ ^_; '~: ' C O O O O O O O O O O OO O O O O O O O O O O O O O O O O O O O O OO O O O O O OO O O O O O OO O O
^ J CN NCO CJ NCJ NCJ NCJ CCJ NCJ NCJ NCO NCJ NCJ NCJ NCJ NCJ CCMJ NCJ NCJ SCJ SCJ NCJ CCJ SCJ NCO CO CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ C^ CJ CJ CJ CJ CJ CJ CJ CO CJ C CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CO CJ CO CJ CJ CJ CJ CO CJ CJ CJ CO CJ CJ CO CJ CJ CO CJ CJ O
Figure imgf000226_0002
90 r- o o n to o LO IN 00 J CM - r— rN v rN v ιΛ O C0 -- O C0 ι— CN LO LO UJ UJ CN -t O LO CM CN OO LO rN .- O UJ CN OO r- n i - CM o U) U) o IN C CJ O — O -- '— O UJ UJ UJ C0 t C0 O UJ UJ CJ UJ UJ UJ r- O O UJ UJ -t CM rv "^ CJ C0 'vt -- CJ O UJ CJ UJ
J i-_ I !N_ o °ι o ∞o o co o o Co) cJ Uu) -oo "-vtt t θo tt vvtt cj Luj cooo tt oo oo oo rrNv θo rrv oo oo oo oo oo oo oo oo oo Q Q θ O θO θO θO ϊ!_, θ O UJ rINN rINN θO uUJ COj -r-— o O CJ CN CM CN CM CN CM CM CM CN CN CM CN CO CJ CJ CN CN CM CM CJ CJ CM CJ CJ CJ CJ CJ
H 88 o CJ J CJ CJ — — — CM CN CN CN CM CJ U α.
-t ΓN OO O B O - t — L vt -J -vT CO O O O CN O c .jj uuj vsti . j IrNN oαoj uj CN vt t o oo rN o j rN t t CJ IN CJ O — O UJ π n o IINN tt of rxii o ryvi iN |N O O O - UJ UJ UJ UJ OO CCNN t vvtt OO CCJJ CCJJ U ■"J ■"J O - C -O C —O --J o - |N O O CM -t 00 CJ t -t 00 CJ OO CO O — OO O CO IN ^ CJ rv CM C u-) L. 0 _ 0_ _- _- _. _- _- s1_ r_ CNj CM CN O _ _ O_ _ O_ _ O_ O_ _ IN IN CJ CJ CO vt UJ O O O UJ
CM CN C CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN — — — CM C. CM CM CN CM
Figure imgf000227_0001
OoO OoO OoO TOoCoO TOoTOp O O O O O O O O O O O O O O O O O O O O O
O O O O CJ rt
Figure imgf000227_0002
o r- . o Q OoOoOoOoOoOoO O O NoCNooO O O CN CNoO CNoCNoO CNoCNoO O C CoN CoO O O O O O O O O NoCNoO C oCNoO O O CNoCNoCNoCNoCNooO CN CNoCNoCNoO CNoCNoO O O O CNoCNoO C^oooOoOoOoOoOoOoOoOoOoOoOoOoOoOo o a CN CN CN CN CN C o
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
200 LI:332161.1 2000SEP08 gl 162883 2013 2342
200 U.332161.1 2000SEP08 5719879H1 2039 2437
200 U.332161.1 2000SEP08 1597684F6 2050 2516
200 U.332161.1 2000SEP08 1597684H1 2050 2258
200 Ll:332161.1 2000SEP08 5290786H1 2052 2311
200 Ll:332161.1 2000SEP08 3243490H1 2006 2243
200 Ll:332161.1 2000SEP08 g705533 2089 2251
200 Ll:332161.1 2000SEP08 2244532H1 2089 2207
200 Ll:332161.1 2000SEP08 6148317H1 2253 2752
200 1-1:332161.1 2000SEP08 7238430H1 2322 2657
200 U.332161.1 2000SEP08 6865396H1 2320 2873
200 U.332161.1 2000SEP08 6865595H1 2320 2847
200 U.332161.1 2000SEP08 g2159107 2328 2715
200 U.332161.1 2000SEP08 2263256H1 2328 2567
200 LI:332161.1 2000SEP08 6058081 HI 2241 2307
200 LI.332161.1 2000SEP08 1597684T6 2483 3012
200 Ll:3321όl.l 2000SEP08 4694344H1 2484 2590
200 Ll:332161.1 2000SEP08 4030167T8 2497 2963
201 LI:184867.1 2000SEP08 2728616H1 39 290
201 LI:184867.1 2000SEP08 70868482V1 288 920
201 LI:184867.1 2000SEP08 4897718F6 342 738
201 LI:184867.1 2000SEP08 70868801VI 1342 1956
201 LI:184867.1 2000SEP08 71229928V1 1560 2254
201 LI:184867.1 2000SEP08 70867474V1 920 1530
201 LI:184867.1 2000SEP08 71230292V1 898 1427
201 LI:184867.1 2000SEP08 71220802V1 1570 1837
201 LI:184867.1 2000SEP08 70867220V1 1199 1738
201 LI:184867.1 2000SEP08 2614428H1 1272 1523
201 LI:184867.1 2000SEP08 70870340V1 1285 1456
201 LI:184867.1 2000SEP08 g3229291 1837 2254
201 LI:184867.1 2000SEP08 g4187447 1875 2257
201 LI:184867.1 2000SEP08 g3245634 1882 2251
201 LI:184867.1 2000SEP08 g3148017 1907 2248
201 LI:184867.1 2000SEP08 g661909 1925 2261
201 LI:184867.1 2000SEP08 g3238463 2003 2252
201 LI:184867.1 2000SEP08 4897718T6 1725 2206
201 LI:184867.1 2000SEP08 g5630597 1816 2261
201 LI:184867.1 2000SEP08 70838826V1 342 614
201 LI:184867.1 2000SEP08 4897718H1 342 594
201 LI:184867.1 2000SEP08 70868224V1 342 523
201 LI:184867.1 2000SEP08 70870473V1 1292 1908
201 LI:184867.1 2000SEP08 70870290V1 343 929
201 LI:184867.1 2000SEP08 71229226V1 386 922
201 LI:184867.1 2000SEP08 5876947H1 417 693
201 LI:184867.1 2000SEP08 70870733V1 555 1027
201 LI:184867.1 2000SEP08 70868874V1 622 814
201 LI:184867.1 2000SEP08 70840206V1 662 1019
201 LI:184867.1 2000SEP08 70869882V1 725 1390
201 LI:184867.1 2000SEP08 70818463V1 785 967
201 LI:184867.1 2000SEP08 70869040V1 783 1385
227 00 CM r-
CM o n K r- N K lO W « « - r- IO N r. ιn θ - - - Λ rk W & ,g 3 3 N CK I0 3 ^ Λ r\| 0' Cr fv lO -ι t rN CM CN IN UJ OO CO vt O t
CΛ o θ S n n o- 3 v N S -) - ^ ^ S ) - - <) o « δ <) C) ^ N* ^ cM -) ιθ N ? ^ S - o S oo g O UJ O O UJ O O O OO UJ O ^ o g ^ ιo <2 ^ ^ α g n ^ - g |- ^ ^ 2 ° S ^ P: ^ N s N σ ^ 5 g - c g ( ^ CO C CJ CJ CJ r— CJ CJ CJ O vt
H U α.
t o ^ ! ∞ o cj ≤ °° r- -_ S; Nt N? ^ (. ^ rt rSN U&J USJ Oι C_jM lCnM Cln OS. w « - θ' 'θ ! * lv | ' O' CN ( !θ c . Nf ^ o o^ O t — rv U ^ cN o fe fe — S — ^ n 2 Ϊ N n N n n n Λ -. _ Λ S θ N " ^ ^ ∞ ^ O ^ ^ N W Λ ^ ^ ^ ^ 00 0v ffi n 5 00 O . L O O 77. ~ J O O ^ O — — ^ « rN ^ ^ ^ ifl UJ O O ^ __ __ l_N U-) u r N iN |N TO TO TO O O ^__ 2
Figure imgf000229_0001
o r- z o Q — — — — — CM (N (N ( CN ( C CN CM CM CN C CN CN CM CM ( ( CN CM CN CM ( CN CN C CM CM CN CN CN CN C^ o o O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O
CN CN CN CN CN CN CN ( CN (N CN CN CN CN CN CM CN CM CN CN CN CN ( CN C CM CN CM CM CN CN CM CN CM CN CN < CN CN < o GU O CO
O CJ
Figure imgf000230_0002
CM — —
Figure imgf000230_0001
-t I-N OO rN -t p θ θ __I 1θ L J (-_. -- r- r- r- r- O C -- -^ r- c o c _oδδSS o. o o o o-^^S.?-sE S;..S2-^∞ -.-.ϊ--cJ_^NN^-.-l-S-S-oNϊoioooιSoJ∞3-So^ϊ-S-S- -- o -N o « c a_ T o o ΓN o o ~ 2 s § s S g 111118 §
Figure imgf000230_0003
rγ_ /r> .« .Λ .n 00 C0 CO CO C0 C0 C0 00 <» 00 C0 C0 e0 C0 C0 C0 C0 -0 00 CO C0 00 C0 00 00 e0 e0 C0 -0 C0 eθ C0 C0 -0 -O Cθ e0 C0 C0 C0 C0 C0 C0 Cθ eθ rR r S r_ o C_ o p p o o o o o o p o p o o o o o o o o o o o o o o p o p o o o o o o o o o p p o p o o n rr. nMπ nrπ nr7. nrτ. CLXu m_L _L _I uJ -L_U- -LLU -iLj αLU- CL_Lj -L -LLU -LUL -LLU _-^u j Lu j Lu uj u^
Q O O O
( ) < ) ( ) r^ r^ O O O O O Q O O Q O O O O O O O O O O O O O O O O O O O O O O Q O O O O O O O O O O O O O
CD ( ) ( ) ( ) H HπRoOoOooO OoOoOoOoOoOoOoOoOoO OoOoOoO O CNoC ooOoQoOoO O O O O O O O O Q O O O O O O O O O O O O O n CN CN CM CM ^ CN Cj rNj rNj c CN C f CN CN CN CN ooooooooooooooooooooooo a vt t t j j "~. "—. "~. "~. "~. "~. "~. "~. "~. "~. ^~. "~. "~. ^ "~. "~. ^ "~. "~. "~. ^ "~. "~. "~. "~. "~. "~. "~. "~. *~. "~. "~. "~. "~. "~. "~. "~. : *~. -~. "~. "~. ^~. "~. "~. ε CM CM CN R. R.; N CM CJ ( CM CM N CM CM C C CM W N CM CM CM CM I (N CM CM W CM CM CM W CM CM C CM C . CN CN CM CNI C
C) C) C) ^ J^ CJ CJ CO CJ CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ C CJ CO CJ CJ O CO CJ CJ CJ CJ O CJ CJ CJ CJ CJ CJ CJ C^
___ o- ClO- ^ ^ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o oO y- isT o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o
CN ts CN ^ ^ 00 CO CO 00 00 CO CO 00 O0 C0 O0 C0 C0 CO 00 C0 00 00 00 αθ CO 00 00 C0 00 00 00 Cθ α) 0O CO CO CO CO C0 rø
CM CM CN N ^ ^ ^ ^ ^ i: i; -I ^ ^ - - - - - - - - - - I Z I I !I I !I Z I I Z I I I _ _ I
o r- o Q o ooooooooooooooooooooooooooooooooooooooooooooooooo o O
90 CM r-
CM o Q ^. 0K0 C^J ^ vt ^ O CJ _-_ -T O- 3 O- —5 C-M O- - v-t OS UJ O SJ 00 c m CO - O — OO UJ IN O O UJ O UJ --,. --. ,-, m
CΛ g ^ ^ C S ^ g g rN ^ -t ^ o - O ^ - k g- .o "" J S3§- 2jr θ Lθ C O C -t θ 00 -t lN Uj αJ O O t -t O ^ n. 2 ^ °_ iN CM § J UJ CJ ~ .— "t UJ O — CN — — O O — C C. S CM 00 5< a _ oo y! R CM rv __ r_ r_ r_ cM CN CN CN — CN — — — — — CN CN CM °° ^ "-' T O
H U α.
J
P
Figure imgf000231_0001
∞ O
© o o
Figure imgf000231_0002
CΛ m iN NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ KJ KJ NJ NO NJ NJ NJ M rNj NJ NJ NJ NJ NJ M NJ NJ NJ KJ KJ Nj r NJ NJ KJ NJ rO K^ K D o C»oC» OooovjoNjoNjoN|oNjoNjoNloNjoNjoNloNloNjoNloNlσNloOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoCJιoCrιoC_πoenoCπo<Jι eoπoC_π coπoCπoJ__ oJN_o__ gJ _o o
Figure imgf000232_0001
O Nj --. .- ;θ θ o en N KJ N C» g __ __ f __ r_ -ι __ r-, r-1 O0 0O CO CO Nl 4_. — ' KJ — ' — en JN. .TN. jN. c
NJ CO CJ CJ Q
^ cn c3n 1 -o -O O O Ni t_. cn o o .t- - K JN. K J.. S - 5 KJ_ 5 —- • Q Nl O NI O 03 0 M N N W O NI KJ O en NI .£_. o en en en _j.
I en cn 4-- fe. CJ o NJ — CJ
00 Nl —i o en co — ■ — ' Cj cj c o c-g
NI O — ■ en o — • oooo to - oo to ro S NI NI o o- j-- en e -ti θ ffloo oao -i ft M C. ^ S ^ _. ft ^ i U 2 ffl o (J p; 0; 00 - CJ l 0, <) | lj) ^ u 0,^
OO O O O OO — ' NJ O o CJ en o o o
Figure imgf000232_0002
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop 208 U:220537.2:2000SEP08 70036886V1 1202 1670 208 U:220537.2:2000SEP08 70038905V1 1202 1532 208 LI:220537.2:2000SEP08 3988928 6 1202 1693 208 LI:220537.2:2000SEP08 70037558V1 1202 1625 208 U:220537.2:2000SEP08 70113012V1 1799 1927 208 U:220537.2:2000SEP08 70112530V1 1799 1886 208 U:220537.2:2000SEP08 4822576H1 1908 2170 208 U:220537.2:2000SEP08 70036263V1 1202 1550 208 U:220537.2:2000SEP08 70036856V1 1202 1559 208 U:220537.2:2000SEP08 3988928H1 1202 1391 208 U:220537.2:2000SEP08 4320977H1 1272 1562 208 LI:220537.2:2000SEP08 70112984V1 1279 1395 208 LI:220537.2:2000SEP08 70035465V1 1202 1607 208 LI:220537.2:2000SEP08 4248779H1 120 271 208 LI:220537.2:2000SEP08 70037119V1 1432 2003 208 l_l:220537.2:2000SEP08 3988928T6 1450 1964 208 U:220537.2:2000SEP08 1681947H1 625 839 208 LI:220537.2:2000SEP08 70036202V1 1474 1583 208 U:220537.2:2000SEP08 3436719H1 1334 1577 208 U:220537.2:2000SEP08 70039272V1 1374 1829 208 LI:220537.2:2000SEP08 70037845V1 1389 1853 208 U:220537.2:2000SEP08 70035348V1 1376 1939 208 LI:220537.2:2000SEP08 988064H1 327 607 208 LI:220537.2:2000SEP08 7664878J1 387 1027 208 U:220537.2:2000SEP08 4248779R6 567 1023 208 LI:220537.2:2000SEP08 4248779F6 120 545 208 LI:220537.2:2000SEP08 7247233H1 144 680 208 LI:220537.2:2000SEP08 5390613H1 24 282 208 U:220537.2:2000SEP08 g2022820 747 1095 208 U:220537.2:2000SEP08 7664878H1 951 1459 208 LI:220537.2:2000SEP08 g839595 1056 1292 208 LI:22O537.2:2000SEP08 70039014V1 1202 1800 208 LI:220537.2:2000SEP08 70036954V1 1202 1664 208 LI:220537.2:2000SEP08 70035507V1 1202 1625 209 LI:248364.2:2000SEP08 814363R6 1500 1854 209 U:248364.2:2000SEP08 814363H1 1500 1729 209 LI:248364.2:2000SEP08 g3931760 1512 1907 209 LI:248364.2:2000SEP08 5047370H1 1447 1737 209 U:248364.2:2000SEP08 814363R1 1500 1930 209 LI:248364.2:2000SEP08 60209134U1 1401 1704 209 U:248364.2:2000SEP08 60209135U1 1408 1901 209 LI:248364.2:2000SEP08 6758436H1 1100 1572 209 LI:248364.2:2000SEP08 7754413H1 1124 1659 209 LI:248364.2:2000SEP08 6608287H1 1244 1796 209 LI:248364.2:2000SEP08 8053666J1 1323 1932 209 U:248364.2:2000SEP08 3071079H1 1364 1661 209 U:248364.2:2000SEP08 7608161Jl 1374 1930 209 U:248364.2:2000SEP08 60201844V1 1245 1645 209 U:248364.2:2000SEP08 g3873143 220 416 209 U:248364.2:2000SEP08 7468175H1 456 898
232 00 CM
CM
O Q. O UJ — O O g- r- vt vt O O O CJ CO LO vt O CJ O O — O OO CΛ 3 -o iO M» o . O' C. n ™ ® S o - io c ^ 2 ^ o> ^ S 3 U (M o o ^r
— CoJ C—M —O Ovt CO O Q — O CN CO O O -t — O O UJ O O vt oo u — rN vt rN ^ - T O ^ ^ rS t o cN CN co S O- t ^ S 'j cN o i- O
CJ t J UJ O t UJ <_ — o o CN UJ UJ 00 -vt O O O O O Oθ C_ CN 0, _D (>. 0. 0. C_ — o- ^ L- Λ C CN —— O CCJJ CCJJ
CO
H U α.
t r- C lO C SS o O CN lN r O O v CtM-l IN CJ 00 IN t IN t O CJ r- O — — 'Nt — UJ O CO O O UJ CJ CM t
O cN o ^r c_J --]- 00 co co - C -J O ^ O - CJ _ C. (O N O O O - J OO Q O — CN — — O O O O CiO.: ^ — __ CJ — CJ
Jj O O ffl CO ^ - - ι- r- r— r— C-i •— — o _ CM CN — — CN 8 L c vt t t vt t - u u u o o o o o o rv rN oo ^I J__ LO UJ
Figure imgf000234_0001
0 o - O O O O O O O O O O O O O O O O - — — — CN CN CN CN CN CN CN CN CN CN CN CN CM CM CM CN CN CN CN CN CN CN CM CN CM CN CM CN CN CM LJ O O O O O O O O O O O O — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —
— CN CN CN CN CN C CN CN CN CN CM CN CN C CN CN CM CM CN C CM CM CN CN CN CN CN CN CN CN CN CN CN C^ o G LU O CO
Figure imgf000235_0001
π ΓN vt
ΓN
Figure imgf000235_0002
CO 0O CO OO 0O
"~. ^ ^ ^ "~. "~. : ^
Figure imgf000235_0003
o z o Q CM ( CN CM ( CM CN C CM CN CN CJ CJ CJ CJ CJ CJ CJ C CJ CJ C CJ CJ CJ CJ CJ CJ CJ CJ CJ _ CJ CJ CJ C^ CN S S CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN o G o
00
Q. ^ O 00 00 O t CJ CJ t UJ CJ IN θ θ 00 r- J-l O O CM CN CM UJ CJ CM vt O UJ C0 O θ rN θ — O — O CM — O CM t rv CM UJ CM vt CJ CO n S « f . N N Nt θ N -) » o co () θ a N - n ^ o> - 3 Nr c. () oj 3 θ> ιθ Nr N <) <o -) o - -) o θ' θ- N g N' 'θ θ N co <) i- K cj -o o o oo — CM c — - co -t -t ^ S [ -t o o oo c o u rN r- -- r- 'vt θ 'v rN vt t co — - i !. Nf f o <, o o o -) - o
CO °° C CN — O CM CN CN CM CN — CM CO CJ CJ t-, CD CJ CN CN - CO CN CN — CN CN CN CJ CN CJ r- CN CN CN CN CN — CJ CN CJ CO CN — — CN CN — CN CM
H U α.
n w αι . N θ o - (> N _ ι_) !> c. o β m ι- n ^ » (. N ^ i- 9 ^ . ιo ) θ' %r ^ 3 N θ ι- w (> N ι- ι- _ Λ o c. r- c. N π r. cθ r- ^; θ O - OO Q O ^ CN O — vt S Ln O rv lN O oO — CO O O CM CN -t CN CN O CN CN CJ 'vt cO CO rN r- CN CJ CO ^ S cO CM O O CJ i- S oO 't S O O O O S - O — » 2 K cO (. (. c. » (. O ιO ιO ι-) - θ Nt Nt CM CM CM CM f j crJ «) ^ - κ 5 a !D C O c O N ^ rsj J ^ ^ ^ __ CN CN ∞ C C C CN ^ ^ CN — — — CN CN — — — — — CJ CN CO — CM CM CN CN — — co — CJ CJ r- CO CO __ __ __ __ __
Figure imgf000236_0001
o _LoCLo_Lo_Lo_Lo_Lo_LoCLo_Lo_Lo_Lo_Lp_LoCLo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_LoCLo_Lo_Lp_Lo_Lo_Lo_Lo_Lo_LpCLo_LoCLp£opoopooopoooop
Q _ Coo-J CoCoO CoO CoCo CoCoCoCo Co Co CdO Co CoCdO CpCoCoCoO CoCoO Co CoO C C CO C C C CO C C C_J C C C C C W — Φ OoOooOo o oooooooooooppooooooooo oo OoOoOoOoOoO O OoOoo oo O OoO OoOooOoP OoOoO O OoOoOoOoOoOoo Ooo OooOo oOoOoOoOoOoOoOoOoOoOoOoO O OoOooo O OoO -Jz CM CM CM CM t CM CM CN t CN C CM CM CM CM CM CN
~iz l- ^ ^ \z ]~: ~: ~] ~ Z^ Z l- ^ ~ ~ ~ ~ ~ ~ ~^
E ^ ^ ^ ^ ^ ^ 'Nt ^ ^ t -t - 'v 'v t -t -v 'v ^ t -t t -t t -J ^ m oo co oo oo αo co co oo oo oo oo oo oo co oo oo co oo oo oo co oo oo co oo oo oo αo oo oo oo oo oo co co oo co cg oo αo oo oo oo oo oo oo oo co co co i^ LO LO UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ J UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ OO OO CO OO CO CO CO CO ∞ CO CO OO CO CO CO CO CO CO CO CO CO CO OO CO CO ∞ CO CO CO ∞ CO CO CO CO OO CO ∞ ∞
o r- o Q o G o
00 CM
CM
O Q. rn iN CN Cj r- o CN - O n m - in vt cO [N O CN __ O CM — CN UJ -t CO UJ IN 00 t UJ — O O CN t O «. C0 O CJ t CN _ O CM O CΛ O O ^ ^ -O O IN CM OO O C C O tG rr. O UJ CJ CM O π UJ O UJ CJ -- 5[ rsi f-N oo — o — o u co t S rv- oo co rv cM CM CJ IN
vt O IN t O t t UJ IN o 5 S
CO ,n ι0 CM C. - - - C. CM M ( . (. t C CM — N? -c-o-. irN_ r_ O--- C-| C__M —f _ —- S O_. '_Λ- OC Cc O__ C ^O uC?r- oo ΓN c
CM — CM
H s U α.
t CN Uj cO r-. rN oo — •— o oo .-, — CM — j v OO vt rN g αρ ^-. o D N oo B N m o ιn fNJ v m '. c n r ι n ιn !,' 1Q " π O O C fNj ^ ι_-l f r g _J c CM CN CO CN & O C ϊ ^ tO < __ - C^ i- vt -t co — — S S O I N O - ^ (. O O O Nf ^ ^ Nt ^ Nf '. l N' - S ^ m S oo -o ^ ^ O -o S S cM - — y r- ( - αJ r- r- r- r- C < . CN C -- -- r- M (θ M r- ( C. r- r- r^ C . -_ -_ -- __ U3 C 0° O 00 C0 I_ 1_ O U O -O < r- CM
oOoOoOoOoO
co oo co
Figure imgf000237_0001
co rø
o r- z o Q CJ CJ CJ CJ CJ CJ CO CJ CJ CJ CO CJ CO CJ C CJ CJ CO CJ CJ CJ CJ C CJ CJ CO CJ CO CJ CJ CO CJ CJ CJ CJ CJ CJ CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN C^ o G UJ o CO
CN O - — vt t Uj 00 O vt 0-0_ C .J - C*qJ' 'C;)J' 3CJt SCJ CM:f CrM
Figure imgf000238_0001
O -CJ — CO O O O O IN CM O CN O vt LO rN CO CO CO O O vr eθ - — — CN CN CN CJ O O O —
Figure imgf000238_0002
Figure imgf000238_0003
CoLo_Lo_L _oL _oLo_Lo_L _oLo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_Lo_LoQ-oCLo-LpO-o-Lp---p-Lpppppppopoppppp
Q o O O O O O O O O O O O Q O Q O Q O O O O O O O O O O O O Q O d O Q O Q O O O Q O O O O O O O O O O — O O O O O O O O O O O O O O O O O Q O O Q O Q O O O O O O O O O Q O Q O O O Q Q O O O O O O O O O O Φ O O O O O O O O O O O O O O O O O O O O O O P O O O O O O O O O O O O O O O O O O O O O O O O O O O π CN CN CN C CN CN C CM CN CM CN CM CM CM CM C^
~ \- \- - ~ ~ ~:\i ~ ~ ~ ~ι ~ ~ι~ι~ ~2 ~ι~: ~ι~ 00 O0 CO CO 00 00 CO C0 C0 CO CO CO CO CO 00 00 C0 C0 CO CO C0 00 00 00 CO CO CO CO CO C0 CO C0 C0 C0 (U CO 00 00 C rø
^ UJ UJ UJ UJ UJ UJ UJ UJ UJ l - - UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ - L UJ - LO UJ UJ UJ IO IO L L L UJ UJ UJ UJ UJ UJ UJ UJ UJ L^ -O 00 CO Cθ CO CO CO 00 00 0_ C» 00 00 -0 __ C0 CO ∞ C0 CO CD lX> C0 C0 CO CO C0 CO CO CO CO 0O C» CO CO ∞
o r- o cO cO eO rt cO cO cO cO cO O O cO cO rt cO CO CO CO CO CO cO cO cO O O O O cO cO cO O cO cO cO cO n CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN C^ o G O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
213 LI:1185841.1 :2000SEP08 71358284V1 2678 3220
213 U:11 --5841.1 2000SEP08 2241089F6 2743 3171
213 LI:1185841.1 2000SEP08 3275630H1 2611 2893
213 Ll:1185841.1 2000SEP08 290499H1 2632 2978
213 LI:1185841.1 2000SEP08 2241089H1 2743 2990
213 LI:1185841.1 2000SEP08 2399690H1 2771 3008
213 LI:1185841.1 2000SEP08 71356842V1 2140 2807
213 LI:1185841.1 2000SEP08 71355693V1 2191 2945
213 LI:1185841.1 2000SEP08 7961057H1 2184 2857
213 LI:1185841.1 2000SEP08 71356810V1 458 1117
213 LI:1185841.1 2000SEP08 71358113V1 469 1065
213 LI: 1185841.1 2000SEPO8 8041383J1 501 1058
213 LI: 1185841.1 2000SEP08 1749048H1 515 783
213 LI: 1185841.1 2000SEP08 1749048F6 515 875
213 LI: 1185841.1 2000SEP08 70861180V1 515 727
213 LI:1185841.1 2000SEP08 70794384V1 525 1062
213 LI:1185841.1 2000SEP08 4957262F6 1524 1984
213 LI:1185841.1 2000SEP08 4957262H1 1524 1809
213 LI:1185841.1 2000SEP08 2443857F6 1524 1871
213 LI:1185841.1 2000SEP08 2443857H1 1524 1772
213 LI:1185841.1 2000SEP08 71336855V1 1564 1876
213 LI:1185841.1 2000SEP08 70797142V1 1570 2252
213 LI:1185841.1 2000SEP08 70794533V1 938 1489
213 LI:1185841.1 2000SEP08 71357423V1 970 1608
213 LI:1185841.1 2000SEP08 71351527V1 989 1192
213 LI:1185841.1 2000SEP08 71351586V1 989 1192
213 LI:1185841.1 2000SEP08 71347649V1 1032 1279
213 LI: 1185841.1 2000SEP08 3518889H1 1044 1368
214 Ll:1181710.1 2000SEP08 1476485H1 1 85
214 Ll:1181710.1 2000SEP08 1476477H1 1 205
214 U:1181710.1 2000SEP08 1476477H6 1 207
214 Ll:1181710.1 2000SEP08 1476477F6 1 546
214 U:1181710.1 2000SEP08 4398045H1 1 226
214 U:l 181710.1 2000SEP08 105065H1 327 504
214 U:l 181710.1 2000SEP08 1476477T6 122 543
214 Ll:l 181710.1 2000SEP08 105065 1 327 461
215 Ll:2048959.1 2000SEP08 g1809909 64 414
215 Ll:2048959.1 2000SEP08 3325402T7 1 458
215 Ll:2048959.1 2000SEP08 g1757886 64 317
216 1-1:798494.1: 2O00SEPO8 2110417H1 559 823
216 1-1:798494.1: 2000SEPO8 1398471HI 270 388
216 1-1:798494.1: 2000SEP08 g5880265 1 409
216 Ll:798494.1: 2000SEP08 1398471F6 151 560
216 Ll:798494.1: 2000SEP08 3382640F8 498 966
216 Ll:798494.1: 2000SEPO8 1399832H1 270 377
216 U798494.1: 2O00SEPO8 g4690049 1 330
217 LI:2049223.1:2000SEP08 4 14013T8 166 672
217 U:2049223.1:2000SEP08 g2907281 202 361
217 U:2049223.1:2000SEP08 4914013H1 1 255
218 LI:1177833.1 2000SEP08 5502992F6 717 1037
238 CO m
NJ NJ NJ NJ NJ NJ NJ NJ -vJ NJ NJ N rO NJ NJ -N -vJ NJ NJ NJ NJ NJ NJ NJ M IvJ NJ -Nj NJ KJ NJ NJ rO -N NJ NJ NJ NJ K^ J KJ 5
W -O N M N M - W N M M M - N I M M - - - M M M -' -' - - -- - -^ -- -' -- -' -- -^ -- - ^ -' -' -' ^
W M I M K) - - - - - -' -' O O O O O O O O O O O ) -0 '0 '0 C0 C-3 C0 0) 00 0! 0! (» (» ffi α) 0) 00 00 0! 01 (() 01 CB 00 0) W (» σ
O
en NI _, cn oo oi cn ω ^ ,. _. (.T1 CJ CJ CJ CJ NJ CJ J__ CJ — ■ ^- _ 00 Co J-- 4-- NJ _ cn vj co NI cn cn NI NJ en o O ≥ — ' ?£ — ' O Nl NJ 00 Nl 00 c j
CJ ^^ o —O ' - N NOi jN- 0o0 ≤ S ∞ C|-- OO) OO- O_, <t-- O0. -.i CNO) 0-. • Sj-- ~— _ ^ NI -fc-. NI CJ Nj Nl O J O — • NJ —■ NI O O NJ ONJ —' CJ _-_., cn Q_2
CO —■ to CJ ro CJ ro 7L
Figure imgf000240_0002
NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ -O NJ NJ M NJ KJ -vJ NJ NJ NJ NJ NO O NJ - NJ NJ NJ NJ N r NJ NJ NJ NJ t KJ KJ IO NJ t NJ NJ NJ NJ NJ KJ tO NJ -O NJ KJ NJ NJ - -vJ NJ NO NJ W O- t ro r rO M NJ NJ t NJ KJ NJ NJ NJ NJ NJ KJ rO NJ NJ NJ r tO KJ I- -Nj tO NJ NJ
Figure imgf000241_0001
5^ 0 0 0 0 00 0 0 00 0 0 0 0 0
_^
_. ooooooooooooooo^τ ^ IO -^ r^ ^ ^ N^
[--ϊ
S O OO O OO O O OO O O O O O Φ S O OO O OO O O OO O O O O O — g fπ oooooooooooooooπ
_π g
Figure imgf000241_0002
O Q CD O O O P O O O O O O O O w
Figure imgf000241_0003
en cn cj j. r-, cj c-x) e o o) -v _1 cn ιo c-ι o en o en c» Nj oo oo , oo ^ go rj NJ fn Co CJ _, I NI o 4_. ro NJ CJ en cB θ _-. s S ^ ^ "N ζ c o ^ o ej -θ Nj Ni Ni Ni θ Ni K -- fx e -) o >c o S5 NJ Nl ^_ Nl O 00 NJ —■ —' .-_.
UC-ιWg^O-'(»CJι»M≤OMC- )t-OCB-'OOιOιgN§--gC_ g NJ Nl _J O 00 CO o o N
CoJ o — o _, en CJ oo o NJ -c- NJ
Figure imgf000241_0004
m
NJ NJ NJ NJ NJ rO NJ NJ NJ NJ NJ NJ NJ KJ NJ NJ NJ NJ NJ NJ NJ NJ NJ rO Kj rO rO ro rO NJ rO rO NJ KJ NJ NJ NJ NJ M NJ NJ NJ ro ro ®
-o ro ro ro ro ro ro ro ro ro N w tO NJ iO NJ ro ro ro ro ro ro ro i ro ro ro ro ro ro fo ro to w KJ NJ en en (jι en cji c-n eπ ci c-n en en en en c-n en cn cji cjι en en c-n en en 4-- 4_. 4_. 4_. 4- 4-. 4-- 4-- 4-. 4_. 4-- 4_. 4-- oo co CJ
Figure imgf000242_0001
o o o o NJ ro ro NJ o o o o o o NJ NJ
o o J TJ TJ
Figure imgf000242_0002
C CJ en 4-. NJ -2 o en o CJ en Q
Figure imgf000242_0003
— ■ 00 fc-. CJ — • _J.
NJ NJ o JV n o 4-- en cn to 4_. .t_ O N) CJ 00 .O 00 Cj0 00 00 00 S en NI oo co CO
CJ NJ o to co 5 en O 4-- 8... o o o 00
CO 0 -0 S o o _ c_p c _-i o _ e _n o c0nι Sfc- O NI N) — ■ n oo Ό n J_ - ' to cn en en o
00 NJ NJ CJ g NI en NI en O 4-- O -C- — ■ . Co — ■ CJ NJ CJ CJ CO NJ 4-. N NIl C NJl c oo e on o v O en NJ o o e -o vj o — - O ΓO NJ j
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
225 LI 108801022000SEP08 60215398U1 281 787
225 LI.10880102:2000SEP08 6002309F8 296 914
225 LI: 108801022000SEP08 71659572V1 347 963
225 LI: 10880102.2000SEP08 71660290V1 383 1018
225 LI 108801022000SEP08 60215397U1 424 921
225 LI.10880102.2000SEP08 60208437U1 515 952
225 LI:10880102:2000SEP08 60208271U2 515 952
225 LI:10880102:2000SEP08 60215409U1 584 1055
225 LI.10880102.2000SEP08 2887278F6 642 1183
225 LI: 10880102:2000SEP08 6354996F8 645 1074
225 LI: 10880102:2000SEP08 2665463F7 650 1203
225 LI: 108801022000SEP08 g2825271 788 1219
225 LI: 10880102:2000SEP08 60215394U1 772 1117
225 LI: 10880102:2000SEP08 5989526T8 994 1514
225 LI:10880102:2000SEP08 696418R6 1007 1341
225 LI.10880102.2000SEP08 71653192V1 1033 1647
225 U:10880102:2000SEP08 997293T6 1058 1527
225 LI: 10880102:2000SEP08 997293R6 1070 1601
225 LI:10880102'2000SEP08 60208436U1 1073 1609
225 LI:10880102:2000SEP08 539754F1 1144 1629
225 LI:10880102:2000SEP08 539754H1 1155 1376
225 LI:10880102:2000SEP08 1662073T6 1149 1582
225 LI:10880102-2000SEP08 539754R1 1156 1549
225 LI:10880102:2000SEP08 2447370H1 1176 1397
225 LI:10880102:2000SEP08 60208438U1 1166 1606
225 LI:10880102:2000SEP08 1932152T6 1220 1592
225 LI.10880102.2000SEP08 1860333F6 1288 1629
225 LI:10880102:2000SEP08 696 18T6 1300 1592
225 LI:10880102:2000SEP08 g1685966 1457 1633
225 LI10880102'2000SEP08 2588430T6 1304 1590
225 LI:10880102-2000SEP08 6836029H1 394 859
225 LI:10880102:2000SEP08 2994174H1 527 827
225 LI:1088010.2:2000SEP08 1369182R1 1298 1629
225 LI:10880102:2000SEP08 4302572H1 278 529
225 LI:1088010.2:2000SEP08 696418H1 1007 1229
225 LI: 1088010.2:2000SEP08 3703176F6 1 538
225 LI: 10880102:2000SEP08 6816514J1 90 657
225 LI:1088010.2:2000SEP08 6772720J1 112 624
225 LI: 1088010.2-2000SEP08 8010886H1 114 618
225 LI:1088010.2-2000SEP08 60215401U1 160 647
225 LI'10880102:2000SEP08 g1506103 637 925
225 LI:1088010.2.2000SEP08 6002309H1 302 582
225 LI:1088010.2:2000SEP08 7617283J1 354 747
225 LI:1088010.2-2000SEP08 025139H1 1005 1081
225 LI:10880102:2000SEP08 2541212H1 1551 1629
225 LI:1088010.2:2000SEP08 5989526H1 721 1001
225 LI:1088010.2:2000SEP08 5570460H1 199 416
225 LI: 10880102:2000SEP08 1662073H1 1156 1260
225 LI: 10880102:2000SEP08 1369182T1 1298 1588
225 LI: 10880102:2000SEP08 5801043H1 1034 1190
242 TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
225 LI:1088010.2:2000SEP08 60215402U1 500 955
225 LI:1088010.2.2000SEP08 60215405U1 593 949
225 LI:1088010.2.2000SEP08 999496H1 1083 1191
225 LI:1088010.2:2000SEP08 3441343H1 651 899
225 LI: 1088010.2:2000SEP08 1476972T6 1347 1591
225 LI:1088010.2:2000SEP08 6515741 HI 103 649
225 LI: 1088010.2:2000SEP08 997293R2 1070 1584
225 LI: 1088010.2:2000SEP08 g1502060 1346 1629
225 LI: 1088010.2:2000SEP08 g1785439 788 1245
226 LI:1165276.1:2000SEP08 1895443H1 19 272
226 LI:1165276.1:2000SEP08 3052342H1 46 335
226 LI:1165276.1:2000SEP08 3021038T6 568 880
226 LI:1165276.1:2000SEP08 1891884H1 19 264
226 LI:1165276.1:2000SEP08 3022533T6 568 902
226 LI:1165276.1:2000SEP08 3021710H1 568 853
226 LI:116527ό.l:2000SEP08 4729485H1 86 235
226 LI:1165276.1 -2000SEP08 5217490H1 1 198
226 LI:1165276.1:2000SEP08 1542833H1 1 137
226 LI:1165276.1:2000SEP08 3836068F6 7 549
226 LI:1165276.1 -2000SEP08 6018614H1 102 701
226 LI:11 5276.1:2000SEP08 g5661058 155 347
226 U:1165276.1:2000SEP08 3525414H1 171 494
226 LI:1165276.1:2000SEP08 6621760H1 188 747
226 LI:1165276.1:2000SEP08 3679568T9 546 925
226 LI:1165276.1:2000SEP08 3836068T6 560 1044
226 LI:1165276.1:2000SEP08 3021710R6 568 943
226 LI:116527ό.l:2000SEP08 6371853H1 587 865
226 LI:1165276.1:2000SEP08 3836068H1 7 289
226 LI:1165276.1:2000SEP08 5735738H1 78 348
227 LI:1169524.2:2000SEP08 g5741053 237 456
227 LI:1169524.2:2000SEP08 g4983712 216 581
227 LI:1169524.2:2000SEP08 g4536188 231 456
227 LI:1169524.2:2000SEP08 5645239H1 752 935
227 LI:1169524.2:2000SEP08 6156187H1 1 322
227 LI:1169524.2:2000SEP08 g2583400 27 527
227 LI:1169524.2:2000SEP08 505847H1 119 324
227 LI:1169524.2:2000SEP08 g2540081 125 581
227 U:1169524.2:2000SEP08 3015762H1 145 431
227 LI:1169524.2:2000SEP08 8109677J1 532 1030
227 LI:1169524.2:2000SEP08 8109677H1 549 1033
227 LI:1169524.2:2000SEP08 6478543H1 586 1019
227 U:1169524.2:2000SEP08 g2278640 151 581
228 LI:1180255.1 -2000SEP08 6946211 HI 1102 1636
228 LI:1180255.1 -2000SEP08 g2159550 567 933
228 LI:1180255.1:2000SEP08 g2011335 1381 1758
228 LI:1180255.1:2000SEP08 5990950H1 1429 1730
228 LI:1180255.1:2000SEP08 6946142H1 1272 1603
228 U:1180255.1:2000SEP08 8039729H1 812 1485
228 U:1180255.1:2000SEP08 g401 078 498 942
228 U:1180255.1:2000SEP08 5286647T8 506 818
243
Figure imgf000245_0001
vT o
O C i— UJ IN Cvl o ^
UJ
Figure imgf000245_0002
< o in o Oo
-— o UJ o CN
Figure imgf000245_0003
O r- o ^ CO CO CO CO CO CO O O O O O O O O O O O O O O O O O O O O O O O O o-- o-- o-- o-- o-- o-- o--, o--. o-- o-- o-- o^-- o.-- ^ .- .- .- .- .- ^
Q CM C CM C CM CN ( CM ( ( CN C CM CM CJ C CJ CJ CJ CJ <- CJ CJ CJ CJ CJ CJ CJ C CJ CJ CJ CJ CJ C^
— CN C C C CN CN C CN CN C C CN C C CN CN C CN CN C CN C CN C CN C C o G o
90 r- fM o C O O O CM — UJ IN _ UJ CO — O — — vT Cvl lN
^ n C fNri g ^ — — o o f M vt (O r- CO — O UJ OO CM UJ t -i- JU o co oo oo o o o o w N vf ^ vf CM - O " 00 " C0 " — 8 —: CM 00 J CJ — CM
CO ^ CN CN CN CM CN CN — CN u;) — Cvl CN CM CM CN CM CM CN CM CN CN CM CN CN CM CN — — — — C CN CN CN CN CN — — — CN CN CN CN CN CN CN C-N
H U α.
Figure imgf000246_0001
ό
IN o Q c c CJ CO CJ CJ CJ CJ CJ C CJ C CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CO CJ CJ CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CN CN CN CN CN CN CN CN C C CN CN C Csl CN C CN CN CN C CN CM C C CN CN CN CN C^ o O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
231 LI:2050313.1 2000SEPO8 g1760607 1992 2359
231 LI.2050313.1 2000SEP08 71191657V1 1992 2629
231 Ll:2050313.1 2000SEP08 7659719J1 1994 2665
231 Ll:2050313.1 2000SEP08 g4889106 3344 3563
231 LI.2Q50313.1 2000SEP08 70697058V1 2435 2991
231 LI.2Q50313.1 2000SEP08 70861844V1 1946 2492
231 U:2050313.1 2000SEP08 2800074H1 1946 2241
231 U.2050313.1 2000SEP08 g857159 1958 2269
231 LI:2050313.1 2000SEP08 g866480 1958 2280
231 LL2050313.1 2000SEP08 4778417H1 2979 3263
231 U-2050313.1 2000SEP08 g4373468 3070 3544
231 LL2050313.1 2000SEP08 4903502H2 2864 2921
231 Ll:2050313.1 2000SEP08 5722778H1 2898 3447
231 Ll:2050313.1 2000SEP08 70696758V1 2231 2506
231 Ll:2050313.1 2000SEP08 4694505H1 2232 2431
231 Ll:2050313.1 2000SEP08 7063383H1 2471 2801
231 Ll:2050313.1 2000SEP08 71635157V1 2473 3045
231 Ll:2050313.1 2000SEP08 2079269T6 2481 2851
231 Ll:2050313.1 2000SEP08 2130543H1 2454 2737
231 U:2050313.1 2000SEP08 70694490V1 2458 2927
231 Ll:2050313.1 2000SEP08 g2220430 2465 2911
231 Ll:2050313.1 2000SEP08 g1928342 2487 2884
231 Ll:2050313.1 2000SEP08 816379T6 2435 2852
231 Ll:2050313.1 2000SEP08 5119740F6 292 753
231 Ll:2050313.1 2000SEP08 g1970036 1817 2199
231 Ll:2050313.1 2000SEP08 g3095393 2522 2913
231 Ll:2050313.1 2000SEP08 g866395 2517 2882
231 Ll:2050313.1 2000SEP08 517655H1 1 226
231 Ll:2050313.1 2000SEP08 2012964H1 1982 2271
231 Ll:2050313.1 2000SEP08 gl928165 1983 2430
231 Ll:2050313.1 2000SEP08 3742339H1 1986 2296
231 Ll:2050313.1 2000SEP08 g823926 2614 2924
231 Ll:2050313.1 2000SEP08 1899389H1 1132 1371
231 Ll:2050313.1 2000SEP08 70864752V1 1137 1586
231 Ll:2050313.1 2000SEP08 71228173V1 1160 1709
231 1-1:2050313.1 2000SEP08 4572460H1 1167 1458
231 Ll:2050313.1 2000SEP08 835906H1 1802 2105
231 Ll:2050313.1 2000SEP08 5460718H1 1122 1379
231 Ll:2050313.1 2000SEP08 g6398687 2432 2899
231 Ll:2050313.1 2000SEPO8 g2669938 2432 2899
231 Ll:2050313.1 2000SEP08 71188286V1 1885 2447
231 Ll:2050313.1 2000SEP08 g2878831 2652 2907
231 Ll:2050313.1 2000SEP08 g1548944 2664 2913
231 Ll:2050313.1 2000SEP08 g2360009 2675 2902
231 Ll:2050313.1 2000SEP08 g5837028 2427 2917
231 Ll:2050313.1 2000SEP08 6011505H1 2425 2729
231 Ll:2050313.1 2000SEP08 71637073V1 1676 2206
231 Ll:2050313.1 2000SEP08 3408725F6 1676 2141
231 Ll:2050313.1 2000SEP08 g2008872 1741 2031
231 Ll:2050313.1 2000SEP08 71638790V1 1774 2386
246 90
IN
Q ^. O _ — O - O_ C-N. C-J. I.N — _J. O_O- O_ C-N. — O _ O -O- C -O_ — t . O- O_ I.N — _ . _. - _ . -. _- _- o -N u vt o cN g; UJ t CN CJ CJ CJ O CM O n O IN t O CO OO -N t UJ — CJ CN OO UJ t t CJ t CM CO -N .— O O C0 UJ O U 00 ι-J CJ [N CN O t CM CJ r— co O 00 t o — LQ
-= __ — vt UJ O rN 'vt O CJ oO CO -O O oO O _ UJ - U - CJ - UJ - — CJ CJ tN CO . cO CO OO. CO. CO. CO O. CN CN CN O __ O_ — UJ CN CN — OO O t t UJ CJ O UJ
CO CM CM CN CN CN CN CJ CJ CJ C — CM CN CN CN — CM CM CM CN CN C CN CM CN CN r- CN — >— CN CN CN CN CN CN CN CN CN CN CN CN
H U α.
t O O
Figure imgf000248_0001
[N
Figure imgf000248_0002
oo co co co co co αj co co co oo oo co co oo co oo co oo co oo oo oo co oo oo co oo oo oo co co oo oo oo oo oo co cp oo co co oo
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ __
Q O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O Q O O O O O O O O O Q O O O O — O O O O O O O O O O O O O Q O O O O O O O O O O O O O O O O O O O Q O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O += CM ( CM ( CN CN CN < C C C C C C C CN C
& cj cj cj cj cj cj cj cj cj co oo cj cj cj cj cj cj cj cj φ CJ c CJ CO C CJ CJ CJ CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CO CJ CJ CJ CO CJ CJ CJ CJ CJ CJ
*~ -o -) -) -) -) iβ -) ifi θ -) -. -. -. -) -. -. -. -. -. -) -5 -) -. -) -. -) -) -) -) -.
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O CN C C CN CN CN CN CN C C CN CN C CN OJ CM CN CN CN CN C CN C C CN CN C C
0
IN z __ __ o Q c cj CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ O CJ O CJ CJ CO CJ α0^ ( CN CN C C C C CN C CN (-N C CN CN C CN CN CN C CN CN C CN CN CN CN C C CN CN o O
90 r- o Q. t O0 -N — t O O IN CM CJ — O ^ rvi O O vt 0 CJ -N Cvl O IN 'Nt 00 O CJ O O O 00 O t O ^ O O rN CJ --0 O 00 i0 O -^ O O t O O -O _ IN IN t UJ UJ O ^ i- O O O O — OO OO -N CN O UJ O O vt ^t — -N O CN O S — O CN U S o C p cO t — O KJ O N CJ O t |N CN CN C0 S N 'N ∞ O O CJ ΓN O C0 — OO O O UJ UJ O — UJ CO 5 — 00 O CN fc-T — UJ O O CO O OO OO IN UJ CO CN CN CN CN CJ CJ CJ CJ — CM CM CM < |^ — CN CN CN — — — — -N CN CN CM CN CJ CN CN CN CN CN CM ^r CJ CN CM C-N '^ — CN CN CN CN CM CN CN CN C
H U α.
fc — C 00 00 O CN — t O CJ CJ rv. m co -— i-o o o oo cj u o t o rN i— O ΓN CN UJ O CJ UJ O CM __ CO UJ OO CO O CM CJ -N Q O _ CO . C.M. C.M CJ . IN — CM ΓN O — CN CO O IN CN UJ CM — CJ O OO O CJ CN o o IN O CN {ζ f O IN OO IN C N -— t t v O O UJ UJ — — CN CN - O _ O_ CN £ ζf O UJ UJ UJ O — — — u o o o t vt cM -N rN t j ∞S f C I t O O UJ
CO CN — CM CM CJ CO CJ CJ — CN CN CN CM CN CN — — — — CN CN — CN CN CN CN CN — — CN CN oO O o CN CN CM CJ CN CJ CJ CJ CJ CN
Figure imgf000249_0001
-oLo-L -oLo-Lo-L -oLo-Lo-L -oLo-Lo-Lo-Lo-Lp-Lo-Lo-Lp-Lo-Lo-Lo-Lo-Lo-Lo-Lo-Lo-Lo-Lp-Lo-Lo-Lo-Lo-Lo-Lo-Lo-Lo-^ooooooooooooooo
— Ω OoOoOoOooOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoO OooOoOoOoOoOoOoOoOoOoOoO OooOoOgOoOogoOoOoOQOQOoOoOoOoOoOoO O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O jr CN CN C C CN C C CN C C CN C CN CM CN CN C^
& cj c cj co c c-) co crj c c c c cj cj cj cj cj cj cj CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CO CJ CO CJ CJ C CO CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ CJ h-. O O Q O O O O O O O O O O O O O O Q O Q O O O O O O O O O O O O O Q O O O O O O O Q O O O O O Q Q Q UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ UJ U^ O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O CN CN CN CN CN CN CM CN CN CN CN CN C CN CN CN CN CN CN C CN CN CN CN C CN CN CN CN CN CN CN CN CN CN CN C^
0
IN o Q c c CJ CJ CJ CJ CJ CJ C CO CJ CJ CJ CO CJ C CO CO CJ CO CO CJ C CJ CJ CO CJ CJ CJ CJ CJ CO CJ CJ CO CJ CJ CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CJ CJ < CN C ( C (N CΛI CN CN CN CN ( CN CN CN < CN CN ( CN C CN CN CN CN C CN CN CM CN CN CN CN o LU O 00
90 r- o Q. t CJ O O IN - IN — — o o O rN UJ co cO O vt O O O O O O cO O O CM t O O — O O O O UJ UJ CJ UJ --, CN Q o. ^ r ιr> g" CM — CJ .0.0_ CJ . C _> o o o - -N oo p O O -N t t oO t t -N UJ cO O UJ O — « o - - S O O SR ^ C U
UJ O CO O CJ t UJ CO -N OO oo co co o o o — ΓN IN O ΓN ΓN O O CO ^ OO OO O CO CO O — iN rN CN Lθ Lθ (. cj ^ θ Lθ 55 y: r< J^ CO CM CN CN CN CM CN CN CM CM CM CN CN CN CN CN CN t CJ CJ vt CJ CJ CJ CN C CM CN CN CM CM CM — CN — CJ CJ CJ CM CN O CM t o o ^ ^
H U α.
t o uj co - o cn oo cM Uj o t —— vvtt oo oo oo oooo rrvv uujj uujj —— o o ccoo ccj —— oo ccoo oo vvtt uujj LuOj vvtt ujJ CO 00 O CO o oo UJ CO il L -N lN O O -N OO — O -N — CO O J O — UJ O CO CO UJ UJ — O O O OO OO O O — C -N - C -N J J O CO CO — J IN UJ UJ = CM CJ CJ CM CN ( CN CJ c c t -N rN UJ UJ t t UJ UJ -O CN — — — — — — — CN t t vT vt o o
— o t O UJ O CM
CO CN CN Csl CN CN CN Csl CN C CN CN CN CN C CN CN CO CJ CJ CJ CJ CJ CN CM CN CN CN CM CN CN — —— — — CJ CJ CJ CN oCM CM — — CJ CO t S-&
vf
Figure imgf000250_0001
o r- z o — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — CM CN CN CN CM CN CM CN
-J CJ CJ CJ CJ CJ CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CJ CO CJ CJ CJ CJ CJ CJ CO CJ CO CO CJ CJ CJ CJ CJ CJ CJ CJ CJ Λ — CN CN CN CN CN CN CN CN C CN CN CN CN CN CN CN C CN CN CN C CN CN CN CN CN CN CN CN C o Co O
Figure imgf000251_0001
t CN UJ OO — CO t CJ UJ IN CM CO O- IINN r.'J — r ^-. r-T- I ("- m ^-+ o UUJJ UUJJ o KJ- --. VtI l-J- O KJ- O_ _- ri O O O — — — — CN CN CJ O t — |_. J UJ j- . o K_ι C r..J C
÷ OoIO IOoNoN OoO OoOo.o £. .o. coo c — ™ o- O 3 __ — lO N o S s - t m S o o- i- — S OO O ΓN ΓN
00- o — oooooooooo o oo - O
Figure imgf000251_0002
00 CO CO CO 00 CO 00 00 00 CO 00 00 CO CO CO CO 00 00 CO 00 CO 00 00 CO 00 00 00 00 CO co 00 CO CO 00 00 CO co CO 00 00 00 co 00 00 CO 00 00 00 00
O O O O O 0 0 0 0 O 0 0 0 0 0 α- 0 O 0 0 O 0
CL 0- 0- g 0 0 0 0
0- 0- 0- 0- 0- Q- CL D- Q- Q. α- Q. 2 0 0 0 0
Q- g 0- CL 0- CL Q- Q- 0_ CL CL 2 0 0 O 0 0
0- CL g O
CL L Q- CL Q- g 0
0- g 0_ g CL gg 0 0 0 0
Q- 0- CL Q.
LU LU LU LU J LU LU J LU LU LU LU LU LU LU LU LU LU LU LU LU LU LU LU LU LU l-U LU LU LU LU LU LU LU LU UJ LU LU LU UJ LU LU LU LU LU UJ J LU LU LU
Q O CO CO O CO CO O CO O CO O
0 O 0 0 O O 8 CO CO cn O O CO CO O
O O |— CD O 0 O 8 CO O O CO CO CO O CO CO CO O O CO O O CO CO CO O CO CO CO O CO O
O O O O 0 0 O O O 0 O 0 0 0 O O O O 0 O 0 O 0 α O O O O O 0 CQ
0 0 0
Φ O O O O 0 0 O 0 0 O O 0 O O p 0 O 0 O O 0 0 O O O 0 O 0 0 0 O O O
O O O O 0 0 O 0 0 O O 0 O O O 0 O 0 O 8 O
0 O 0 0 O O O 0 O 0 0 0 O O O 8 0 O 0
0 O 0 8 0 0 O O O O O 0 0 0 0
0 0 O O O O O 0 0 0 0
0 CN CN CN CN CN CM CM CN CN CN CN CN CN CN CN CM CN CN CN CN CM CM CN CN CN CN CN CN CN CN CN CM CN CN CN CN CN CN CM CM CN CN CN CN CN CN CN CN CM CN
CO CO CO CO CO CO CO cj CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO CO j CO cj CO CO CO j cj CO cj CO CO cj cj CO cj CO j cj j CO CO CO in 10 LO in in UJ LO LO in in in in J UJ in in UJ LO J CJ m in J in m in J J in in in m in in in in m m J m J LO UJ J LO in J UJ m
C C) J C) 0 CO m LO m
CO C 00 C 00 CO 00 J CJ J 00 CO J CJ J OO 00 CJ C CJ CO CJ 00 00 00 c co 00 CO CO C CO J C CO C CO 00 CO CJ
CJ- 0- 0- 0- 0 0- <)- ()- 0- 0- 0 0 0 0 0- 0- 0- J- 0 0- 0 0 0- 0 CJ- 0- 0- 0- CJ- 0 J- 0- CJ- 0 0- 0 0- 0- c ()- J- (>- CJ- 0- 0- - 0-
CJ CJ J 0 C J CJ C J C J CJ CJ CJ C C 0 0 0 0 0 0 0 0 0 0 CJ 0 CJ CJ CJ CJ 0 0 0 0 CJ CJ 0 C J CJ CJ CJ J 0 C J C J J CJ
CN CN CN CM C CN CN CN CM CM CN CN CM CN CN CN CN CN CN CN CN CN CN CN CM CN CN CN CN CN CN CN CN CN CN CM CM CM CN CN CN CN CN CN CN £ J- J
CN CN CN CN 8 CM5
Figure imgf000251_0003
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop 232 U:209351.3:2000SEP08 70156234V1 1081 1435 232 U:209351.3:2000SEP08 70152471V1 1087 1454 232 U:209351.3:2000SEP08 70157391VI 1107 1458 232 U:209351.3:2000SEP08 70161681V1 1116 1447 232 LI:209351.3:2000SEP08 70156138V1 1118 1462 232 LI:209351.3:2000SEP08 70156038V1 1123 1462 232 1-1:209351 ,3:2000SEP08 70719134V1 1195 1590 232 1-1:209351.3:2000SEP08 71273545V1 2074 2675 232 LI:209351.3:2000SEP08 6623018J1 708 1308 232 U:209351.3:2000SEPO8 70151903V1 685 1170 232 1-1:209351.3:2000SEP08 gl 988613 740 1049 232 1-1:209351 ,3:2000SEP08 70151788V1 845 1389 232 1-1:209351 ,3:2000SEP08 70157541V1 875 1294 232 1-1:209351.3:2000SEP08 70712620V1 880 1218 232 U:209351.3:2000SEP08 70689377V1 883 1488 232 1-1:209351.3:2000SEP08 70151840V1 899 1361 232 U:209351.3:2000SEP08 g713726 903 1201 232 U:209351.3:2000SEP08 70720502V1 1590 2144 232 U:209351.3:2000SEP08 70717624V1 1625 1804 232 U:209351.3:2000SEP08 70717872V1 1649 2012 232 U:209351.3:2000SEP08 70718549V1 1649 1773 232 LI:209351.3:2000SEP08 6993359H1 1575 2193 232 U:209351.3:2000SEP08 70717405V1 1590 2079 232 LI:209351.3:2000SEP08 70647862V1 1700 2090 232 LI:209351.3:2000SEP08 70717416V1 755 1280 232 LI:209351.3:2000SEP08 70159034V1 273 735 232 LI:209351.3:2000SEP08 70158626V1 811 1316 232 LI:209351.3:2000SEP08 71271703V1 2199 2776 232 U:209351.3:2000SEP08 2402824H1 2212 2442 232 U:209351.3:2000SEP08 71271047V1 2296 2603 232 1-1:209351.3:2000SEP08 71272430V1 2413 2603 232 LI:209351.3:2000SEP08 4500146T6 2522 2977 232 LI:209351.3:2000SEP08 1377137H1 2728 2927 232 LI:209351.3:2000SEP08 5193988H1 2111 2376 232 1-1:209351.3:2000SEP08 1876631H1 2135 2409 232 1-1:209351.3:200OSEPO8 71272644V1 2148 2603 232 LI:209351.3:2000SEP08 70912859V1 2166 2650 232 LI:209351.3:2000SEP08 2255236H1 2182 2410 232 LI:209351.3:2000SEP08 70158340V1 761 1315 232 LI:209351.3:2000SEP08 70160323V1 774 1173 232 LI:209351.3:2O00SEP08 70151309V1 485 903 232 LI:209351.3:2000SEP08 6482421 HI 605 1155 232 LI:209351.3:2000SEP08 70716723V1 1200 1399 232 U:209351.3:2000SEP08 70720944V1 1200 1399 232 U:209351.3:2000SEP08 70157680V1 1 458 232 1-1:209351.3:2000SEP08 70154364V1 1 410 232 LI:209351.3:2000SEP08 8123890H1 183 851 232 LI:209351.3:2000SEP08 70160906V1 1 442 232 U:209351.3:2000SEP08 70152279V1 775 1455 232 LI:209351.3:2000SEP08 4500146F6 1494 1814
251 90 CM
CM
O -Nt O t — OO CJ O vt t CN CO CN — CM O O 00 O IN o iN io o o — — uj c t o rv o o o c -vt c t ^ iS S cN UJ j cM -N -N co co co o oo — rN vt vT vt ^r o ^ cM O N CΛ j- 5 -• CIO„2S5 UJ vt O UJ O O CJ UJ vt CM cO CM CN CJ — — CN CJ vt lO S- 2 p. CN — O N U- Cvl -N UJ CN rN CN t oO UJ U lO LO — & £ & :
-= CO — CN CN CM t ^1 Z=- — CM CM CM CN CM CM CM CM CM CM CN CN CN CN CN <κ l-J C0 «) C O CO CJ CN CM CJ 'vt 00 C0 -N 00 00 00 C0 C0 jΪ r^ UJ I^
H U α.
fc o o -r, O. C -N - INN U uJ c'OJ VtJ I-NN — i I—I ^- IN ι_~ ι_- v-l -vj VJ ' — I'JJ HO U -J- O U ,« m (O -T- - o . 00 O ^ IN CO CM — (N CN UJ O OO OO O O O CN LO UJ O O O CM CJ ^ Ω ∞ S2 θ UJ i2 |N θ _ f. S oo o — uj o — 7 — — vT O O O O O .- O' CO O' N K N N MO I N CN ^ S.' S ^ iO N y O '. ^ — ; ό co CJ CN J o ^ j
CO — CM CN — — — CN CN CN Cvl CN rv vt uj 0 O UJ J co oo
Figure imgf000253_0001
o
© z
Q o G O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
234 1-1:2052274.1 2000SEP08 71742947V1 1071 1509
234 Ll:2052274.1 2000SEP08 71744365V1 1154 1687
234 ' 1-1:2052274.1 2000SEP08 3732853H1 1138 1288
234 1-1:2052274.1 2000SEP08 7318884H2 1212 1732
234 1-1:2052274.1 2000SEP08 877185H1 341 601
234 1-1:2052274.1 2000SEP08 5674936T6 858 1398
234 1-1:2052274.1 2000SEP08 71741876V1 885 1417
234 1-1:2052274.1 2000SEP08 71742240V1 1008 1509
234 1-1:2052274.1 2000SEP08 71746493V1 1016 1530
234 1-1:2052274.1 2000SEP08 71742262V1 1052 1509
234 1-1:2052274.1 2000SEP08 2797262F6 1065 1642
234 Ll:2052274.1 2000SEP08 2797262H1 1065 1314
234 1-1:2052274.1 2000SEP08 4157540H1 60 240
234 1-1:2052274.1 2000SEP08 4157540F8 96 559
234 1-1:2052274.1 200OSEP08 1704531 HI 160 301
234 1-1:2052274.1 2000SEP08 7658244J1 342 905
234 U:2052274.1 2000SEP08 71741928V1 576 809
234 1-1:2052274.1 2000SEP08 2831361 HI 586 859
234 1-1:2052274.1 2000SEP08 71744286V1 993 1509
234 1-1:2052274.1 2000SEP08 gl523013 960 1142
234 1-1:2052274.1 2000SEP08 g3678477 973 1437
234 1-1:2052274.1 2000SEP08 71747321VI 931 1123
234 1-1:2052274.1 2000SEP08 71741847V1 957 1509
234 1-1:2052274.1 2000SEP08 71743248V1 355 807
234 Ll:2052274.1 2000SEP08 7986547H1 830 1316
234 1.1:2052274.1 2000SEP08 71746536V1 753 1314
234 1-1:2052274.1 2000SEP08 3731660H1 750 1057
234 1-1:2052274.1 2000SEP08 71745329V1 795 1125
234 Ll:2052274.1 2000SEP08 71743782V1 811 1172
234 1-1:2052274.1 2000SEP08 71745857V1 811 1172
234 1-1:2052274.1 2000SEP08 877185R6 341 824
234 1-1:2052274.1 2000SEP08 2672580F6 559 824
234 1-1:2052274.1 2000SEP08 5674936F6 355 966
234 Ll:2052274.1 2000SEP08 71743956V1 355 937
234 1-1:2052274.1 2000SEP08 71743912V1 355 872
234 1-1:2052274.1 2000SEP08 71743192V1 351 981
234 1-1:2052274.1 2000SEP08 71744831VI 355 1004
234 1-1:2052274.1 2000SEP08 71742129V1 576 808
234 Ll:2052274.1 2000SEP08 4333137H1 356 633
234 1.1:2052274.1 2000SEP08 7762849J1 903 1561
234 1-1:2052274.1 2000SEP08 2561680H1 1452 1744
234 1-1:2052274.1 2000SEP08 g7456825 1461 1765
234 1-1:2052274.1 2000SEP08 71744355V1 1451 1509
234 1-1:2052274.1 2000SEP08 g3750627 1488 1732
234 1-1:2052274.1 2000SEP08 g2557941 1597 1732
234 1-1:2052274.1 2000SEP08 6768646J1 1304 1524
234 Ll:2052274.1 2000SEP08 71746594V1 1336 1420
234 1-1:2052274.1 2000SEP08 71742519V1 599 1286
234 Ll:2052274.1 2000SEP08 6244881HI 673 752
234 1-1:2052274.1 200OSEP08 71742310V1 692 1387
253 CΛ m r rO NJ NJ NJ NJ NJ tO NJ NJ NJ NJ NJ NJ NJ NJ tO NJ IO NJ NJ NJ NJ NJ NJ NJ NJ NJ KJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ NJ r NJ -O M NJ NJ o co c-J C C- C Cjo c o ω c c J CjJ Cj Cjo cj cj ej cj cj cj - cj Cjo cj cj co j CJ CJ CJ C ω CJ CJ o o o o o o o o o o o o o o o o o o o o o o o o o o o o o cn cn c-n cn cn en fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
Figure imgf000255_0001
o —i e Nnl c> o όo e On o CO N VNj! N ViI KJ- c UnI e UI fc -P- fc 4-. fc 4-- fc -- fc 4-- fc -- K ςJJ IrVoJ IιVoJ |ιV-J IVo-J ItVoJ I-NoJ INVJJ INVJ —^' ,§-v, f§, _, CJ e _n fc fc fc fc 00 CJ C0 u OI Ol l Ϊ! i n -' ro en Ni Nj cn o cn en en oo — ' Oo en o en fc Q
N N oO S- rl- NNj --- CuD cθ-, N(>j c^ eJ| θ-_ ucx> o<) o| ωil 0l 000 Nj 0i |(j01 fc^ fcg --0- -<-) __, M g-S) k-5. CJ o fc Oo fc O O CJ Cn Cn en o o oo -_
Figure imgf000255_0002
m
NJ NJ NJ NJ NJ NJ NJ KJ KJ KJ NO NJ IO rO rO rO NJ NJ N rO KJ KJ KJ rO NJ KJ NJ Nj rO NJ NJ NJ NJ NJ NJ NJ rO rO rO rO Kj rO NJ NJ CjJ C-J C-J C-J C-J C-J C-J W CjJ C-J C-J C-J CJ CJ W CrJ CJ C-J CJ CjO CJ CjO C Co CJ CjO CJ W o oJ w oj oJ io a oj c-o c-o oi o oo w oo oo cio c-J Oo α oi oo oo α M N N N M N N M N N N O' O' O' O' O' O- O' O- O' O' O- o- o- o-
Figure imgf000256_0001
O
M N N N M M M N N N M M N M N N M M N N M M N M 00 (-O O) --0 C-3 C- C0 C- 00 00 C- 00 00 (B 00 (-0 (10 (-0 00 00 CD C0 ()) (» C» (-O
NI Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nl Nj Nl Nl Nl Nl NI NJ Nl Nl Nl — < — . --- —. —. —. —. —. —. —. —. —. —. —. —. —I —. —. —. —. —. —. — . —. —. -- . en en en en en en en en c-n c-n en en cn en en en en en en en en en en en fc fc fc fc fc fc fc fc fc fc fc cj cj cj c co co c ej cj cj co c-j oj cj w
C ω cj j c c c ω ω cj ω cj cj e ej ω w ω w cj c w c c - N K - N NJ N^
OJ C-J w cj co oJ Cj ω ω oo cj co oJ CJ ej ς ζ ej c ω w e e ω o o o o o o o o o o o o fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc — ■ — i — i — i — -ι _. _- -_ . _- --- _ . Nl Nl NI Nl Nl NI NI Nl Nl NI Nl NI NJ Nj Nl -3
^ --. -- -- -J ----, ^ --- ---J ---. ---J ----, --J --. -J -J ---J --J --J ----, ---J -- -- -- -- -J -- -J -- -- -. -- -^
-O -vJ NJ -vj iv-J NJ NJ -J NJ N rvJ -J NJ -vj -vj
§ O O O O O Q Q Q O O O O O O O O O O O O O O Q O O O O O O O O O O O O O O O O O Q O O O O O O O O m" D Q Q Q O O Q O Q Q Q Q Q O O O O O O O O O O O O O O O O O O O O O O Q O O O O O O O O O O O O O lu b o o b o o o o o o o p o o o p p o o o o o o o o o o o o o o o o o o o o p o o p p p p o o p o o p; m m m m m m m m rπ m iTi rπ m m π m m m m m m -Ti m m rπ -Ti m m m m -Ti m m m m m m m m m m -Ti τ τ τ τ τ τ τ -σ τ τ τj -D τ τ τ τ τ τ -D τ -- τ -α 'O τj τ -D τ τj τ τ τj -α τ τ τ τ^
O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O CO 00 00 00 C» 00 00 00 C» 00 00 00 00 00 0- C» 00 00 (» 00 00 00 00 00 <» 00 C» 00 03 00 C0 00 00 00 0D »
Figure imgf000256_0002
~; Z- -0 O 00 Nl O O fc. fc. NJ NJ -- fc fc fc fc CJ CJ CJ — ■ en Ni en cn cn fc fc fc CJ C c cj c cn ?
^ ^ en — ■ cn oo oo fc K — ' fc o o — ' fc CJ IO N Nl fc fc OO — ■ — ' O fc C CJ — ' fc Co — ' 00 00 00 00 — ' Q ro KJ co K io ej N NJ CJ N ≥ ≥ fc Nl OO O NJ O O fc O fc -' en — ' Cn cj — ' O to en en cn o — — ' Cn cj — ' O o o o en CO _£
fc fc fc fc fc CJ fc fc CJ fc f Co CJ CJ O —' o Nj cn en Ni co Ni o e Un1 OU- NNiI — — - c_vol t|Vo r-. r- oHJ oNJ o-u - OU U O- 00 o vi vi oo oo en
N) K) ( ) N) N) N) ' ( ) -<> c MNl --^ OC _J NJ Ni oo f c c fc oo o Nj y y fc o o o "o c "-D VI fc ro en oo vi KJ ω
O — O — « fc fc fc CO 00 O CO CJ 00 Nl Nl c o — • vi _ . - - -- - o o o NJ CJ en o fc o -* - fc ro — — o ro fc oo o o fc o cn g S δJ NJ -' fc — ■ O cn ro oo oo oo o to OT o O
00 CM t-. CM
O CΛ g O-o CN o CM vt
Figure imgf000257_0001
H U α.
t CM CM N CM CM CN CJ CJ CN CN CJ UJ O O O O — — ιn ιO -) ιQ -Q t κ Ef — — — 00 — O OO — — OO O O CN UJ UJ IN O O CO
55 — — — — — — — — — — — O O O O — CM CN UJ UJ UJ UJ UJ O oo S — — O O O rv CJ CJ CJ CJ CJ O — CJ t vJ O O UJ CM O
Nt c c cj u u uj u uj uj uj uj o -o l^: — — — — CN CM CJ vt t vt LO UJ UJ UJ UJ -N CO OO OO
Figure imgf000257_0002
n
m ii 0
Q. C Φ i—
Figure imgf000257_0003
O O O O O O O O O O O O O O O O O O O vt t t N vt
Figure imgf000257_0004
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
240 1-1:1183147.3-2000SEP08 6560934T8 879 1494
240 1-1:1183147.3-2000SEP08 6560934H1 880 1451
240 1-1:1183147.3-2000SEP08 6560934F8 880 1603
240 1-1:1183147.3.2000SEPO8 2844761 HI 887 1147
240 1-1:1183147.3-2000SEP08 1832094H1 935 1025
240 1-1:1183147.3-2000SEP08 g1300493 1073 1487
240 1-1:1183147.3-2000SEP08 1007614m 595 776
240 1-1:1183147.3-2000SEP08 1401826F6 630 1115
240 LI:1183147.3:2000SEP08 1401826H1 630 876
240 1-1:1183147.3-2000SEP08 1333095F6 632 949
240 1-1:1183147.3-2000SEP08 70651793V1 690 1236
240 1-1:1183147.3-2000SEP08 70649650V1 706 1269
240 LI:1183147.3:2000SEP08 2597282T6 1329 1606
240 LI:1183147.3:2000SEP08 g6041091 1329 1598
240 LI:1183147.3:2000SEP08 g3803626 1329 1602
240 LI:1183147.3:2000SEP08 g4990175 1362 1598
240 LI:1183147.3:2000SEP08 g4890819 1146 1602
240 LI:1183147.3:2000SEP08 196640R6 1 205
241 LI:1175373.3:2000SEP08 70394043D1 245 647
241 LI:1175373.3:2000SEP08 70503364V1 9 417
241 1-1:1175373.3:2000SEP08 70392349D1 21 557
241 LI:1175373.3:2000SEP08 70393943D1 215 521
241 LI:1175373.3:2000SEP08 70503499V1 35 722
241 LI:1175373.3:2000SEP08 60217417D1 39 230
241 LI:1175373.3:2000SEP08 70497402V1 281 581
241 LI:1175373.3:2000SEP08 7312966H1 317 725
241 LI:1175373.3:2000SEP08 6772905J1 1 603
241 LI:1175373.3:2000SEP08 2189954H1 9 279
241 LI:1175373.3:2000SEP08 70502455V1 9 578
241 LI:1175373.3:2000SEP08 2189954F6 9 365
241 LI:1175373.3:2000SEP08 60217438D1 12 329
241 LI:1175373.3:2000SEP08 60217436D1 16 346
241 LI:1175373.3:2000SEP08 60217437D1 18 346
241 LI:1175373.3:2000SEP08 70355479D1 332 593
241 LI:1175373.3:2000SEP08 70393236D1 213 647
242 1-1:813757.1 2000SEP08 70250012V1 7 394
242 Ll:813757.1 2000SEP08 70247960V1 114 615
242 1-1:813757.1 2000SEP08 70248697V1 235 565
242 1-1:813757.1 2000SEP08 70250218V1 161 693
242 -.1:813757.1 2000SEP08 70250104V1 120 722
242 1-1:813757.1 2000SEP08 70250824V1 142 721
242 1-1:813757.1 2000SEP08 70250254V1 221 716
242 Ll:813757.1 2000SEP08 70249632V1 157 707
242 1-1:813757.1 2000SEP08 70247802V1 59 589
242 1-1:813757.1 2000SEP08 70251286V1 588
242 1-1:813757.1 2000SEP08 3449946R6 577
242 1-1:813757.1 2000SEP08 70251559V1 210 580
242 LI:813757.1 2000SEP08 70251049V1 549
242 LI:813757.1 2000SEP08 70247407V1 535
242 LI:813757.1 2000SEP08 70251324V1 532
257 - IO NJ NJ NJ NJ tO lv rO NJ IO IO IO NJ -- I I -O M IO r I N NJ NJ IO KJ NJ I I fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc m
CJ C-J CJ C-J C-J CJ C-J CJ CjO CJ Co CJ C-J C-J C-J CJ C-J C-J C-J CjJ CJ CJ CJ CJ -O NJ I^
0
Figure imgf000259_0001
vl O o o o o o _- o o o o o O o O o o ro ro to o o O O o o o ro κ> ro ro o ° o o fc o o o m o o en en fc en en fc fc oo o en en fc fc en fc fc o fc fc ro en en fc ro ro ro f f fc en o en o o o o o c fc c Ό f o en o o o fc f o o O o e ----n---- c ro c £ en — _ o — - c en o en o en fc en - o o o f o o en en — o 8 v8 fc cn o en fc o fc cf o o o _ o o o to to o fc fc o en fc fc1 o §s <3±<<<±< < < < < < < < < < < < < < < < < < < < < < X o en o o
< < < < < < < < ≥ o
< < X o o o fc ~
<< < < < < < -.
D
cj fc Ni o fc co cn c cj o co fc — ■ cn o co - en ro co NJ fc fc fc Nl 0 o. 0cn- 0o —— - ' O -NvjJ oO —— - ' oO. cCn NNii ccnn .-oo ccjj o e cnn o S NJ — i c enn CJ o NI __ _ CO - ? ° O O O NJ — ■ — ■
— ■ fc co o ro Q IO — ■ vi vi — ' fc N o o oo o- oo en & 00 o fc cn en NI CJ 3 - 5 7L
O VI oo oo oo O _ O _ O _ O O OO OO NI fc Nl O O o o oo Ni c O Ni fc en fc en o NJ o o fc o en fc NI NI o _ o-- ,-- — NJ KJ fc fc en en o?.
-TO N O vJ I- M U W - N -i N -i o - c -n c _π- o _ fc . c -n o _ Ni co o o -v -v o fc o en co — ' vi o — • — ' fc NI NJ O OO O O C^J NI O CJ O OO OO O — ' O oo o Ni to fc oo oo NJ K Ni en en cj fc fc Ni Nj ej O NJ - Ni fc o oo cj o fc cn — ' ivj cn — ' fc o NI -- CO CO — > — • cn , or, cn fc cf cn o o o o- -_3
pe ID Temlat N r NJ ro ro rO I NJ rO NJ NJ NJ NJ NJ NJ NJ - - NJ KJ NJ IO NJ NJ NJ NJ NJ NJ NJ NJ I NJ NJ NJ NJ NJ NJ NJ NJ N^ fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc cn cπ cn en en en en en cπ en en cn cπ cn cn en cn cn en cn en en Sfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfcfc
Figure imgf000260_0002
Figure imgf000260_0001
Figure imgf000260_0003
co — < -- —. —. —. --. fc --- co ιo ro — ' Co co — ■ — ■ — — ' CJ o - o - rj r-j Ni — , CO CJ NJ ,.„ NJ O O O Nl fc CJ Co fc — CJ CJ CJ CJ CJ CJ CJ OO CJ CJ 0"6 — ' O - O_ C-.J- fc- C -J . — ' Nl . C -J - CJ CJ fc - Nvli C(-J-| (c-j~ ;-Λ ;rΛ r^-) <cJx-i — ' O 0O0 Ni iN, -- ■ K. c —. j^ jv. cj o c — - cn oo — ■ tO fc fc fc fc Nl fc CJ fc tO Nl Ni i oo ro N NJ O — K j e N Co fc S S cn S 0 0' en ro =: fc fc ro o oo ro o ro o fc -
g en cn cn en cn en en cn cn cn en oo s cn cn cn o cn cn en en o o Ni Ni Ni oo oo oo oo oo o oo oo oo — ■ o - - o - c-- cS ^ ^ cj ∞ w ω -N- - io ^ w g g ' o e e j o o fc Nj o o N ιo_ rtσ Sfc ∞oo oo 0, c,0 <.O-J θ M ^c Nfc Coo <jJ θ-0 cn cn v0i g^ v^j ^^ g'^ NJi c(_no .ω-_ <,0 j 1 .-0 Wro gro Nj Ncjo rø.c evni τo_j
IvJ NJ NJ ro rO NJ tO IO IO IO IvJ tO -O I NJ IO IO IO fO NJ -O NJ IO NJ NJ NJ NJ -vJ NJ I l^ fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc o o o O O o o o O o o o o o o O o o o o o o o o o o O o O o o o o o o o o O o O O o cn cn en cn en en en
Figure imgf000261_0001
Figure imgf000261_0002
fc fc fc fc CJ co ro J to to NJ NJ f o _ 00 ,oenencnen-ooo y_^?_^Oofcfcfcc -_-- _ CJ cf -■ -■ o o .N _J N .I o - e -n- fc c ooo^--- fNJ ' _-- ONS ONS N CO - Λ cn fc CO CJ Q to o O Ni cj cn fc ro cn KJ 00 CO ^ -O300I OC-1(,l fe c,' ω O OC-lC-lC--0' Jl M , s, α> 3S o o ro fc _J.
ro co o- fc fc cn cn OJ N α
Figure imgf000261_0003
S-Ji SCϋJi cf Oi N CJ ω O & fc ToJ
CΛ m
NJ NJ NJ NJ -O -vJ t IO NJ M NJ -O NJ NJ NJ NJ M NJ NJ I NJ NJ KJ NJ NJ NJ NJ KJ I NJ NJ NJ NJ -O KJ KJ NJ NJ KJ I NJ W fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc M_2
Figure imgf000262_0001
Figure imgf000262_0002
Figure imgf000262_0003
- o & & - cj co fc Nj en en en en en en en en en en --. oo ∞ oo oo c--o oo c--o c--θ Nj Ni Nj Nj Nj o o o o o o c-n en ςn en ci cn c-n fc fc fc --H o fc § 9 rn O - CO Nl o — rO KJ NJ NJ KJ tO KJ IO tO ej — ' Cn en en en cn fc -vj — ' O oo o fc — ' Ni Ni fc co c co Ni fc fc fc Nj o o o o Q o o ^ c" oo oi to ω o O O NJ KJ KJ OO fc tO O — ' fc t 00 θ Nj _- CJ θ fc NJ O O O fc 00 O O _;.
ro to ω fc CJ Cj e o o Ni cj S fe ^ a o r en cn e fc ^ cj cj o ω - o o cj ω No - -- -- z ~ oo __: __ co oo __, o NJ n o o -¥ g fc O N ^ N O O ^ j^ ^ g Nj fc O fc NJ fc ^ W - - -O -O C^ -O ^ Nl C OO NJ O "- -- - - -- o N -- Nl oo NJ i en o c o o ooo n σ — o ■ r ^o y o o o W O- '0 -' N -0 -O t- fc ^ < ti M O' ω fi ϋ' vl ^ -ι -0 - N O) 0 -ι O I C- t -- 0 o ro N w fc -o «XS
r- o Q. O O ιO rN i CN c '^ θ Nj o lN θ ιθ rN θO t O '^ 0 O O C r-, ιrv rn vτr _o -Nj -^ O IN O 00 o O co . cN O O O iΛ iO CN - c .- '- '- cO NO CN .- o S ≤ π & rt o rj S ΓN ΓN o IN
CO CN CM CM CM CM CM CM CM CM CM CM CM —
H U α.
Figure imgf000263_0001
0g0 Og0 0g0 0g0 0g0 0g0 0g0 Cg0 0g0 020 020 020 020 020 020 C2O 0g0 0g0 0g0 CgO 0g0 0g0 0g0 CgO 0g0 0g0 CgO 0g0 0£0 0£0 C£O 0£0 αgJ Cg0 0g0 0g0 0£0 020 00 00 00 00 00 00 C0 m ^ ^
L±i w w w w w w w w w w w i-u w w w w w w w uj ijj w w w u w £2gggggo§888
Ω . !O-) (O-JO-Λ WO(O-lO(Λ WOOIΛ WO(OWO(O-J (O-J !OO IO-J WOO--J WOnO(OgO-Λ (Ofl WOO--J CO) (OΛ W (Λ W n (-J W O CO CO
— O O O O O O O O O O O O O O O O O O O O O O O O O OOOOOOOOOOOOOOOOOOOOOO O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O OO O OOOOO O OO O OOOOOO O OO O SQRSQ CJ CD O O OO O HS S 0 0 J= CM CN < CM CM CN CN CM CM ( C CN C C CN CN C^ CN CM CN n ^ ^ ^ ^ ^ ^- ^ ^ r^ '_ '~ '- '~ ," ^ ^ r r. ," r. r '- r r r r r r. '- r r r r r: r r r r: r r rr r r r v v t t t
E O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O i^* !^" C -O CO CO CO CO CO CJ CO CO CO CO CO CO C CO CO CO C CO CO CO O CO CO C CO CO CO CO OO CO CO CO M IN IN φ N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N X ?; CO CO O l_ϊ ( C (>l ( Cvl C C CN CM C CN CN CN CN CN CN CN (-N CN C CN < CN CN CN C ( CN C CN CN C^ 0 0
CO OD O0 00 00 OD 00 00 0O 0O 0O 0θ αD _0 0O 0O 0O 00 00 OO 0O 0O _0 OO _0 -0 0O _D _0 0O 0O O0 0O OO OO OO 0O OO 0O CO 0O 0O 0O 0O t "vt vt O C CO
,I ^ ^ -Z ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ :z :z ;z ^ ;__ ^ ^ ^ ^ :__ :__ ^ ;----: ^ ^ ^ ;Z CM CN CN CN CN
o r- z o Q vt o O
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
248 U:234937.4:2000SEP08 6201371 HI 28 589
248 U:234937.4:2000SEP08 6444066H1 86 619
248 U:234937.4:2000SEP08 70170335V1 88 512
248 LI:234937.4:2000SEP08 70199890V1 103 612
248 LI:234937.4:2000SEP08 6951283H1 119 554
248 LI:234937.4:2000SEP08 5400740H1 147 415
248 LI:234937.4:2000SEP08 971204R6 164 574
248 LI:234937.4:2000SEP08 971204H7 164 475
248 LI:234937.4:2000SEP08 60210282U1 315 883
248 U:234937.4:2000SEP08 70197633V1 381 680
248 LI:234937.4:2000SEP08 70165167V1 204 493
248 LI:234937.4:2000SEP08 70166455V1 210 680
248 U:234937.4:2000SEP08 70168325V1 207 490
248 LI:234937.4:2000SEP08 5900730H1 216 482
248 LI:234937.4:2000SEP08 71704061V1 249 423
248 LI:234937.4:2000SEP08 70200163V1 268 716
248 LI:234937.4:2000SEP08 70199868V1 284 711
248 LI:234937.4:2000SEP08 70165466V1 286 698
248 LI:234937.4:2000SEP08 70200075V1 297 796
248 LI:234937.4:2000SEP08 70199839V1 309 674
248 LI:234937.4:2000SEP08 70199994V1 314 744
248 LI:234937.4:2000SEP08 2815595H1 313 584
248 LI:234937.4:2000SEP08 971204H1 164 427
248 LI:234937.4:2000SEP08 5907378H1 177 434
248 LI:234937.4:2000SEP08 70200111V1 182 708
248 LI:234937.4:2000SEP08 6439781 HI 198 708
248 LI:234937.4:2000SEP08 70169923V1 386 666
248 LI:234937.4:2000SEP08 70200182V1 391 835
248 LI:234937.4:2000SEP08 4245558H1 421 665
248 LI:234937.4:2000SEP08 984607H1 440 753
248 LI:234937.4:2000SEP08 4194365H1 442 750
249 LI:1170660.1 2000SEP08 7582236H1 1 568
249 LI:1170660.1 2000SEP08 60218360D1 45 533
249 LI: 1170660.1 2000SEP08 60218356D1 69 309
249 LI: 1170660.1 2000SEP08 5547632H1 223 419
249 LI:1170660.1 2000SEP08 7186676H1 223 780
249 LI: 1170660.1 2000SEP08 4706707H1 229 494
249 LI:1170660.1 2000SEP08 4543751 HI 229 492
249 LI:1170660.1 2000SEP08 g1635530 251 552
249 LI:1170660.1 2000SEP08 g1230443 257 449
249 LI:1170660.1 2000SEP08 60218359D1 264 759
249 LI: 1170660.1 2000SEP08 172943H1 324 547
249 LI:1170660.1 2000SEP08 60218364D1 513 923
249 LI:1170660.1 2000SEP08 4572811 HI 588 716
249 LI:1170660.1 2000SEP08 7328676H1 604 1045
249 LI:1170660.1 2000SEP08 g706516 643 871
249 LI:1170660.1 2000SEP08 g883656 644 1014
249 LI:1170660.1 2000SEP08 3806727H1 649 774
249 LI:1170660.1 2000SEP08 g675102 664 1020
249 LI:1170660.1 2000SEP08 60218358D1 851 1293
263 00 UJ O 3 CN O O . cCNM| C- «(q- irN_
Figure imgf000265_0002
Figure imgf000265_0001
-t r -N OO C-O t -— UJ CO O -N O i— CO CN CO OO r— -— m m Λ m m m Λ K 0 π ^ -) - N . ιo io -) in N N o >o - g - <) " 2 S 55 S 5 ^ ιo ^ c- c. iv O O O IN IN CN -N OO CN CM r— O i— IN CN CM CM i— O O J o o O CM CO O O- r— <— CD r— CO ' vt O O O UJ CM
CO °C' cN CN CN CN CN CN CN '— -— CN CN CM CN CN r- -— -— U U U OO OO OO OO __ CO CO CO CO vt vt CO c O t t UJ ccO cO cO O O O cO t
Figure imgf000265_0003
o r- z o Q O O O O O O O O OO O O O O O O O O O O O O OO Q O O O O O OOOO O O O OO O O O O Q OO O O O
UJ LO LO LO LO LO LO LO LO LO LO UJ LO LO LO LO LO LO UJ LO LO IO UJ IO I-J IO IO LO LO LO LO LO IO UJ LO UJ UJ LO LO LO IO L^
CN CN ( CN CN CN CN CN CN CN CN CN CN ( CN CN CN C CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN C^ o G o
00 fM fM
O o. ( iθ vf ) ) c - N () -) θ N vt Nt -) v' N g c- vr g ι. ^ (θ v' r- c. N o g g ι- θ' ) 3 ^ p- _; (> ι . - ) - θ (. O -. . rv_ o. --^ ^ ._ ^ c^ ^ ^ __- -J ->- O_ ^ CM C0_ C0_ 0 c t ιΛ O _ ,v. ι O_ 'vt O„ CM O_ -N -— O IN UJ IV iv CΛ V- -~ -r. --.. .--. - 3-.. _ - O-- . O--. .- ^t
-= c u-j t c t O LO ' t o rN rN J _ IN "t~ " IN IN " IN C —O " -N O -O c—O O- —t -O- -U-J v —t l -N -iN -iN -— -— - - - - IN _t C-M. c —O C —O C -O- C-N. i.O_ _t-. -O cO O — CN CN CN CN CN CN CN — — — CN CN — — CN CN — CN CN CN CN CM CN CN CN CN CN o r CvO
H U α.
O rv .— 1-O O O OO UJ -v rv j CN '— O O O CN -N LO O O CO O CN — g O IN OO CN '— — O O — UJ t O vt CN 00 O Q O p p p UJ Ln r J I-O O tN OO CO O -— -— -— O cO O -O CM -O cO vt lN O 'vt O '— ( tO vt vJ vt O CM O O CO OO O CJ CO t O O -N OO OO OO O O O O i
CM CN CN UJ r- -— -— CN CN CN CN CN CN eO — — — — — -— -— vt UJ UJ fv O O O O O O O O O 0i0c O CO C CM CN CN CN CM CM CN CN CN CN — — — — — — — — — — CN CM CN CM CN CN CN — — — 8 u u o o o 3 -N CN CN CN CN CN CN CN CN
Figure imgf000266_0001
oo oo oo oo oo co oo oo oo co oo oo co oo oo co oo oo oo oo oo rø oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo oo co co oo oo oo oo co o o o p o o o o o o o o o o o o o o o o o o o o o o o o o o p o p o o o o o o p o o o o o p o o o o
_L _L LL _L _L CL _L _L _L _L Q- -L -L --- - -_. _L _L IX _L _L _L _L _L _L _L _L Q- -L --- -L -L -^
III III III III III III III III in in in in in in III III III III III III in III III III III III III III III III III III in III III III III III III III Ml III III in III III III III III in ^ (O CO C C C C CO CO C CO CO C C CO CO C_J CO . CO C CO CO C CO CO C CO CO C CO C CO C C CO C CO CO C CO CO CO W
Q —OooO OoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOooO OoOoOoOoOoOooO OoOoOoOoOoOoOoQδOoOoOoOpOoOoQδOoOoOoOoOoOo φoooooooooooooooooooooooooooooooooooδooooδooooooooo
£ CN CN CN C CN CN CN CN CN CN -N CN CM CN CN CN < CN CM CN CN CN CN C CN CN CM CN CM CN CN CN
^ECoOoOoOo0oo00oOoOoOo0o OooOOoOoOoOoOoOoOoOoOoOoOoOoOoOoOoOooooooooooooooooooooooooooooooooooooooo'-:^o
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — r- — — — — — — — — — — r- — t -Nt
— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — r- — — — — — — CN CN
o r- o O O O O O O O O O O O OO OOOO O OO O O OO O O O O O OO O Q OO O OO O OOO Q O O O O — —
IO LO L LO L L L L L !O I LO L IO I LO L L L LO LO L U L 1 10 LO IΛ IO UJ LO L
CN CN CN CN CN ( CN CN CN CN CN CN CN (N CN CN CN CN CN CN CN CN CN CN CN CN CN CN ( CN CN CN C^ o o
TABLE 5
SEQ ID NO: Template ID Component ID Start Stop
251 Ll:246290.10:2000SEP08 7644139J1 512 1059
251 U:246290.10:2000SEP08 6723259H1 516 916
251 U:246290.10:2000SEP08 747973R6 330 859
251 U:246290.10:2000SEP08 747973H1 330 572
251 LI:246290.10:2000SEP08 7170570H1 383 910
251 U:246290.10:2000SEP08 5080008H1 88 328
251 U:246290.10:2000SEP08 5957953H1 89 545
251 LI:246290.10:2000SEP08 6937991 HI 99 200
251 U:246290.10:2000SEP08 7168155H1 137 243
251 U:246290.10:2000SEP08 4831526F8 229 767
251 U:246290.10:2000SEP08 4831526H1 230 359
251 U:246290.10:2000SEP08 4831526F9 229 744
251 U:246290.10:2000SEP08 7387420H1 285 504
251 U:246290.10:2000SEP08 5761576H1 540 807
251 U:246290.10:2000SEP08 4930979H1 612 891
251 LI:246290.10:2000SEP08 806868H1 728 958
251 U:246290.10:2000SEP08 6866060H1 68 403
251 U:246290.10:2000SEP08 3506257H1 73 350
251 LI:246290.10:2000SEP08 3550395H1 1 288
251 LI:246290.10:2000SEP08 7988114H1 2 511
251 LI:246290.10:2000SEP08 3316757H1 18 304
251 U:246290.10:2000SEP08 8122994H1 19 612
251 U:246290.10:2000SEP08 5457648H1 23 233
251 U:246290.10:2000SEP08 3508411 HI 23 303
251 U:246290.10:2000SEP08 3577541 HI 23 287
251 U:246290.10:2000SEP08 5460535H1 23 271
251 LI:246290.10:2000SEP08 7002420H1 39 596
251 U:246290.10:2000SEP08 4328953H1 42 304
251 U:246290.10:2000SEP08 3973335H1 57 325
251 U:246290.10:2000SEP08 7470753H1 67 633
251 LI:246290.10:2000SEP08 2966230H1 1 297
251 LI:246290.10:2000SEP08 2966230F6 1 377
252 Ll:280034.1 2000SEP08 8045415H1 1 646
252 Ll:280034.1 2000SEP08 8045415J1 482 1096
252 Ll:280034.1 2000SEP08 681866T6 584 802
252 Ll:280034.1 2000SEP08 681866H1 584 845
252 Ll:280034.1 2000SEP08 681866R6 584 841
252 Ll:280034.1 2000SEP08 4936924H1 781 1020
266 TABLE 6
SEQ ID NO: Template ID Tissue Distribution
1 LG:150318.1:2000SEP08 Nervous System - 100%
2 LG:022529.1:2000SEP08 Musculoskeletal System - 25%, Unclassified/Mixed - 22%, Cardiovascular
Figure imgf000268_0001
System - 14%
3 LG:352559.1:2000SEP08 Unclassified/Mixed - 67%, Digestive System - 33%
4 LG:175223.1-2000SEP08 Embryonic Structures - 82%, Nervous System - 18%
5 LG:476989.1:2000SEP08 Exocrine Glands - 62%, Urinary Tract - 31%
6 LG:253268.7:2000SEP08 Germ Cells - 56%, Nervous System - 23%
7 LG:401322.1:2000SEP08 Sense Organs - 50%, Liver - 17%, Skin - 13%
8 LG:1328436.1:2000SEP08 Respiratory System - 38%, Endocrine System - 35%, Connective Tissue -
27%
9 LG:475404.1:2000SEP08 Skin - 81%
10 LG: 1384132.1.-2000SEP08 Respiratory System - 50%, Digestive System - 33%,. Nervous System - 17%
11 LG:410804.18:2000SEP08 Nervous System - 100%
12 LG:1082306.1:2000SEP08 Cardiovasculor System - 28%, Digestive System - 21%, Exocrine Glands -
14%, Endocrine System - 14%
13 LG:233814.4:2000SEP08 Endocrine System - 36%, Pancreas - 36%, Exocrine Glands - 16%
14 LG:977478.5:2000SEP08 Cardiovascular System - 34%, Musculoskeletal System - 19%, Female
Genitalia - 16%
15 LG:025931.1:2000SEP08 Female Genitalia - 29%, Urinary Tract - 24%, Endocrine System - 24%
16 LG:885368.1:2000SEP08 Nervous System - 67%, Female Genitalia - 33%
17 LG:1054900.1:2000SEP08 Digestive System - 50%, Female Genitalia - 25%, Male Genitalia - 25%
18 LG:995186.2:2000SEP08 Digestive System - 67%, Nervous System - 33%
19 LG:435048.23:2000SEP08 Urinary Tract - 80%, Nervous System - 20%
20 LG:954859.1:2000SEP08 Embryonic Structures - 56%, Hemic and Immune System - 19%, Female
Genitalia - 13%, Nervous System - 13%
21 LG:364370.1:2000SEP08 Liver - 90%, Male Genitalia - 10%
22 LG:1098789.1:2000SEP08 Urinary Tract - 100%
23 LG:201540.2:2000SEP08 Unclassified/Mixed - 15%, Female Genitalia - 13%, Connective Tissue -
13%, Male Genitalia - 13%
24 LG:1077357.1:2000SEP08 Nervous System - 43%, Male Genitalia - 29%, Female Genitalia - 29%
25 LG:1048846.4:2000SEP08 Female Genitalia - 25%, Digestive System - 25%, Male Genitalia - 25%
TABLE 6
ID NO: Template ID Tissue Distribution 6 LG:336685.1:2000SEP08 Hemic and Immune System - 35%, Urinary Tract - 24%, Male Genitalia -
Figure imgf000269_0001
7 LG:1076253.1:2000SEP08 Liver - 50%, Urinary Tract - 1 1%, Endocrine System - 11% 8 LG:1400601.2:2000SEP08 Skin - 100% 9 LG:1079092.3:2000SEP08 Musculoskeletal System - 100% 0 LG:1086064.1:2000SEP08 Skin - 69% 1 LG:1400608.1:2000SEP08 Female Genitalia - 26%, Urinary Tract - 21%, Respiratory System - 16% 2 LG:399275.5:2000SEP08 Respiratory System - 50%, Male Genitalia - 33%, Hemic and Immune
System - 17% 3 LG:293943.1 :2000SEP08 Exocrine Glands - 50%, Digestive System - 25%, Hemic and Immune
System - 13%, Nervous System - 13% 4 LG:345884.1:2000SEP08 Respiratory System - 63%, Hemic and Immune System - 38% 5 LG:400967.1:2000SEP08 Embryonic Structures - 36%, Urinary Tract - 28%, Cardiovascular System -
16% 6 LG:024556.6:2000SEP08 Nervous System - 89%, Male Genitalia - 11% 7 LG:081189.3:2000SEP08 Germ Cells - 88% 8 LG:018258.1:2000SEP08 Digestive System - 44%, Endocrine System - 44%, Hemic and Immune
System - 1 1% 9 LG:450399.3:2000SEP08 Nervous System - 100% 0 LG:451122.1 -2000SEP08 Nervous System - 100% 1 LG:451682.1:2000SEP08 Nervous System - 100% 2 LG:238631.4:2000SEP08 Uver - 17%, Endocrine System - 14%, Respiratory System - 1 1%, Male
Genitalia - 1 1% 3 LG:236654.1:2000SEP08 Unclassified/Mixed - 38%, Respiratory System - 15% 4 LG:332655.1:2000SEP08 Pancreas - 26%, Digestive System - 17%, Female Genitalia - 13% 5 LG:217396.2:2000SEP08 Skin - 24%, Embryonic Structures - 15%, Pancreas - 15% 6 LG:090574.1:2000SEP08 Respiratory System - 100% 7 LG:202943.1:2000SEP08 Embryonic Structures - 44%, Female Genitalia - 19%, Liver - 14% 8 LG:236928.1:2000SEP08 Unclassified/Mixed - 15%, Hemic and Immune System - 14%, Nervous
System - 12% 9 LG:215169.2:2000SEP08 Endocrine System - 22%, Unclassified/Mixed - 20%, Nervous System - 20%
Figure imgf000269_0002
TABLE 6
SEQ ID NO: Template ID Tissue Distribution 50 LG:410726.1:2000SEP08 Embryonic Structures - 38%, Pancreas - 24%, Endocrine System - 18% 51 LG:234372.2:2000SEP08 Stomatognathic System - 17%, Germ Cells - 12%, Musculoskeletal System
Figure imgf000270_0001
- 1 1%
52 LG:022629,1:2000SEP08 Unclassified/Mixed - 36%, Germ Cells - 28% 53 LG:068682.1:2000SEP08 Unclassified/Mixed - 61%, Male Genitalia - 20% 54 LG:222335.1:2000SEP08 Musculoskeletal System - 27%, Exocrine Glands - 17%, Male Genitalia -
13%
55 LG:331342.1:2000SEP08 Male Genitalia - 35%, Digestive System - 22%, Endocrine System - 20% 56 LG:021770.1:2000SEP08 Pancreas - 18%, Exocrine Glands - 14%, Nervous System - 12%, Urinary
Tract - 12%, Male Genitalia - 12%
57 LG:181607.9:2000SEP08 Musculoskeletal System - 25%, Cardiovascular System - 22%, Embryonic
Structures - 18%
58 LG:1042768.1:2000SEP08 Uver - 100% 59 LG:282729.1:2000SEP08 Skin - 74%, Unclassified/Mixed - 21% t σv 60 LG:998305.3:2000SEP08 Urinary Tract - 50%, Cardiovascular System - 29%, Nervous System - 21% ^_> 61 LG:1135213.1:2000SEP08 Embryonic Structures - 26%, Cardiovascular System - 20%, Digestive
System - 1 1%, Unclassified/Mixed - 11%
62 LG:267762.1:2000SEP08 Cardiovascular System - 16%, Urinary Tract - 14%, Connective Tissue -
14%
63 LG:120744.1:2000SEP08 Skin - 35%, Embryonic Structures - 23%, Digestive System - 20%
64 LG:403409.1:2000SEP08 Stomatognathic System - 45%, Respiratory System - 12%
65 LG:226874.3:2000SEP08 Respiratory System - 38%, Male Genitalia - 35%, Female Genitalia - 19%
66 LG:1045521.4:2000SEP08 Germ Cells - 16%
67 LG:275876.1:2000SEP08 Unclassified/Mixed - 53%, Endocrine System - 27%, Nervous System - 20%
68 LG:475127.7:2000SEP08 Hemic and Immune System - 75%, Nervous System - 25%
69 LG:157263.1:2000SEP08 Embryonic Structures - 39%, Urinary Tract - 30%
70 LG:247382.7:2000SEP08 Urinary Tract - 19%, Hemic and Immune System - 13%, Embryonic
Structures - 13%, Endocrine System - 13%
71 LG:197367.5:2000SEP08 Endocrine System - 100%
72 LG:218090.5:2000SEP08 Unclassified/Mixed - 42%, Urinary Tract - 21%, Exocrine Glands - 21%
73 LG:216612.4:2000SEP08 Nervous System - 45%, Digestive System - 30%, Endocrine System - 20%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
74 LG:197614.12000SEP08 Germ Cells - 17%, Unclassified/Mixed - 16%
75 LG'37842812000SEP08 Liver - 30%, Endocrine System - 15%, Embryonic Structures - 15%
Figure imgf000271_0001
76 LG 28663912000SEP08 Germ Cells - 72%
77 LG:389870.12000SEP08 Liver - 41%, Nervous System - 23%, Cardiovascular System - 18%,
Endocrine System - 18%
78 LG-13874856'2000SEP08 Female Genitalia - 16%, Digestive System - 14%, Urinary Tract - 13%
79 LG 23015112000SEP08 Exocrine Glands - 16%, Urinary Tract - 14%, Liver - 12%, Pancreas - 12%
80 LG_1515852000SEP08 Liver - 13%, Nervous System - 12%, Unclassified/Mixed - 1 1%
81 LG:23584012000SEP08 Connective Tissue - 13%, Nervous System - 12%, Hemic and Immune
System - 1 1%, Liver - 1 1%
82 LG.350272.12000SEP08 Germ Cells - 17%, Musculoskeletal System - 15%
83 LG 23219012000SEP08 Liver - 24%, Unclassified/Mixed - 23%, Skin - 19%
84 LG:1068127.1:2000SEP08 Hemic and Immune System - 43%, Male Genitalia - 29%, Nervous System
- 29% t 85 LG.40875132000SEP08 Nervous System - 43%, Sense Organs - 31%
Nl o 86 LG:1078933.V2000SEP08 Liver - 23%, Unclassified/Mixed - 20%, Nervous System - 13%
87 LG 958731.12000SEP08 Nervous System - 100%
88 LG 02412552000SEP08 Connective Tissue - 17%, Embryonic Structures - 16%
89 LG 37363732000SEP08 Unclassified/Mixed - 73%, Male Genitalia - 23%
90 LGl053229.1:2000SEP08 Cardiovascular System - 50%, Embryonic Structures - 41%
91 LG 24836412000SEP08 Sense Organs - 48%, Female Genitalia - 17%
92 LG 47713012000SEP08 Nervous System - 100%
93 LG:113786.172000SEP08 Hemic and Immune System - 100%
94 LG:347635.12000SEP08 Musculoskeletal System - 45%, Female Genitalia - 17%, Urinary Tract -
14%
95 LG 24296642000SEP08 Germ Cells - 23%, Stomatognathic System - 23%
96 LG.217814.12000SEP08 Skin - 25%, Liver - 16%, Unclassified/Mixed - 15%
97 LG 476452.12000SEP08 Nervous System - 100%
98 LG' 1100657 V2000SEP08 Liver - 100%
99 LG113241822000SEP08 Cardiovascular System - 100%
100 LG-109857012000SEP08 Uver - 100%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
101 LG:1097987.1:2000SEP08 Embryonic Structures - 41%, Musculoskeletal System - 27%, Female
Genitalia - 23%
Figure imgf000272_0001
102 LG:337818.2:2000SEP08 Digestive System - 35%, Liver - 13%, Female Genitalia - 10%
103 LG:1040582.1:2000SEP08 Liver - 38%, Pancreas - 38%, Cardiovascular System - 17%
104 LG:1099122.1:2000SEP08 Liver - 96%
105 LG:1327449.1:2000SEP08 Endocrine System - 40%, Respiratory System - 30%, Digestive System -
20%
106 LG:227933.5:2000SEP08 Male Genitalia - 33%, Nervous System - 21%, Female Genitalia - 15%
107 LG:1043709.2:2000SEP08 Liver - 100%
108 LG:1099871.1 :2000SEP08 Liver - 90%, Nervous System - 10%
109 LG:1399139.4:2000SEP08 Unclassified/Mixed - 30%, Male Genitalia - 26%, Musculoskeletal System
22%
110 LG:236386.1:2000SEP08 Skin - 16%, Connective Tissue - 13%, Pancreas - 10%
111 LG:1015157.1:2000SEP08 Liver - 100%
112 LG:1065433.1:2000SEP08 Female Genitalia - 50%, Endocrine System - 40%, Nervous System - 10%
113 LG:236992.4:2000SEP08 Connective Tissue - 64%, Urinary Tract - 12%, Cardiovascular System -
12%
1 14 LG: 1071 124.1.2000SEP08 Unclassified/Mixed - 22%, Cardiovascular System - 22%, Urinary Tract -
19%
115 LG:206425.2:2000SEP08 Sense Organs - 39%, Female Genitalia - 10%
116 LG:885747.2:2000SEP08 Female Genitalia - 100%
117 LG:1140501.1:2000SEP08 Sense Organs - 20%, Nervous System - 15%
118 LG:001239.1:2000SEP08 Hemic and Immune System - 35%, Uver - 17%, Male Genitalia - 13%
119 LG:018980.1:2000SEP08 Unclassified/Mixed - 52%, Urinary Tract - 12%, Nervous System - 12%
120 LG:1083120.3:2000SEP08 Digestive System - 50%, Hemic and Immune System - 25%, Nervous
System - 25%
121 LG:233258.3:2000SEP08 Germ Cells - 25%, Female Genitalia - 12%
122 LG:999062.1:2000SEP08 Nervous System - 100%
123 LG:887776.1:2000SEP08 Cardiovascular System - 35%, Skin - 33%, Hemic and Immune System -
14%
124 LG:1400301.2:2000SEP08 Exocrine Glands - 50%, Connective Tissue - 44%
125 LG:1329362.1:2000SEP08 Unclassified/Mixed - 44%, Urinary Tract - 22%, Exocrine Glands - 22%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
126 LG:1096498.1 :2000SEP08 Liver - 100%
127 LG:1096337.1:2000SEP08 Urinary Tract - 40%, Exocrine Glands - 40%, Nervous System - 20%
Figure imgf000273_0001
128 LG:1400579.1 :2000SEP08 Liver - 53%, Musculoskeletal System - 25%
129 LG:1080091.1-2000SEP08 Connective Tissue - 21%, Musculoskeletal System - 18%, Female
Genitalia - 15%
130 LG:1082203.1-2000SEP08 Embryonic Structures - 23%, Endocrine System - 1 1%, Male Genitalia -
1 1%
131 LG 1084051.1:2000SEP08 Germ Cells - 33%, Pancreas - 11%, Male Genitalia - 11%
132 LG 1082393.1 :2000SEP08 Male Genitalia - 20%, Urinary Tract - 13%, Unclassified/Mixed - 13%
133 LG 1086183.1 :2000SEP08 Liver - 25%, Skin - 20%, Hemic and Immune System - 14%
134 LG 1090268.1.2000SEP08 Sense Organs - 40%, Unclassified/Mixed - 17%
135 LG 1400597.5:2000SEP08 Unclassified/Mixed - 53%, Connective Tissue - 47%
136 LG 1080307.2:2000SEP08 Digestive System - 100%
137 LG 1400603.2:2000SEP08 Germ Cells - 40%, Nervous System - 18%, Unclassified/Mixed - 12% ro 138 LG 1052984.1 :2000SEP08 Skin - 31%, Musculoskeletal System - 15%, Endocrine System - 10%,
Nj t Pancreas - 10%
139 LG:1091259.1-2000SEP08 Connective Tissue - 66%, Respiratory System - 25%
140 LG:1082263.2:2000SEP08 Embryonic Structures - 32%, Endocrine System - 15%, Nervous System -
10%
141 LG 1048604.2:2000SEP08 Exocrine Glands - 50%, Embryonic Structures - 38%
142 LG 1085254.3:2000SEP08 Embryonic Structures - 61%, Respiratory System - 32%
143 LG 1400606.2:2000SEP08 Embryonic Structures - 34%, Unclassified/Mixed - 14%, Nervous System -
13%, Connective Tissue - 13%
144 LG:1090358.2:2000SEP08 Urinary Tract - 36%, Male Genitalia - 23%, Hemic and Immune System -
15%, Musculoskeletal System - 15%
145 LG:1079064.2:2000SEP08 Nervous System - 21%, Female Genitalia - 16%, Uver - 12%
146 LG:1076866.1.2000SEP08 Germ Cells - 28%, Embryonic Structures - 20%, Cardiovascular System -
12%
147 LG:969359.1:2000SEP08 Liver - 79%, Pancreas - 19%
148 LG:366783.1:2000SEP08 Sense Organs - 95%
149 LG:332176.3:2000SEP08 Pancreas - 28%, Connective Tissue - 22%, Urinary Tract - 22%
150 LG:994938.1:2000SEP08 Exocrine Glands - 50%, Respiratory System - 21%, Urinary Tract - 17%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
151 LG:982800.1:2000SEP08 Urinary Tract - 22%, Skin - 21%, Germ Cells - 13%
152 LG:977850.7:2000SEP08 Exocrine Glands - 100%
Figure imgf000274_0001
153 LG:234748.2:2000SEP08 Sense Organs - 27%, Urinary Tract - 15%, Unclassified/Mixed - 1 1%
154 LG:306284.1:2000SEP08 Nervous System - 60%, Male Genitalia - 40%
155 U:333170.3:2000SEP08 Germ Cells - 82%, Unclassified/Mixed - 13%
156 U:336685.2:2000SEP08 Hemic and Immune System - 35%, Male Genitalia - 25%, Urinary Tract -
20%
157 U:279013.5:2000SEP08 Female Genitalia - 100%
158 LI:1037075.1 2000SEP08 Musculoskeletal System - 94%
159 LI:1073403.1 2000SEP08 Liver - 100%
160 LI:1075296.1 2000SEP08 Liver - 100%
161 LI:1085501.1 2000SEP08 Connective Tissue - 100%
162 LI:1086181.1 2000SEP08 u'ver - 100%
163 LI:1164493.1 2000SEP08 Urinary Tract - 52%, Female Genitalia - 20%, Digestive System - 12%, Male
Genitalia - 12% )
NJ 164 U:1175097.1:2000SEP08 Unclassified/Mixed - 57%, Digestive System - 21%, Nervous System - 21%
165 LI:1092948.1:2000SEP08 Embryonic Structures - 26%, Connective Tissue - 19%, Musculoskeletal
System - 17%
166 LI:380378.2:2000SEP08 Nervous System - 100%
167 LI:1029674.1:2000SEP08 Digestive System - 42%, Nervous System - 36%, Unclassified/Mixed - 22%
169 11:1186208.1 :2000SEP08 Stomatognathic System - 51%, Sense Organs - 38%
170 LI:1170753.1 :2000SEP08 Endocrine System - 95%
171 U:l 180908.1 :2000SEP08 Skin - 21%, Embryonic Structures - 14%, Male Genitalia - 13%
172 U:1182900.2:2000SEP08 Digestive System - 47%, Pancreas - 34%, Respiratory System - 1 %
173 U:1169548.2:2000SEP08 Nervous System - 100%
174 LI:1039974.1:2000SEP08 Female Genitalia - 31%, Urinary Tract - 29%, Nervous System - 16%
175 LI:1175765.2:200OSEP08 Nervous System - 100%
176 LI:313948.1:2000SEP08 Unclassified/Mixed - 31%, Endocrine System - 23%, Cardiovascular
System - 19%
177 U:335923.2:2000SEP08 Germ Cells - 74%, Male Genitalia - 26%
178 ^ U:345884.1:2000SEP08 Respiratory System - 75%, Hemic and Immune System - 25%
Figure imgf000274_0002
179 LI:417127.1:2000SEP08 Connective Tissue - 100%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
180 LI 451710.1 :2000SEP08 Connective Tissue - 90%, Nervous System - 10%
181 LI 406882.2:2000SEP08 Sense Organs - 34%, Unclassified/Mixed - 15%, Nervous System - 15%
Figure imgf000275_0001
182 LI 728223.1:2000SEP08 Nervous System - 100%
183 U:289783.19:2000SEP08 Digestive System - 38%, Unclassified/Mixed - 15%, Respiratory System -
12%
184 LI 235255.8:2000SEP08 Exocrine Glands - 23%, Connective Tissue - 12%, Female Genitalia - 1 1%
185 LI 237693.5:2000SEP08 Urinary Tract - 32%, Hemic and Immune System - 25%, Endocrine System
21%
186 LI:433670.3:2000SEP08 Nervous System - 50%, Male Genitalia - 25%, Digestive System - 25%
187 LI 202943.4:2000SEP08 Embryonic Structures - 42%, Liver - 19%, Unclassified/Mixed - 16%
188 LI 068682.1:2000SEP08 Unclassified/Mixed - 47%, Digestive System - 21%, Male Genitalia - 19%
189 LI 203301.3:2000SEP08 Germ Cells - 19%, Male Genitalia - 14%, Respiratory System - 12%
190 U:020726.3:2000SEP08 Sense Organs - 79%
191 LI 027209.1:2000SEP08 Musculoskeletal System - 53%, Female Genitalia - 27%, Exocrine Glands -
Figure imgf000275_0002
ry Tract - 45%, Digestive System - 28%, Skin - 15%
193 U.021759.1 :2000SEP08 Hemic and Immune System - 40%, Nervous System - 22%, Uver - 1 1 %
1 4 Lh11659671.2000SEP08 Liver - 50%, Stomatognathic System - 50%
195 U:1166315.1:2000SEP08 Cardiovascular System - 33%, Urinary Tract - 27%, Female Genitalia -
20%, Hemic and Immune System - 20%
196 LI204626.1:2000SEP08 Digestive System - 58%, Nervous System - 15%, Exocrine Glands - 13%
197 LI.801140.1 :2000SEP08 Embryonic Structures - 50%, Cardiovascular System - 38%
198 LI:286639.1:2000SEP08 Germ Cells - 64%
199 U:288905.4:2000SEP08 Unclassified/Mixed - 76%, Nervous System - 24%
200 U:332161.1:2000SEP08 Cardiovascular System - 47%, Nervous System - 14%, Connective Tissue -
10%
201 LI: 184867.1 :2000SEP08 Unclassified/Mixed - 33%, Embryonic Structures - 24%, Male Genitalia -
16%
202 U:229932.4:2000SEP08 Musculoskeletal System - 40%, Cardiovascular System - 21%
203 LI:1189932.1:2000SEP08 Embryonic Structures - 37%, Germ Cells - 18%, Sense Organs - 13%
204 LI:1076689.1:2000SEP08 Uver - 99%
205 U:415181.2:2000SEP08 Nervous System - 100%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
206 U:296358.1:2000SEP08 Hemic and Immune System - 83%
207 U:205186.3:2000SEP08 Cardiovascular System - 59%, Male Genitalia - 20%, Unclassified/Mixed -
16%
208 U:220537.2:2000SEP08 Stomatognathic System - 52%, Male Genitalia - 32%
209 U:248364.2:2000SEP08 Cardiovascular System - 30%, Female Genitalia - 23%, Sense Organs -
22%
210 U:2048338.1:2000SEP08 Connective Tissue - 90%, Nervous System - 10%
211 U:1185203.8:2000SEP08 Nervous System - 75%, Female Genitalia - 25%
212 U:021770.3:2000SEP08 Pancreas - 77%
213 LI:1185841.1:2000SEP08 Hemic and Immune System - 30%, Respiratory System - 26%
214 Ll:l 181710.1 :2000SEP08 Male Genitalia - 38%, Nervous System - 38%, Hemic and Immune System
- 25%
215 U:2048959.1:2000SEP08 Musculoskeletal System - 73%, Endocrine System - 27%
216 LI:798494.1:2000SEP08 Germ Cells - 88%
217 U:2049223.1:2000SEP08 Male Genitalia - 60%, Hemic and Immune System - 40%
218 LI:1177833.1 :2000SEP08 Liver - 55%, Embryonic Structures - 10%
219 LI:2049267.1:2000SEP08 Cardiovascular System - 36%, Urinary Tract - 29%, Digestive System - 21 %
220 LI:1165939.1:2000SEP08 Sense Organs - 62%, Endocrine System - 20%
221 LI:1170958.1:2000SEP08 Exocrine Glands - 53%, Cardiovascular System - 26%, Male Genitalia -
16%
222 U:1089827.1 :2000SEP08 Endocrine System - 16%, Male Genitalia - 13%, Musculoskeletal System -
12%
223 LI:792112.1:2000SEP08 Liver - 83%
224 LI:282219.2:2000SEP08 Embryonic Structures - 55%, Exocrine Glands - 23%, Male Genitalia - 14%
225 LI:1088010.2:2000SEP08 Exocrine Glands - 31%, Musculoskeletal System - 23%, Respiratory System
- 13%
226 LI:1165276.1:2000SEP08 Nervous System - 18%, Male Genitalia - 16%, Pancreas - 16%
227 U:1169524.2:2000SEP08 Endocrine System - 41%, Urinary Tract - 16%, Musculoskeletal System -
14%, Male Genitalia - 14%
228 U:l 180255.1 :2000SEP08 Female Genitalia - 66%, Liver - 1 1%, Embryonic Structures - 11%
Figure imgf000276_0001
229 U:1091903.1:2000SEP08 Germ Cells - 93%
TABLE 6
SEQ ID NO: Template ID Tissue Distribution
230 LI:1169219.1:2000SEP08 Sense Organs - 34%, Skin - 22%
231 LI:2050313.1:2000SEP08 Skin - 15%, Respiratory System - 14%, Male Genitalia - 14%
Figure imgf000277_0001
232 U:209351.3:2000SEP08 Cardiovascular System - 41 %
233 U:119900.1:2000SEP08 Stomatognathic System - 77%
234 U:2052274.1:2000SEP08 Urinary Tract - 31%, Female Genitalia - 1 %, Liver - 12%
235 LI:1075502.1:2000SEP08 Liver - 100%
236 LI:813697.1;2000SEP08 Endocrine System - 36%, Unclassified/Mixed - 25%, Exocrine Glands - 15%
237 LI:814261.1:2000SEP08 Exocrine Glands - 50%, Respiratory System - 20%, Urinary Tract - 13%
238 LI:775334.1:2000SEP08 Cardiovascular System - 26%, Embryonic Structures - 23%,
Unclassified/Mixed - 15%
239 LI:1180325.1 -2000SEP08 Skin - 38%, Pancreas - 23%, Female Genitalia - 23%
240 U:1183147.3:2000SEP08 Female Genitalia - 42%, Respiratory System - 17%, Nervous System - 13%
241 U:1175373.3:2000SEP08 Nervous System - 50%, Male Genitalia - 25%, Digestive System - 25%
242 U:813757.1:2000SEP08 Female Genitalia - 50%, Nervous System - 50%
243 LI:1182979.2:2000SEP08 Respiratory System - 67%, Pancreas - 19%
244 U:1177823.2:2000SEP08 Embryonic Structures - 48%, Cardiovascular System - 20%, Exocrine
Glands - 20%
245 LI:1174279.1 -2000SEP08 Endocrine System - 36%, Nervous System - 1 %, Liver - 18%
246 U:l 178411.1 -2000SEP08 Nervous System - 21%, Liver - 18%, Exocrine Glands - 15%
247 LI:1182739.1 :2000SEP08 Unclassified/Mixed - 34%, Skin - 18%, Nervous System - 12%
248 LI:234937.4:2000SEP08 Endocrine System - 33%, Nervous System - 24%, Digestive System - 21 %
249 LI:1170660.1 :2000SEP08 Musculoskeletal System - 21%, Nervous System - 18%, Cardiovascular
System - 13%, Male Genitalia - 13%
250 U:1144409.1:2000SEP08 Sense Organs - 53%
251 LI:246290.10:2000SEP08 Cardiovascular System - 29%, Female Genitalia - 29%
252 U:280034.1:2000SEP08 Female Genitalia - 100%
Figure imgf000277_0002
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
253 2 114 2 343 gl 1643582 2.00E-68 PR-domain containing protein 14
253 2 114 2 343 g10434076 2.00E-68 unnamed protein product
253 2 114 2 343 g7020503 4.00E-27 unnamed protein product
255 2 211 2 634 g7243243 2.00E-43 KIAA1431 protein
255 2 211 2 634 g4567178 2.00E-41 R31665_2 (AA 1- 673 )
255 2 211 2 634 g3445181 2.00E-41 R31665.2
256 3 151 54 506 g10433955 5.00E-40 unnamed protein product
256 3 151 54 506 g7295442 4.00E-14 CGI 7334 gene product
256 3 151 54 506 g2073111 3.00E-11 Y box protein 2
257 2 138 17 430 g12407395 4.00E-54 tripartite motif protein TRIM 7
257 2 138 17 430 g12407397 2.00E-42 tripartite motif protein TRIM7
257 2 138 17 430 g12407379 8.00E-16 tripartite motif protein TRIM4 isoform beta
258 1 317 1 951 g9246977 1.00E-137 RNA-binding protein BRUNOL4
258 1 317 1 951 g12746394 1.00E-136 CUG-BP and ETR-3 like factor 4
258 1 317 1 951 g13278792 1.00E-106 Bruno (Drosophila) -like 4, RNA binding protein
260 2 114 14 355 g4589588 2.00E-34 KIAA0972 protein
260 2 114 14 355 g7576272 2.00E-30 bA393J16.1 (zinc finger protein 33a (KOX 31))
260 2 114 14 355 g498152 2.00E-30 ha0946 protein is Kruppel-related.
261 2 127 140 520 g10434195 2.00E-64 unnamed protein product
261 2 127 140 520 g13529188 4.00E-42 Unknown (protein for MGC: 12466)
261 2 127 140 520 g6467206 3.00E-36 gonadotropin inducible transcription repressor-4
262 151 1 453 g14042293 4.00E-47 unnamed protein product
262 151 1 453 g12052983 1.00E-46 hypothetical protein
262 151 1 453 g487785 7.00E-46 zinc finger protein ZNF136
263 79 16 252 g349075 4.00E-16 calmodulin-binding protein
263 79 16 252 g13543326 4.00E-16 hypothetical protein MGC8407
263 79 16 252 g12804937 4.00E-16 Unknown (protein for MGC:3732)
264 183 112 660 g13325337 3.00E-83 Unknown (protein for MGC: 10520)
264 183 112 660 g8439407 3.00E-22 zinc finger protein
264 183 112 660 g7020166 3.00E-22 unnamed protein product
266 3 260 3 782 g12698001 1.00E-147 KIAA 1728 protein
Figure imgf000278_0001
266 3 260 3 782 g8052233 1.00E-94 putative ankyrin-repeat containing protein
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score ! Annotation
266 3 260 3 782 g7294632 2.00E-61 CG5679 gene product
267 2 196 2 589 g6249687 1.00E-108 R31 155J
Figure imgf000279_0001
267 2 196 2 589 g 10436360 1.00E-53 unnamed protein product
267 2 196 2 589 g 14042803 6.00E-50 unnamed protein product
268 1 1 13 130 468 g 14596085 7.00E-55 (AY042830) Putative 40S ribosomal protein SI 5A 268 1 1 13 130 468 g9757906 7.00E-55 40S ribosomal protein SI 5A
268 1 113 130 468 g8439890 7.00E-55 Strong similarity to 40S ribosomal protein SI 5A from Arabidopsis thaliana gb I L27461. EST gb | R30315 comes from this gene.
269 3 165 3 497 g 12052983 2.00E-73 hypothetical protein 269 3 165 3 497 g5262560 1.00E-47 hypothetical protein
269 3 165 3 497 g 10434856 1.00E-47 unnamed protein product
270 3 168 165 668 g 12052983 4.00E-70 hypothetical protein 270 3 168 165 668 g5262560 4.00E-35 hypothetical protein
270 3 168 165 668 g 10434856 5.00E-34 unnamed protein product
271 2 122 2 367 g 12044553 8.00E-48 bA261 P9.2 (putative novel protein similar to fly CG7340 and human putative amlnopeptidase ZK353.6 in chromosome 3 (EC 3.4.1 1.-))
271 2 122 2 367 g 10432867 5.00E-47 unnamed protein product
271 2 122 2 367 g7299691 5.00E-42 BcDNA:LD41548 gene product (alt 1)
272 2 91 212 484 g872315 3.00E-35 40S ribosomal protein SI 2 272 2 91 212 484 g 12842004 3.00E-35 putative
272 2 91 212 484 gl2833134 3.00E-35 putative
273 3 225 3 677 g510552 1.OOE-100 ribosomal protein LI 3 273 3 225 3 677 g 12833027 1.00E-99 putative
273 3 225 3 677 g3869148 4.00E-98 robosomal protein LI 3
274 3 153 3 461 g825539 2.00E-84 MLC2 274 3 153 3 461 g 1675396 2.00E-84 myosin light chain 2
274 3 153 3 461 gl637 2.00E-84 myosin light chain 2 type 2
275 2 296 176 1063 g 12407387 1.OOE-l 47 tripartite motif protein TRIM5 isoform delta 275 2 296 176 1063 g 12407385 1.OOE-l 47 tripartite motif protein TRIM5 isoform gamma
275 2 296 176 1063 g 12407383 1.OOE-l 47 tripartite motif protein TRIM5 isoform beta
276 1 175 22 546 g 13752754 . 1.00E-26 zinc finger l l l l 276 1 175 22 546 g 14348588 5.00E-25 KRAB zinc finger protein
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
276 1 175 22 546 g12654015 4.00E-24 Similar to hypothetical protein FU 10891
277 3 76 141 368 g9714522 5.00E-13 bA162G10.3 (zinc finger protein)
277 3 76 141 368 g5730194 5.00E-13 KRAB protein domain
277 3 76 141 368 g4589588 8.00E-13 KIAA0972 protein
278 3 146 624 1061 g12697318 5.00E-14 PBX4 protein
278 3 146 624 1061 g7160798 9.00E-13 pbxy homeodomain protein
278 3 146 624 1061 g634053 9.00E-13 PBX2
279 3 391 3 1175 g10439850 1.OOE-l 16 unnamed protein product
279 3 391 3 1175 g9968290 1.OOE-l 12 zinc finger protein 304
279 3 391 3 1175 g14249844 1.OOE-l 11 Similar to hypothetical protein FU23233
280 3 80 156 395 g12052983 7.00E-15 hypothetical protein
280 3 80 156 395 g13277768 1.OOE-l 1 zinc finger protein 93
280 3 80 156 395 gl 184371 1.OOE-l 1 zinc finger protein; Method: conceptual translation supplied by author
281 2 162 74 559 gό467206 8.00E-37 gonadotropin inducible transcription repressor-4
281 2 162 74 559 g9187356 2.00E-36 hypothetical protein, similar to (AB021644)GONADOTROPIN INDUCIBLE TRANSCRIPTION REPRESSOR-4
281 2 162 74 559 g220637 2.00E-36 zinc finger protein
282 2 164 50 541 g13752754 1.00E-28 zinc finger l l l l
282 2 164 50 541 g14348588 5.00E-28 KRAB zinc finger protein
282 2 164 50 541 gl2654015 3.00E-27 Similar to hypothetical protein FU 10891
283 3 105 75 389 g12052732 2.00E-38 hypothetical protein
283 3 105 75 389 g3329372 9.00E-37 DNA-binding protein
283 3 105 75 389 g7959207 3.00E-34 KIAA1473 protein
284 123 289 657 g14042035 9.00E-64 unnamed protein product
284 123 289 657 g6224922 1.OOE-41 BTB/POZ domain zinc finger factor HOF-S
284 123 289 657 g6063139 1.OOE-41 BTB/POZ domain zinc finger factor HOF-L
285 174 199 720 g7022603 2.00E-20 unnamed protein product
285 174 199 720 g7022652 3.00E-10 unnamed protein product
285 174 199 720 g3283350 3.00E-08 calmodulin-binding protein SHA1
286 2 181 71 613 gl3591714 4.00E-97 (AF343664) immunoglobulin superfamily receptor translocation associated protein 2c
TABLE 7
-IDNC >: Frame Length Start Stop Gl Number Probability Score Annotation
286 2 181 71 613 gl3591712 4.00E-97 (AF343663) immunoglobulin superfamily receptor translocation associated protein 2b
286 2 181 71 613 gl3591710 4.00E-97 (AF343662) immunoglobulin superfamily receptor translocation associated protein 2a
287 91 64 336 g10047183 8.00E-48 KIAA1559 protein
287 91 64 336 g5080758 8.00E-25 BC331 191J
287 91 64 336 g456269 1.00E-21 zinc finger protein 30
288 188 226 789 gl 1999277 4.00E-45 solute carrier
288 188 226 789 g12845461 8.00E-34 putative
288 188 226 789 g10434874 2.00E-33 unnamed protein product
289 2 290 2 871 gl4017771 1.OOE-l57 fibrillin3
289 2 290 2 871 g762831 1.OOE-l 13 fibrillin 2
289 2 290 2 871 g4959652 1.OOE-l 13 fibrillin-2
290 3 199 3 599 g375771 1.00E-83 dJ422F24.1 (PUTATIVE novel protein similar to C. elegans C02C2.5)
290 3 199 3 599 g12832288 5.00E-83 putative
290 3 199 3 599 g7294769 2.00E-38 CG6279 gene product
291 3 148 3 446 g13489168 5.00E-77 60S ribosomal protein LI 7
291 3 148 3 446 g13430182 3.00E-76 ribosomal protein LI 7
291 3 148 3 446 g14596111 2.00E-75 (AY042843) 60S ribosomal protein LI 7
292 1 92 37 312 g6002102 5.00E-39 Acyl-CoA binding protein (ACBP)
292 1 92 37 312 g1938236 3.00E-38 acyl-CoA-binding protein
292 1 92 37 312 g6002104 2.00E-37 Acyl-CoA binding protein (ACBP)
293 3 256 3 770 g50 1520 1.OOE-l27 ESTs AU058081 (E30812),AU058365(E50679), AU030138(E50679) correspond to a region of the predicted gene.; Similar to Spinacia oleracea mRNA for proteasome 37kD subunit.(X96974)
293 3 256 3 770 g8671496 1.OOE-l 27 alpha 3 subunit of 20S proteasome
293 3 256 3 770 g8096329 1.OOE-l 27 ESTs AU058081(E3082),AU075427(E30384) correspond to a region of the predicted gene. -Similar to Spinacia oleracea proteasome 27 kD subunit (P52427)
295 3 318 132 1085 gl 1602755 1.00E-37 zinc finger protein
295 3 318 132 1085 g12843135 6.00E-33 putative
295 3 318 132 1085 g13094151 3.00E-25 zinc finger protein hRitl alpha
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
296 194 304 885 g 12856559 1.00E-85 putative
296 194 304 885 g 12856631 3.00E-85 putative
296 194 304 885 g 12849896 3.00E-85 putative
297 469 1 1407 g9927838 0 unnamed protein product
297 469 1 1407 g 10045028 0 unnamed protein product
297 469 1 1407 g 10047295 1. OOE-l 69 KIAA1610 protein
298 3 81 3 245 g 13249093 5.00E-37 carbonic anhydrase XIII
298 3 81 3 245 gl 2845416 5.00E-37 putative
298 3 81 3 245 g65332 4.00E-22 carbonic anhydrase
299 452 1 1356 gl 1 177164 0 polydom protein
299 452 1 1356 g 12060830 1. OOE-l 42 serologically defined breast cancer antigen NY-BR-38
. 299 452 1 1356 g7292728 1.00E-52 fw gene product
300 362 97 1 182 g 14594722 0 (AY037298) elongation of very long chain fatty acids protein
300 362 97 1 182 g 12044051 0 ELOVL4 i 300 362 97 1 182 g 12044043 0 ELOVL4
- 301 2 354 2 1063 g 14272764 1. OOE-l 46 unnamed protein product
301 2 354 2 1063 g561853 5.00E-28 megalin
301 2 354 2 1063 g 1809240 7.00E-27 gp330 precursor
302 1 305 220 1 134 g 12483902 1.00E-83 zinc finger protein HIT-10
302 1 305 220 1 134 g6002480 4.00E-49 BWSCR2 associated zinc-finger protein BAZ2
302 1 305 220 1 134 g9963806 4.00E-47 zinc finger protein ZNF287
303 2 659 164 2140 g 14587851 0 (AB050785) Graf2
303 2 659 164 2140 gl3310137 0 PSGAP-m
303 2 659 164 2140 gl3310135 0 PSGAP-s
304 3 390 102 1271 g 13879442 1. OOE-l 70 Similar to RIKEN cDNA 2310035M22 gene
304 3 390 102 1271 g 12855490 1. OOE-l 65 putative
304 3 390 102 1271 g 12848905 1. OOE-l 02 putative
305 2 294 2 883 g 13898617 1. OOE-l 42 serine/threonine protein kinase SSTK
305 2 294 2 883 g 13540326 1. OOE-l 42 serine/threonine kinase FKSG82
305 2 294 2 883 g 13898619 1. OOE-l 40 serine/threonine protein kinase SSTK
306 1 269 1 807 g 14250138 1. OOE-l 39 Similar to RIKEN cDNA 5730421 El 8 gene
306 1 269 1 807 g 12856916 1. OOE-l 20 putative
TABLE 7
- ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
306 1 269 1 807 g12840367 1.OOE-l20 putative
307 3 270 393 1202 g10434082 1.OOE-l03 unnamed protein product
307 3 270 393 1202 g12052816 1.OOE-l02 hypothetical protein
307 3 270 393 1202 g297157 1.00E-74 rabl 7
308 3 358 150 1223 g12861800 1.OOE-l 70 putative
308 3 358 150 1223 g3878713 5.00E-73 (Z46935) weak similarity with quinone oxidoreductase, contains similarity to Pfam domain: PF00107 (Zinc-binding dehydrogenases), Score=-80.6, E- value=6.2e-06, N=l -cDNA EST ykl64b4.5 comes from this gene-cDNA EST ykl64b4.3 comes from this gene-cDNA EST yk264f3.5 comes from this
308 3 358 150 1223 g2633069 2.00E-52 similar to quinone oxidoreductase
309 2 232 2 697 g12834244 1.00E-71 putative
309 2 232 2 697 g12833174 1.00E-71 putative
309 2 232 2 697 g13905260 1.00E-36 RIKEN cDNA 1300006C06 gene
310 3 184 3 554 g2335037 9.00E-67 Tim 17
310 3 184 3 554 g4378524 2.00E-65 mitochondrial inner membrane translocase component Tim 17a
310 3 184 3 554 g12833600 2.00E-65 putative
311 1 206 46 663 g12314268 4.00E-82 dJ14Nl .2 (novel S-100/ICaBP type calcium binding domain protein, similar to trichohyalin)
311 1 206 46 663 g553621 3.00E-14 profilaggrin
311 1 206 46 663 g12314267 3.00E-14 dJ 14N 1.1.1 (profilaggrin 5' end)
312 2 301 59 961 gl2314195 1.OOE-l 71 bA255Al 1.3 (novel protein similar to KIAA1074)
312 2 301 59 961 gl2314164 1.OOE-l34 bA526D8.2 (novel protein similar to KIAA1074)
312 2 301 59 961 g12053099 4.00E-95 hypothetical protein
313 156 70 537 g6692607 2.00E-65 MGA protein
313 156 70 537 g5931585 1.00E-44 T-box family member; T-box domain
313 156 70 537 g4049463 4.00E-16 transcription factor TBX6
314 279 340 1176 g12854977 1.OOE-l 23 putative
314 279 340 1176 g7267246 1.OOE-41 putative adenosine deaminase
314 279 340 1176 g7299138 2.00E-39 CGI 1994 gene product
315 329 181 1167 g4582324 0 dJ708F5.1 (PUTATIVE novel Collagen alpha 1 LIKE protein)
315 329 181 1167 g12052774 0 hypothetical protein
315 329 181 1167 g2326442 1.00E-38 collagen type XII alpha 1 chain
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
316 3 757 3 2273 g8896164 0 kinesin-like protein GAKIN
316 3 757 3 2273 g 10697238 0 KIF13A
316 3 757 3 2273 g 12054032 0 KINESIN-13A2
317 329 1 987 g 14595658 8.00E-97 (AF387815) UM protein prickle
317 329 1 987 g9229890 3.00E-84 prickle 2
317 329 1 987 g9229888 3.00E-83 prickle 1
318 494 1 1482 g 1763096 0 glutamate pyruvate transaminase
318 494 1 1482 g467528 0 alanine aminotransferase
318 494 1 1482 g 1507680 0 alanine aminotransferase
319 156 400 867 g5912051 2.00E-30 hypothetical protein
319 156 400 867 g 14278937 2.00E-30 calmin
319 156 400 867 g 10437695 2.00E-30 unnamed protein product
320 128 1 18 501 g7294107 4.00E-44 CG4638 gene product
320 128 1 18 501 g3169065 3.00E-40 elongation factor 2-like protein
320 128 1 18 501 gl302132 4.00E-38 ORF YN LI 63c
321 197 151 741 g 13185203 2.00E-21 unnamed protein product
321 197 151 741 g2463632 3.00E-14 monocarboxylate transporter homologue MCT6
321 197 151 741 gόl 03363 3.00E-13 monocarboxylate transporter MCT3
322 560 1 1680 g4240293 1. OOE-l 41 KIAA0902 protein
322 560 1 1680 g4151807 1. OOE-l 41 membrane-associated guanylate kinase-interacting protein 2 Maguin-2
322 560 1 1680 g4151805 1. OOE-l 41 membrane-associated guanylate kinase-interacting protein 1 Maguin-1
323 2 163 2 490 g 10172680 3.00E-06 stage V sporulation protein C (peptidyl-tRNA hydrolase)
323 2 163 2 490 gl 001232 4.00E-06 (D64003) peptidyl-tRNA hydrolase
323 2 163 2 490 g2983032 1.00E-05 peptidyl-tRNA hydrolase
324 2 197 2 592 g9295345 2.00E-67 HSKM-B
324 2 197 2 592 g 12834773 5.00E-19 putative
324 2 197 2 592 g5870834 3.00E-17 skm-BOP2
325 3 520 3 1562 g8655678 1. OOE-l 39 hypothetical protein
325 3 520 3 1562 g 12382779 1.00E-59 zinc transporter 1
325 3 520 3 1562 g577843 2.00E-59 ZnT-1
326 1 772 640 2955 g 13358642 0 hypothetical protein
326 ' 1 772 640 2955 g 10435064 0 unnamed protein product
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
326 1 772 640 2955 g 10434826 , 0 unnamed protein product
327 3 702 153 2258 g726402ό 0 dJ876B10.2 (novel protein (ortholog of rat EX084))
327 3 702 153 2258 g2827164 0 exo84
327 3 702 153 2258 g7301432 2.00E-38 CG6095 gene product
328 3 255 609 1373 g 13383265 1. OOE-l 06 actin related protein
328 3 255 609 1373 g 13938319 1. OOE-l 05 Unknown (protein for MGC: 15664)
328 3 255 609 1373 g 12840619 9.00E-72 putative
330 2 178 203 736 g 10435210 2.00E-76 unnamed protein product
330 2 178 203 736 g 14598968 3.00E-58 (AX179297) 21615 ADH
330 2 178 203 736 g8895083 3.00E-58 oxidoreductase UCPA
331 217 1624 2274 g8655648 1. OOE-l 08 hypothetical protein
331 217 1624 2274 g 12803319 1. OOE-l 08 Unknown (protein for MGC3090)
331 217 1624 2274 g 10047337 1. OOE-l 08 KIAA 1630 protein
332 191 823 1395 g9651711 4.00E-72 arsenite inducible RNA associated protein
332 191 823 1395 g 12835478 3.00E-53 putative
332 191 823 1395 g7295806 6.00E-45 CGI 2795 gene product
333 2 252 1223 1978 g 12856598 6.00E-93 putative
333 2 252 1223 1978 g 14042913 2.00E-19 unnamed protein product
333 2 252 1223 1978 gl 4424618 9.00E-19 hypothetical protein MGC2628
334 2 502 305 1810 g 12845866 1. OOE-l 31 putative
334 2 502 305 1810 g 13477235 4.00E-80 Similar to RIKEN cDNA 0610037N03 gene
334 2 502 305 1810 g 12833017 6.00E-22 putative
335 2 369 2 1108 g 12856270 1. OOE-l 65 putative
335 2 369 2 1108 g6682873 1. OOE-l 19 reduced expression in cancer
335 2 369 2 1 108 g7230612 1. OOE-l 16 small rec
337 2 332 2 997 g8886025 1. OOE-l 66 collapsin response mediator protein-5
337 2 332 2 997 g8671 168 1 , OOE-l 66 hypothetical protein
337 2 332 2 997 g 13259169 1. OOE-l 66 phosphoprotein ULIP6
338 1 215 73 717 g 12652727 1. OOE-l 25 Unknown (protein for IMAGE:3352566)
338 1 215 73 717 g488555 2.00E-67 zinc finger protein ZNF135
338 1 215 73 717 g8050899 6.00E-65 ZNF180
339 3 364 3 1094 g9758769 1.00E-64 11-beta-hydroxysteroid dehydrogenase-like
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
339 3 364 3 1094 g8777393 1.00E-64 1 1 -beta-hydroxysteroid dehydrogenase-like
339 3 364 3 1094 g8777400 1.00E-54 contains similarity to oxidoreductase~gene_id:MFB16.17
340 3 229 21 707 g 12052959 1. OOE-l 30 hypothetical protein
340 3 229 21 707 g 14336767 1. OOE-l 28 similar to homoprotocatechuate catabolism bifunctional isomerase/decarboxylase
340 3 229 21 707 g7670464 1. OOE-l 15 unnamed protein product
341 164 235 726 g2286123 8.00E-47 testis specific DNAj-homolog
341 164 235 726 g 12838392 8.00E-47 putative
341 164 235 726 g 12838396 4.00E-46 putative
342 131 199 591 g 12804323 5.00E-60 Unknown (protein for MGC:4054)
342 131 199 591 g3264773 5.00E-32 zinc-finger protein-37; ZFP-37
342 131 199 591 g 10440398 5.00E-32 FU00032 protein
343 2 406 701 1918 g 12697482 1. OOE-l 21 dJ583Pl 5.7.2 (novel zinc finger protein similar to rat RIN ZF)
343 2 406 701 1918 g 12855580 1.00E-81 putative
343 2 406 701 1918 g 12847599 7.00E-78 putative
344 3 219 3 659 g60 1722 3.00E-51 putative ribosomal protein LI 3
344 3 219 3 659 g6650751 2.00E-49 ribosomal protein 1
344 3 219 3 659 g2984157 4.00E-28 ribosomal protein LI 3
345 2 134 95 496 g 14042747 4.00E-63 unnamed protein product
345 2 134 95 496 g 10440516 4.00E-63 FU00106 protein
345 2 134 95 496 g 10445215 7.00E-62 PDZ-LIM protein mystique
346 3 174 3 524 gl 1527997 1. OOE-l 12 NOTCH2 protein
346 3 174 3 524 gl 1275978 1. OOE-l 12 NOTCH 2
346 3 174 3 524 g287990 1. OOE-l 12 Motch B
347 1 306 202 1 1 19 g 14024759 4.00E-32 aldehyde dehydrogenase
347 1 306 202 1 1 19 gl 3162101 3.00E-28 putative aldehyde dehydrogenase
347 1 306 202 11 19 g 14025513 1.00E-23 succinate-semialdehyde dehydrogenase
348 3 283 819 1667 g 10438978 1. OOE-l 56 unnamed protein product
348 3 283 819 1667 g 13358630 1. OOE-l 55 hypothetical protein
348 3 283 819 1667 g9280094 1.00E-36 unnamed protein product
349 3 188 123 686 g4126809 5.00E-88 glyoxalase 1
349 3 188 123 686 g 1808684 3.00E-84 hypothetical protein
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
349 3 188 123 686 g2213425 8.00E-81 hypothetical protein
350 176 1 528 g57469 2.00E-98 vasopressin
350 176 1 528 g4469315 2.00E-98 vasopressin precursor
350 176 1 528 g207674 3.00E-96 vasopressin/neurophysin precursor
351 52 253 408 g3790135 3.00E-18 dJ191N21.3 (proteasome subunit HC5)
351 52 253 408 g220026 3.00E-18 proteasome subunit C5
351 52 253 408 g 1698584 3.00E-18 proteasome beta-subunit C5
352 3 190 3 572 g205662 5.00E-86 nucleoside diphosphate kinase
352 3 190 3 572 g206580 4.00E-85 RBL-NDP kinase 18kDa subunit (pi 8)
352 3 190 3 572 g53354 5.00E-85 nucleoside diphosphate kinase B
353 3 135 546 950 g9368839 1.00E-37 hypothetical protein
353 3 135 546 950 g57690 9.00E-29 ribosomal protein L23a
353 3 135 546 950 g404015 9.00E-29 ribosomal protein L23a
354 1 517 1 1551 gl3161 184 0 cytochrome P4502S1
354 1 517 1 1551 g 14042396 0 unnamed protein product
354 1 517 1 1551 g 12836063 0 putative
355 2 135 203 607 g7677318 5.00E-55 aldehyde reductase
355 2 135 203 607 g 12848322 5.00E-55 putative
355 2 135 203 607 g 12848318 5.00E-55 putative
356 3 77 42 272 g3851089 1.00E-24 guanine nucleotide binding protein gamma 5
356 3 77 42 272 g3329380 1.00E-24 G protein gamma 5 subunit
356 3 77 42 272 g204241 1.00E-24 G protein gamma-5 subunit
357 3 1 10 96 425 g7340072 4.00E-15 40S ribosomal protein S15A
357 3 no 96 425 g495273 4.00E-15 ribosomal protein SI 5a
357 3 no 96 425 g 12859337 4.00E-15 putative
358 485 1455 g 10436300 0 unnamed protein product
358 485 1455 g 10433852 0 unnamed protein product
358 485 1455 g9280029 0 unnamed protein product
360 143 429 g63466 2.00E-56 histone H2A
360 143 429 gό094631 2.00E-56 histone H2A.F
360 143 429 g3420799 2.00E-56 histone H2A.F/Z variant
361 3 81 159 401 g 10437078 4.00E-34 unnamed protein product
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
361 3 81 159 401 g 12859774 6.00E-33 putative
361 3 81 159 401 g 12857408 6.00E-33 putative
362 3 543 834 2462 g 12849316 4.00E-87 putative
362 3 543 834 2462 g6164628 3.00E-86 SH3 and PX domain-containing protein SH3PX1
362 3 543 834 2462 g5410249 3.00E-86 SDP1 protein
363 1 185 1 555 g 12836716 1. OOE-l 03 putative
363 1 185 1 555 g 12834312 1. OOE-l 03 putative
363 1 185 1 555 g 12832330 1. OOE-l 03 putative
364 3 156 177 644 g 14456629 1.00E-31 dJ54B20.2 (novel KRAB box containing C2H2 type zinc finger protein)
364 3 156 177 644 g4589588 9.00E-27 KIAA0972 protein
364 3 156 177 644 g3970712 3.00E-23 zinc finger protein 10
365 183 235 783 g 12248382 7.00E-31 SEMB
365 183 235 783 g854326 7.00E-29 semaphorin B
365 183 235 783 gl 1 10599 1. OOE-l 4 semaphorin homolog=M-Sema F (mice, neonatal brain, Peptide, 834 aa) t 0o0 366 416 1 1248 g 13543419 1. OOE-l 61 Similar to zinc finger protein 304
-J 366 416 1 1248 g7020745 5.00E-53 unnamed protein product
366 416 1 1248 g 12652759 5.00E-53 hypothetical protein FU20557
367 2 258 455 1228 gl l74187 2.00E-94 purine nucleotide binding protein
367 2 258 455 1228 g 193444 8.00E-88 guanylate binding protein
367 2 258 455 1228 g829177 1.00E-68 guanylate binding protein isoform II
368 3 68 270 473 g 14586963 7.00E-10 (AF362574) M75
368 3 68 270 473 g571 15 7.00E-10 ribosomal protein L31 (AA 1-125)
368 3 68 270 473 g36130 7.00E-10 ribosomal protein L31 (AA 1-125)
371 1 122 568 933 g 10435150 2.00E-39 unnamed protein product
371 1 122 568 933 g49821 5 2.00E-16 lepA protein
371 1 122 568 933 g2984041 6.00E-16 G-protein LepA
372 3 1 1 1 3 335 g 13752754 3.00E-15 zinc finger l l l l
372 3 1 1 1 3 335 g7023216 2.00E-14 unnamed protein product
372 3 11 1 3 335 g 14348588 2.00E-14 KRAB zinc finger protein
373 3 1281 36 3878 g 157409 2.00E-95 fat protein
373 3 1281 36 3878 g7295732 3.00E-94 ft gene product
373 3 1281 36 3878 g 10727403 2.00E-93 ds gene product
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
374 3 164 3 494 g9759463 2.00E-60 40S ribosomal protein SI 9
374 3 164 3 494 g6513924 4.00E-60 putative 40S ribosomal protein SI 9
374 3 164 3 494 g13878029 4.00E-60 putative 40S ribosomal protein SI 9
375 3 365 750 1844 g55628 0 reading frame (preproalbumin)
375 3 365 750 1844 g3647327 0 serum albumin
375 3 365 750 1844 g12845183 0 putative
376 1 96 184 471 g12853416 5.00E-25 putative
376 1 96 184 471 g13529497 8.00E-23 Unknown (protein for MGC6652)
376 1 96 184 471 g4589588 5.00E-22 KIAA0972 protein
377 2 106 125 442 g14042293 3.00E-37 unnamed protein product
377 2 106 125 442 g12052983 3.00E-37 hypothetical protein
377 2 106 125 442 g14043841 7.00E-34 Unknown (protein for MGC: 14429)
378 2 119 2 358 g57710 2.00E-31 ribosomal phosphoprotein PI (AA 1-1 14)
378 2 119 2 358 g190234 2.00E-31 acidic ribosomal phosphoprotein (PI)
378 2 119 2 358 g14043204 2.00E-31 ribosomal protein, large, PI
380 2 282 2 847 g10440398 8.00E-93 FU00032 protein
380 2 282 2 847 g10047297 8.00E-93 KIAA161 1 protein
380 2 282 2 847 g10436789 6.00E-92 unnamed protein product
381 2 47 287 427 g7290056 2.00E-10 EG:1 18B3.2 gene product (alt 2)
381 2 47 287 427 g5901822 2.00E-10 EG:1 18B3.2
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation 381 2 47 287 427 g3645991 2.00E-10 /prediction=(method:""genefinder"", version :""084"", score:""128.32"")~/prediction=(method:""genscan"", version:""l .0"")~/match=(desc:""KIAA0376 PROTEIN (FRAGMENT)"", species:""HOMO SAPIENS (HUMAN)"", ranges: (query: 15779..16030, target-SPTREMBL: ::O15081 :314..231, score:""229.00""), (query: 14786..15736, target.SPTREMBL: ::O15081 :650..334, score:""319.00""), (query: 13868..13960, target:SPTREMBL: ::O15081 :884..854, score:"" 156.00"")), rnethod:""blastx"", version:""l ,4.9"")~/match=(desc:""SPECTRIN BETA CHAIN, ERYTHROCYTE"", species:""HOMO SAPIENS (HUMAN)"", ranges:(query: 13748..13954, target:SWISS-PROT::Pl 1277:242..174, score:""201.00"")), method:""blastx"", version:""l .4.9"")~/match=( esc:""GH04661.5prime GH Drosophila melanogaster head pOT2 Drosophila melanogaster cDNA clone GH04661 δprime, mRNA sequence"", specles:""Drosophila melanogaster (fruit fly)"", ranges:(query:22139..22499, target:EMBL::AI064293:3όl ..1, score:""! 796.00"")), method:""blastn"", version:""l ,4.9"")~/match=(desc:""GH05563.5prime GH Drosophila melanogaster head pOT2 Drosophila melanogaster cDNA clone GH05563
383 1 280 16 855 g14456629 9.00E-90 dJ54B20.2 (novel KRAB box containing C2H2 type zinc finger protein)
383 1 280 16 855 g3378094 3.00E-72 KRAB domain zinc finger protein
383 1 280 16 855 g1020145 1.00E-71 DNA binding protein
384 3 264 3 794 g6807587 1.OOE-l24 hypothetical protein
384 3 264 3 794 g5931821 1.OOE-l24 dJ228Hl 3.3 (zinc finger protein)
384 3 264 3 794 g488555 1.00E-96 zinc finger protein ZNF135
385 3 561 135 1817 g10954044 0 KRAB zinc finger protein ZFQR
385 3 561 135 1817 g10442700 0 zinc-finger protein ZBRK1
385 3 561 135 1817 g10435411 0 unnamed protein product
386 2 288 206 1069 g10439850 4.00E-70 unnamed protein product
386 2 288 206 1069 g4894364 6.00E-64 zinc finger protein 3
386 2 288 206 1069 gl2655165 6.00E-64 zinc finger protein 256
387 3 476 171 1598 g14042538 0 unnamed protein product
387 3 476 171 1598 g10438630 0 unnamed protein product
387 3 476 171 1598 g1309691 0 Similar to zinc finger protein 135 (clone pHZ-17)
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
389 3 60 78 257 g5262560 5.00E-07 hypothetical protein
389 3 60 78 257 g 10434856 5.00E-07 unnamed protein product
389 3 60 78 257 g 13623354 6.00E-07 Similar to zinc finger protein 136 (clone pHZ-20)
390 2 355 254 1318 g 10047183 1.00E-94 KIAA1559 protein
390 2 355 254 1318 g 13676461 6.00E-81 hypothetical protein
390 2 355 254 1318 g4589566 1.00E-80 KIAA0961 protein
391 1 101 1 12 414 g 13937999 3.00E-48 Similar to DNA-binding protein
391 1 101 112 414 g3329372 1.00E-34 DNA-binding protein
391 1 101 112 414 g 12052732 3.00E-34 hypothetical protein
392 3 326 3 980 gl4017921 0 KIAA1852 protein
392 3 326 3 980 g8050899 1. OOE-l 33 ZNF180
392 3 326 3 980 g6409345 1. OOE-l 33 zinc finger protein ZNF180
393 3 263 792 1580 g5441615 1.00E-96 zinc finger protein
393 3 263 792 1580 g498721 1.00E-94 zinc finger protein
393 3 263 792 1580 g8099348 2.00E-93 zinc finger protein
394 3 121 372 734 g 10439850 3.00E-18 unnamed protein product
394 3 121 372 734 g 14249844 7.00E-18 Similar to hypothetical protein FU23233
394 3 121 372 734 g2689441 4.00E-13 F18547J
395 2 298 2 895 g 10437560 0 unnamed protein product
395 2 298 2 895 g 10047305 0 KIAA1615 protein
395 2 298 2 895 g498721 4.00E-86 zinc finger protein
396 287 220 1080 g 14042550 3.00E-94 unnamed protein product
396 287 220 1080 g 13937909 3.00E-94 Similar to KIAA0961 protein
396 287 220 1080 g 10047183 5.00E-78 KIAA1559 protein
397 281 277 1 119 g3540177 1. OOE-l 23 F23269_2
397 281 277 1 119 g5080758 1. OOE-l 20 BC331 191J
397 281 277 1 119 g 12855931 5.00E-73 putative
398 263 1 789 g 10434650 4.00E-88 unnamed protein product
398 263 1 789 g 13623217 2.00E-40 Similar to hypothetical protein FU 12895
398 263 1 789 g7020855 6.00E-27 unnamed protein product
399 2 646 200 2137 g 14042373 0 unnamed protein product
399 2 646 200 2137 g498152 0 ha094ό protein is Kruppel-related.
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
399 2 646 200 2137 g7243633 0 RB-associated KRAB repressor
400 2 199 140 736 g204123 5.00E-89 ferritin light chain
Figure imgf000292_0001
400 2 199 140 736 g204133 6.00E-89 ferritin light chain
400 2 199 140 736 g193275 ό.OOE-89 ferritin light chain
401 1 210 25 654 g9651704 4.00E-79 carboxypeptidase B precursor
401 1 210 25 654 g6013463 8.00E-61 carboxypeptidase homolog
401 1 210 25 654 g203295 8.00E-61 carboxypeptidase B
402 2 390 2 1171 g!3365901 0 hypothetical protein
402 2 390 2 1171 g2104689 1.OOE-l26 alpha glucosidase II, alpha subunit
402 2 390 2 1171 g1890664 1.OOE-l26 glucosidase II
403 3 43 348 476 g12052884 3.00E-06 hypothetical protein
403 3 43 348 476 g7023332 4.00E-06 unnamed protein product
403 3 43 348 476 g183002 6.00E-05 guanylate binding protein isoform I
404 406 1218 gl2053281 0 hypothetical protein 404 406 1218 gl2836135 1.OOE-l67 putative 404 406 1218 g3043598 2.00E-64 KIAA0537 protein
405 90 270 g7981261 4.00E-33 dJ50O24.4 (novel protein with DHHC zinc finger domain)
405 90 270 g14035816 4.00E-33 unnamed protein product
405 90 270 g12224992 4.00E-33 hypothetical protein
407 2 192 44 619 g7297667 1.00E-30 CG5022 gene product
407 2 192 44 619 g2224617 4.00E-30 KIAA0338
407 2 192 44 619 g13277297 4.00E-30 bA234K24.1.2 (Erythrocyte membrane protein band 4.1 -like 1 protein (KIAA0338) isoform 2)
408 1 183 28 576 g12314083 9.00E-94 dJ1007G16.5 (novel high-mobility group (nonhistone chromosomal) protein 2 (HMG2) like protein)
408 1 183 28 576 g12838247 4.00E-67 putative 408 1 183 28 576 gl304193 9.00E-35 HMG2 409 3 163 3 491 g12697320 8.00E-51 Pbx4 protein 409 3 163 3 491 g7160800 3.00E-50 Pbx4/Lazarus homeodomain protein 409 3 163 3 491 g5679283 3.00E-50 Pbx4 homeodomain protein 410 3 186 198 755 g12840673 2.00E-48 putative 410 3 186 198 755 g607003 4.00E-18 beta transducin-like protein
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation 410 3 186 198 755 g3878300 1. OOE-l 4 predicted using Genefinder-Similarity to C.elegans Guanine nucleotide binding protein (WP:C14B1.4), contains similarity to Pfam domain: PF00400 (WD domain, G-beta repeat), Score= 196.1, E-value=1.8e-55, N=7~cDNA EST yk567gl2.3 comes from this gene-cDNA EST yk567gl2.5 comes from this gene
411 1 134 286 687 g11342541 2.00E-67 putative white family ATP-binding cassette transporter
411 1 134 286 687 g9665220 3.00E-50 ATP-binding cassette transporter, sub-family G member 1
411 1 134 286 687 g7768742 3.00E-50 white protein homolog (ATP-binding cassette transporter 8)
412 2 97 35 325 g57175 1.00E-37 S- 100 protein
412 2 97 35 325 g206825 1.00E-37 SI 00 protein
412 2 97 35 325 g404769 2.00E-37 SI 00 beta protein
413 1 181 1 543 g13543071 9.00E-64 RIKEN cDNA 1500031 N16 gene
413 1 181 1 543 g12837801 9.00E-64 putative
413 1 181 1 543 g12832973 5.00E-60 putative
414 3 258 3 776 g12856949 4.00E-98 putative
414 3 258 3 776 g12653785 3.00E-97 Unknown (protein for IMAGE:3349601)
414 3 258 3 776 g12845723 4.00E-94 putative
415 2 169 2 508 g7259240 1.00E-79 unnamed protein product
415 2 169 2 508 g12845457 1.00E-79 putative
415 2 169 2 508 g12834293 1.00E-79 putative
416 1 177 310 840 g9502403 5.00E-06 Hypothetical zinc finger-like protein
417 1 81 451 693 g14042550 ό.OOE-27 unnamed protein product
417 1 81 451 693 g13937909 6.00E-27 Similar to KIAA0961 protein
417 1 81 451 693 g487787 3.00E-23 zinc finger protein ZNF140
418 2 94 329 610 g13752754 7.00E-18 zinc finger l l l l
418 2 94 329 610 g14348588 1.OOE-l6 KRAB zinc finger protein
418 2 94 329 610 g12654015 1.OOE-l6 Similar to hypothetical protein FU 108 1
419 3 165 3 497 g12856090 2.00E-51 putative
419 3 165 3 497 g12854104 2.00E-51 putative
419 3 165 3 497 g4678718 4.00E-43 dJ20l3,l (brain mitochondrial carrier protein- 1 (BMCP1))
420 3 352 111 1166 g13444976 4.00E-36 unnamed protein product
420 3 352 111 1166 g12309630 4.00E-36 bA438B23.1 (neuronal leucine-rich repeat protein)
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
420 3 352 111 1166 g9651089 8.00E-34 hypothetical protein
421 2 113 2 340 gl4164613 3.00E-27 sialic acid binding immunoglobulin-like lectin 10
421 2 113 2 340 g13991167 ό.OOE-16 sialic acid-binding immunoglobulin-like lectin-like long splice variant
421 2 113 2 340 g13991166 6.00E-16 sialic acid-binding immunoglobulin-like lectin-like short splice variant
422 2 100 101 400 g12052732 1.OOE-38 hypothetical protein
422 2 100 101 400 g3329372 8.00E-37 DNA-binding protein
422 2 100 101 400 g7959207 2.00E-34 KIAA1473 protein
423 1 117 1 351 g7959207 8.00E-39 KIAA1473 protein
423 1 117 1 351 g3342002 2.00E-36 hematopoietic cell derived zinc finger protein
423 1 117 1 351 g186774 5.00E-35 zinc finger protein
424 2 250 2 751 g14042850 2.00E-50 unnamed protein product
424 2 250 2 751 g12052983 8.00E-37 hypothetical protein
424 2 250 2 751 g14042293 1.00E-36 unnamed protein product
425 1 199 1 597 g10437560 1.OOE-l05 unnamed protein product
425 1 199 1 597 g10047305 1.OOE-l05 KIAA1615 protein
425 1 199 1 597 g13436440 2.00E-77 Unknown (protein for MGC:4400)
426 3 166 3 500 g14017833 3.00E-86 KIAA1808 protein
426 3 166 3 500 g4240175 ό.OOE-24 KIAA0843 protein
426 3 166 3 500 g505094 5.00E-21 similar to an actin bundling protein, dematn.
427 1 157 1168 1638 g14042421 4.00E-11 unnamed protein product
427 1 157 1168 1638 g14017937 4.00E-11 KIAA1860 protein
427 1 157 1168 1638 g13366084 4.00E-11 MAP/microtubule affinity-regulating kinase like 1
429 3 200 3 602 g9651099 1.OOE-l 18 hypothetical protein
429 3 200 3 602 g881564 1.00E-55 ZNF157
429 3 200 3 602 g453466 5.00E-55 zinc finger protein
430 2 173 2 520 g11990770 2.00E-67 bA534G20.1.1 (novel protein similar to Lysozyme C-l (1,4-beta-N- acylmuramidase C, EC 3.2.1.17) (isoform 1 ))
430 2 173 2 520 gl 1990771 1.00E-56 bA534G20.1.2 (novel protein similar to Lysozyme C-l (1 ,4-beta-N- acylmuramidase C, EC 3.2.1.17) (isoform 2))
430 2 173 2 520 g12839824 3.00E-49 putative
431 1 181 76 618 gl3591714 4.00E-97 (AF343664) immunoglobulin superfamily receptor translocation associated protein 2c
TABLE 7
- ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
431 1 181 76 618 gl3591712 4.00E-97 (AF343663) immunoglobulin superfamily receptor translocation associated protein 2b
431 1 181 76 618 gl3591710 4.00E-97 (AF343662) immunoglobulin superfamily receptor translocation associated protein 2a
433 1 157 22 492 g7268562 7.00E-54 ribosomal protein L32-like protein
433 1 157 22 492 g5816996 7.00E-54 ribosomal protein L32-like protein
433 1 157 22 492 g13899057 2.00E-53 ribosomal protein L32
434 1 182 157 702 gl2855141 5.00E-89 putative
434 1 182 157 702 g12852338 5.00E-89 putative
434 1 182 157 702 g12849745 5.00E-89 putative
435 2 134 2 403 g2624328 2.00E-44 OsGRP2
435 2 134 2 403 g2366750 2.00E-34 RNA binding protein
435 2 134 2 403 g7268089 8.00E-34 glycine-rich RNA-binding protein AtGRP2-like
436 2 148 731 1174 g13397122 1.00E-58 unnamed protein product
436 2 148 731 1174 g12654715 1.00E-58 Similar to glucose regulated protein, 58 kDa
436 2 148 731 1174 g10728195 3.00E-15 CGI 837 gene product
437 2 296 2 889 g12597312 1.OOE-l49 tRNA-guanine transglycosylase
437 2 296 2 889 g12597314 1.OOE-l 37 tRNA-guanine transglycosylase
437 2 296 2 889 g7415812 1.OOE-l 35 tRNA-guanine transglycosylase
438 3 169 9 515 g12406772 4.00E-52 unnamed protein product
438 3 169 9 515 g12406688 4.00E-52 unnamed protein product
438 3 169 9 515 g7299372 4.00E-29 CG6567 gene product
439 2 170 200 709 gl4133251 3.00E-93 KIAA1479 protein
439 2 170 200 709 g2623162 7.00E-39 semaphorin Via
439 2 170 200 709 g11093909 7.00E-39 axon guidance signal SEMA6A1
440 2 841 2 2524 glll77164 0 polydom protein
440 2 841 2 2524 g12060830 1.OOE-l55 serologically defined breast cancer antigen NY-BR-38
440 2 841 2 2524 gl4198157 4.00E-83 polydomain protein
441 1 271 70 882 gl3898617 1.OOE-l42 serine/threonine protein kinase SSTK
441 1 271 70 882 g13540326 1.OOE-l42 serine/threonine kinase FKSG82
441 1 271 70 882 gl3898619 1.OOE-l40 serine/threonine protein kinase SSTK
442 2 311 743 1675 g8979743 1.OOE-l60 Band4.1-like5 protein
TABLE 7
- ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
442 2 311 743 1675 gl 3278193 1. OOE-l 53 Similar to EHM2 gene
442 2 31 1 743 1675 g 10434740 1. OOE-l 40 unnamed protein product
443 2 529 416 2002 g 12834087 1. OOE-l 55 putative
443 2 529 416 2002 g2463628 ό.OOE-46 putative monocarboxylate transporter
443 2 529 416 2002 g7328162 1.00E-40 hypothetical protein
444 1 335 4 1008 gl 3159480 1. OOE-l 28 Translation may initiate at the ATG codon at nucleotides 40-42 or the ATG at nucleotides 43-45
444 335 4 1008 g9229906 5.00E-35 fibrinogen-like protein
444 335 4 1008 g387156 8.00E-33 flbrinogen-like protein
445 329 196 1 182 g4582324 0 dJ708F5.1 (PUTATIVE novel Collagen alpha 1 LIKE protein)
445 329 196 1 182 g 12052774 0 hypothetical protein
445 329 196 1 182 g2326442 1.00E-38 collagen type XII alpha 1 chain
446 2 395 1 10 1294 g 12642596 0 nuclear receptor co-repressor/HDAC3 complex subunit TBLR1
446 2 395 no 1294 g 10434648 0 unnamed protein product
446 2 395 no 1294 g 12006104 0 IRA1
447 1 163 1 489 gl 2858551 7.00E-67 putative
447 1 163 1 489 g 12805285 7.00E-67 Similar to ribosomal protein S27a
447 1 163 1 489 g 1050756 7.00E-67 fusion protein: ubiquitin (bases 43_513); ribosomal protein S27a (bases 217.532)
448 3 78 330 563 g8699209 5.00E-06 cyclophilin A
448 3 78 330 563 g50621 5.00E-06 cyclophilin (AA 1 - 164)
448 3 78 330 563 g49496 5.00E-06 cyclophilin (AA 1-164)
449 314 277 1218 g 12844770 1. OOE-l 30 putative
449 314 277 1218 gl 2861366 1. OOE-l 27 putative
449 314 277 1218 g 12857383 7.00E-50 putative
450 130 181 570 g2843171 4.00E-06 zinc finger protein
450 130 181 570 g5817149 5.00E-06 hypothetical protein
450 130 181 570 g 10434142 5.00E-06 unnamed protein product
451 176 697 1224 g 13383265 1.00E-63 actin related protein
451 176 697 1224 g 13938319 3.00E-62 Unknown (protein for MGC: 15664)
451 176 697 1224 g 12840619 2.00E-51 putative
452 3 671 3 2015 g 12698057 1. OOE-l 68 KIAA1756 protein
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
452 3 671 3 2015 g 1 177322 1. OOE-l 24 CPG2 protein
452 3 671 3 2015 g 10728577 1.00E-82 Msp-300 gene product
453 3 293 3 881 g4235148 8.00E-85 BC41195J
453 3 293 3 881 g 14165525 7.00E-71 Similar to CG8500 gene product
453 3 293 3 881 g 14388336 2.00E-70 hypothetical protein
455 253 478 1236 g 14424576 1. OOE-l 18 hypothetical protein FU21963
455 253 478 1236 g 10438188 1. OOE-l 18 unnamed protein product
455 253 478 1236 g 14025162 3.00E-57 acetyl-coa synthetase
457 125 529 903 g 12859335 1.00E-26 putative
457 125 529 903 g 12858578 1.00E-26 putative
457 125 529 903 g 12843085 ό.OOE-21 putative
458 394 577 1758 g 14533376 1. OOE-l 48 (AX151207) unnamed protein product
458 394 577 1758 gl 1762100 1. OOE-l 44 myo-inositol 1 -phosphate synthase
458 394 577 1758 g3108053 1. OOE-l 44 myo-inositol 1 -phosphate synthase; INOl
459 3 149 1 137 1583 g 12053149 2.00E-39 hypothetical protein
460 2 323 122 1090 g7297059 3.00E-06 CG91 17 gene product
461 3 200 387 986 g 13365897 1.00E-82 hypothetical protein
461 3 200 387 986 g2636109 2.00E-24 similar to metabolite transport protein
461 3 200 387 986 g 1894771 2.00E-24 product highly similar to metabolite transport proteins
462 2 406 701 1918 g 12697482 1. OOE-l 21 dJ583Pl 5.7.2 (novel zinc finger protein similar to rat RIN ZF)
462 2 406 701 1918 g 12855580 1.00E-81 putative
462 2 406 701 1918 g 12847599 7.00E-78 putative
463 3 181 102 644 g 13879442 2.00E-78 Similar to RIKEN cDNA 2310035M22 gene
463 3 181 102 644 g 12848905 2.00E-78 putative
463 3 181 102 644 g 12855490 1.00E-73 putative
464 3 137 69 479 gl l231085 7.00E-29 hypothetical protein
464 3 137 69 479 g 10998425 6.00E-09 NORPEG-like protein
464 3 137 69 479 gl 0937641 6.00E-09 ankycorbin
465 2 278 218 1051 gl 2861800 1. OOE-l 1 1 putative
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
465 2 278 218 1051 g3878713 2.00E-38 (Z46935) weak similarity with quinone oxidoreductase, contains similarity to Pfam domain: PF00107 (Zinc-binding dehydrogenases), Score=-80.ό, E- value=6.2e-06, N=l -cDNA EST ykl64b4.5 comes from this gene-cDNA EST ykl64b4.3 comes from this gene-cDNA EST yk264f3.5 comes from this
465 2 278 218 1051 g9948219 4.00E-36 conserved hypothetical protein
467 1 142 118 543 g4559318 4.00E-17 BC273239J
467 1 142 118 543 g186774 9.00E-17 zinc finger protein
467 1 142 118 543 gl017722 9.00E-17 repressor transcriptional factor
468 3 126 78 455 g9963804 5.00E-47 zinc finger protein ZNF286
468 3 126 78 455 g14017965 5.00E-47 KIAA1874 protein
468 3 126 78 455 g5640017 2.00E-46 zinc finger protein ZFP1 13
469 3 246 228 965 g13676461 1.00E-45 hypothetical protein
469 3 246 228 965 g4589566 1.00E-45 KIAA0961 protein
469 3 246 228 965 g487787 2.00E-37 zinc finger protein ZNF140
470 1 107 40 360 g14348591 2.00E-28 KRAB zinc finger protein
470 1 107 40 360 g7959207 1.00E-27 KIAA 1473 protein
470 1 107 40 360 g186774 4.00E-27 zinc finger protein
471 3 357 3 1073 g12052983 1.OOE-l67 hypothetical protein
471 3 357 3 1073 g14042293 1.OOE-l35 unnamed protein product
471 3 357 3 1073 g5262560 1.00E-113 hypothetical protein
472 3 90 3 272 g13752754 2.00E-16 zinc finger l l l l
472 3 90 3 272 g14348588 9.00E-15 KRAB zinc finger protein
472 3 90 3 272 g12654015 9.00E-15 Similar to hypothetical protein FU10891
473 295 885 g14495650 1.00E-75 (BC009433) zinc finger protein 331; zinc finger protein 463
473 295 885 g8575775 1.00E-75 KRAB zinc finger protein
473 295 885 g13939858 1.00E-75 RITA
474 195 585 g7020503 9.00E-68 unnamed protein product
474 195 585 g8453103 1.00E-67 zinc finger protein
474 195 585 g8099348 3.00E-67 zinc finger protein
475 2 232 257 952 g14042293 4.00E-53 unnamed protein product
475 2 232 257 952 g14042850 3.00E-47 unnamed protein product
475 2 232 257 952 g12052983 6.00E-38 hypothetical protein
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
476 2 282 2 847 g10440398 1.00E-93 FU00032 protein
476 2 282 2 847 g10047297 1.00E-93 KIAA161 1 protein
476 2 282 2 847 g10436789 8.00E-93 unnamed protein product
477 3 149 3 449 g12656631 7.O0E-20 Kruppel-like zinc finger protein GLIS2
477 3 149 3 449 g13507039 2.00E-19 Gli-Kruppel zinc-finger protein NKL
477 3 149 3 449 g13507037 2.00E-19 Gli-Kruppel zinc-finger protein NKL
478 3 283 3 851 g13623633 1.OOE-l 74 Unknown (protein for MGC: 13105)
478 3 283 3 851 g488555 3.00E-98 zinc finger protein ZNF135
478 3 283 3 851 g10437767 9.00E-97 unnamed protein product
479 3 264 3 794 gό807587 1.OOE-l 24 hypothetical protein
479 3 264 3 794 g5931821 1.OOE-l 24 dJ228H13.3 (zinc finger protein)
479 3 264 3 794 g488555 1.00E-96 zinc finger protein ZNF135
480 201 430 1032 g3540177 1.OOE-l 17 F23269.2
480 201 430 1032 g5080758 6.00E-92 BC331 191J
480 201 430 1032 g12855931 1.00E-55 putative
481 283 880 1728 gl4017871 1.OOE-l43 KIAA1827 protein
481 283 880 1728 g10436789 2.00E-97 unnamed protein product
481 283 880 1728 g13752754 1.00E-95 zinc finger l l l l
482 2 101 131 433 g14042373 1.00E-24 unnamed protein product
482 2 101 131 433 g1389741 7.00E-24 KRAB/zinc finger suppressor protein 1
482 2 101 131 433 g9800824 2.00E-20 bA179N14.1 (novel zinc finger protein)
483 1 101 112 414 g13937999 3.00E-48 Similar to DNA-binding protein
483 1 101 112 414 g3329372 1.00E-34 DNA-binding protein
483 1 101 112 414 g12052732 3.00E-34 hypothetical protein
484 2 297 215 1105 g12862320 2.00E-51 WDC146
486 3 108 462 785 g12856025 1.00E-51 putative
486 3 108 462 785 g12846941 4.00E-37 putative
486 3 108 462 785 g13938537 2.00E-36 Similar to RIKEN CDNA 4933430F16 gene
487 2 122 122 487 g4235144 4.00E-47 BC39498J
487 2 122 122 487 g4235143 7.00E-34 BC39498_3
487 2 122 122 487 g8163824 4.00E-30 krueppel-like zinc finger protein HZF2
488 1 182 148 693 g12854977 1.OOE-41 putative
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
488 1 182 148 693 g7267246 1.00E-21 putative adenosine deaminase
488 1 182 148 693 g7299138 1. OOE-l 6 CGI 1 94 gene product
Figure imgf000300_0001
489 2 1 14 293 634 g5262523 3.00E-19 hypothetical protein
489 2 1 14 293 634 g340170 3.00E-19 orotidine 5'-monophosphate decarboxylase (EC 4.1.1.23)
489 2 1 14 293 634 g340168 3.00E-19 UMP synthase
490 3 415 3 1247 g2689442 0 R28830J
490 3 415 3 1247 g 1572600 1. OOE-l 72 Zikl
490 3 415 3 1247 g 12652759 1. OOE-l 41 hypothetical protein FU20557
491 3 43 348 476 g 12052884 3.00E-06 hypothetical protein
491 3 43 348 476 g7023332 4.00E-06 unnamed protein product
491 3 43 348 476 g 183002 6.00E-05 guanylate binding protein isoform 1
492 1 134 49 450 g220637 4.00E-52 zinc finger protein
492 1 134 49 450 g5730196 2.00E-51 Kruppel-type zinc finger
492 1 134 49 450 g 14456631 8.00E-51 dJ54B20.4 (novel KRAB box containing C2H2 type zinc finger protein) 493 3 239 3 719 g 14042330 2.00E-96 unnamed protein product 493 3 239 3 719 g 10954044 8.00E-92 KRAB zinc finger protein ZFQR
493 3 239 3 719 g 10442700 8.00E-92 zinc-finger protein ZBRK1
495 2 185 134 688 g 13560888 2.00E-38 EZFIT-related protein 1
495 2 185 134 688 g7243243 2.00E-37 KIAA1431 protein
495 2 185 134 688 g4567178 5.00E-35 R31665_2 (AA 1- 673 )
496 1 277 178 1008 g4589588 4.00E-67 KIAA0972 protein
496 1 277 178 1008 g498152 4.00E-51 ha0946 protein is Kruppel-related.
496 1 277 178 1008 g6467204 3.00E-38 gonadotropin inducible transcription repressor-3
497 3 241 3 725 g 14043841 3.00E-41 Unknown (protein for MGC: 14429)
497 3 241 3 725 g 14042293 6.00E-41 unnamed protein product
497 3 241 3 725 g 12052983 2.00E-40 hypothetical protein
498 2 251 1139 1891 g 12052983 1. OOE-l 14 hypothetical protein
498 2 251 1139 1891 g5262560 4.00E-68 hypothetical protein
498 2 251 1139 1891 g 10434856 1.00E-61 unnamed protein product
499 3 282 54 899 g5080758 1. OOE-l 03 BC331 191J
499 3 282 54 899 g3540177 1. OOE-l 03 F23269_2
499 3 282 54 899 g 12855931 1.00E-58 putative
TABLE 7
SEQ ID NO: Frame Length Start Stop Gl Number Probability Score Annotation
500 1 442 28 1353 g 12652727 0 Unknown (protein for IMAGE:3352566)
500 1 442 28 1353 g488555 1.00E-97 zinc finger protein ZNF135
Figure imgf000301_0001
500 1 442 28 1353 g3378094 1.00E-95 KRAB domain zinc finger protein
501 3 292 3 878 g488551 1.00E-97 zinc finger protein ZNF132
501 3 292 3 878 g 1354341 2.00E-97 Similar to zinc finger protein 304
501 3 292 3 878 g 1 1 9604 3.00E-97 zinc finger protein C2H2-25
502 166 498 g 12804419 2.00E-65 Unknown (protein for MGC: 1 136)
502 166 498 g 12844790 5.00E-64 putative
502 166 498 gl31 1 1895 3.00E-33 Unknown (protein for MGC2627)
504 406 1218 g 12053281 0 hypothetical protein
504 406 1218 gl 2836135 1. OOE-l 67 putative
504 406 1218 g3043598 2.00E-64 KIAA0537 protein
505 332 996 g 10438696 1. OOE-l 40 unnamed protein product
505 332 996 g 12847740 7.00E-1 1 putative ) o 505 332 996 g7022551 1.00E-09 unnamed protein product o 506 2 183 2 550 g4105619 1.00E-94 SPAF
506 2 183 2 550 g 12847023 1.00E-94 putative
506 2 183 2 550 g7297973 5.00E-67 CG5776 gene product
TABLE 8
Program Description Reference Parameter Threshold
ABI FACTURA A program that removes vector sequences and masks Applied Biosystems, Foster City, CA ambiguous bases in nucleic acid sequences
ABI/PARACEL FDF A Fast Data Finder useful in comparing and annotating Applied Biosystems, Foster City, CA, Paracel Mismatch <50 % amino acid or nucleic acid sequences Inc , Pasadena, CA
ABI AutoAssembler A program that assembles nucleic acid sequences Applied Biosystems, Foster City, CA BLAST A Basic Local Alignment Search Tool useful in sequence Altschul, S F et al (1990) J Mol Biol 215 403- ESTs Probability value= 1 0E-8 or less, similarity search for amino acid and nucleic acid 410, Altschul, S F et al (1997) Nucleic Acids Full Length sequences Probability valuer sequences BLAST includes five functions blastp, Res 25 3389-3402 l OE-lO or less blastn, blastx, tblastn, and tblastx
FASTA A Pearson and Lipman algorithm that searches for Pearson, W R and D J Lipman (1988) Proc ESTs fasta E value=l 06E-6, Assembled similarity between a query sequence and a group of Natl Acad Sci USA 85 2444-2448, Pearson, ESTs fasta Identιty= 95% or greater and sequences of the same type FASTA compπses as least W R (1990) Methods Enzymol 183 63-98, and Match length=200 bases or greater, fastx E five functions fasta, tfasta, fastx, tfastx, and ssearch Smith, T F and M S Waterman (1981) Adv value=l 0E-8 or less, Full Length
Appl Math 2 482-489 sequences fastx score= 100 or greater
BLIMPS A BLocks IMProved Searcher that matches a sequence Henikoff. S and J G Henikoff (1991 ) Nucleic Probability value= 1 0E-3 or less against those in BLOCKS, PRINTS, DOMO, PRODOM, Acids Res 19 6565-6572, Henikoff, J G and S and PFAM databases to search for gene families, Henikoff (1996) Methods Enzymol 266 88-105, sequence homology, and structural fingerprint regions and Attwood, T K et al (1997) J Chem Inf
Comput Sci 37 417-424
HMMER An algorithm for searching a query sequence against Krogh, A et al (1994) J Mol Biol 235 1501- PFAM hits Probability value= 1 OE-3 or hidden Markov model (HMM)-based databases of 1531, Sonnhammer, E L L et al (1988) Nucleic less, protein family consensus sequences, such as PFAM Acids Res 26320-322, Durbin, R et al (1998) Signal peptide hits Score= 0 or greater
Our World View, in a Nutshell, Cambridge
Univ Press, pp 1 350
ProfileScan An algorithm that searches for structural and sequence Gribskov, M et al (1988) CABIOS 4 61-66, Normalized quality score≥GCG-specified motifs in protein sequences that match sequence patterns Gribskov, M et al (1989) Methods Enzymol "HIGH" value for that particular Prosite defined in Prosite 183 146-159, Bairoch, A et al (1997) Nucleic motif Generally, score=l 4-2 1
Acids Res 25 217 221
TABLE 8
Program Descnption Reference Parameter Threshold Phred A base-calling algorithm that examines automated Ewing, B et al (1998) Genome Res 8 175-185, sequencer traces with high sensitivity and probability Ewing, B and P Green (1998) Genome Res 8 186-194
Phrap A Phils Revised Assembly Program including SWAT Smith, T F and M S Waterman (1981) Adv Score= 120 or greater, and CrossMatch, programs based on efficient Appl Math 2 482-489, Smith, T F and M S Match length= 56 or greater implementation of the Smith-Waterman algorithm, Waterman (1981) J Mol Biol 147 195-197, useful in searching sequence homology and assembling and Green, P , University of Washington, DNA sequences Seattle, WA
Consed A graphical tool for viewing and editing Phrap Gordon, D et al (1998) Genome Res 8 195 assemblies 202
SPScan A weight matrix analysis program that scans protein Nielson, H et al (1997) Protein Engineering Score=3 5 or greater sequences for the presence of secretory signal peptides 10 1-6, Claveπe, J M and S Audic (1997) CABIOS 12 431 439 TMAP A program that uses weight matrices to delineate Persson, B and P Argos (1994) J Mol Biol transmembrane segments on protein sequences and 237 182-192, Persson, B and P Argos (1996) determine orientation Protein Sci 5 363-371
TMHMMER A program that uses a hidden Markov model (HMM) to Sonnhammer, E L et al (1998) Proc Sixth Intl delineate transmembrane segments on protein sequences Conf On Intelligent Systems for Mol Biol , and determine orientation Glasgow et al , eds , The Am Assoc for Artificial Intelligence (AAAI) Press, Menlo Park, CA, and MIT Press, Cambπdge, MA, pp 175-182
Motifs A program that searches amino acid sequences for Bairoch, A et al (1997) Nucleic Acids Res patterns that matched those defined in Prosite 25 217-221, Wisconsin Package Program Manual, version 9, page M51-59, Genetics Computer Group, Madison, WI

Claims

What is claimed is:
1. An isolated polynucleotide selected from the group consisting of: a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO: 1 -252, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d).
2. An isolated polynucleotide of claim 1 , selected from the group consisting of SEQ ID NO: l-
252.
3. An isolated polynucleotide comprising at least 30 contiguous nucleotides of a polynucleotide of claim 1.
4. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a polynucleotide of claim 1.
5. A composition for the detection of expression of disease detection and treatment polynucleotides comprising at least one of the polynucleotides of claim 1 and a detectable label.
6. A method for detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 1 , the method comprising: a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
7. A method for detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 1 , the method comprising:
303 a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof.
8. A method of claim 7, wherein the probe comprises at least 30 contiguous nucleotides.
9. A method of claim 7, wherein the probe comprises at least 60 contiguous nucleotides.
10. A recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide of claim 1.
11. A cell transformed with a recombinant polynucleotide of claim 10.
12. A transgenic organism comprising a recombinant polynucleotide of claim 10.
13. A method for producing a disease detection and treatment polypeptide encoded by a polynucleotide of claim 1, the method comprising: a) culturing a cell under conditions suitable for expression of the disease detection and treatment polypeptide, wherein said cell is transformed with a recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide of claim 1, and b) recovering the disease detection and treatment polypeptide so expressed.
14. A method of claim 13, wherein the polypeptide has an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
15. An isolated disease detection and treatment polypeptide (MDDT) encoded by at least one of the polynucleotides of claim 2.
16. A method of screening for a test compound that specifically binds to the polypeptide of claim 15, the method comprising:
304 a) combining the polypeptide of claim 15 with at least one test compound under suitable conditions, and b) detecting binding of the polypeptide of claim 15 to the test compound, thereby identifying a compound that specifically binds to the polypeptide of claim 15.
17. A microarray wherein at least one element of the microarray is a polynucleotide of claim
18. A method for generating a transcript image of a sample which contains polynucleotides, the method comprising: a) labeling the polynucleotides of the sample, b contacting the elements of the microarray of claim 17 with the labeled polynucleotides of the sample under conditions suitable for the formation of a hybridization complex, and c) quantifying the expression of the polynucleotides in the sample.
19. A method for screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence of a polynucleotide of claim 1 , the method comprising: a) exposing a sample comprising the target polynucleotide to a compound, under conditions suitable for the expression of the target polynucleotide, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.
20. A method for assessing toxicity of a test compound, said method comprising: a) treating a biological sample containing nucleic acids with the test compound, b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide of claim 1 under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide of claim 1 or fragment thereof, c) quantifying the amount of hybridization complex, and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a
305 difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound.
21. An array comprising different nucleotide molecules affixed in distinct physical locations on a solid substrate, wherein at least one of said nucleotide molecules comprises a first oligonucleotide or polynucleotide sequence specifically hybridizable with at least 30 contiguous nucleotides of a target polynucleotide, and wherein said target polynucleotide is a polynucleotide of claim 1.
22. An array of claim 21 , wherein said first oligonucleotide or polynucleotide sequence is completely complementary to at least 30 contiguous nucleotides of said target polynucleotide.
23. An array of claim 21 , wherein said first oligonucleotide or polynucleotide sequence is completely complementary to at least 60 contiguous nucleotides of said target polynucleotide
24. An array of claim 21 , wherein said first oligonucleotide or polynucleotide sequence is completely complementary to said target polynucleotide.
25. An array of claim 21, which is a microarray.
26. An array of claim 21 , further comprising said target polynucleotide hybridized to a nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence.
27. An array of claim 21 , wherein a linker joins at least one of said nucleotide molecules to said solid substrate.
28. An array of claim 21, wherein each distinct physical location on the substrate contains multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical location have the same sequence, and each distinct physical location on the substrate contains nucleotide molecules having a sequence which differs from the sequence of nucleotide molecules at another distinct physical location on the substrate.
29. An isolated polypeptide selected from the group consisting of: a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90%
306 identical to an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
30. An isolated polypeptide of claim 29, having a sequence selected from the group consisting of SEQ ID NC.253-506.
31 . An isolated polynucleotide encoding a polypeptide of claim 29.
32. An isolated polynucleotide encoding a polypeptide of claim 30.
33. An isolated polynucleotide of claim 32, having a sequence selected from the group consisting of SEQ ID NO: 1 -252.
34. An isolated antibody which specifically binds to a disease detection and treatment polypeptide of claim 29.
35. A diagnostic test for a condition or disease associated with the expression of MDDT in a biological sample, the method comprising: a) combining the biological sample with an antibody of claim 34, under conditions suitable for the antibody to bind the polypeptide and form an antibody:polypeptide complex, and b) detecting the complex, wherein the presence of the complex correlates with the presence of the polypeptide in the biological sample.
36. The antibody of claim 34, wherein the antibody is: a) a chimeric antibody, b) a single chain antibody, c) a Fab fragment, d) a F(ab')2 fragment, or e) a humanized antibody.
307
37. A composition comprising an antibody of claim 34 and an acceptable excipient.
38. A method of diagnosing a condition or disease associated with the expression of MDDT in a subject, comprising administering to said subject an effective amount of the composition of claim 37.
39. A composition of claim 37, wherein the antibody is labeled.
40. A method of diagnosing a condition or disease associated with the expression of MDDT in a subject, comprising administering to said subject an effective amount of the composition of claim
39.
41. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 34, the method comprising: a) immunizing an animal with a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, or an immunogenic fragment thereof, under conditions to elicit an antibody response, b) isolating antibodies from said animal, and c) screening the isolated antibodies with the polypeptide, thereby identifying a polyclonal antibody which binds specifically to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
42. An antibody produced by a method of claim 41.
43. A composition comprising the antibody of claim 42 and a suitable carrier.
44. A method of making a monoclonal antibody with the specificity of the antibody of claim 34, the method comprising: a) immunizing an animal with a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506, or an immunogenic fragment thereof, under conditions to elicit an antibody response, b) isolating antibody producing cells from the animal, c) fusing the antibody producing cells with immortalized cells to form monoclonal antibody-producing hybridoma cells,
308 d) culturing the hybridoma cells, and e) isolating from the culture monoclonal antibody which binds specifically to a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
45. A monoclonal antibody produced by a method of claim 44.
46. A composition comprising the antibody of claim 45 and a suitable carrier.
47. The antibody of claim 34, wherein the antibody is produced by screening a Fab expression library.
48. The antibody of claim 34, wherein the antibody is produced by screening a recombinant immunoglobulin library.
49. A method of detecting a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 in a sample, the method comprising: a) incubating the antibody of claim 34 with a sample under conditions to allow specific binding of the antibody and the polypeptide, and b) detecting specific binding, wherein specific binding indicates the presence of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 in the sample.
50. A method of purifying a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506 from a sample, the method comprising: a) incubating the antibody of claim 34 with a sample under conditions to allow specific binding of the antibody and the polypeptide, and b) separating the antibody from the sample and obtaining the purified polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:253-506.
51. A composition comprising a polypeptide of claim 29 and a pharmaceutically acceptable excipient.
52. A composition of claim 51, wherein the polypeptide has an amino acid sequence of SEQ ID NO:253-506.
309
53. A method for treating a disease or condition associated with decreased expression of functional MDDT, comprising administering to a patient in need of such treatment the composition of claim 51.
54. A method for screening a compound for effectiveness as an agonist of a polypeptide of claim 29, the method comprising: a) exposing a sample comprising a polypeptide of claim 29 to a compound, and b) detecting agonist activity in the sample.
55. A composition comprising an agonist compound identified by a method of claim 54 and a pharmaceutically acceptable excipient.
56. A method for treating a disease or condition associated with decreased expression of functional MDDT, comprising administering to a patient in need of such treatment a composition of claim 55.
57. A method for screening a compound for effectiveness as an antagonist of a polypeptide of claim 29, the method comprising: a) exposing a sample comprising a polypeptide of claim 29 to a compound, and b) detecting antagonist activity in the sample.
58. A composition comprising an antagonist compound identified by a method of claim 57 and a pharmaceutically acceptable excipient.
59. A method for treating a disease or condition associated with overexpression of functional
MDDT, comprising administering to a patient in need of such treatment a composition of claim 58.
60. A method of screening for a compound that modulates the activity of the polypeptide of claim 29, said method comprising: a) combining the polypeptide of claim 29 with at least one test compound under conditions permissive for the activity of the polypeptide of claim 29, b) assessing the activity of the polypeptide of claim 29 in the presence of the test compound, and c) comparing the activity of the polypeptide of claim 29 in the presence of the test compound with the activity of the polypeptide of claim 29 in the absence of the test
310 compound, wherein a change in the activity of the polypeptide of claim 29 in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide of claim 29.
31 1
PCT/US2001/027628 2000-09-05 2001-09-05 Molecules for disease detection and treatment WO2002040715A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CA002420983A CA2420983A1 (en) 2000-09-05 2001-09-05 Molecules for disease detection and treatment
US10/363,829 US20040142331A1 (en) 2001-09-05 2001-09-05 Molecules for disease detection and treatment
AU2001287108A AU2001287108A1 (en) 2000-09-05 2001-09-05 Molecules for disease detection and treatment
EP01966607A EP1343885A2 (en) 2000-09-05 2001-09-05 Molecules for disease detection and treatment

Applications Claiming Priority (46)

Application Number Priority Date Filing Date Title
US22974800P 2000-09-05 2000-09-05
US22974900P 2000-09-05 2000-09-05
US22975100P 2000-09-05 2000-09-05
US22974700P 2000-09-05 2000-09-05
US22975000P 2000-09-05 2000-09-05
US23058300P 2000-09-05 2000-09-05
US60/229,747 2000-09-05
US60/229,750 2000-09-05
US60/229,748 2000-09-05
US60/230,583 2000-09-05
US60/229,749 2000-09-05
US60/229,751 2000-09-05
US23086500P 2000-09-06 2000-09-06
US23098900P 2000-09-06 2000-09-06
US23059900P 2000-09-06 2000-09-06
US23061000P 2000-09-06 2000-09-06
US23051800P 2000-09-06 2000-09-06
US23051500P 2000-09-06 2000-09-06
US23051700P 2000-09-06 2000-09-06
US23059500P 2000-09-06 2000-09-06
US23098800P 2000-09-06 2000-09-06
US23051400P 2000-09-06 2000-09-06
US23050500P 2000-09-06 2000-09-06
US23051900P 2000-09-06 2000-09-06
US23059700P 2000-09-06 2000-09-06
US23059800P 2000-09-06 2000-09-06
US60/230,865 2000-09-06
US60/230,988 2000-09-06
US60/230,505 2000-09-06
US60/230,597 2000-09-06
US60/230,519 2000-09-06
US60/230,517 2000-09-06
US60/230,989 2000-09-06
US60/230,514 2000-09-06
US60/230,610 2000-09-06
US60/230,515 2000-09-06
US60/230,599 2000-09-06
US60/230,595 2000-09-06
US60/230,518 2000-09-06
US60/230,598 2000-09-06
US23116300P 2000-09-07 2000-09-07
US23116700P 2000-09-07 2000-09-07
US23095100P 2000-09-07 2000-09-07
US60/230,951 2000-09-07
US60/231,163 2000-09-07
US60/231,167 2000-09-07

Publications (3)

Publication Number Publication Date
WO2002040715A2 true WO2002040715A2 (en) 2002-05-23
WO2002040715A8 WO2002040715A8 (en) 2002-10-24
WO2002040715A3 WO2002040715A3 (en) 2003-07-17

Family

ID=27586707

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/027628 WO2002040715A2 (en) 2000-09-05 2001-09-05 Molecules for disease detection and treatment

Country Status (3)

Country Link
EP (1) EP1343885A2 (en)
CA (1) CA2420983A1 (en)
WO (1) WO2002040715A2 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002008400A2 (en) * 2000-07-20 2002-01-31 Millennium Pharmaceuticals, Inc. 25233, a human aminotransferase and uses therefor
WO2002014483A2 (en) * 2000-08-15 2002-02-21 Zymogenetics Inc Human adenosine deaminase
WO2002024895A2 (en) * 2000-09-22 2002-03-28 Incyte Genomics, Inc. Transcription factors and zinc finger proteins
WO2002074960A2 (en) * 2000-11-08 2002-09-26 Millennium Pharmaceuticals, Inc. 38650, 28472, 5495, 65507, 81588 and 14354 methods and compositions of human proteins and uses thereof
WO2002055701A3 (en) * 2000-12-15 2003-06-26 Millennium Pharmaceuticals, Inc. Human sugar transporter proteins, potassium channel proteins, phospholipid transporter proteins and methods of use thereof
US6706513B2 (en) 2000-08-21 2004-03-16 Bristol-Myers Squibb Company Adenosine deaminase homolog
EP1464651A1 (en) * 2003-04-03 2004-10-06 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts Dystrophin-related protein (Drop1), a marker for carcinomas
US6989441B2 (en) 2001-02-15 2006-01-24 Millennium Pharmaceuticals, Inc. 25466, a human transporter family member and uses therefor
US7078205B2 (en) 2000-02-17 2006-07-18 Millennium Pharmaceuticals, Inc. Nucleic acid sequences encoding melanoma associated antigen molecules, aminotransferase molecules, atpase molecules, acyltransferase molecules, pyridoxal-phosphate dependent enzyme molecules and uses therefor

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997033551A2 (en) * 1996-03-15 1997-09-18 Millennium Pharmaceuticals Compositions and methods for the diagnosis, prevention, and treatment of neoplastic cell growth and proliferation
WO1999067386A2 (en) * 1998-06-23 1999-12-29 Chiron Corporation Differentially expressed genes in pancreatic cancer and displasia
WO2000020869A1 (en) * 1998-10-06 2000-04-13 Georgetown University Detection of pleiotrophin

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1074617A3 (en) * 1999-07-29 2004-04-21 Research Association for Biotechnology Primers for synthesising full-length cDNA and their use

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997033551A2 (en) * 1996-03-15 1997-09-18 Millennium Pharmaceuticals Compositions and methods for the diagnosis, prevention, and treatment of neoplastic cell growth and proliferation
WO1999067386A2 (en) * 1998-06-23 1999-12-29 Chiron Corporation Differentially expressed genes in pancreatic cancer and displasia
WO2000020869A1 (en) * 1998-10-06 2000-04-13 Georgetown University Detection of pleiotrophin

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DATABASE EMBL [Online] 26 June 2001 (2001-06-26) T. OTA ET AL: "Human cDNA sequence seq ID no.14325" retrieved from EBI, HINXTON, UK Database accession no. AAH15833 XP002219862 -& EP 1 074 617 A (HELIX RESEARCH INSTITUTE) 7 February 2001 (2001-02-07) *
DATABASE EMBL [Online] 26 June 2001 (2001-06-26) T. OTA ET AL: "Human protein sequence ID no.14326" retrieved from EBI, HINXTON, UK Database accession no. AAB94103 XP002219864 -& EP 1 074 617 A (HELIX RESEARCH INSTITUTE) 7 February 2001 (2001-02-07) *
DATABASE EMBL [Online] 29 September 2000 (2000-09-29) T. ISOGAI ET AL: "NEDO human DNA sequencing project. Homo sapiens cDNA FLJ12533 fis, clone NT2RM4000202, weakly similar to ZINC Finger PROTEIN MOK-2" retrieved from EBI, HINXTON, UK Database accession no. AK022595 XP002219863 & UNPUBLISHED, *
EMMANUEL DIAS NETO ET AL: "SHOTGUN SEQUENCING OF THE HUMAN TRANSCRIPTOME WITH ORF EXPRESSED SEQUENCE TAGS" PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF USA, NATIONAL ACADEMY OF SCIENCE. WASHINGTON, US, vol. 97, no. 7, 28 March 2000 (2000-03-28), pages 3491-3496, XP000996193 ISSN: 0027-8424 -& DATABASE EMBL [Online] 15 January 2001 (2001-01-15) DIAS NETO E. ET AL: "RC1-IT025-201100-021-d12 IT0025 Homo sapiens cDNA, mRNA sequence" retrieved from EBI, HINXTON, UK Database accession no. BF770200 XP002219868 *
TOMMERUP N ET AL: "Isolation and fine mapping of 16 novel human zinc finger-encoding cDNAs identify putative candidate genes for developmental and malignant disorders" GENOMICS, ACADEMIC PRESS, SAN DIEGO, US, vol. 27, no. 2, 20 May 1995 (1995-05-20), pages 259-264, XP002117526 ISSN: 0888-7543 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7078205B2 (en) 2000-02-17 2006-07-18 Millennium Pharmaceuticals, Inc. Nucleic acid sequences encoding melanoma associated antigen molecules, aminotransferase molecules, atpase molecules, acyltransferase molecules, pyridoxal-phosphate dependent enzyme molecules and uses therefor
US7256010B2 (en) 2000-02-17 2007-08-14 Millennium Pharmaceuticals, Inc. Nucleic acid sequences encoding melanoma associated antigen molecules, aminotransferase molecules, ATPase molecules, acyltransferase molecules, pyridoxal-phosphate dependant enzyme molecules and uses therefor
WO2002008400A3 (en) * 2000-07-20 2003-04-17 Millennium Pharm Inc 25233, a human aminotransferase and uses therefor
WO2002008400A2 (en) * 2000-07-20 2002-01-31 Millennium Pharmaceuticals, Inc. 25233, a human aminotransferase and uses therefor
WO2002014483A2 (en) * 2000-08-15 2002-02-21 Zymogenetics Inc Human adenosine deaminase
WO2002014483A3 (en) * 2000-08-15 2003-04-24 Zymogenetics Inc Human adenosine deaminase
US7169596B2 (en) 2000-08-21 2007-01-30 Bristol-Meyers Squibb Company Adenosine deaminase homolog
US6706513B2 (en) 2000-08-21 2004-03-16 Bristol-Myers Squibb Company Adenosine deaminase homolog
WO2002024895A2 (en) * 2000-09-22 2002-03-28 Incyte Genomics, Inc. Transcription factors and zinc finger proteins
WO2002024895A3 (en) * 2000-09-22 2003-04-24 Incyte Genomics Inc Transcription factors and zinc finger proteins
WO2002074960A3 (en) * 2000-11-08 2003-09-12 Millennium Pharm Inc 38650, 28472, 5495, 65507, 81588 and 14354 methods and compositions of human proteins and uses thereof
WO2002074960A2 (en) * 2000-11-08 2002-09-26 Millennium Pharmaceuticals, Inc. 38650, 28472, 5495, 65507, 81588 and 14354 methods and compositions of human proteins and uses thereof
WO2002055701A3 (en) * 2000-12-15 2003-06-26 Millennium Pharmaceuticals, Inc. Human sugar transporter proteins, potassium channel proteins, phospholipid transporter proteins and methods of use thereof
US6989441B2 (en) 2001-02-15 2006-01-24 Millennium Pharmaceuticals, Inc. 25466, a human transporter family member and uses therefor
WO2004087751A1 (en) * 2003-04-03 2004-10-14 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts Dystrophin-related protein (drop1), a marker for carcinomas
EP1464651A1 (en) * 2003-04-03 2004-10-06 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts Dystrophin-related protein (Drop1), a marker for carcinomas

Also Published As

Publication number Publication date
EP1343885A2 (en) 2003-09-17
WO2002040715A8 (en) 2002-10-24
WO2002040715A3 (en) 2003-07-17
CA2420983A1 (en) 2002-05-23

Similar Documents

Publication Publication Date Title
CA2447183A1 (en) Molecules for disease detection and treatment
CA2447212A1 (en) Secretory molecules
WO2002040715A2 (en) Molecules for disease detection and treatment
EP1368375A2 (en) Secretory molecules
US20050095587A1 (en) Molecules for disease detection and treatment
WO2003062379A2 (en) Molecules for disease detection and treatment
WO2001062918A2 (en) Secretory polypeptides and corresponding polynucleotides
US20040058365A1 (en) Molecules for disease detection and treatment
US20040142331A1 (en) Molecules for disease detection and treatment
EP1181357A2 (en) Molecules for disease detection and treatment
WO2002016587A2 (en) Microtubule-associated proteins and tubulins
EP1472285A2 (en) Secretory molecules
EP1200571A1 (en) Secretory molecules
EP1222258A2 (en) Molecules for disease detection and treatment
EP1220907A2 (en) Human secretory molecules
EP1325128A2 (en) Lipocalins
CA2402747A1 (en) G-protein associated molecules
CA2430906A1 (en) Proteins associated with cell growth, differentiation, and death
US20030208040A1 (en) G-protein associated molecules
US20040023251A1 (en) Cell cycle proteins and mitosis-associated molecules
WO2002092759A9 (en) Molecules for disease detection and treatment
WO2002012339A2 (en) Sequences for integrin alpha-8
WO2002077235A2 (en) Intracellular signaling molecules
CA2415077A1 (en) Cell cycle proteins and mitosis-associated molecules
EP1390396A2 (en) Molecules for disease detection and treatment

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

AK Designated states

Kind code of ref document: C1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

CFP Corrected version of a pamphlet front page
CR1 Correction of entry in section i

Free format text: PAT. BUL. 21/2002 UNDER (22) REPLACE "20010906" BY "20010905" AND UNDER (30) ADD "60/229751, 20000905, US 60/229749, 20000905, US 60/229749, 20000905, US 60/229750, 20000905, US 60/229747, 20000905, US 60/229748, 20000905, US 60/230583, 20000905, US"

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2420983

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 2001966607

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWE Wipo information: entry into national phase

Ref document number: 10363829

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 2001966607

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 2001966607

Country of ref document: EP

NENP Non-entry into the national phase in:

Ref country code: JP