Berkeley Structural Genomics Center
Berkeley Structural Genomics Center
 
    BSGC HOME  
   
    ABOUT BSGC  
   
    PUBLICATIONS  
   
    NEW TECHNOLOGIES  
   
    PROTOCOLS  
   
    STRUCT. PROTEOME  
   
    JOBS  
   
    NEWS  
   
    COLLABORATORS  
       
    WEB RESOURCES  
   
    STATUS  
       
    CONTACT US  
 
 
 
   
BSGC PUBLICATIONS  
   

This page gives BSGC publications abstracts, where available. Abstracts are listed alphabetically, by year.
Click to go back to the publication index.

2007:

  • Chandonia JM. 2007. StrBioLib: a Java library for development of custom computational structural biology applications. Bioinformatics [Preprint PDF]

    SUMMARY: StrBioLib is a library of Java classes useful for developing software for computational structural biology research. StrBioLib contains classes to represent and manipulate protein structures, biopolymer sequences, sets of biopolymer sequences, and alignments between biopolymers based on either sequence or structure. Interfaces are provided to interact with commonly used bioinformatics applications, including (PSI)-BLAST, MODELLER, MUSCLE, and Primer3, and tools are provided to read and write many file formats used to represent bioinformatic data. The library includes a general-purpose neural network object with multiple training algorithms, the Hooke and Jeeves nonlinear optimization algorithm, and tools for efficient C-style string parsing and formatting. StrBioLib is the basis for the Pred2ary secondary structure prediction program, is used to build the ASTRAL compendium for sequence and structure analysis, and has been extensively tested through use in many smaller projects. Examples and documentation are available at the site below. AVAILABILITY: StrBioLib may be obtained under the terms of the GNU LGPL license from http://strbio.sourceforge.net/

    Click here to go back to the publication index

  • Das D, Xu QS, Lee JY, Ankoudinova I, Huang C, ... Kim R, Kim SH. 2007. Crystal structure of the multidrug efflux transporter AcrB at 3.1A resolution reveals the N-terminal region with conserved amino acids. J Struct Biol 158:494-502.

    Crystal structures of the bacterial multidrug transporter AcrB in R32 and C2 space groups showing both symmetric and asymmetric trimeric assemblies, respectively, supplemented with biochemical investigations, have provided most of the structural basis for a molecular level understanding of the protein structure and mechanisms for substrate uptake and translocation carried out by this 114-kDa inner membrane protein. They suggest that AcrB captures ligands primarily from the periplasm. Substrates can also enter the inner cavity of the transporter from the cytoplasm, but the exact mechanism of this remains undefined. Analysis of the amino acid sequences of AcrB and its homologs revealed the presence of conserved residues at the N-terminus including two phenylalanines which may be exposed to the cytoplasm. Any potential role that these conserved residues may play in function has not been addressed by existing biochemical or structural studies. Since phenylalanine residues elsewhere in the protein have been implicated in ligand binding, we explored the structure of this N-terminal region to investigate structural determinants near the cytoplasmic opening that may mediate drug uptake. Our structure of AcrB in R32 space group reveals an N-terminus loop, reducing the diameter of the central opening to approximately 15 A as opposed to the previously reported value of approximately 30 A for crystal structures in this space group with disordered N-terminus. Recent structures of the AcrB in C2 space group have revealed a helical conformation of this N-terminus but have not discussed its possible implications. We present the crystal structure of AcrB that reveals the structure of the N-terminus containing the conserved residues. We hope that the structural information provides a structural basis for others to design further biochemical investigation of the role of this portion of AcrB in mediating cytoplasmic ligand discrimination and uptake.

    Click here to go back to the publication index

  • Lowery TJ, Pelton JG, Chandonia JM, Kim R, Yokota H, Wemmer DE. 2007. NMR structure of the N-terminal domain of the replication initiator protein DnaA. J Struct Funct Genomics

    DnaA is an essential component in the initiation of bacterial chromosomal replication. DnaA binds to a series of 9 base pair repeats leading to oligomerization, recruitment of the DnaBC helicase, and the assembly of the replication fork machinery. The structure of the N-terminal domain (residues 1-100) of DnaA from Mycoplasma genitalium was determined by NMR spectroscopy. The backbone r.m.s.d. for the first 86 residues was 0.6 +/- 0.2 A based on 742 NOE, 50 hydrogen bond, 46 backbone angle, and 88 residual dipolar coupling restraints. Ultracentrifugation studies revealed that the domain is monomeric in solution. Features on the protein surface include a hydrophobic cleft flanked by several negative residues on one side, and positive residues on the other. A negatively charged ridge is present on the opposite face of the protein. These surfaces may be important sites of interaction with other proteins involved in the replication process. Together, the structure and NMR assignments should facilitate the design of new experiments to probe the protein-protein interactions essential for the initiation of DNA replication.

    Click here to go back to the publication index

  • Oganesyan V, Adams PD, Jancarik J, Kim R, Kim SH. 2007. Structure of O67745_AQUAE, a hypothetical protein from Aquifex aeolicus. Acta Crystallogr Sect F Struct Biol Cryst Commun 63:369-74.

    Using single-wavelength anomalous dispersion data obtained from a gold-derivatized crystal, the X-ray crystal structure of the protein 067745_AQUAE from the prokaryotic organism Aquifex aeolicus has been determined to a resolution of 2.0 A. Amino-acid residues 1-371 of the 44 kDa protein were identified by Pfam as an HD domain and a member of the metal-dependent phosphohydrolase superfamily (accession No. PF01966). Although three families from this large and diverse group of enzymatic proteins are represented in the PDB, the structure of 067745_AQUAE reveals a unique fold that is unlike the others and that is likely to represent a new subfamily, further organizing the families and characterizing the proteins. Data are presented that provide the first insights into the structural organization of the proteins within this clan and a distal alternative GDP-binding domain outside the metal-binding active site is proposed.

    Click here to go back to the publication index

  • Shin DH, Hou J, Chandonia JM, Das D, Choi IG, Kim R, Kim SH. 2007. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center. J Struct Funct Genomics

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

    Click here to go back to the publication index

  • Shin DH, Proudfoot M, Lim HJ, Choi IK, Yokota H, ... Kim R, Kim SH. 2007. Structural and enzymatic characterization of DR1281: A calcineurin-like phosphoesterase from Deinococcus radiodurans. Proteins

    We have determined the crystal structure of DR1281 from Deinococcus radiodurans. DR1281 is a protein of unknown function with over 170 homologs found in prokaryotes and eukaryotes. To elucidate the molecular function of DR1281, its crystal structure at 2.3 A resolution was determined and a series of biochemical screens for catalytic activity was performed. The crystal structure shows that DR1281 has two domains, a small alpha domain and a putative catalytic domain formed by a four-layered structure of two beta-sheets flanked by five alpha-helices on both sides. The small alpha domain interacts with other molecules in the asymmetric unit and contributes to the formation of oligomers. The structural comparison of the putative catalytic domain with known structures suggested its biochemical function to be a phosphatase, phosphodiesterase, nuclease, or nucleotidase. Structural analyses with its homologues also indicated that there is a dinuclear center at the interface of two domains formed by Asp8, Glu37, Asn38, Asn65, His148, His173, and His175. An absolute requirement of metal ions for activity has been proved by enzymatic assay with various divalent metal ions. A panel of general enzymatic assays of DR1281 revealed metal-dependent catalytic activity toward model substrates for phosphatases (p-nitrophenyl phosphate) and phosphodiesterases (bis-p-nitrophenyl phosphate). Subsequent secondary enzymatic screens with natural substrates demonstrated significant phosphatase activity toward phosphoenolpyruvate and phosphodiesterase activity toward 2',3'-cAMP. Thus, our structural and enzymatic studies have identified the biochemical function of DR1281 as a novel phosphatase/phosphodiesterase and disclosed key conserved residues involved in metal binding and catalytic activity. Proteins 2007. (c) 2007 Wiley-Liss, Inc.

    Click here to go back to the publication index

2006:

  • Chandonia JM, Brenner SE. 2006. The impact of structural genomics: expectations and outcomes. Science 311:347-51. [PDF]|[Supplementary Info]

    Structural genomics (SG) projects aim to expand our structural knowledge of biological macromolecules while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and we contrast these results with traditional structural biology. The first structure identified in a protein family enables inference of the fold and of ancient relationships to other proteins; in the year ending 31 January 2005, about half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient SG center in the United States has dropped to one-quarter of the estimated cost of solving a structure by traditional methods. However, the efficiency of the top structural biology laboratories-even though they work on very challenging structures-is comparable to that of SG centers; moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

    Click here to go back to the publication index

  • Chandonia JM, Kim SH, Brenner SE. 2005. Target selection and deselection at the Berkeley Structural Genomics Center. Proteins 62:356-370. [PDF]|[Supplementary Info]

    At the Berkeley Structural Genomics Center (BSGC), our goal is to obtain a near-complete structural complement of proteins in the minimal organisms Mycoplasma genitalium and M. pneumoniae, two closely related pathogens. Current targets for structure determination have been selected in six major stages, starting with those predicted to be most tractable to high throughput study and likely to yield new structural information. We report on the process used to select these proteins, as well as our target deselection procedure. Target deselection reduces experimental effort by eliminating targets similar to those recently solved by the structural biology community or other centers. We measure the impact of the 69 structures solved at the BSGC as of July 2004 on structure prediction coverage of the M. pneumoniae and M. genitalium proteomes. The number of Mycoplasma proteins for which the fold could first be reliably assigned based on structures solved at the BSGC (24 M. pneumoniae and 21 M. genitalium) is approximately 25% of the total resulting from work at all structural genomics centers and the worldwide structural biology community (94 M. pneumoniae and 86 M. genitalium) during the same period. As the number of structures contributed by the BSGC during that period is less than 1% of the total worldwide output, the benefits of a focused target selection strategy are apparent. If the structures of all current targets were solved, the percentage of M. pneumoniae proteins for which folds could be reliably assigned would increase from approximately 57% (391 of 687) at present to around 80% (550 of 687), and the percentage of the proteome that could be accurately modeled would increase from around 37% (254 of 687) to about 64% (438 of 687). In M. genitalium, the percentage of the proteome that could be structurally annotated based on structures of our remaining targets would rise from 72% (348 of 486) to around 76% (371 of 486), with the percentage of accurately modeled proteins would rise from 50% (243 of 486) to 58% (283 of 486). Sequences and data on experimental progress on our targets are available in the public databases TargetDB and PEPCdb. Proteins 2006. (c) 2005 Wiley-Liss, Inc.

    Click here to go back to the publication index

  • Chandonia JM, Kim SH. 2006. Structural proteomics of minimal organisms: Conservation of protein fold usage and evolutionary implications. BMC Struct Biol 6:7. [PDF]

    ABSTRACT: BACKGROUND: Determining the complete repertoire of protein structures for all soluble, globular proteins in a single organism has been one of the major goals of several structural genomics projects in recent years. RESULTS: We report that this goal has nearly been reached for several "minimal organisms"--parasites or symbionts with reduced genomes--for which over 95% of the soluble, globular proteins may now be assigned folds, overall 3-D backbone structures. We analyze the structures of these proteins as they relate to cellular functions, and compare conservation of fold usage between functional categories. We also compare patterns in the conservation of folds among minimal organisms and those observed between minimal organisms and other bacteria. CONCLUSION: We find that proteins performing essential cellular functions closely related to transcription and translation exhibit a higher degree of conservation in fold usage than proteins in other functional categories. Folds related to transcription and translation functional categories were also overrepresented in minimal organisms compared to other bacteria.

    Click here to go back to the publication index

  • Kim JS, Shin DH, Pufan R, Huang C, Yokota H, Kim R, Kim SH. 2006. Crystal structure of ScpB from Chlorobium tepidum, a protein involved in chromosome partitioning. Proteins 62:322-8.

    Structural maintenance of chromosome (SMC) proteins are essential in chromosome condensation and interact with non-SMC proteins in eukaryotes and with segregation and condensation proteins (ScpA and ScpB) in prokaryotes. The highly conserved gene in Chlorobium tepidum gi 21646405 encodes ScpB (ScpB_ChTe). The high resolution crystal structure of ScpB_ChTe shows that the monomeric structure consists of two similarly shaped globular domains composed of three helices sided by beta-strands [a winged helix-turn-helix (HTH)], a motif observed in the C-terminal domain of Scc1, a functionally related eukaryotic ScpA homolog, as well as in many DNA binding proteins.

    Click here to go back to the publication index

  • Shin DH, Kim JS, Yokota H, Kim R, Kim SH. 2006. Crystal structure of the DUF16 domain of MPN010 from Mycoplasma pneumoniae. Protein Sci 15:921-8.

    We have determined the crystal structure of the DUF16 domain of unknown function encoded by the gene MPN010 of Mycoplasma pneumoniae at 1.8 A resolution. The crystal structure revealed that this domain is composed of two separated homotrimeric coiled-coils. The shorter one consists of 11 highly conserved residues. The sequence comprises noncanonical heptad repeats that induce a right-handed coiled-coil structure. The longer one is composed of approximately nine heptad repeats. In this coiled-coil structure, there are three distinguishable regions that confer unique structural properties compared with other known homotrimeric coiled-coils. The first part, containing one stutter, is an unusual phenylalanine-rich region that is not found in any other coiled-coil structures. The second part is a highly conserved glutamine-rich region, frequently found in other trimeric coiled-coil structures. The last part is composed of prototype heptad repeats. The phylogenetic analysis of the DUF16 family together with a secondary structure prediction shows that the DUF16 family can be classified into five subclasses according to N-terminal sequences. Based on the structural comparison with other coiled-coil structures, a probable molecular function of the DUF16 family is discussed.

    Click here to go back to the publication index

  • Sims GE, Kim SH. 2006. A method for evaluating the structural quality of protein models by using higher-order phi-psi pairs scoring. Proc Natl Acad Sci U S A 103:4428-32.

    A method is presented for scoring the model quality of experimental and theoretical protein structures. The structural model to be evaluated is dissected into small fragments via a sliding window, where each fragment is represented by a vector of multiple phi-psi angles. The sliding window ranges in size from a length of 1-10 phi-psi pairs (3-12 residues). In this method, the conformation of each fragment is scored based on the fit of multiple phi-psi angles of the fragment to a database of multiple phi-psi angles from high-resolution x-ray crystal structures. We show that measuring the fit of predicted structural models to the allowed conformational space of longer fragments is a significant discriminator for model quality. Reasonable models have higher-order phi-psi score fit values (m) > -1.00.

    Click here to go back to the publication index

  • Smith A, Chandonia JM, Brenner SE. 2006. ANDY: a general, fault-tolerant tool for database searching on computer clusters. Bioinformatics [PDF]|[Supplementary Info]

    SUMMARY: ANDY (seArch coordination aND analYsis) is a set of Perl programs and modules for distributing large biological database searches, and in general any sequence of commands, across the nodes of a Linux computer cluster. ANDY is compatible with several commonly used Distributed Resource Management (DRM) systems, and it can be easily extended to new DRMs. A distinctive feature of ANDY is the choice of either dedicated or fair-use operation: ANDY is almost as efficient as single-purpose tools that require a dedicated cluster, but it runs on a general-purpose cluster along with any other jobs scheduled by a DRM. Other features include communication through named pipes for performance, flexible customizable routines for error-checking and summarizing results, and multiple fault-tolerance mechanisms. AVAILABILITY: ANDY is freely available and may be obtained from http://compbio.berkeley.edu/proj/andy; this site also contains supplemental data and figures and a more detailed overview of the software.

    Click here to go back to the publication index

2005:

  • Chandonia JM, Brenner SE. 2005. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches. Proteins 58:166-79. [PDF]

    Structural genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the "Pfam5000" strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These strategies include complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at the European Bioinformatics Institute (EBI). Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68% of all prokaryotic proteins (covering 59% of residues) and 61% of eukaryotic proteins (40% of residues). More fine-grained coverage that would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example, to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: A significant fraction (about 30-40% of the proteins and 40-60% of the residues) of each proteome is classified in small families, which may have little overlap with other species of interest. Random selection of targets from one or more genomes is similar to the Pfam5000 strategy in that proteins from larger families are more likely to be chosen, but substantial effort would be spent on small families.

    Click here to go back to the publication index

  • Chandonia JM, Brenner SE. 2005. Update on the Pfam5000 Strategy for Selection of Structural Genomics Targets. Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China [PDF]

    Structural Genomics is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy that is medically and biologically relevant, of good financial value, and tractable. In 2003, we presented the "Pfam5000" strategy, which involves selecting the 5,000 most important families from the Pfam database as sources for targets. In this update, we show that although both the Pfam database and the number of sequenced genomes have increased in size, the expected benefits of the Pfam5000 strategy have not changed substantially. Solving the structures of proteins from the 5,000 largest Pfam families would allow accurate fold assignment for approximately 65% of all prokaryotic proteins (covering 54% of residues) and 63% of eukaryotic proteins (42% of residues). Fewer than 2,300 of the largest families on this list remain to be solved, making the project feasible in the next five years given the expected throughput to be achieved in the production phase of the Protein Structure Initiative.

    Click here to go back to the publication index

  • Chen S, Yakunin AF, Proudfoot M, Kim R, Kim SH. 2005. Structural and functional characterization of a 5,10-methenyltetrahydrofolate synthetase from Mycoplasma pneumoniae (GI: 13508087). Proteins 61:433-43.

    Mycoplasma pneumoniae 5,10-methenyltetrahydrofolate synthetase [MTHFS; also known as 5-formyltetrahydrofolate cycloligase; Enzyme Commission (EC) 6.3.3.2] belongs to a large cycloligase protein family with 97 sequence homologues from bacteria to human. To help define the molecular (biochemical and biophysical) function of the M. pneumoniae MTHFS, we have previously determined its crystal structure at 2.2 A resolution (Chen et al., Proteins 2004;56:839-843). In this current study, activity assays confirmed the functionality of the recombinant protein, with K(m) = 165 microM for 5-formyltetrahydrofolate (5-FTHF) and K(m) = 166 microM for MgATP. The methenyltetrahydrofolate activity of M. pneumoniae MTHFS has a requirement for divalent metal ions with Mg2+ being most effective, and an absolute requirement for nucleoside 5'-triphosphates with adenosine triphosphate (ATP) being most effective. Crystallization in the presence of substrates (MgATP, with or without 5-FTHF) produced the complex structures of the protein with adenosine diphosphate (ADP) and phosphate at 2.2 A resolution; with ADP, phosphate, and 5-FTHF at 2.5 A resolution. These structures directly demonstrated that the role of Mg2+ in the reaction is to form the ATP--Mg2+-enzyme complex.

    Click here to go back to the publication index

  • Kim SH, Shin DH, Liu J, Oganesyan V, Chen S, ... Adams PD, Kim R. 2005. Structural genomics of minimal organisms and protein fold space. J Struct Funct Genomics 6:63-70.

    The initial aim of the Berkeley Structural Genomics Center is to obtain a near-complete structural complement of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter fewer than 700 genes. To achieve this goal, the current protein targets have been selected starting with those predicted to be most tractable and likely to yield new structural and functional information. During the past 3 years, the semi-automated structural genomics pipeline has been set up from cloning, expression, purification, and ultimately to structural determination. The results from the pipeline substantially increased the coverage of the protein fold space of M. pneumoniae and M. genitalium. Furthermore, about 1/2 of the structures of 'unique' protein sequences revealed new and novel folds, and over 2/3 of the structures of previously annotated 'hypothetical proteins' inferred their molecular functions.

    Click here to go back to the publication index

  • Kim JS, DeGiovanni A, Jancarik J, Adams PD, Yokota H, Kim R, Kim SH. 2005. Crystal structure of DNA sequence specificity subunit of a type I restriction-modification enzyme and its functional implications. Proc Natl Acad Sci U S A 102:3248-53.

    Type I restriction-modification enzymes are differentiated from type II and type III enzymes by their recognition of two specific dsDNA sequences separated by a given spacer and cleaving DNA randomly away from the recognition sites. They are oligomeric proteins formed by three subunits: a specificity subunit, a methylation subunit, and a restriction subunit. We solved the crystal structure of a specificity subunit from Methanococcus jannaschii at 2.4-A resolution. Two highly conserved regions (CRs) in the middle and at the C terminus form a coiled-coil of long antiparallel alpha-helices. Two target recognition domains form globular structures with almost identical topologies and two separate DNA binding clefts with a modeled DNA helix axis positioned across the CR helices. The structure suggests that the coiled-coil CRs act as a molecular ruler for the separation between two recognized DNA sequences. Furthermore, the relative orientation of the two DNA binding clefts suggests kinking of bound dsDNA and exposing of target adenines from the recognized DNA sequences.

    Click here to go back to the publication index

  • Liu J, Huang C, Shin DH, Yokota H, Jancarik J, ... Kim R, Kim SH. 2005. Crystal structure of a heat-inducible transcriptional repressor HrcA from Thermotoga maritima: structural insight into DNA binding and dimerization. J Mol Biol 350:987-96.

    All cells have a defense mechanism against a sudden heat-shock stress. Commonly, they express a set of proteins that protect cellular proteins from being denatured by heat. Among them, GroE and DnaK chaperones are representative defending systems, and their transcription is regulated by a heat-shock repressor protein HrcA. HrcA repressor controls the transcription of groE and dnaK operons by binding the palindromic CIRCE element, presumably as a dimer, and the activity of HrcA repressor is modulated by GroE chaperones. Here, we report the first crystal structure of a heat-inducible transcriptional repressor, HrcA, from Thermotoga maritima at 2.2A resolution. The Tm_HrcA protein crystallizes as a dimer. The monomer is composed of three domains: an N-terminal winged helix-turn-helix domain (WH), a GAF-like domain, and an inserted dimerizing domain (IDD). The IDD shows a unique structural fold with an anti-parallel beta-sheet composed of three beta-strands sided by four alpha-helices. The Tm_HrcA dimer structure is formed through hydrophobic contact between the IDDs and a limited contact that involves conserved residues between the GAF-like domains. In the overall dimer structure, the two WH domains are exposed, but the conformation of these two domains seems to be incompatible with DNA binding. We suggest that our structure may represent an inactive form of the HrcA repressor. Structural implication on how the inactive form of HrcA may be converted to the active form by GroEL binding to a conserved C-terminal sequence region of HrcA is discussed.

    Click here to go back to the publication index

  • Liu J, Lou Y, Yokota H, Adams PD, Kim R, Kim SH. 2005. Crystal structure of a PhoU protein homologue: a new class of metalloprotein containing multinuclear iron clusters. J Biol Chem 280:15960-6.

    PhoU proteins are known to play a role in the regulation of phosphate uptake. In Thermotoga maritima, two PhoU homologues have been identified bioinformatically. Here we report the crystal structure of one of the PhoU homologues at 2.0 A resolution. The structure of the PhoU protein homologue contains a highly symmetric new structural fold composed of two repeats of a three-helix bundle. The structure unexpectedly revealed a trinuclear and a tetranuclear iron cluster that were found to be bound on the surface. Each of the two multinuclear iron clusters is coordinated by a conserved E(D)XXXD motif pair. Our structure reveals a new class of metalloprotein containing multinuclear iron clusters. The possible functional implication based on the structure are discussed.

    Click here to go back to the publication index

  • Liu J, Lou Y, Yokota H, Adams PD, Kim R, Kim SH. 2005. Crystal structures of an NAD kinase from Archaeoglobus fulgidus in complex with ATP, NAD, or NADP. J Mol Biol 354:289-303.

    NAD kinase is a ubiquitous enzyme that catalyzes the phosphorylation of NAD to NADP using ATP or inorganic polyphosphate (poly(P)) as phosphate donor, and is regarded as the only enzyme responsible for the synthesis of NADP. We present here the crystal structures of an NAD kinase from the archaeal organism Archaeoglobus fulgidus in complex with its phosphate donor ATP at 1.7 A resolution, with its substrate NAD at 3.05 A resolution, and with the product NADP in two different crystal forms at 2.45 A and 2.0 A resolution, respectively. In the ATP bound structure, the AMP portion of the ATP molecule is found to use the same binding site as the nicotinamide ribose portion of NAD/NADP in the NAD/NADP bound structures. A magnesium ion is found to be coordinated to the phosphate tail of ATP as well as to a pyrophosphate group. The conserved GGDG loop forms hydrogen bonds with the pyrophosphate group in the ATP-bound structure and the 2' phosphate group of the NADP in the NADP-bound structures. A possible phosphate transfer mechanism is proposed on the basis of the structures presented.

    Click here to go back to the publication index

  • Oganesyan N, Kim SH, Kim R. 2005. On-column protein refolding for crystallization. J Struct Funct Genomics 6:177-82.

    One major bottleneck in protein production in Escherichia coli for structural genomics projects is the formation of insoluble protein aggregates (inclusion bodies). The efficient refolding of proteins from inclusion bodies is becoming an important tool that can provide soluble native proteins for structural and functional studies. Here we report an on-column refolding method established at the Berkeley Structural Genomics Center (BSGC). Our method is a combination of an 'artificial chaperone-assisted refolding' method previously proposed and affinity chromatography to take advantage of a chromatographic step: less time-consuming, no filtration or concentration, with the additional benefit of protein purification. It can be easily automated and formatted for high-throughput process.

    Click here to go back to the publication index

  • Oganesyan V, Huang C, Adams PD, Jancarik J, Yokota HA, Kim R, Kim SH. 2005. Structure of a NAD kinase from Thermotoga maritima at 2.3 A resolution. Acta Crystallograph Sect F Struct Biol Cryst Commun 61:640-6.

    NAD kinase is the only known enzyme that catalyzes the formation of NADP, a coenzyme involved in most anabolic reactions and in the antioxidant defense system. Despite its importance, very little is known regarding the mechanism of catalysis and only recently have several NAD kinase structures been deposited in the PDB. Here, an independent investigation of the crystal structure of inorganic polyphosphate/ATP-NAD kinase, PPNK_THEMA, a protein from Thermotoga maritima, is reported at a resolution of 2.3 A. The crystal structure was solved using single-wavelength anomalous diffraction (SAD) data collected at the Se absorption-peak wavelength in a state in which no cofactors or substrates were bound. It revealed that the 258-amino-acid protein is folded into two distinct domains, similar to recently reported NAD kinases. The N-terminal alpha/beta-domain spans the first 100 amino acids and the last 30 amino acids of the polypeptide and has several topological matches in the PDB, whereas the other domain, which spans the middle 130 residues, adopts a unique beta-sandwich architecture and only appreciably matches the recently deposited PDB structures of NAD kinases.

    Click here to go back to the publication index

  • Oganesyan V, Oganesyan N, Adams PD, Jancarik J, Yokota HA, Kim R, Kim SH. 2005. Crystal structure of the "PhoU-like" phosphate uptake regulator from Aquifex aeolicus. J Bacteriol 187:4238-44.

    The phoU gene of Aquifex aeolicus encodes a protein called PHOU_AQUAE with sequence similarity to the PhoU protein of Escherichia coli. Despite the fact that there is a large number of family members (more than 300) attributed to almost all known bacteria and despite PHOU_AQUAE's association with the regulation of genes for phosphate metabolism, the nature of its regulatory function is not well understood. Nearly one-half of these PhoU-like proteins, including both PHOU_AQUAE and the one from E. coli, form a subfamily with an apparent dimer structure of two PhoU domains on the basis of their amino acid sequence. The crystal structure of PHOU_AQUAE (a 221-amino-acid protein) reveals two similar coiled-coil PhoU domains, each forming a three-helix bundle. The structures of PHOU_AQUAE proteins from both a soluble fraction and refolded inclusion bodies (at resolutions of 2.8 and 3.2A, respectively) showed no significant differences. The folds of the PhoU domain and Bag domains (for a class of cofactors of the eukaryotic chaperone Hsp70 family) are similar. Accordingly, we propose that gene regulation by PhoU may occur by association of PHOU_AQUAE with the ATPase domain of the histidine kinase PhoR, promoting release of its substrate PhoB. Other proteins that share the PhoU domain fold include the coiled-coil domains of the STAT protein, the ribosome-recycling factor, and structural proteins like spectrin.

    Click here to go back to the publication index

  • Pajon A, Ionides J, Diprose J, Fillon J, Fogh R, ... Stuart DI, Henrick K. 2005. Design of a data model for developing laboratory information management and analysis systems for protein production. Proteins 58:278-84.

    Data management has emerged as one of the central issues in the high-throughput processes of taking a protein target sequence through to a protein sample. To simplify this task, and following extensive consultation with the international structural genomics community, we describe here a model of the data related to protein production. The model is suitable for both large and small facilities for use in tracking samples, experiments, and results through the many procedures involved. The model is described in Unified Modeling Language (UML). In addition, we present relational database schemas derived from the UML. These relational schemas are already in use in a number of data management projects.

    Click here to go back to the publication index

  • Schulze-Gahmen U, Aono S, Chen S, Yokota H, Kim R, Kim SH. 2005. Structure of the hypothetical Mycoplasma protein MPN555 suggests a chaperone function. Acta Crystallogr D Biol Crystallogr 61:1343-7.

    The crystal structure of the hypothetical protein MPN555 from Mycoplasma pneumoniae (gi|1673958) has been determined to a resolution of 2.8 Angstrom using anomalous diffraction data at the Se-peak wavelength. Structure determination revealed a mostly alpha-helical protein with a three-lobed shape. The three lobes or fingers delineate a central binding groove and additional grooves between lobes 1 and 3 and between lobes 2 and 3. For one of the molecules in the asymmetric unit, the central binding pocket was filled with a peptide from the uncleaved N-terminal affinity tag. The MPN555 structure has structural homology to two bacterial chaperone proteins: SurA and trigger factor from Escherichia coli. The structural data and the homology to other chaperone proteins suggests an involvement in protein folding as a molecular chaperone for MPN555.

    Click here to go back to the publication index

  • Shin DH, Oganesyan N, Jancarik J, Yokota H, Kim R, Kim SH. 2005. Crystal structure of a nicotinate phosphoribosyltransferase from Thermoplasma acidophilum. J Biol Chem 280:18326-35.

    We have determined the crystal structure of nicotinate phosphoribosyltransferase from Themoplasma acidophilum (TaNAPRTase). The TaNAPRTase has three domains, an N-terminal domain, a central functional domain, and a unique C-terminal domain. The crystal structure revealed that the functional domain has a type II phosphoribosyltransferase fold that may be a common architecture for both nicotinic acid and quinolinic acid (QA) phosphoribosyltransferases (PRTase) despite low sequence similarity between them. Unlike QAPRTase, TaNAPRTase has a unique extra C-terminal domain containing a zinc knuckle-like motif containing 4 cysteines. The TaNAPRTase forms a trimer of dimers in the crystal. The active site pocket is formed at dimer interfaces. The complex structures with phosphoribosylpyrophosphate (PRPP) and nicotinate mononucleotide (NAMN) showed, surprisingly, that functional residues lining on the active site of TaNAPRTase are quite different from those of QAPRTase, although their substrates are quite similar to each other. The phosphate moiety of PRPP and NAMN is anchored to the phosphate-binding loops formed by backbone amides, as found in many alpha/beta barrel enzymes. The pyrophosphate moiety of PRPP is located at the entrance of the active site pocket, whereas the nicotinate moiety of NAMN is located deep inside. Interestingly, the nicotinate moiety of NAMN is intercalated between highly conserved aromatic residues Tyr(21) and Phe(138). Careful structural analyses combined with other NAPRTase sequence subfamilies reveal that TaNAPRTase represents a unique sequence subfamily of NAPRTase. The structures of TaNAPRTase also provide valuable insight for other sequence subfamilies such as pre-B cell colony-enhancing factor, known to have nicotinamide phosphoribosyltransferase activity.

    Click here to go back to the publication index

  • Shin DH, Lou Y, Jancarik J, Yokota H, Kim R, Kim SH. 2005. Crystal structure of TM1457 from Thermotoga maritima. J Struct Biol 152:113-7.

    The crystal structure of a hypothetical protein, TM1457, from Thermotoga maritima has been determined at 2.0A resolution. TM1457 belongs to the DUF464 family (57 members) for which there is no known function. The structure shows that it is composed of two helices in contact with one side of a five-stranded beta-sheet. Two identical monomers form a pseudo-dimer in the asymmetric unit. There is a large cleft between the first alpha-helix and the second beta-strand. This cleft may be functionally important, since the two highly conserved motifs, GHA and VCAXV(S/T), are located around the cleft. A structural comparison of TM1457 with known protein structures shows the best hit with another hypothetical protein, Ybl001C from Saccharomyces cerevisiae, though they share low structural similarity. Therefore, TM1457 still retains a unique topology and reveals a novel fold.

    Click here to go back to the publication index

  • Sims GE, Choi IG, Kim SH. 2005. Protein conformational space in higher order {phi}-{psi} maps. Proc Natl Acad Sci U S A

    We have mapped protein conformational space from two to seven residue lengths by employing multidimensional scaling on a data matrix composed of pair-wise angular distances for multiple phi-psi values collected from high-resolution protein structures. The resulting global maps show clustering of peptide conformations that reveals a dramatic reduction of conformational space as sampled by experimentally observed peptides. Each map can be viewed as a higher order phi-psi plot defining regions of space that are conformationally allowed.

    Click here to go back to the publication index

  • Xu QS, Jancarik J, Lou Y, Kuznetsova K, Yakunin AF, ... Kim R, Kim SH. 2005. Crystal structures of a phosphotransacetylase from Bacillus subtilis and its complex with acetyl phosphate. J Struct Funct Genomics 6:269-79.

    Phosphotransacetylase (Pta) [EC 2.3.1.8] plays a major role in acetate metabolism by catalyzing the reversible transfer of the acetyl group between coenzyme A (CoA) and orthophosphate: CH(3)COSCoA+HPO [Formula: see text]CH(3)COOPO (3) (2-) +CoASH. In this study, we report the crystal structures of Pta from Bacillus subtilis at 2.75 A resolution and its complex with acetyl phosphate, one of its substrates, at 2.85 A resolution. In addition, the Pta activity of the enzyme has been assayed. The enzyme folds into an alpha/beta architecture with two domains separated by a prominent cleft, very similar to two other known Pta structures. The enzyme-acetyl phosphate complex structure reveals a few potential substrate binding sites. Two of them are located in the middle of the interdomain cleft: each one is surrounded by a region of strictly and highly conserved residues. High structural similarities are found with 4-hydroxythreonine-4-phosphate dehydrogenase (PdxA), and isocitrate and isopropylmalate dehydrogenases, all of which utilize NADP(+) as their cofactor, which binds in the interdomain cleft. Their substrate binding sites are close to the acetyl phosphate binding sites of Pta in the cleft as well. These results suggest that the CoA is likely to bind to the interdomain cleft of Pta in a similar way as NADP(+) binds to the other three enzymes.

    Click here to go back to the publication index

  • Zhang Y, Chandonia JM, Ding C, Holbrook SR. 2005. Comparative mapping of sequence-based and structure-based protein domains. BMC Bioinformatics 6:77. [PDF]

    BACKGROUND: Protein domains have long been an ill-defined concept in biology. They are generally described as autonomous folding units with evolutionary and functional independence. Both structure-based and sequence-based domain definitions have been widely used. But whether these types of models alone can capture all essential features of domains is still an open question. METHODS: Here we provide insight on domain definitions through comparative mapping of two domain classification databases, one sequence-based (Pfam) and the other structure-based (SCOP). A mapping score is defined to indicate the significance of the mapping, and the properties of the mapping matrices are studied. RESULTS: The mapping results show a general agreement between the two databases, as well as many interesting areas of disagreement. In the cases of disagreement, the functional and evolutionary characteristics of the domains are examined to determine which domain definition is biologically more informative.

    Click here to go back to the publication index

2004:

  • Busso D, Kim R, Kim SH. 2004. Using an Escherichia coli cell-free extract to screen for soluble expression of recombinant proteins. J Struct Funct Genomics 5:69-74.

    For structural and functional genomics programs, new high-throughput methods to characterize well-expressing and highly soluble proteins are essential. A faster and more convenient approach to screen expression conditions of recombinant proteins compared to classical in vivo systems is the Escherichia coli cell-free expression system. Here, we describe a rapid procedure to screen for expression and solubility of recombinant proteins using an E. coli cell-free extract. The results presented cover 24 open reading frames of unknown function from different micro-organisms. In order to screen different variables that may interfere with solubility, we expressed the recombinant proteins with a histidine(6) tag, either N-terminal or C-terminal at two temperatures (25 |SNC and 30 |SNC). The identification of recombinant proteins is performed by the dot blot procedure using an anti-histidine tag antibody. We designed a rapid method that allows the characterization of soluble candidates from a large number of genes or from a large number of variants that is highly compatible with structural genomics expectations. Abbreviations IPTG - isopropyl beta-d-1 thiogalactopyranoside; Mr - molecular mass; ORF - open reading frame; PCR - polymerase chain reaction; TBST - Tris-buffered saline Tween; Tris - tris(hydroxymethyl)aminomethane.

    Click here to go back to the publication index

  • Card GL, England BP, Suzuki Y, Fong D, Powell B, ... Schlessinger J, Zhang KY. 2004. Structural basis for the activity of drugs that inhibit phosphodiesterases. Structure (Camb) 12:2233-47.

    Phosphodiesterases (PDEs) comprise a large family of enzymes that catalyze the hydrolysis of cAMP or cGMP and are implicated in various diseases. We describe the high-resolution crystal structures of the catalytic domains of PDE4B, PDE4D, and PDE5A with ten different inhibitors, including the drug candidates cilomilast and roflumilast, for respiratory diseases. These cocrystal structures reveal a common scheme of inhibitor binding to the PDEs: (i) a hydrophobic clamp formed by highly conserved hydrophobic residues that sandwich the inhibitor in the active site; (ii) hydrogen bonding to an invariant glutamine that controls the orientation of inhibitor binding. A scaffold can be readily identified for any given inhibitor based on the formation of these two types of conserved interactions. These structural insights will enable the design of isoform-selective inhibitors with improved binding affinity and should facilitate the discovery of more potent and selective PDE inhibitors for the treatment of a variety of diseases.

    Click here to go back to the publication index

  • Chandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE. 2004. The ASTRAL Compendium in 2004. Nucleic Acids Res 32 Database issue:D189-92. [PDF]

    The ASTRAL Compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. Partially derived from the SCOP database of protein structure domains, it includes sequences for each domain and other resources useful for studying these sequences and domain structures. The current release of ASTRAL contains 54,745 domains, more than three times as many as the initial release 4 years ago. ASTRAL has undergone major transformations in the past 2 years. In addition to several complete updates each year, ASTRAL is now updated on a weekly basis with preliminary classifications of domains from newly released PDB structures. These classifications are available as a stand-alone database, as well as integrated into other ASTRAL databases such as representative subsets. To enhance the utility of ASTRAL to structural biologists, all SCOP domains are now made available as PDB-style coordinate files as well as sequences. In addition to sequences and representative subsets based on SCOP domains, sequences and subsets based on PDB chains are newly included in ASTRAL. Several search tools have been added to ASTRAL to facilitate retrieval of data by individual users and automated methods. ASTRAL may be accessed at http://astral.stanford. edu/.

    Click here to go back to the publication index

  • Chen S, Yakunin AF, Kuznetsova E, Busso D, Pufan R, ... Kim R, Kim SH. 2004. Structural and Functional Characterization of a Novel Phosphodiesterase from Methanococcus jannaschii. J Biol Chem 279:31854-62.

    Methanococcus jannaschii MJ0936 is a hypothetical protein of unknown function with over 50 homologs found in many bacteria and Archaea. To help define the molecular (biochemical and biophysical) function of MJ0936, we determined its crystal structure at 2.4-A resolution and performed a series of biochemical screens for catalytic activity. The overall fold of this single domain protein consists of a four-layered structure formed by two beta-sheets flanked by alpha-helices on both sides. The crystal structure suggested its biochemical function to be a nuclease, phosphatase, or nucleotidase, with a requirement for some metal ions. Crystallization in the presence of Ni(2+) or Mn(2+) produced a protein containing a binuclear metal center in the putative active site formed by a cluster of conserved residues. Analysis of MJ0936 against a panel of general enzymatic assays revealed catalytic activity toward bis-p-nitrophenyl phosphate, an indicator substrate for phosphodiesterases and nucleases. Significant activity was also found with two other phosphodiesterase substrates, thymidine 5'-monophosphate p-nitrophenyl ester and p-nitrophenylphosphorylcholine, but no activity was found for cAMP or cGMP. Phosphodiesterase activity of MJ0936 had an absolute requirement for divalent metal ions with Ni(2+) and Mn(2+) being most effective. Thus, our structural and enzymatic studies have identified the biochemical function of MJ0936 as that of a novel phosphodiesterase.

    Click here to go back to the publication index

  • Chen S, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Crystal structure of a protein associated with cell division from Mycoplasma pneumoniae (GI: 13508053): a novel fold with a conserved sequence motif. Proteins 55:785-91.

    UPF0040 is a family of proteins implicated in a cellular function of bacteria cell division. There is no structure information available on protein of this family. We have determined the crystal structure of a protein from Mycoplasma pneumoniae that belongs to this family using X-ray crystallography. Structural homology search reveals that this protein has a novel fold with no significant similarity to any proteins of known three-dimensional structure. The crystal structures of the protein in three different crystal forms reveal that the protein exists as a ring of octamer. The conserved protein residues, including a highly conserved DXXXR motif, are examined on the basis of crystal structure.

    Click here to go back to the publication index

  • Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14:1188-90. [PDF]

    WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Each logo consists of stacks of letters, one stack for each position in the sequence. The overall height of each stack indicates the sequence conservation at that position (measured in bits), whereas the height of symbols within the stack reflects the relative frequency of the corresponding amino or nucleic acid at that position. WebLogo has been enhanced recently with additional features and options, to provide a convenient and highly configurable sequence logo generator. A command line interface and the complete, open WebLogo source code are available for local installation and customization.

    Click here to go back to the publication index

  • Grosse-Kunstleve RW, Sauter NK, Adams PD. 2004. Numerically stable algorithms for the computation of reduced unit cells. Acta Crystallogr A 60:1-6.

    The computation of reduced unit cells is an important building block for a number of crystallographic applications, but unfortunately it is very easy to demonstrate that the conventional implementation of cell reduction algorithms is not numerically stable. A numerically stable implementation of the Niggli-reduction algorithm of Krivy & Gruber [Acta Cryst. (1976), A32, 297-298] is presented. The stability is achieved by consistently using a tolerance in all floating-point comparisons. The tolerance must be greater than the accumulated rounding errors. A second stable algorithm is also presented, the minimum reduction, that does not require using a tolerance. It produces a cell with minimum lengths and all angles acute or obtuse. The algorithm is a simplified and modified version of the Buerger-reduction algorithm of Gruber [Acta Cryst. (1973), A29, 433-440]. Both algorithms have been enhanced to generate a change-of-basis matrix along with the parameters of the reduced cell.

    Click here to go back to the publication index

  • Jancarik J, Pufan R, Hong C, Kim SH, Kim R. 2004. Optimum solubility (OS) screening: an efficient method to optimize buffer conditions for homogeneity and crystallization of proteins. Acta Crystallogr D Biol Crystallogr 60:1670-3.

    One of the most critical steps in the preparation of protein samples for structural studies by X-ray crystallography is to obtain biochemically pure and conformationally homogenous protein samples. Very often, the purified sample does not meet these qualifications and therefore does not crystallize. A screening method, Optimum Solubility Screen, has been developed that consists of two steps. The first step selects a better buffer than that used during purification. 24 different buffers ranging from pH 3 to pH 10 are screened using a vapor-diffusion method and very small amounts of protein. The solubility of the protein is first determined by visual examination using a light microscope and those drops that remain clear after 24 h are further evaluated using dynamic light scattering. If the results from the first step are still not satisfactory, a second step explores a variety of chemical additives in order to improve the monodispersity of the protein sample. In 64% of the cases, crystallization was successful from proteins that had initially shown high levels of aggregation. This screen can be configured to perform in an automated high-throughput mode and can be expanded for additional buffers and additives.

    Click here to go back to the publication index

  • Nguyen H, Martinez B, Oganesyan N, Kim R. 2004. An automated small-scale protein expression and purification screening provides beneficial information for protein production. J Struct Funct Genomics 5:23-7.

    One of the first key steps in structural genomics is high-throughput expression and rapid screening to select highly soluble proteins, the preferred candidates for crystal production. Here we describe the methodology used at the Berkeley Structural Genomics Center (BSGC) for automated parallel expression and small-scale purification of fusion proteins using a 96-well format. Our robotic method includes cell lysis, soluble fraction separation and purification with affinity resins. For detection of His-tagged proteins in the soluble fractions and after affinity resin elution, a dot-blot procedure with an anti-His-antibody is used. The expression level and molecular mass of recombinant proteins are checked by SDS-PAGE. With this approach, we are able to obtain beneficial information to be used for large-scale protein expression and purification.

    Click here to go back to the publication index

  • Oganesyan V, Pufan R, DeGiovanni A, Yokota H, Kim R, Kim SH. 2004. Structure of the putative DNA-binding protein SP_1288 from Streptococcus pyogenes. Acta Crystallogr D Biol Crystallogr 60:1266-71.

    The crystal structure of the putative DNA-binding protein SP_1288 (gi/15675166, also listed as gi/28895954) from Streptococcus pyogenes has been determined by X-ray crystallography to a resolution of 2.3 A using anomalous diffraction data at the Se peak wavelength. SP_1288 belongs to a family of proteins whose cellular function is associated with the signal recognition particle; no structural information has been available until now about the members of the family. Crystallographic analysis revealed that the overall fold of SP_1288 consists exclusively of alpha-helices and that 75% of the structure has good similarity to domain 4 of the sigma subunit of RNA polymerase. This suggests its possible involvement in the biochemical function of transcription initiation, which includes interaction with DNA.

    Click here to go back to the publication index

  • Ranatunga W, Hill EE, Mooster JL, Holbrook EL, Schulze-Gahmen U, ... Brenner SE, Holbrook SR. 2004. Structural studies of the Nudix hydrolase DR1025 from Deinococcus radiodurans and its ligand complexes. J Mol Biol 339:103-16.

    We have determined the crystal structure, at 1.4A, of the Nudix hydrolase DR1025 from the extremely radiation resistant bacterium Deinococcus radiodurans. The protein forms an intertwined homodimer by exchanging N-terminal segments between chains. We have identified additional conserved elements of the Nudix fold, including the metal-binding motif, a kinked beta-strand characterized by a proline two positions upstream of the Nudix consensus sequence, and participation of the N-terminal extension in the formation of the substrate-binding pocket. Crystal structures were also solved of DR1025 crystallized in the presence of magnesium and either a GTP analog or Ap(4)A (both at 1.6A resolution). In the Ap(4)A co-crystal, the electron density indicated that the product of asymmetric hydrolysis, ATP, was bound to the enzyme. The GTP analog bound structure showed that GTP was bound almost identically as ATP. Neither nucleoside triphosphate was further cleaved.

    Click here to go back to the publication index

  • Sauter NK, Grosse-Kunstlev RW, Adams PD. 2004. Robust indexing for automatic data collection. Journal of Applied Crystallography 37:399-409.

    Improved methods for indexing diffraction patterns from macromolecular crystals are presented. The novel procedures include a more robust way to verify the position of the incident X-ray beam on the detector, an algorithm to verify that the deduced lattice basis is consistent with the observations, and an alternative approach to identify the metric symmetry of the lattice. These methods help to correct failures commonly experienced during indexing, and increase the overall success rate of the process. Rapid indexing, without the need for visual inspection, will play an important role as beamlines at synchrotron sources prepare for high-throughput automation.

    Click here to go back to the publication index

  • Shi J, Pelton JG, Cho HS, Wemmer DE. 2004. Protein signal assignments using specific labeling and cell-free synthesis. J Biomol NMR 28:235-47.

    The goal of structural genomics initiatives is to determine complete sets of protein structures that represent recently sequenced genomes. The development of new high throughput methods is an essential aspect of this enterprise. Residue type and sequential assignments obtained from specifically labeled samples, when combined with 3D heteronuclear data, can significantly increase the efficiency and accuracy of the assignment process, the first step in structure determination by NMR. A protocol for the design of specifically labeled samples with high information content is presented along with a description of the experiments used to extract essential information using 2D versions of 3D heteronuclear experiments. In vitro protein synthesis methods were used to produce four specifically labeled samples of the 23.5 kDa protein phosphoserine phosphatase (PSP) from Methanoccous jannaschii (MJ1594). Each sample contained two (13)C/(15)N-labeled amino acids and one (15)N-labeled amino acid. The 135 type and 14 sequential assignments obtained from these samples were used in conjunction with 3D data obtained from uniformly (13)C/(15)N-labeled and (2)H/(13)C/(15)N-labeled protein to manually assign the backbone (1)H(N), (15)N, (13)CO, (13)C(alpha), and (13)C(beta) signals. Using an automated assignment algorithm, 30% more assignments were obtained when the type and sequential assignments were used in the calculations.

    Click here to go back to the publication index

  • Shin DH, Choi IG, Busso D, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Structure of OsmC from Escherichia coli: a salt-shock-induced protein. Acta Crystallogr D Biol Crystallogr 60:903-11. [TXT]

    The crystal structure of an osmotically inducible protein (OsmC) from Escherichia coli has been determined at 2.4 A resolution. OsmC is a representative protein of the OsmC sequence family, which is composed of three sequence subfamilies. The structure of OsmC provides a view of a salt-shock-induced protein. Two identical monomers form a cylindrically shaped dimer in which six helices are located on the inside and two six-stranded beta-sheets wrap around these helices. Structural comparison suggests that the OsmC sequence family has a peroxiredoxin function and has a unique structure compared with other peroxiredoxin families. A detailed analysis of structures and sequence comparisons in the OsmC sequence family revealed that each subfamily has unique motifs. In addition, the molecular function of the OsmC sequence family is discussed based on structural comparisons among the subfamily members.

    Click here to go back to the publication index

  • Shin DH, Brandsen J, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Structural analyses of peptide release factor 1 from Thermotoga maritima reveal domain flexibility required for its interaction with the ribosome. J Mol Biol 341:227-39.

    We have determined the crystal structure of peptide chain release factor 1 (RF1) from Thermotoga maritima (gi 4981173) at 2.65 Angstrom resolution by selenomethionine single-wavelength anomalous dispersion (SAD) techniques. RF1 is a protein that recognizes stop codons and promotes the release of a nascent polypeptide from tRNA on the ribosome. Selenomethionine-labeled RF1 crystallized in space group P2(1) with three monomers per asymmetric unit. It has approximate dimensions of 75 Angstrom x 70 Angstrom x 45 Angstrom and is composed of four domains. The overall fold of each RF1 domain shows almost the same topology with Escherichia coli RF2, except that the RF1 N-terminal domain is shorter and the C-terminal domain is longer than that of RF2. The N-terminal domain of RF1 indicates a rigid-body movement relative to that of RF2 with an angle of approximately 90 degrees. Including these features, RF1 has a tripeptide anticodon PVT motif instead of the SPF motif of RF2, which confers the specificity towards the stop codons. The analyses of three molecules in the asymmetric unit and comparison with RF2 revealed the presence of dynamic movement of domains I and III, which are anchored to the central domain by hinge loops. The crystal structure of RF1 elucidates the intrinsic property of this family of having large domain movements for proper function with the ribosome.

    Click here to go back to the publication index

  • Shin DH, Lou Y, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Crystal structure of YjeQ from Thermotoga maritima contains a circularly permuted GTPase domain. Proc Natl Acad Sci U S A 101:13198-203.

    We have determined the crystal structure of the GDP complex of the YjeQ protein from Thermotoga maritima (TmYjeQ), a member of the YjeQ GTPase subfamaily. TmYjeQ, a homologue of Escherichia coli YjeQ, which is known to bind to the ribosome, is composed of three domains: an N-terminal oligonucleotide/oligosaccharide-binding fold domain, a central GTPase domain, and a C-terminal zinc-finger domain. The crystal structure of TmYjeQ reveals two interesting domains: a circularly permutated GTPase domain and an unusual zinc-finger domain. The binding mode of GDP in the GTPase domain of TmYjeQ is similar to those of GDP or GTP analogs in ras proteins, a prototype GTPase. The N-terminal oligonucleotide/oligosaccharide-binding fold domain, together with the GTPase domain, forms the extended RNA-binding site. The C-terminal domain has an unusual zinc-finger motif composed of Cys-250, Cys-255, Cys-263, and His-257, with a remote structural similarity to a portion of a DNA-repair protein, rad51 fragment. The overall structural features of TmYjeQ make it a good candidate for an RNA-binding protein, which is consistent with the biochemical data of the YjeQ subfamily in binding to the ribosome.

    Click here to go back to the publication index

  • Snell G, Cork C, Nordmeyer R, Cornell E, Meigs G, ... Stevens RC, Earnest T. 2004. Automated sample mounting and alignment system for biological crystallography at a synchrotron source. Structure (Camb) 12:537-45.

    High-throughput data collection for macromolecular crystallography requires an automated sample mounting and alignment system for cryo-protected crystals that functions reliably when integrated into protein-crystallography beamlines at synchrotrons. Rapid mounting and dismounting of the samples increases the efficiency of the crystal screening and data collection processes, where many crystals can be tested for the quality of diffraction. The sample-mounting subsystem has random access to 112 samples, stored under liquid nitrogen. Results of extensive tests regarding the performance and reliability of the system are presented. To further increase throughput, we have also developed a sample transport/storage system based on "puck-shaped" cassettes, which can hold sixteen samples each. Seven cassettes fit into a standard dry shipping Dewar. The capabilities of a robotic crystal mounting and alignment system with instrumentation control software and a relational database allows for automated screening and data collection to be developed.

    Click here to go back to the publication index

  • Zhang KY, Card GL, Suzuki Y, Artis DR, Fong D, ... Schlessinger J, Bollag G. 2004. A glutamine switch mechanism for nucleotide selectivity by phosphodiesterases. Mol Cell 15:279-86.

    Phosphodiesterases (PDEs) comprise a family of enzymes that modulate the immune response, inflammation, and memory, among many other functions. There are three types of PDEs: cAMP-specific, cGMP-specific, and dual-specific. Here we describe the mechanism of nucleotide selectivity on the basis of high-resolution co-crystal structures of the cAMP-specific PDE4B and PDE4D with AMP, the cGMP-specific PDE5A with GMP, and the apo-structure of the dual-specific PDE1B. These structures show that an invariant glutamine functions as the key specificity determinant by a "glutamine switch" mechanism for recognizing the purine moiety in cAMP or cGMP. The surrounding residues anchor the glutamine residue in different orientations for cAMP and for cGMP. The PDE1B structure shows that in dual-specific PDEs a key histidine residue may enable the invariant glutamine to toggle between cAMP and cGMP. The structural understanding of nucleotide binding enables the design of new PDE inhibitors that may treat diseases in which cyclic nucleotides play a critical role.

    Click here to go back to the publication index

2003:

  • Busso D, Kim R, Kim SH. 2003. Expression of soluble recombinant proteins in a cell-free system using a 96-well format. J Biochem Biophys Methods 55:233-40.

    For structural and functional genomics programs, new high-throughput methods to obtain well-expressing and highly soluble proteins are essential. Here, we describe a rapid procedure to express recombinant proteins in an Escherichia coli cell-free system using a 96-well format. The identification of soluble proteins is performed by the Dot Blot procedure using an anti-His tag antibody. The applications and the automation of this method are described.

    Click here to go back to the publication index

  • Hou J, Sims GE, Zhang C, Kim SH. 2003. A global representation of the protein fold space. Proc Natl Acad Sci U S A 100:2386-90.

    One of the principal goals of the structural genomics initiative is to identify the total repertoire of protein folds and obtain a global view of the "protein structure universe." Here, we present a 3D map of the protein fold space in which structurally related folds are represented by spatially adjacent points. Such a representation reveals a high-level organization of the fold space that is intuitively interpretable. The shape of the fold space and the overall distribution of the folds are defined by three dominant trends: secondary structure class, chain topology, and protein domain size. Random coil-like structures of small proteins and peptides are mapped to a region where the three trends converge, offering an interesting perspective on both the demography of fold space and the evolution of protein structures.

    Click here to go back to the publication index

  • Kim SH, Shin DH, Choi IG, Schulze-Gahmen U, Chen S, Kim R. 2003. Structure-based functional inference in structural genomics. J Struct Funct Genomics 4:129-35.

    The dramatically increasing number of new protein sequences arising from genomics and proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators' laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions. Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs, a considerable number of protein structures have already been produced, some of them coming directly out of semiautomated structure determination pipelines. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.

    Click here to go back to the publication index

  • Kim R, Lai L, Lee HH, Cheong GW, Kim KK, ... Marqusee S, Kim SH. 2003. On the mechanism of chaperone activity of the small heat-shock protein of Methanococcus jannaschii. Proc Natl Acad Sci U S A 100:8151-5.

    The small heat-shock protein (sHSP) from Methanococcus jannaschii (Mj HSP16.5) forms a homomeric complex of 24 subunits and has an overall structure of a multiwindowed hollow sphere with an external diameter of approximately 120 A and an internal diameter of approximately 65 A with six square "windows" of approximately 17 A across and eight triangular windows of approximately 30 A across. This sHSP has been known to protect other proteins from thermal denaturation. Using purified single-chain monellin as a substrate and a series of methods such as protease digestion, antibody binding, and electron microscopy, we show that the substrates bind to Mj HSP16.5 at a high temperature (80 degrees C) on the outside surface of the sphere and are prevented from forming insoluble substrate aggregates in vitro. Circular dichroism studies suggest that a very small, if any, conformational change occurs in sHSP even at 80 degrees C, but substantial conformational changes of the substrate are required for complex formation at 80 degrees C. Furthermore, deletion mutation studies of Mj HSP16.5 suggest that the N-terminal region of the protein has no structural role but may play an important kinetic role in the assembly of the sphere by "preassembly condensation" of multiple monomers before final assembly of the sphere.

    Click here to go back to the publication index

  • Moshinsky DJ, Bellamacina CR, Boisvert DC, Huang P, Hui T, ... Kim SH, Rice AG. 2003. SU9516: biochemical analysis of cdk inhibition and crystal structure in complex with cdk2. Biochem Biophys Res Commun 310:1026-31.

    SU9516 is a 3-substituted indolinone compound with demonstrated potent and selective inhibition toward cyclin dependent kinases (cdks). Here, we describe the kinetic characterization of this inhibition with respect to cdk2, 1, and 4, along with the crystal structure in complex with cdk2. The molecule is competitive with respect to ATP for cdk2/cyclin A, with a K(i) value of 0.031 microM. Similarly, SU9516 inhibits cdk2/cyclin E and cdk1/cyclin B1 in an ATP-competitive manner, although at a 2- to 8-fold reduced potency. In contrast, the compound exhibited non-competitive inhibition with respect to ATP toward cdk4/cyclin D1, with a 45-fold reduced potency. The X-ray crystal structure of SU9516 bound to cdk2 revealed interactions between the molecule and Leu83 and Glu81 of the kinase. This study should aid in the development of more potent and selective cdk inhibitors for potential therapeutic agents.

    Click here to go back to the publication index

  • Oganesyan V, Busso D, Brandsen J, Chen S, Jancarik J, Kim R, Kim SH. 2003. Structure of the hypothetical protein AQ_1354 from Aquifex aeolicus. Acta Crystallogr D Biol Crystallogr 59:1219-23.

    The crystal structure of a hypothetical protein AQ_1354 (gi 2983779) from the hyperthermophilic bacteria Aquifex aeolicus has been determined using X-ray crystallography. As found in many structural genomics studies, this protein is not associated with any known function based on its amino-acid sequence. PSI-BLAST analysis against a non-redundant sequence database gave 68 similar sequences referred to as 'conserved hypothetical proteins' from the uncharacterized protein family UPF0054 (accession No. PF02310). Crystallographic analysis revealed that the overall fold of this protein consists of one central alpha-helix surrounded by a four-stranded beta-sheet and four other alpha-helices. Structure-based homology analysis with DALI revealed that the structure has a moderate to good resemblance to metal-dependent proteinases such as collagenases and gelatinases, thus suggesting its possible molecular function. However, experimental tests for collagenase and gelatinase-type function show no detectable activity under standard assay conditions. Therefore, we suggest either that the members of the UPF0054 family have a similar fold but different biochemical functions to those of collagenases and gelatinases or that they have a similar function but perform it under different conditions.

    Click here to go back to the publication index

  • Rubin SM, Pelton JG, Yokota H, Kim R, Wemmer DE. 2003. Solution structure of a putative ribosome binding protein from Mycoplasma pneumoniae and comparison to a distant homolog. J Struct Funct Genomics 4:235-43. [TXT]

    The solution structure of MPN156, a ribosome-binding factor A (RBFA) protein family member from Mycoplasma pneumoniae, is presented. The structure, solved by nuclear magnetic resonance, has a type II KH fold typical of RNA binding proteins. Despite only approximately 20% sequence identity between MPN156 and another family member from Escherichia coli, the two proteins have high structural similarity. The comparison demonstrates that many of the conserved residues correspond to conserved elements in the structures. Compared to a structure based alignment, standard alignment methods based on sequence alone mispair a majority of amino acids in the two proteins. Implications of these discrepancies for sequence based structural modeling are discussed.

    Click here to go back to the publication index

  • Schulze-Gahmen U, Pelaschier J, Yokota H, Kim R, Kim SH. 2003. Crystal structure of a hypothetical protein, TM841 of Thermotoga maritima, reveals its function as a fatty acid-binding protein. Proteins 50:526-30.

    We determined the three-dimensional (3D) crystal structure of protein TM841, a protein product from a hypothetical open-reading frame in the genome of the hyperthermophile bacterium Thermotoga maritima, to 2.0 A resolution. The protein belongs to a large protein family, DegV or COG1307 of unknown function. The 35 kDa protein consists of two separate domains, with low-level structural resemblance to domains from other proteins with known 3D structures. These structural homologies, however, provided no clues for the function of TM841. But the electron density maps revealed clear density for a bound fatty-acid molecule in a pocket between the two protein domains. The structure indicates that TM841 has the molecular function of fatty-acid binding and may play a role in the cellular functions of fatty acid transport or metabolism.

    Click here to go back to the publication index

  • Shin DH, Nguyen HH, Jancarik J, Yokota H, Kim R, Kim SH. 2003. Crystal structure of NusA from Thermotoga maritima and functional implication of the N-terminal domain. Biochemistry 42:13429-37.

    We report the crystal structure of N-utilizing substance A protein (NusA) from Thermotoga maritima (TmNusA), a protein involved in transcriptional pausing, termination, and antitermination. TmNusA has an elongated rod-shaped structure consisting of an N-terminal domain (NTD, residues 1-132) and three RNA binding domains (RBD). The NTD consists of two subdomains, the globular head and the helical body domains, that comprise a unique three-dimensional structure that may be important for interacting with RNA polymerase. The globular head domain possesses a high content of negatively charged residues that may interact with the positively charged flaplike domain of RNA polymerase. The helical body domain is composed of a three-helix bundle that forms a hydrophobic core with the aid of two neighboring beta-strands. This domain shows structural similarity with one of the helical domains of sigma(70) factor from Escherichia coli. One side of the molecular surface shows positive electrostatic potential suitable for nonspecific RNA interaction. The RBD is composed of one S1 domain and two K-homology (KH) domains forming an elongated RNA binding surface. Structural comparison between TmNusA and Mycobacterium tuberculosis NusA reveals a possible hinge motion between NTD and RBD. In addition, a functional implication of the NTD in its interaction with RNA polymerase is discussed.

    Click here to go back to the publication index

  • Shin DH, Roberts A, Jancarik J, Yokota H, Kim R, Wemmer DE, Kim SH. 2003. Crystal structure of a phosphatase with a unique substrate binding domain from Thermotoga maritima. Protein Sci 12:1464-72.

    We have determined the crystal structure of a phosphatase with a unique substrate binding domain from Thermotoga maritima, TM0651 (gi 4981173), at 2.2 A resolution by selenomethionine single-wavelength anomalous diffraction (SAD) techniques. TM0651 is a member of the haloacid dehalogenase (HAD) superfamily, with sequence homology to trehalose-6-phosphate phosphatase and sucrose-6(F)-phosphate phosphohydrolase. Selenomethionine labeled TM0651 crystallized in space group C2 with three monomers per asymmetric unit. Each monomer has approximate dimensions of 65 x 40 x 35 A(3), and contains two domains: a domain of known hydrolase fold characteristic of the HAD family, and a domain with a new tertiary fold consisting of a six-stranded beta-sheet surrounded by four alpha-helices. There is one disulfide bond between residues Cys35 and Cys265 in each monomer. One magnesium ion and one sulfate ion are bound in the active site. The superposition of active site residues with other HAD family members indicates that TM0651 is very likely a phosphatase that acts through the formation of a phosphoaspartate intermediate, which is supported by both NMR titration data and a biochemical assay. Structural and functional database searches and the presence of many aromatic residues in the interface of the two domains suggest the substrate of TM0651 is a carbohydrate molecule. From the crystal structure and NMR data, the protein likely undergoes a conformational change upon substrate binding.

    Click here to go back to the publication index

  • Sims GE, Kim SH. 2003. Global mapping of nucleic acid conformational space: dinucleoside monophosphate conformations and transition pathways among conformational classes. Nucleic Acids Res 31:5607-16.

    A global conformational space of 6253 dinucleoside monophosphate (DMP) units consisting of RNA and DNA (free and protein/drug-bound) was 'mapped' using high resolution crystal structures cataloged in the Nucleic Acid Database (NDB). The torsion angles of each DMP were clustered in a reduced three-dimensional space using a classical multi-dimensional scaling method. The mapping of the conformational space reveals nine primary clusters which distinguish among the common A-, B- and Z-forms and their various substates, plus five secondary clusters for kinked or bent structures. Conformational relationships and possible transitional pathways among the substates are also examined using the conformational states of DNA and RNA bound with proteins or drugs as potential pathway intermediates.

    Click here to go back to the publication index

  • Zhang C, Kim SH. 2003. Overview of structural genomics: from structure to function. Curr Opin Chem Biol 7:28-32.

    The unprecedented increase in the number of new protein sequences arising from genomics and proteomics highlights directly the need for methods to rapidly and reliably determine the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds, thereby providing three-dimensional portraits for all proteins in a living organism and to infer molecular functions of the proteins. The goal of obtaining protein structures on a genomic scale has motivated the development of high-throughput technologies for macromolecular structure determination, which have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional and evolution relationships that were hidden at the sequence level.

    Click here to go back to the publication index

2002:

  • Chandonia JM, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE. 2002. ASTRAL compendium enhancements. Nucleic Acids Res 30:260-3. [PDF]

    The ASTRAL compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. It is partially derived from the SCOP database of protein domains, and it includes sequences for each domain as well as other resources useful for studying these sequences and domain structures. Several major improvements have been made to the ASTRAL compendium since its initial release 2 years ago. The number of protein domain sequences included has doubled from 15 190 to 30 867, and additional databases have been added. The Rapid Access Format (RAF) database contains manually curated mappings linking the biological amino acid sequences described in the SEQRES records of PDB entries to the amino acid sequences structurally observed (provided in the ATOM records) in a format designed for rapid access by automated tools. This information is used to derive sequences for protein domains in the SCOP database. In cases where a SCOP domain spans several protein chains, all of which can be traced back to a single genetic source, a 'genetic domain' sequence is created by concatenating the sequences of each chain in the order found in the original gene sequence. Both the original-style library of SCOP sequences and a new library including genetic domain sequences are available. Selected representative subsets of each of these libraries, based on multiple criteria and degrees of similarity, are also included. ASTRAL may be accessed at http://astral.stanford.edu/.

    Click here to go back to the publication index

  • Grosse-Kunstleve RW, Adams PD. 2002. Algorithms for deriving crystallographic space-group information. II. Treatment of special positions. Acta Crystallogr A 58:60-5.

    Algorithms for the treatment of special positions in three-dimensional crystallographic space groups are presented. These include an algorithm for the determination of the site-symmetry group given the coordinates of a point, an algorithm for the determination of the exact location of the nearest special position, an algorithm for the assignment of a Wyckoff letter given the site-symmetry group, and an alternative algorithm for the assignment of a Wyckoff letter given the coordinates of a point directly. All algorithms are implemented in ISO C++ and are integrated into the Computational Crystallography Toolbox. The source code is freely available.

    Click here to go back to the publication index

  • Huang L, Hung L, Odell M, Yokota H, Kim R, Kim SH. 2002. Structure-based experimental confirmation of biochemical function to a methyltransferase, MJ0882, from hyperthermophile Methanococcus jannaschii. J Struct Funct Genomics 2:121-7.

    We have determined the three-dimensional (3-D) structure of protein MJ0882, which derives from a hypothetical open reading frame in the genome of the hyperthermophile Methanococcus jannaschii. The 3-D fold of MJ0882 at 1.8 A highly resembles that of a methyltransferase, despite limited sequence similarity to any confirmed methyltransferase. The structure has an S-adenosylmethionine (AdoMet) binding pocket surrounded by motifs with similarities to those commonly found among AdoMet binding proteins. Preliminary biochemical experiments show that MJ0882 specifically binds to AdoMet, which is the essential co-factor for methyltransferases.

    Click here to go back to the publication index

  • Kim SH, Wang W, Kim KK. 2002. Dynamic and clustering model of bacterial chemotaxis receptors: structural basis for signaling and high sensitivity. Proc Natl Acad Sci U S A 99:11611-5.

    Bacterial chemotaxis receptors can detect a small concentration gradient of attractants and repellents in the environment over a wide range of background concentration. The clustering of these receptors to form patches observed in vivo and in vitro has been suspected as a reason for the high sensitivity, and such wide dynamic range is thought to be due to the resetting of the receptor sensitivity threshold by methylation/demethylation of the receptors. However, the mechanisms by which such high sensitivity is achieved and how the methylation/demethylation resets the sensitivity are not well understood. A molecular modeling of an intact bacterial chemotaxis receptor based on the crystal structures of a cytoplasmic domain and a periplasmic domain suggests an interesting clustering of three dimeric receptors and a two-dimensional, close-packed lattice formation of the clusters, where each receptor dimer contacts two other receptor dimers at the cytoplasmic domain and two yet different receptor dimers at the periplasmic domain. This interconnection of the receptors to form a patch of receptor clusters suggests a structural basis for the high sensitivity of the bacterial chemotaxis receptors. Furthermore, we present crystallographic data suggesting that, in contrast to most molecular signaling by conformational changes and/or oligomerization of the signaling molecules, the changes in dynamic property of the receptors on ligand binding or methylation may be the language of the signaling by the chemotaxis receptors. Taken together, the changes of the dynamic property of one receptor propagating mechanically to many others in the receptor patch provides a plausible, simple mechanism for the high sensitivity and the dynamic range of the receptors.

    Click here to go back to the publication index

  • Martinez-Cruz LA, Dreyer MK, Boisvert DC, Yokota H, Martinez-Chantar ML, Kim R, Kim SH. 2002. Crystal structure of MJ1247 protein from M. jannaschii at 2.0 A resolution infers a molecular function of 3-hexulose-6-phosphate isomerase. Structure (Camb) 10:195-204.

    The crystal structure of the hypothetical protein MJ1247 from Methanococccus jannaschii at 2 A resolution, a detailed sequence analysis, and biochemical assays infer its molecular function to be 3-hexulose-6-phosphate isomerase (PHI). In the dissimilatory ribulose monophosphate (RuMP) cycle, ribulose-5-phosphate is coupled to formaldehyde by the 3-hexulose-6-phosphate synthase (HPS), yielding hexulose-6-phosphate, which is then isomerized to fructose-6-phosphate by the enzyme 3-hexulose-6-phosphate isomerase. MJ1247 is an alpha/beta structure consisting of a five-stranded parallel beta sheet flanked on both sides by alpha helices, forming a three-layered alpha-beta-alpha sandwich. The fold represents the nucleotide binding motif of a flavodoxin type. MJ1247 is a tetramer in the crystal and in solution and each monomer has a folding similar to the isomerase domain of glucosamine-6-phosphate synthase (GlmS).

    Click here to go back to the publication index

  • Schulze-Gahmen U, Kim SH. 2002. Structural basis for CDK6 activation by a virus-encoded cyclin. Nat Struct Biol 9:177-81.

    Cyclin from herpesvirus saimiri (Vcyclin) preferentially forms complexes with cyclin-dependent kinase 6 (CDK6) from primate host cells. These complexes show higher kinase activity than host cell CDKs in complex with cellular cyclins and are resistant to cyclin-dependent inhibitory proteins (CDKIs). The crystal structure of human CDK6--Vcyclin in an active state was determined to 3.1 A resolution to better understand the structural basis of CDK6 activation by viral cyclins. The unphosphorylated CDK6 in complex with Vcyclin has many features characteristic of cyclinA-activated, phosphorylated CDK2. There are, however, differences in the conformation at the tip of the T-loop and its interactions with Vcyclin. Residues in the N-terminal extension of Vcyclin wrap around the tip of the CDK6 T-loop and form a short beta-sheet with the T-loop backbone. These interactions lead to a 20% larger buried surface in the CDK6--Vcyclin interface than in the CDK2--cyclinA complex and are probably largely responsible for the specificity of Vcyclin for CDK6 and resistance of the complex to inhibition by INK-type CDKIs.

    Click here to go back to the publication index

  • Shin DH, Yokota H, Kim R, Kim SH. 2002. Crystal structure of a conserved hypothetical protein from Escherichia coli. J Struct Funct Genomics 2:53-66.

    The crystal structure of a conserved hypothetical protein from Escherichia coli has been determined using X-ray crystallography. The protein belongs to the Cluster of Orthologous Group COG1553 (National Center for Biotechnology Information database, NLM, NIH), for which there was no structural information available until now. Structural homology search with DALI algorism indicated that this protein has a new fold with no obvious similarity to those of other proteins with known three-dimensional structures. The protein quaternary structure consists of a dimer of trimers, which makes a characteristic cylinder shape. There is a large closed cavity with approximate dimensions of 16 A x 16 A x 20 A in the center of the hexameric structure. Six putative active sites are positioned along the equatorial surface of the hexamer. There are several highly conserved residues including two possible functional cysteines in the putative active site. The possible molecular function of the protein is discussed.

    Click here to go back to the publication index

  • Shin DH, Yokota H, Kim R, Kim SH. 2002. Crystal structure of conserved hypothetical protein Aq1575 from Aquifex aeolicus. Proc Natl Acad Sci U S A 99:7980-5.

    The crystal structure of a conserved hypothetical protein, Aq1575, from Aquifex aeolicus has been determined by using x-ray crystallography. The protein belongs to the domain of unknown function DUF28 in the Pfam and PALI databases for which there was no structural information available until now. A structural homology search with the DALI algorithm indicates that this protein has a new fold with no obvious similarity to those of other proteins of known three-dimensional structure. The protein reveals a monomer consisting of three domains arranged along a pseudo threefold symmetry axis. There is a large cleft with approximate dimensions of 10 A x 10 A x 20 A in the center of the three domains along the symmetry axis. Two possible active sites are suggested based on the structure and multiple sequence alignment. There are several highly conserved residues in these putative active sites. The structure based molecular properties and thermostability of the protein are discussed.

    Click here to go back to the publication index

  • Wang W, Cho HS, Kim R, Jancarik J, Yokota H, ... Wemmer DE, Kim SH. 2002. Structural characterization of the reaction pathway in phosphoserine phosphatase: crystallographic "snapshots" of intermediate states. J Mol Biol 319:421-31.

    Phosphoserine phosphatase (PSP) is a member of a large class of enzymes that catalyze phosphoester hydrolysis using a phosphoaspartate-enzyme intermediate. PSP is a likely regulator of the steady-state d-serine level in the brain, which is a critical co-agonist of the N-methyl-d-aspartate type of glutamate receptors. Here, we present high-resolution (1.5-1.9 A) structures of PSP from Methanococcus jannaschii, which define the open state prior to substrate binding, the complex with phosphoserine substrate bound (with a D to N mutation in the active site), and the complex with AlF3, a transition-state analog for the phospho-transfer steps in the reaction. These structures, together with those described for the BeF3- complex (mimicking the phospho-enzyme) and the enzyme with phosphate product in the active site, provide a detailed structural picture of the full reaction cycle. The structure of the apo state indicates partial unfolding of the enzyme to allow substrate binding, with refolding in the presence of substrate to provide specificity. Interdomain and active-site conformational changes are identified. The structure with the transition state analog bound indicates a "tight" intermediate. A striking structure homology, with significant sequence conservation, among PSP, P-type ATPases and response regulators suggests that the knowledge of the PSP reaction mechanism from the structures determined will provide insights into the reaction mechanisms of the other enzymes in this family.

    Click here to go back to the publication index

  • Zhang C, Hou J, Kim SH. 2002. Fold prediction of helical proteins using torsion angle dynamics and predicted restraints. Proc Natl Acad Sci U S A 99:3581-5.

    We describe a procedure for predicting the tertiary folds of alpha-helical proteins from their primary sequences. The central component of the procedure is a method for predicting interhelical contacts that is based on a helix-packing model. Instead of predicting the individual contacts, our method attempts to identify the entire patch of contacts that involve residues regularly spaced in the sequences. We use this component to glue together two powerful existing methods: a secondary structure prediction program, whose output serves as the input to the contact prediction algorithm, and the tortion angle dynamics program, which uses the predicted tertiary contacts and secondary structural states to assemble three-dimensional structures. In the final step, the procedure uses the initial set of simulated structures to refine the predicted contacts for a new round of structure calculation. When tested against 24 small to medium-sized proteins representing a wide range of helical folds, the completely automated procedure is able to generate native-like models within a limited number of trials consistently.

    Click here to go back to the publication index

2001:

  • Cave JW, Cho HS, Batchelder AM, Yokota H, Kim R, Wemmer DE. 2001. Solution nuclear magnetic resonance structure of a protein disulfide oxidoreductase from Methanococcus jannaschii. Protein Sci 10:384-96.

    The solution structure of the protein disulfide oxidoreductase Mj0307 in the reduced form has been solved by nuclear magnetic resonance. The secondary and tertiary structure of this protein from the archaebacterium Methanococcus jannaschii is similar to the structures that have been solved for the glutaredoxin proteins from Escherichia coli, although Mj0307 also shows features that are characteristic of thioredoxin proteins. Some aspects of Mj0307's unique behavior can be explained by comparing structure-based sequence alignments with mesophilic bacterial and eukaryotic glutaredoxin and thioredoxin proteins. It is proposed that Mj0307, and similar archaebacterial proteins, may be most closely related to the mesophilic bacterial NrdH proteins. Together these proteins may form a unique subgroup within the family of protein disulfide oxidoreductases.

    Click here to go back to the publication index

  • Cho H, Wang W, Kim R, Yokota H, Damo S, ... Kustu S, Yan D. 2001. BeF(3)(-) acts as a phosphate analog in proteins phosphorylated on aspartate: structure of a BeF(3)(-) complex with phosphoserine phosphatase. Proc Natl Acad Sci U S A 98:8525-30.

    Protein phosphoaspartate bonds play a variety of roles. In response regulator proteins of two-component signal transduction systems, phosphorylation of an aspartate residue is coupled to a change from an inactive to an active conformation. In phosphatases and mutases of the haloacid dehalogenase (HAD) superfamily, phosphoaspartate serves as an intermediate in phosphotransfer reactions, and in P-type ATPases, also members of the HAD family, it serves in the conversion of chemical energy to ion gradients. In each case, lability of the phosphoaspartate linkage has hampered a detailed study of the phosphorylated form. For response regulators, this difficulty was recently overcome with a phosphate analog, BeF(3)(-), which yields persistent complexes with the active site aspartate of their receiver domains. We now extend the application of this analog to a HAD superfamily member by solving at 1.5-A resolution the x-ray crystal structure of the complex of BeF(3)(-) with phosphoserine phosphatase (PSP) from Methanococcus jannaschii. The structure is comparable to that of a phosphoenzyme intermediate: BeF(3)(-) is bound to Asp-11 with the tetrahedral geometry of a phosphoryl group, is coordinated to Mg(2+), and is bound to residues surrounding the active site that are conserved in the HAD superfamily. Comparison of the active sites of BeF(3)(-) x PSP and BeF(3)(-) x CeY, a receiver domain/response regulator, reveals striking similarities that provide insights into the function not only of PSP but also of P-type ATPases. Our results indicate that use of BeF(3)(-) for structural studies of proteins that form phosphoaspartate linkages will extend well beyond response regulators.

    Click here to go back to the publication index

  • Dreyer MK, Borcherding DR, Dumont JA, Peet NP, Tsay JT, ... Shen J, Kim SH. 2001. Crystal structure of human cyclin-dependent kinase 2 in complex with the adenine-derived inhibitor H717. J Med Chem 44:524-30.

    Cyclin-dependent kinases (CDKs) are regulatory proteins of the eukaryotic cell cycle. They act after association with different cyclins, the concentrations of which vary throughout the progression of the cell cycle. As central mediators of cell growth, CDKs are potential targets for inhibitory molecules that would allow disruption of the cell cycle in order to evoke an antiproliferative effect and may therefore be useful as cancer therapeutics. We synthesized several inhibitory 2,6,9-trisubstituted purine derivatives and solved the crystal structure of one of these compounds, H717, in complex with human CDK2 at 2.6 A resolution. The orientation of the C2-p-diaminocyclohexyl portion of the inhibitor is strikingly different from those of similar moieties in other related inhibitor complexes. The N9-cyclopentyl ring fully occupies a space in the enzyme which is otherwise empty, while the C6-N-aminobenzyl substituent points out of the ATP-binding site. The structure provides a basis for the further development of more potent inhibitory drugs.

    Click here to go back to the publication index

  • Du X, Wang W, Kim R, Yakota H, Nguyen H, Kim SH. 2001. Crystal structure and mechanism of catalysis of a pyrazinamidase from Pyrococcus horikoshii. Biochemistry 40:14166-72.

    Bacterial pyrazinamidase (PZAase)/nicotinamidase converts pyrazinamide (PZA) to ammonia and pyrazinoic acid, which is active against Mycobacterium tuberculosis. Loss of PZAase activity is the major mechanism of pyrazinamide-resistance by M. tuberculosis. We have determined the crystal structure of the gene product of Pyrococcus horikoshii 999 (PH999), a PZAase, and its complex with zinc ion by X-ray crystallography. The overall fold of PH999 is similar to that of N-carbamoylsarcosine amidohydrolase (CSHase) of Arthrobacter sp. and YcaC of Escherichia coli, a protein with unknown physiological function. The active site of PH999 was identified by structural features that are also present in the active sites of CSHase and YcaC: a triad (D10, K94, and C133) and a cis-peptide (between V128 and A129). Surprisingly, a metal ion-binding site was revealed in the active site and subsequently confirmed by crystal structure of PH999 in complex with Zn(2+). The roles of the triad, cis-peptide, and metal ion in the catalysis are proposed. Because of extensive homology between PH999 and PZAase of M. tuberculosis (37% sequence identity), the structure of PH999 provides a structural basis for understanding PZA-resistance by M. tuberculosis harboring PZAase mutations.

    Click here to go back to the publication index

  • Du X, Frei H, Kim SH. 2001. Comparison of nitrophenylethyl and hydroxyphenacyl caging groups. Biopolymers 62:147-9.

    Nitrophenylethyl (NPE)- and hydroxyphenacyl (HPA)-caged nucleotides were employed in a time-resolved Fourier transform IR spectroscopy study on Ras-catalyzed guanosine triphosphate (GTP) hydrolysis. A fast kinetic component was observed following the photolysis of NPE-caged nucleotides in the NPE-GTP-Ras complex. However, this kinetic component was not observed in the HPA-GTP-Ras experiment. This fast kinetic component was likely due to a chemical reaction between Ras and the detached caging group, nitrosoacetophenone. This communication serves as a note of caution in interpreting spectral changes and kinetic behavior of the enzymatic systems employing NPE-caged compounds.

    Click here to go back to the publication index

  • Frankenberg RJ, Hsu TS, Yakota H, Kim R, Clark DS. 2001. Chemical denaturation and elevated folding temperatures are required for wild-type activity and stability of recombinant Methanococcus jannaschii 20S proteasome. Protein Sci 10:1887-96.

    The 20S proteasome from the extreme thermophile Methanococcus jannaschii (Mj) was purified and sequenced to facilitate production of the recombinant proteasome in E. coli. The recombinant proteasome remained in solution at a purity level of 80-85% (according to SDS PAGE) following incubation of cell lysates at 70 degrees C. Temperature-activity profiles indicated that the temperature optima of the wild-type and recombinant enzymes differed substantially, with optimal activities occurring at 119 degrees C and 95 degrees C, respectively. To ameliorate this discrepancy, two recombinant enzyme preparations were produced, each of which included denaturation of the proteasome by 4 M urea followed by high-temperature (85 degrees C) dialysis. The wild-type temperature optimum was restored, but only if proteasome subunits were denatured and refolded prior to assembly (a preparation designated as alpha & beta). In contrast, when proteasome assembly preceded denaturation (designated alpha + beta) the optimum temperature was raised to a lesser degree. Moreover, the alpha & beta and alpha + beta preparations had apparent thermal half-lives at 114 degrees C of 54.2 and 26.2 min, respectively, and the thermostability of the less stable enzyme was more sensitive to a reduction in pH. Attainment of wild-type activity and stability thus required the proper folding of both the alpha- and beta-subunits prior to proteasome assembly. Consistent with this behavior, dual-scanning calorimetry (DSC) measurements revealed differences in the reassembly efficiency of the two proteasome preparations. The ability to produce structural conformers with dramatically different thermal optima and thermostabilities may facilitate the determination of molecular forces and structural motifs responsible for enzyme thermostablity and high-temperature activity.

    Click here to go back to the publication index

  • Kim VN, Kataoka N, Dreyfuss G. 2001. Role of the nonsense-mediated decay factor hUpf3 in the splicing-dependent exon-exon junction complex. Science 293:1832-6.

    Nonsense-mediated messenger RNA (mRNA) decay, or NMD, is a critical process of selective degradation of mRNAs that contain premature stop codons. NMD depends on both pre-mRNA splicing and translation, and it requires recognition of the position of stop codons relative to exon-exon junctions. A key factor in NMD is hUpf3, a mostly nuclear protein that shuttles between the nucleus and cytoplasm and interacts specifically with spliced mRNAs. We found that hUpf3 interacts with Y14, a component of post-splicing mRNA-protein (mRNP) complexes, and that hUpf3 is enriched in Y14-containing mRNP complexes. The mRNA export factors Aly/REF and TAP are also associated with nuclear hUpf3, indicating that hUpf3 is in mRNP complexes that are poised for nuclear export. Like Y14 and Aly/REF, hUpf3 binds to spliced mRNAs specifically ( approximately 20 nucleotides) upstream of exon-exon junctions. The splicing-dependent binding of hUpf3 to mRNAs before export, as part of the complex that assembles near exon-exon junctions, allows it to serve as a link between splicing and NMD in the cytoplasm.

    Click here to go back to the publication index

2000:

  • Adler M, Davey DD, Phillips GB, Kim SH, Jancarik J, ... Light DR, Whitlow M. 2000. Preparation, characterization, and the crystal structure of the inhibitor ZK-807834 (CI-1031) complexed with factor Xa. Biochemistry 39:12534-42.

    Factor Xa plays a critical role in the formation of blood clots. This serine protease catalyzes the conversion of prothrombin to thrombin, the first joint step that links the intrinsic and extrinsic coagulation pathways. There is considerable interest in the development of factor Xa inhibitors for the intervention in thrombic diseases. This paper presents the structure of the inhibitor ZK-807834, also known as CI-1031, bound to factor Xa and provides the details of the protein purification and crystallization. Results from mass spectrometry indicate that the factor Xa underwent autolysis during crystallization and the first EGF-like domain was cleaved from the protein. The crystal structure of the complex shows that the amidine of ZK-807834 forms a salt bridge with Asp189 in the S1 pocket and the basic imidazoline fits snugly into the S4 site. The central pyridine ring provides a fairly rigid linker between these groups. This rigidity helps minimize entropic losses during binding. In addition, the structure reveals new interactions that were not found in the previous factor Xa/inhibitor complexes. ZK-807834 forms a strong hydrogen bond between an ionized 2-hydroxy group and Ser195 of factor Xa. There is also an aromatic ring-stacking interaction between the inhibitor and Trp215 in the S4 pocket. These interactions contribute to both the potency of this compound (K(I) = 0.11 nM) and the >2500-fold selectivity against homologous serine proteases such as trypsin.

    Click here to go back to the publication index

  • Du X, Frei H, Kim SH. 2000. The mechanism of GTP hydrolysis by Ras probed by Fourier transform infrared spectroscopy. J Biol Chem 275:8492-500.

    Time-resolved Fourier transform infrared spectroscopy (FTIR) in combination with photo-induced release of (18)O-labeled caged nucleotide has been employed to address mechanistic issues of GTP hydrolysis by Ras protein. Infrared spectroscopy of Ras complexes with nitrophenylethyl (NPE)-[alpha-(18)O(2)]GTP, NPE-[beta-(18)O(4)]GTP, or NPE-[gamma-(18)O(3)]GTP upon photolysis or during hydrolysis afforded a substantially improved mode assignment of phosphoryl group absorptions. Photolysis spectra of hydroxyphenylacyl-GTP and hydroxyphenylacyl-GDP bound to Ras and several mutants, Ras(Gly(12))-Mn(2+), Ras(Pro(12)), Ras(Ala(12)), and Ras(Val(12)), were obtained and yielded valuable information about structures of GTP or GDP bound to Ras mutants. IR spectra revealed stronger binding of GDP beta-PO(3)(2-) moiety by Ras mutants with higher activity, suggesting that the transition state is largely GDP-like. Analysis of the photolysis and hydrolysis FTIR spectra of the [beta-nonbridge-(18)O(2), alphabeta-bridge-(18)O]GTP isotopomer allowed us to probe for positional isotope exchange. Such a reaction might signal the existence of metaphosphate as a discrete intermediate, a key species for a dissociative mechanism. No positional isotope exchange was observed. Overall, our results support a concerted mechanism, but the transition state seems to have a considerable amount of dissociative character. This work demonstrates that time-resolved FTIR is highly suitable for monitoring positional isotope exchange and advantageous in many aspects over previously used methods, such as (31)P NMR and mass spectrometry.

    Click here to go back to the publication index

  • Du X, Choi IG, Kim R, Wang W, Jancarik J, Yokota H, Kim SH. 2000. Crystal structure of an intracellular protease from Pyrococcus horikoshii at 2-A resolution. Proc Natl Acad Sci U S A 97:14079-84.

    The intracellular protease from Pyrococcus horikoshii (PH1704) and PfpI from Pyrococcus furiosus are members of a class of intracellular proteases that have no sequence homology to any other known protease family. We report the crystal structure of PH1704 at 2.0-A resolution. The protease is tentatively identified as a cysteine protease based on the presence of cysteine (residue 100) in a nucleophile elbow motif. In the crystal, PH1704 forms a hexameric ring structure, and the active sites are formed at the interfaces between three pairs of monomers.

    Click here to go back to the publication index

  • Falke JJ, Kim SH. 2000. Structure of a conserved receptor domain that regulates kinase activity: the cytoplasmic domain of bacterial taxis receptors. Curr Opin Struct Biol 10:462-9.

    Many bacteria are motile and use a conserved class of transmembrane sensory receptor to regulate cellular taxis toward an optimal living environment. These conserved receptors are typically stimulated by extracellular signals, but also undergo adaptation via covalent modification at specific sites on their cytoplasmic domains. The function of the cytoplasmic domain is to integrate the extracellular and adaptive signals, and to use this integrated information to regulate an associated histidine kinase. The kinase, in turn, triggers a cytoplasmic phosphorylation pathway of the two-component class. The high-resolution structure of a receptor cytoplasmic domain has recently been determined by crystallographic methods and is largely consistent with a structural model independently generated by chemical studies of the domain in the full-length, membrane-bound receptor. These results represent an important step toward a mechanistic understanding of receptor-to-kinase information transfer.

    Click here to go back to the publication index

  • Kim SH. 2000. Structural genomics of microbes: an objective. Curr Opin Struct Biol 10:380-3.

    A comparison of the genome sequences of more than 20 microorganisms reveals that a large fraction of the genes have unknown functions. Determining the structures of the proteins coded by these genes may provide additional key information in an effort to uncover the molecular functions of such proteins and new protein fold patterns. Using existing technology, it is possible to obtain a complete sequence complement and a near complete structural complement for a small microbial genome. Such information may provide a comprehensive view of a small organism, which, in turn, can serve as a platform for understanding more complex organisms.

    Click here to go back to the publication index

  • Lai L, Yokota H, Hung LW, Kim R, Kim SH. 2000. Crystal structure of archaeal RNase HII: a homologue of human major RNase H. Structure Fold Des 8:897-904.

    BACKGROUND: RNases H are present in all organisms and cleave RNAs in RNA/DNA hybrids. There are two major types of RNases H that have little similarity in sequence, size and specificity. The structure of RNase HI, the smaller enzyme and most abundant in bacteria, has been extensively studied. However, no structural information is available for the larger RNase H, which is most abundant in eukaryotes and archaea. Mammalian RNase H participates in DNA replication, removal of the Okazaki fragments and possibly DNA repair. RESULTS: The crystal structure of RNase HII from the hypothermophile Methanococcus jannaschii, which is homologous to mammalian RNase H, was solved using a multiwavelength anomalous dispersion (MAD) phasing method at 2 A resolution. The structure contains two compact domains. Despite the absence of sequence similarity, the large N-terminal domain shares a similar fold with the RNase HI of bacteria. The active site of RNase HII contains three aspartates: Asp7, Asp112 and Asp149. The nucleotide-binding site is located in the cleft between the N-terminal and C-terminal domains. CONCLUSIONS: Despite a lack of any detectable similarity in primary structure, RNase HII shares a similar structural domain with RNase HI, suggesting that the two classes of RNases H have a common catalytic mechanism and possibly a common evolutionary origin. The involvement of the unique C-terminal domain in substrate recognition explains the different reaction specificity observed between the two classes of RNase H.

    Click here to go back to the publication index

  • Meijer L, Thunnissen AM, White AW, Garnier M, Nikolic M, ... Kim SH, Pettit GR. 2000. Inhibition of cyclin-dependent kinases, GSK-3beta and CK1 by hymenialdisine, a marine sponge constituent. Chem Biol 7:51-63.

    BACKGROUND: Over 2000 protein kinases regulate cellular functions. Screening for inhibitors of some of these kinases has already yielded some potent and selective compounds with promising potential for the treatment of human diseases. RESULTS: The marine sponge constituent hymenialdisine is a potent inhibitor of cyclin-dependent kinases, glycogen synthase kinase-3beta and casein kinase 1. Hymenialdisine competes with ATP for binding to these kinases. A CDK2-hymenialdisine complex crystal structure shows that three hydrogen bonds link hymenialdisine to the Glu81 and Leu83 residues of CDK2, as observed with other inhibitors. Hymenialdisine inhibits CDK5/p35 in vivo as demonstrated by the lack of phosphorylation/down-regulation of Pak1 kinase in E18 rat cortical neurons, and also inhibits GSK-3 in vivo as shown by the inhibition of MAP-1B phosphorylation. Hymenialdisine also blocks the in vivo phosphorylation of the microtubule-binding protein tau at sites that are hyperphosphorylated by GSK-3 and CDK5/p35 in Alzheimer's disease (cross-reacting with Alzheimer's-specific AT100 antibodies). CONCLUSIONS: The natural product hymenialdisine is a new kinase inhibitor with promising potential applications for treating neurodegenerative disorders.

    Click here to go back to the publication index

  • Zhang C, Kim SH. 2000. The anatomy of protein beta-sheet topology. J Mol Biol 299:1075-89.

    Here, we present a systematic analysis of the open-faced beta-sheet topologies in a set of non-redundant protein domain structures; in particular, we focus on the topological diversity of four-stranded beta-sheet motifs. Of the 96 topologies that are possible for a four-stranded beta-sheet, 42 were identified in known protein structures. Of these, four account for 50% of the structures that we have studied. Two sets of the topologies that were not observed may represent the section of the topological space that is not readily accessible to proteins on either thermodynamic or kinetic grounds. The first set contains topologies with alternating parallel and antiparallel beta-ladders. Their rare occurrence reflects the expectation that it is energetically unfavorable to match different hydrogen bonding patterns. The polypeptide chains in the second set of topologies go through convoluted paths and are expected to experience great kinetic frustrations during the folding processes. A knowledge of the potential causes for the topological preference of small beta-sheets also helps us to understand the topological properties of larger beta-sheet structures which frequently contain four-stranded motifs. The notion that protein topologies can only be taken from a confined and discrete space has important implications for structural genomics.

    Click here to go back to the publication index

  • Zhang C, Kim SH. 2000. Environment-dependent residue contact energies for proteins. Proc Natl Acad Sci U S A 97:2550-5.

    We examine the interactions between amino acid residues in the context of their secondary structural environments (helix, strand, and coil) in proteins. Effective contact energies for an expanded 60-residue alphabet (20 aa x three secondary structural states) are estimated from the residue-residue contacts observed in known protein structures. Similar to the prototypical contact energies for 20 aa, the newly derived energy parameters reflect mainly the hydrophobic interactions; however, the relative strength of such interactions shows a strong dependence on the secondary structural environment, with nonlocal interactions in beta-sheet structures and alpha-helical structures dominating the energy table. Environment-dependent residue contact energies outperform existing residue pair potentials in both threading and three-dimensional contact prediction tests and should be generally applicable to protein structure prediction.

    Click here to go back to the publication index

  • Zhang C, Kim SH. 2000. A comprehensive analysis of the Greek key motifs in protein beta-barrels and beta-sandwiches. Proteins 40:409-19.

    The Greek key motifs are the topological signature of many beta-barrels and a majority of beta-sandwich structures. An updated survey of these structures integrates many early observations and newly emerging patterns and provides a better understanding of the unique role of Greek keys in protein structures. A stereotypical Greek key beta-barrel accommodates five or six strands and can have 12 possible topologies. All except one six-stranded topologies have been observed, and only one five-stranded topologies have been seen in actual structures. Of the representative beta-barrel structures analyzed here, half have left-handed Greek keys. This result challenges the empirical claim of the handedness regularity of Greek keys in beta-barrels. One of the five-stranded topologies that has not been observed in beta-barrels comprises two overlapping Greek keys. The two three-dimensional forms of this topology constitute a structural unit that is present in a vast majority of known beta-sandwich structures. Using this unit as the root, we have built a new taxonomy tree for the beta-sandwich folds and deduced a set of rules that appear to constrain how other beta-strands adjoin the unit to form a larger double-layered structure. These rules, though derived from a larger data set, are essentially the same as those drawn from earlier studies, suggesting that they may reflect the true topological constraints in the design of beta-sandwich structures. Finally, a novel variant of the Greek key motif (defined here as the twisted Greek key) has emerged which introduces loop crossings into the folded structures. Proteins 2000;40:409-419.

    Click here to go back to the publication index

  • Zhang C, Kim SH. 2000. The effect of dynamic receptor clustering on the sensitivity of biochemical signaling. Pac Symp Biocomput:353-64.

    Lateral clustering has emerged as a general mechanism used by many cellular receptors to control their responses to critical changes in the external environment. Here we derive a general mathematical framework to characterize the effect of receptor clustering on the sensitivity and dynamic range of biochemical signaling. In particular, we apply the theory to the bacterial chemosensory system and show that it can integrate a large body of experimental observations and provide a unified explanation to many aspects of chemotaxis. The principles of dynamic receptor clustering and signal amplification incorporated into this theory may underlie the design of many cellular networks.

    Click here to go back to the publication index

1999:

  • Grigoriev IV, Kim SH. 1999. Detection of protein fold similarity based on correlation of amino acid properties. Proc Natl Acad Sci U S A 96:14318-23.

    An increasing number of proteins with weak sequence similarity have been found to assume similar three-dimensional fold and often have similar or related biochemical or biophysical functions. We propose a method for detecting the fold similarity between two proteins with low sequence similarity based on their amino acid properties alone. The method, the proximity correlation matrix (PCM) method, is built on the observation that the physical properties of neighboring amino acid residues in sequence at structurally equivalent positions of two proteins of similar fold are often correlated even when amino acid sequences are different. The hydrophobicity is shown to be the most strongly correlated property for all protein fold classes. The PCM method was tested on 420 proteins belonging to 64 different known folds, each having at least three proteins with little sequence similarity. The method was able to detect fold similarities for 40% of the 420 sequences. Compared with sequence comparison and several fold-recognition methods, the method demonstrates good performance in detecting fold similarities among the proteins with low sequence identity. Applied to the complete genome of Methanococcus jannaschii, the method recognized the folds for 22 hypothetical proteins.

    Click here to go back to the publication index

[Back to Top]

     
  Top ^

BSGC Home | About BSGC | Publications | New Technologies | Protocols
Structural Proteome | Jobs | News | Collaborators | Web Resources | Status | Contact Us