|
|
|
|
|
BSGC PUBLICATIONS
|
|
|
|
|
|
This page gives BSGC publications abstracts, where available.
Abstracts are listed alphabetically, by year.
Click to go back to the publication index.
2007:
- Chandonia JM. 2007. StrBioLib: a Java library for development of custom computational structural biology applications. Bioinformatics [Preprint PDF]
SUMMARY: StrBioLib is a library of Java classes useful for developing
software for computational structural biology research. StrBioLib contains
classes to represent and manipulate protein structures, biopolymer
sequences, sets of biopolymer sequences, and alignments between
biopolymers based on either sequence or structure. Interfaces are provided
to interact with commonly used bioinformatics applications, including
(PSI)-BLAST, MODELLER, MUSCLE, and Primer3, and tools are provided to read
and write many file formats used to represent bioinformatic data. The
library includes a general-purpose neural network object with multiple
training algorithms, the Hooke and Jeeves nonlinear optimization
algorithm, and tools for efficient C-style string parsing and formatting.
StrBioLib is the basis for the Pred2ary secondary structure prediction
program, is used to build the ASTRAL compendium for sequence and structure
analysis, and has been extensively tested through use in many smaller
projects. Examples and documentation are available at the site below.
AVAILABILITY: StrBioLib may be obtained under the terms of the GNU LGPL
license from http://strbio.sourceforge.net/
Click here to go back to the publication index
- Das D, Xu QS, Lee JY, Ankoudinova I, Huang C, ... Kim R, Kim SH. 2007. Crystal structure of the multidrug efflux transporter AcrB at 3.1A resolution reveals the N-terminal region with conserved amino acids. J Struct Biol 158:494-502.
Crystal structures of the bacterial multidrug transporter AcrB in R32 and
C2 space groups showing both symmetric and asymmetric trimeric assemblies,
respectively, supplemented with biochemical investigations, have provided
most of the structural basis for a molecular level understanding of the
protein structure and mechanisms for substrate uptake and translocation
carried out by this 114-kDa inner membrane protein. They suggest that AcrB
captures ligands primarily from the periplasm. Substrates can also enter
the inner cavity of the transporter from the cytoplasm, but the exact
mechanism of this remains undefined. Analysis of the amino acid sequences
of AcrB and its homologs revealed the presence of conserved residues at
the N-terminus including two phenylalanines which may be exposed to the
cytoplasm. Any potential role that these conserved residues may play in
function has not been addressed by existing biochemical or structural
studies. Since phenylalanine residues elsewhere in the protein have been
implicated in ligand binding, we explored the structure of this N-terminal
region to investigate structural determinants near the cytoplasmic opening
that may mediate drug uptake. Our structure of AcrB in R32 space group
reveals an N-terminus loop, reducing the diameter of the central opening
to approximately 15 A as opposed to the previously reported value of
approximately 30 A for crystal structures in this space group with
disordered N-terminus. Recent structures of the AcrB in C2 space group
have revealed a helical conformation of this N-terminus but have not
discussed its possible implications. We present the crystal structure of
AcrB that reveals the structure of the N-terminus containing the conserved
residues. We hope that the structural information provides a structural
basis for others to design further biochemical investigation of the role
of this portion of AcrB in mediating cytoplasmic ligand discrimination and
uptake.
Click here to go back to the publication index
- Lowery TJ, Pelton JG, Chandonia JM, Kim R, Yokota H, Wemmer DE. 2007. NMR structure of the N-terminal domain of the replication initiator protein DnaA. J Struct Funct Genomics
DnaA is an essential component in the initiation of bacterial chromosomal
replication. DnaA binds to a series of 9 base pair repeats leading to
oligomerization, recruitment of the DnaBC helicase, and the assembly of
the replication fork machinery. The structure of the N-terminal domain
(residues 1-100) of DnaA from Mycoplasma genitalium was determined by NMR
spectroscopy. The backbone r.m.s.d. for the first 86 residues was 0.6 +/-
0.2 A based on 742 NOE, 50 hydrogen bond, 46 backbone angle, and 88
residual dipolar coupling restraints. Ultracentrifugation studies revealed
that the domain is monomeric in solution. Features on the protein surface
include a hydrophobic cleft flanked by several negative residues on one
side, and positive residues on the other. A negatively charged ridge is
present on the opposite face of the protein. These surfaces may be
important sites of interaction with other proteins involved in the
replication process. Together, the structure and NMR assignments should
facilitate the design of new experiments to probe the protein-protein
interactions essential for the initiation of DNA replication.
Click here to go back to the publication index
- Oganesyan V, Adams PD, Jancarik J, Kim R, Kim SH. 2007. Structure of O67745_AQUAE, a hypothetical protein from Aquifex aeolicus. Acta Crystallogr Sect F Struct Biol Cryst Commun 63:369-74.
Using single-wavelength anomalous dispersion data obtained from a
gold-derivatized crystal, the X-ray crystal structure of the protein
067745_AQUAE from the prokaryotic organism Aquifex aeolicus has been
determined to a resolution of 2.0 A. Amino-acid residues 1-371 of the 44
kDa protein were identified by Pfam as an HD domain and a member of the
metal-dependent phosphohydrolase superfamily (accession No. PF01966).
Although three families from this large and diverse group of enzymatic
proteins are represented in the PDB, the structure of 067745_AQUAE reveals
a unique fold that is unlike the others and that is likely to represent a
new subfamily, further organizing the families and characterizing the
proteins. Data are presented that provide the first insights into the
structural organization of the proteins within this clan and a distal
alternative GDP-binding domain outside the metal-binding active site is
proposed.
Click here to go back to the publication index
- Shin DH, Hou J, Chandonia JM, Das D, Choi IG, Kim R, Kim SH. 2007. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center. J Struct Funct Genomics
Advances in sequence genomics have resulted in an accumulation of a huge
number of protein sequences derived from genome sequences. However, the
functions of a large portion of them cannot be inferred based on the
current methods of sequence homology detection to proteins of known
functions. Three-dimensional structure can have an important impact in
providing inference of molecular function (physical and chemical function)
of a protein of unknown function. Structural genomics centers worldwide
have been determining many 3-D structures of the proteins of unknown
functions, and possible molecular functions of them have been inferred
based on their structures. Combined with bioinformatics and enzymatic
assay tools, the successful acceleration of the process of protein
structure determination through high throughput pipelines enables the
rapid functional annotation of a large fraction of hypothetical proteins.
We present a brief summary of the process we used at the Berkeley
Structural Genomics Center to infer molecular functions of proteins of
unknown function.
Click here to go back to the publication index
- Shin DH, Proudfoot M, Lim HJ, Choi IK, Yokota H, ... Kim R, Kim SH. 2007. Structural and enzymatic characterization of DR1281: A calcineurin-like phosphoesterase from Deinococcus radiodurans. Proteins
We have determined the crystal structure of DR1281 from Deinococcus
radiodurans. DR1281 is a protein of unknown function with over 170
homologs found in prokaryotes and eukaryotes. To elucidate the molecular
function of DR1281, its crystal structure at 2.3 A resolution was
determined and a series of biochemical screens for catalytic activity was
performed. The crystal structure shows that DR1281 has two domains, a
small alpha domain and a putative catalytic domain formed by a
four-layered structure of two beta-sheets flanked by five alpha-helices on
both sides. The small alpha domain interacts with other molecules in the
asymmetric unit and contributes to the formation of oligomers. The
structural comparison of the putative catalytic domain with known
structures suggested its biochemical function to be a phosphatase,
phosphodiesterase, nuclease, or nucleotidase. Structural analyses with its
homologues also indicated that there is a dinuclear center at the
interface of two domains formed by Asp8, Glu37, Asn38, Asn65, His148,
His173, and His175. An absolute requirement of metal ions for activity has
been proved by enzymatic assay with various divalent metal ions. A panel
of general enzymatic assays of DR1281 revealed metal-dependent catalytic
activity toward model substrates for phosphatases (p-nitrophenyl
phosphate) and phosphodiesterases (bis-p-nitrophenyl phosphate).
Subsequent secondary enzymatic screens with natural substrates
demonstrated significant phosphatase activity toward phosphoenolpyruvate
and phosphodiesterase activity toward 2',3'-cAMP. Thus, our structural and
enzymatic studies have identified the biochemical function of DR1281 as a
novel phosphatase/phosphodiesterase and disclosed key conserved residues
involved in metal binding and catalytic activity. Proteins 2007. (c) 2007
Wiley-Liss, Inc.
Click here to go back to the publication index
2006:
- Chandonia JM, Brenner SE. 2006. The impact of structural genomics: expectations and outcomes. Science 311:347-51. [PDF]|[Supplementary Info]
Structural genomics (SG) projects aim to expand our structural knowledge
of biological macromolecules while lowering the average costs of structure
determination. We quantitatively analyzed the novelty, cost, and impact of
structures solved by SG centers, and we contrast these results with
traditional structural biology. The first structure identified in a
protein family enables inference of the fold and of ancient relationships
to other proteins; in the year ending 31 January 2005, about half of such
structures were solved at a SG center rather than in a traditional
laboratory. Furthermore, the cost of solving a structure at the most
efficient SG center in the United States has dropped to one-quarter of the
estimated cost of solving a structure by traditional methods. However, the
efficiency of the top structural biology laboratories-even though they
work on very challenging structures-is comparable to that of SG centers;
moreover, traditional structural biology papers are cited significantly
more often, suggesting greater current impact.
Click here to go back to the publication index
- Chandonia JM, Kim SH, Brenner SE. 2005. Target selection and deselection at the Berkeley Structural Genomics Center. Proteins 62:356-370. [PDF]|[Supplementary Info]
At the Berkeley Structural Genomics Center (BSGC), our goal is to obtain a
near-complete structural complement of proteins in the minimal organisms
Mycoplasma genitalium and M. pneumoniae, two closely related pathogens.
Current targets for structure determination have been selected in six
major stages, starting with those predicted to be most tractable to high
throughput study and likely to yield new structural information. We report
on the process used to select these proteins, as well as our target
deselection procedure. Target deselection reduces experimental effort by
eliminating targets similar to those recently solved by the structural
biology community or other centers. We measure the impact of the 69
structures solved at the BSGC as of July 2004 on structure prediction
coverage of the M. pneumoniae and M. genitalium proteomes. The number of
Mycoplasma proteins for which the fold could first be reliably assigned
based on structures solved at the BSGC (24 M. pneumoniae and 21 M.
genitalium) is approximately 25% of the total resulting from work at all
structural genomics centers and the worldwide structural biology community
(94 M. pneumoniae and 86 M. genitalium) during the same period. As the
number of structures contributed by the BSGC during that period is less
than 1% of the total worldwide output, the benefits of a focused target
selection strategy are apparent. If the structures of all current targets
were solved, the percentage of M. pneumoniae proteins for which folds
could be reliably assigned would increase from approximately 57% (391 of
687) at present to around 80% (550 of 687), and the percentage of the
proteome that could be accurately modeled would increase from around 37%
(254 of 687) to about 64% (438 of 687). In M. genitalium, the percentage
of the proteome that could be structurally annotated based on structures
of our remaining targets would rise from 72% (348 of 486) to around 76%
(371 of 486), with the percentage of accurately modeled proteins would
rise from 50% (243 of 486) to 58% (283 of 486). Sequences and data on
experimental progress on our targets are available in the public databases
TargetDB and PEPCdb. Proteins 2006. (c) 2005 Wiley-Liss, Inc.
Click here to go back to the publication index
- Chandonia JM, Kim SH. 2006. Structural proteomics of minimal organisms: Conservation of protein fold usage and evolutionary implications. BMC Struct Biol 6:7. [PDF]
ABSTRACT: BACKGROUND: Determining the complete repertoire of protein
structures for all soluble, globular proteins in a single organism has
been one of the major goals of several structural genomics projects in
recent years. RESULTS: We report that this goal has nearly been reached
for several "minimal organisms"--parasites or symbionts with reduced
genomes--for which over 95% of the soluble, globular proteins may now be
assigned folds, overall 3-D backbone structures. We analyze the structures
of these proteins as they relate to cellular functions, and compare
conservation of fold usage between functional categories. We also compare
patterns in the conservation of folds among minimal organisms and those
observed between minimal organisms and other bacteria. CONCLUSION: We find
that proteins performing essential cellular functions closely related to
transcription and translation exhibit a higher degree of conservation in
fold usage than proteins in other functional categories. Folds related to
transcription and translation functional categories were also
overrepresented in minimal organisms compared to other bacteria.
Click here to go back to the publication index
- Kim JS, Shin DH, Pufan R, Huang C, Yokota H, Kim R, Kim SH. 2006. Crystal structure of ScpB from Chlorobium tepidum, a protein involved in chromosome partitioning. Proteins 62:322-8.
Structural maintenance of chromosome (SMC) proteins are essential in
chromosome condensation and interact with non-SMC proteins in eukaryotes
and with segregation and condensation proteins (ScpA and ScpB) in
prokaryotes. The highly conserved gene in Chlorobium tepidum gi 21646405
encodes ScpB (ScpB_ChTe). The high resolution crystal structure of
ScpB_ChTe shows that the monomeric structure consists of two similarly
shaped globular domains composed of three helices sided by beta-strands [a
winged helix-turn-helix (HTH)], a motif observed in the C-terminal domain
of Scc1, a functionally related eukaryotic ScpA homolog, as well as in
many DNA binding proteins.
Click here to go back to the publication index
- Shin DH, Kim JS, Yokota H, Kim R, Kim SH. 2006. Crystal structure of the DUF16 domain of MPN010 from Mycoplasma pneumoniae. Protein Sci 15:921-8.
We have determined the crystal structure of the DUF16 domain of unknown
function encoded by the gene MPN010 of Mycoplasma pneumoniae at 1.8 A
resolution. The crystal structure revealed that this domain is composed of
two separated homotrimeric coiled-coils. The shorter one consists of 11
highly conserved residues. The sequence comprises noncanonical heptad
repeats that induce a right-handed coiled-coil structure. The longer one
is composed of approximately nine heptad repeats. In this coiled-coil
structure, there are three distinguishable regions that confer unique
structural properties compared with other known homotrimeric coiled-coils.
The first part, containing one stutter, is an unusual phenylalanine-rich
region that is not found in any other coiled-coil structures. The second
part is a highly conserved glutamine-rich region, frequently found in
other trimeric coiled-coil structures. The last part is composed of
prototype heptad repeats. The phylogenetic analysis of the DUF16 family
together with a secondary structure prediction shows that the DUF16 family
can be classified into five subclasses according to N-terminal sequences.
Based on the structural comparison with other coiled-coil structures, a
probable molecular function of the DUF16 family is discussed.
Click here to go back to the publication index
- Sims GE, Kim SH. 2006. A method for evaluating the structural quality of protein models by using higher-order phi-psi pairs scoring. Proc Natl Acad Sci U S A 103:4428-32.
A method is presented for scoring the model quality of experimental and
theoretical protein structures. The structural model to be evaluated is
dissected into small fragments via a sliding window, where each fragment
is represented by a vector of multiple phi-psi angles. The sliding window
ranges in size from a length of 1-10 phi-psi pairs (3-12 residues). In
this method, the conformation of each fragment is scored based on the fit
of multiple phi-psi angles of the fragment to a database of multiple
phi-psi angles from high-resolution x-ray crystal structures. We show that
measuring the fit of predicted structural models to the allowed
conformational space of longer fragments is a significant discriminator
for model quality. Reasonable models have higher-order phi-psi score fit
values (m) > -1.00.
Click here to go back to the publication index
- Smith A, Chandonia JM, Brenner SE. 2006. ANDY: a general, fault-tolerant tool for database searching on computer clusters. Bioinformatics [PDF]|[Supplementary Info]
SUMMARY: ANDY (seArch coordination aND analYsis) is a set of Perl programs
and modules for distributing large biological database searches, and in
general any sequence of commands, across the nodes of a Linux computer
cluster. ANDY is compatible with several commonly used Distributed
Resource Management (DRM) systems, and it can be easily extended to new
DRMs. A distinctive feature of ANDY is the choice of either dedicated or
fair-use operation: ANDY is almost as efficient as single-purpose tools
that require a dedicated cluster, but it runs on a general-purpose cluster
along with any other jobs scheduled by a DRM. Other features include
communication through named pipes for performance, flexible customizable
routines for error-checking and summarizing results, and multiple
fault-tolerance mechanisms. AVAILABILITY: ANDY is freely available and may
be obtained from http://compbio.berkeley.edu/proj/andy; this site also
contains supplemental data and figures and a more detailed overview of the
software.
Click here to go back to the publication index
2005:
- Chandonia JM, Brenner SE. 2005. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches. Proteins 58:166-79. [PDF]
Structural genomics is an international effort to determine the
three-dimensional shapes of all important biological macromolecules, with
a primary focus on proteins. Target proteins should be selected according
to a strategy that is medically and biologically relevant, of good value,
and tractable. As an option to consider, we present the "Pfam5000"
strategy, which involves selecting the 5000 most important families from
the Pfam database as sources for targets. We compare the Pfam5000 strategy
to several other proposed strategies that would require similar numbers of
targets. These strategies include complete solution of several small to
moderately sized bacterial proteomes, partial coverage of the human
proteome, and random selection of approximately 5000 targets from
sequenced genomes. We measure the impact that successful implementation of
these strategies would have upon structural interpretation of the proteins
in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of
eukaryotes) from the Proteome Analysis database at the European
Bioinformatics Institute (EBI). Solving the structures of proteins from
the 5000 largest Pfam families would allow accurate fold assignment for
approximately 68% of all prokaryotic proteins (covering 59% of residues)
and 61% of eukaryotic proteins (40% of residues). More fine-grained
coverage that would allow accurate modeling of these proteins would
require an order of magnitude more targets. The Pfam5000 strategy may be
modified in several ways, for example, to focus on larger families,
bacterial sequences, or eukaryotic sequences; as long as secondary
consideration is given to large families within Pfam, coverage results
vary only slightly. In contrast, focusing structural genomics on a single
tractable genome would have only a limited impact in structural knowledge
of other proteomes: A significant fraction (about 30-40% of the proteins
and 40-60% of the residues) of each proteome is classified in small
families, which may have little overlap with other species of interest.
Random selection of targets from one or more genomes is similar to the
Pfam5000 strategy in that proteins from larger families are more likely to
be chosen, but substantial effort would be spent on small families.
Click here to go back to the publication index
- Chandonia JM, Brenner SE. 2005. Update on the Pfam5000 Strategy for Selection of Structural Genomics Targets. Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China [PDF]
Structural Genomics is an international effort to
determine the three-dimensional shapes of all important
biological macromolecules, with a primary focus on proteins.
Target proteins should be selected according to a strategy that is
medically and biologically relevant, of good financial value, and
tractable. In 2003, we presented the "Pfam5000" strategy, which
involves selecting the 5,000 most important families from the Pfam
database as sources for targets. In this update, we show that although
both the Pfam database and the number of sequenced genomes have
increased in size, the expected benefits of the Pfam5000 strategy have
not changed substantially. Solving the structures of proteins from
the 5,000 largest Pfam families would allow accurate fold assignment
for approximately 65% of all prokaryotic proteins (covering 54% of
residues) and 63% of eukaryotic proteins (42% of residues). Fewer
than 2,300 of the largest families on this list remain to be solved,
making the project feasible in the next five years given the expected
throughput to be achieved in the production phase of the Protein
Structure Initiative.
Click here to go back to the publication index
- Chen S, Yakunin AF, Proudfoot M, Kim R, Kim SH. 2005. Structural and functional characterization of a 5,10-methenyltetrahydrofolate synthetase from Mycoplasma pneumoniae (GI: 13508087). Proteins 61:433-43.
Mycoplasma pneumoniae 5,10-methenyltetrahydrofolate synthetase [MTHFS;
also known as 5-formyltetrahydrofolate cycloligase; Enzyme Commission (EC)
6.3.3.2] belongs to a large cycloligase protein family with 97 sequence
homologues from bacteria to human. To help define the molecular
(biochemical and biophysical) function of the M. pneumoniae MTHFS, we have
previously determined its crystal structure at 2.2 A resolution (Chen et
al., Proteins 2004;56:839-843). In this current study, activity assays
confirmed the functionality of the recombinant protein, with K(m) = 165
microM for 5-formyltetrahydrofolate (5-FTHF) and K(m) = 166 microM for
MgATP. The methenyltetrahydrofolate activity of M. pneumoniae MTHFS has a
requirement for divalent metal ions with Mg2+ being most effective, and an
absolute requirement for nucleoside 5'-triphosphates with adenosine
triphosphate (ATP) being most effective. Crystallization in the presence
of substrates (MgATP, with or without 5-FTHF) produced the complex
structures of the protein with adenosine diphosphate (ADP) and phosphate
at 2.2 A resolution; with ADP, phosphate, and 5-FTHF at 2.5 A resolution.
These structures directly demonstrated that the role of Mg2+ in the
reaction is to form the ATP--Mg2+-enzyme complex.
Click here to go back to the publication index
- Kim SH, Shin DH, Liu J, Oganesyan V, Chen S, ... Adams PD, Kim R. 2005. Structural genomics of minimal organisms and protein fold space. J Struct Funct Genomics 6:63-70.
The initial aim of the Berkeley Structural Genomics Center is to obtain a
near-complete structural complement of two minimal organisms, closely
related pathogens Mycoplasma genitalium and M. pneumoniae. The former has
fewer than 500 genes and the latter fewer than 700 genes. To achieve this
goal, the current protein targets have been selected starting with those
predicted to be most tractable and likely to yield new structural and
functional information. During the past 3 years, the semi-automated
structural genomics pipeline has been set up from cloning, expression,
purification, and ultimately to structural determination. The results from
the pipeline substantially increased the coverage of the protein fold
space of M. pneumoniae and M. genitalium. Furthermore, about 1/2 of the
structures of 'unique' protein sequences revealed new and novel folds, and
over 2/3 of the structures of previously annotated 'hypothetical proteins'
inferred their molecular functions.
Click here to go back to the publication index
- Kim JS, DeGiovanni A, Jancarik J, Adams PD, Yokota H, Kim R, Kim SH. 2005. Crystal structure of DNA sequence specificity subunit of a type I restriction-modification enzyme and its functional implications. Proc Natl Acad Sci U S A 102:3248-53.
Type I restriction-modification enzymes are differentiated from type II
and type III enzymes by their recognition of two specific dsDNA sequences
separated by a given spacer and cleaving DNA randomly away from the
recognition sites. They are oligomeric proteins formed by three subunits:
a specificity subunit, a methylation subunit, and a restriction subunit.
We solved the crystal structure of a specificity subunit from
Methanococcus jannaschii at 2.4-A resolution. Two highly conserved regions
(CRs) in the middle and at the C terminus form a coiled-coil of long
antiparallel alpha-helices. Two target recognition domains form globular
structures with almost identical topologies and two separate DNA binding
clefts with a modeled DNA helix axis positioned across the CR helices. The
structure suggests that the coiled-coil CRs act as a molecular ruler for
the separation between two recognized DNA sequences. Furthermore, the
relative orientation of the two DNA binding clefts suggests kinking of
bound dsDNA and exposing of target adenines from the recognized DNA
sequences.
Click here to go back to the publication index
- Liu J, Huang C, Shin DH, Yokota H, Jancarik J, ... Kim R, Kim SH. 2005. Crystal structure of a heat-inducible transcriptional repressor HrcA from Thermotoga maritima: structural insight into DNA binding and dimerization. J Mol Biol 350:987-96.
All cells have a defense mechanism against a sudden heat-shock stress.
Commonly, they express a set of proteins that protect cellular proteins
from being denatured by heat. Among them, GroE and DnaK chaperones are
representative defending systems, and their transcription is regulated by
a heat-shock repressor protein HrcA. HrcA repressor controls the
transcription of groE and dnaK operons by binding the palindromic CIRCE
element, presumably as a dimer, and the activity of HrcA repressor is
modulated by GroE chaperones. Here, we report the first crystal structure
of a heat-inducible transcriptional repressor, HrcA, from Thermotoga
maritima at 2.2A resolution. The Tm_HrcA protein crystallizes as a dimer.
The monomer is composed of three domains: an N-terminal winged
helix-turn-helix domain (WH), a GAF-like domain, and an inserted
dimerizing domain (IDD). The IDD shows a unique structural fold with an
anti-parallel beta-sheet composed of three beta-strands sided by four
alpha-helices. The Tm_HrcA dimer structure is formed through hydrophobic
contact between the IDDs and a limited contact that involves conserved
residues between the GAF-like domains. In the overall dimer structure, the
two WH domains are exposed, but the conformation of these two domains
seems to be incompatible with DNA binding. We suggest that our structure
may represent an inactive form of the HrcA repressor. Structural
implication on how the inactive form of HrcA may be converted to the
active form by GroEL binding to a conserved C-terminal sequence region of
HrcA is discussed.
Click here to go back to the publication index
- Liu J, Lou Y, Yokota H, Adams PD, Kim R, Kim SH. 2005. Crystal structure of a PhoU protein homologue: a new class of metalloprotein containing multinuclear iron clusters. J Biol Chem 280:15960-6.
PhoU proteins are known to play a role in the regulation of phosphate
uptake. In Thermotoga maritima, two PhoU homologues have been identified
bioinformatically. Here we report the crystal structure of one of the PhoU
homologues at 2.0 A resolution. The structure of the PhoU protein
homologue contains a highly symmetric new structural fold composed of two
repeats of a three-helix bundle. The structure unexpectedly revealed a
trinuclear and a tetranuclear iron cluster that were found to be bound on
the surface. Each of the two multinuclear iron clusters is coordinated by
a conserved E(D)XXXD motif pair. Our structure reveals a new class of
metalloprotein containing multinuclear iron clusters. The possible
functional implication based on the structure are discussed.
Click here to go back to the publication index
- Liu J, Lou Y, Yokota H, Adams PD, Kim R, Kim SH. 2005. Crystal structures of an NAD kinase from Archaeoglobus fulgidus in complex with ATP, NAD, or NADP. J Mol Biol 354:289-303.
NAD kinase is a ubiquitous enzyme that catalyzes the phosphorylation of
NAD to NADP using ATP or inorganic polyphosphate (poly(P)) as phosphate
donor, and is regarded as the only enzyme responsible for the synthesis of
NADP. We present here the crystal structures of an NAD kinase from the
archaeal organism Archaeoglobus fulgidus in complex with its phosphate
donor ATP at 1.7 A resolution, with its substrate NAD at 3.05 A
resolution, and with the product NADP in two different crystal forms at
2.45 A and 2.0 A resolution, respectively. In the ATP bound structure, the
AMP portion of the ATP molecule is found to use the same binding site as
the nicotinamide ribose portion of NAD/NADP in the NAD/NADP bound
structures. A magnesium ion is found to be coordinated to the phosphate
tail of ATP as well as to a pyrophosphate group. The conserved GGDG loop
forms hydrogen bonds with the pyrophosphate group in the ATP-bound
structure and the 2' phosphate group of the NADP in the NADP-bound
structures. A possible phosphate transfer mechanism is proposed on the
basis of the structures presented.
Click here to go back to the publication index
- Oganesyan N, Kim SH, Kim R. 2005. On-column protein refolding for crystallization. J Struct Funct Genomics 6:177-82.
One major bottleneck in protein production in Escherichia coli for
structural genomics projects is the formation of insoluble protein
aggregates (inclusion bodies). The efficient refolding of proteins from
inclusion bodies is becoming an important tool that can provide soluble
native proteins for structural and functional studies. Here we report an
on-column refolding method established at the Berkeley Structural Genomics
Center (BSGC). Our method is a combination of an 'artificial
chaperone-assisted refolding' method previously proposed and affinity
chromatography to take advantage of a chromatographic step: less
time-consuming, no filtration or concentration, with the additional
benefit of protein purification. It can be easily automated and formatted
for high-throughput process.
Click here to go back to the publication index
- Oganesyan V, Huang C, Adams PD, Jancarik J, Yokota HA, Kim R, Kim SH. 2005. Structure of a NAD kinase from Thermotoga maritima at 2.3 A resolution. Acta Crystallograph Sect F Struct Biol Cryst Commun 61:640-6.
NAD kinase is the only known enzyme that catalyzes the formation of NADP,
a coenzyme involved in most anabolic reactions and in the antioxidant
defense system. Despite its importance, very little is known regarding the
mechanism of catalysis and only recently have several NAD kinase
structures been deposited in the PDB. Here, an independent investigation
of the crystal structure of inorganic polyphosphate/ATP-NAD kinase,
PPNK_THEMA, a protein from Thermotoga maritima, is reported at a
resolution of 2.3 A. The crystal structure was solved using
single-wavelength anomalous diffraction (SAD) data collected at the Se
absorption-peak wavelength in a state in which no cofactors or substrates
were bound. It revealed that the 258-amino-acid protein is folded into two
distinct domains, similar to recently reported NAD kinases. The N-terminal
alpha/beta-domain spans the first 100 amino acids and the last 30 amino
acids of the polypeptide and has several topological matches in the PDB,
whereas the other domain, which spans the middle 130 residues, adopts a
unique beta-sandwich architecture and only appreciably matches the
recently deposited PDB structures of NAD kinases.
Click here to go back to the publication index
- Oganesyan V, Oganesyan N, Adams PD, Jancarik J, Yokota HA, Kim R, Kim SH. 2005. Crystal structure of the "PhoU-like" phosphate uptake regulator from Aquifex aeolicus. J Bacteriol 187:4238-44.
The phoU gene of Aquifex aeolicus encodes a protein called PHOU_AQUAE with
sequence similarity to the PhoU protein of Escherichia coli. Despite the
fact that there is a large number of family members (more than 300)
attributed to almost all known bacteria and despite PHOU_AQUAE's
association with the regulation of genes for phosphate metabolism, the
nature of its regulatory function is not well understood. Nearly one-half
of these PhoU-like proteins, including both PHOU_AQUAE and the one from E.
coli, form a subfamily with an apparent dimer structure of two PhoU
domains on the basis of their amino acid sequence. The crystal structure
of PHOU_AQUAE (a 221-amino-acid protein) reveals two similar coiled-coil
PhoU domains, each forming a three-helix bundle. The structures of
PHOU_AQUAE proteins from both a soluble fraction and refolded inclusion
bodies (at resolutions of 2.8 and 3.2A, respectively) showed no
significant differences. The folds of the PhoU domain and Bag domains (for
a class of cofactors of the eukaryotic chaperone Hsp70 family) are
similar. Accordingly, we propose that gene regulation by PhoU may occur by
association of PHOU_AQUAE with the ATPase domain of the histidine kinase
PhoR, promoting release of its substrate PhoB. Other proteins that share
the PhoU domain fold include the coiled-coil domains of the STAT protein,
the ribosome-recycling factor, and structural proteins like spectrin.
Click here to go back to the publication index
- Pajon A, Ionides J, Diprose J, Fillon J, Fogh R, ... Stuart DI, Henrick K. 2005. Design of a data model for developing laboratory information management and analysis systems for protein production. Proteins 58:278-84.
Data management has emerged as one of the central issues in the
high-throughput processes of taking a protein target sequence through to a
protein sample. To simplify this task, and following extensive
consultation with the international structural genomics community, we
describe here a model of the data related to protein production. The model
is suitable for both large and small facilities for use in tracking
samples, experiments, and results through the many procedures involved.
The model is described in Unified Modeling Language (UML). In addition, we
present relational database schemas derived from the UML. These relational
schemas are already in use in a number of data management projects.
Click here to go back to the publication index
- Schulze-Gahmen U, Aono S, Chen S, Yokota H, Kim R, Kim SH. 2005. Structure of the hypothetical Mycoplasma protein MPN555 suggests a chaperone function. Acta Crystallogr D Biol Crystallogr 61:1343-7.
The crystal structure of the hypothetical protein MPN555 from Mycoplasma
pneumoniae (gi|1673958) has been determined to a resolution of 2.8
Angstrom using anomalous diffraction data at the Se-peak wavelength.
Structure determination revealed a mostly alpha-helical protein with a
three-lobed shape. The three lobes or fingers delineate a central binding
groove and additional grooves between lobes 1 and 3 and between lobes 2
and 3. For one of the molecules in the asymmetric unit, the central
binding pocket was filled with a peptide from the uncleaved N-terminal
affinity tag. The MPN555 structure has structural homology to two
bacterial chaperone proteins: SurA and trigger factor from Escherichia
coli. The structural data and the homology to other chaperone proteins
suggests an involvement in protein folding as a molecular chaperone for
MPN555.
Click here to go back to the publication index
- Shin DH, Oganesyan N, Jancarik J, Yokota H, Kim R, Kim SH. 2005. Crystal structure of a nicotinate phosphoribosyltransferase from Thermoplasma acidophilum. J Biol Chem 280:18326-35.
We have determined the crystal structure of nicotinate
phosphoribosyltransferase from Themoplasma acidophilum (TaNAPRTase). The
TaNAPRTase has three domains, an N-terminal domain, a central functional
domain, and a unique C-terminal domain. The crystal structure revealed
that the functional domain has a type II phosphoribosyltransferase fold
that may be a common architecture for both nicotinic acid and quinolinic
acid (QA) phosphoribosyltransferases (PRTase) despite low sequence
similarity between them. Unlike QAPRTase, TaNAPRTase has a unique extra
C-terminal domain containing a zinc knuckle-like motif containing 4
cysteines. The TaNAPRTase forms a trimer of dimers in the crystal. The
active site pocket is formed at dimer interfaces. The complex structures
with phosphoribosylpyrophosphate (PRPP) and nicotinate mononucleotide
(NAMN) showed, surprisingly, that functional residues lining on the active
site of TaNAPRTase are quite different from those of QAPRTase, although
their substrates are quite similar to each other. The phosphate moiety of
PRPP and NAMN is anchored to the phosphate-binding loops formed by
backbone amides, as found in many alpha/beta barrel enzymes. The
pyrophosphate moiety of PRPP is located at the entrance of the active site
pocket, whereas the nicotinate moiety of NAMN is located deep inside.
Interestingly, the nicotinate moiety of NAMN is intercalated between
highly conserved aromatic residues Tyr(21) and Phe(138). Careful
structural analyses combined with other NAPRTase sequence subfamilies
reveal that TaNAPRTase represents a unique sequence subfamily of NAPRTase.
The structures of TaNAPRTase also provide valuable insight for other
sequence subfamilies such as pre-B cell colony-enhancing factor, known to
have nicotinamide phosphoribosyltransferase activity.
Click here to go back to the publication index
- Shin DH, Lou Y, Jancarik J, Yokota H, Kim R, Kim SH. 2005. Crystal structure of TM1457 from Thermotoga maritima. J Struct Biol 152:113-7.
The crystal structure of a hypothetical protein, TM1457, from Thermotoga
maritima has been determined at 2.0A resolution. TM1457 belongs to the
DUF464 family (57 members) for which there is no known function. The
structure shows that it is composed of two helices in contact with one
side of a five-stranded beta-sheet. Two identical monomers form a
pseudo-dimer in the asymmetric unit. There is a large cleft between the
first alpha-helix and the second beta-strand. This cleft may be
functionally important, since the two highly conserved motifs, GHA and
VCAXV(S/T), are located around the cleft. A structural comparison of
TM1457 with known protein structures shows the best hit with another
hypothetical protein, Ybl001C from Saccharomyces cerevisiae, though they
share low structural similarity. Therefore, TM1457 still retains a unique
topology and reveals a novel fold.
Click here to go back to the publication index
- Sims GE, Choi IG, Kim SH. 2005. Protein conformational space in higher order {phi}-{psi} maps. Proc Natl Acad Sci U S A
We have mapped protein conformational space from two to seven residue
lengths by employing multidimensional scaling on a data matrix composed of
pair-wise angular distances for multiple phi-psi values collected from
high-resolution protein structures. The resulting global maps show
clustering of peptide conformations that reveals a dramatic reduction of
conformational space as sampled by experimentally observed peptides. Each
map can be viewed as a higher order phi-psi plot defining regions of space
that are conformationally allowed.
Click here to go back to the publication index
- Xu QS, Jancarik J, Lou Y, Kuznetsova K, Yakunin AF, ... Kim R, Kim SH. 2005. Crystal structures of a phosphotransacetylase from Bacillus subtilis and its complex with acetyl phosphate. J Struct Funct Genomics 6:269-79.
Phosphotransacetylase (Pta) [EC 2.3.1.8] plays a major role in acetate
metabolism by catalyzing the reversible transfer of the acetyl group
between coenzyme A (CoA) and orthophosphate: CH(3)COSCoA+HPO [Formula: see
text]CH(3)COOPO (3) (2-) +CoASH. In this study, we report the crystal
structures of Pta from Bacillus subtilis at 2.75 A resolution and its
complex with acetyl phosphate, one of its substrates, at 2.85 A
resolution. In addition, the Pta activity of the enzyme has been assayed.
The enzyme folds into an alpha/beta architecture with two domains
separated by a prominent cleft, very similar to two other known Pta
structures. The enzyme-acetyl phosphate complex structure reveals a few
potential substrate binding sites. Two of them are located in the middle
of the interdomain cleft: each one is surrounded by a region of strictly
and highly conserved residues. High structural similarities are found with
4-hydroxythreonine-4-phosphate dehydrogenase (PdxA), and isocitrate and
isopropylmalate dehydrogenases, all of which utilize NADP(+) as their
cofactor, which binds in the interdomain cleft. Their substrate binding
sites are close to the acetyl phosphate binding sites of Pta in the cleft
as well. These results suggest that the CoA is likely to bind to the
interdomain cleft of Pta in a similar way as NADP(+) binds to the other
three enzymes.
Click here to go back to the publication index
- Zhang Y, Chandonia JM, Ding C, Holbrook SR. 2005. Comparative mapping of sequence-based and structure-based protein domains. BMC Bioinformatics 6:77. [PDF]
BACKGROUND: Protein domains have long been an ill-defined concept in
biology. They are generally described as autonomous folding units with
evolutionary and functional independence. Both structure-based and
sequence-based domain definitions have been widely used. But whether these
types of models alone can capture all essential features of domains is
still an open question. METHODS: Here we provide insight on domain
definitions through comparative mapping of two domain classification
databases, one sequence-based (Pfam) and the other structure-based (SCOP).
A mapping score is defined to indicate the significance of the mapping,
and the properties of the mapping matrices are studied. RESULTS: The
mapping results show a general agreement between the two databases, as
well as many interesting areas of disagreement. In the cases of
disagreement, the functional and evolutionary characteristics of the
domains are examined to determine which domain definition is biologically
more informative.
Click here to go back to the publication index
2004:
- Busso D, Kim R, Kim SH. 2004. Using an Escherichia coli cell-free extract to screen for soluble expression of recombinant proteins. J Struct Funct Genomics 5:69-74.
For structural and functional genomics programs, new high-throughput
methods to characterize well-expressing and highly soluble proteins are
essential. A faster and more convenient approach to screen expression
conditions of recombinant proteins compared to classical in vivo systems
is the Escherichia coli cell-free expression system. Here, we describe a
rapid procedure to screen for expression and solubility of recombinant
proteins using an E. coli cell-free extract. The results presented cover
24 open reading frames of unknown function from different micro-organisms.
In order to screen different variables that may interfere with solubility,
we expressed the recombinant proteins with a histidine(6) tag, either
N-terminal or C-terminal at two temperatures (25 |SNC and 30 |SNC). The
identification of recombinant proteins is performed by the dot blot
procedure using an anti-histidine tag antibody. We designed a rapid method
that allows the characterization of soluble candidates from a large number
of genes or from a large number of variants that is highly compatible with
structural genomics expectations. Abbreviations IPTG - isopropyl beta-d-1
thiogalactopyranoside; Mr - molecular mass; ORF - open reading frame; PCR
- polymerase chain reaction; TBST - Tris-buffered saline Tween; Tris -
tris(hydroxymethyl)aminomethane.
Click here to go back to the publication index
- Card GL, England BP, Suzuki Y, Fong D, Powell B, ... Schlessinger J, Zhang KY. 2004. Structural basis for the activity of drugs that inhibit phosphodiesterases. Structure (Camb) 12:2233-47.
Phosphodiesterases (PDEs) comprise a large family of enzymes that catalyze
the hydrolysis of cAMP or cGMP and are implicated in various diseases. We
describe the high-resolution crystal structures of the catalytic domains
of PDE4B, PDE4D, and PDE5A with ten different inhibitors, including the
drug candidates cilomilast and roflumilast, for respiratory diseases.
These cocrystal structures reveal a common scheme of inhibitor binding to
the PDEs: (i) a hydrophobic clamp formed by highly conserved hydrophobic
residues that sandwich the inhibitor in the active site; (ii) hydrogen
bonding to an invariant glutamine that controls the orientation of
inhibitor binding. A scaffold can be readily identified for any given
inhibitor based on the formation of these two types of conserved
interactions. These structural insights will enable the design of
isoform-selective inhibitors with improved binding affinity and should
facilitate the discovery of more potent and selective PDE inhibitors for
the treatment of a variety of diseases.
Click here to go back to the publication index
- Chandonia JM, Hon G, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE. 2004. The ASTRAL Compendium in 2004. Nucleic Acids Res 32 Database issue:D189-92. [PDF]
The ASTRAL Compendium provides several databases and tools to aid in the
analysis of protein structures, particularly through the use of their
sequences. Partially derived from the SCOP database of protein structure
domains, it includes sequences for each domain and other resources useful
for studying these sequences and domain structures. The current release of
ASTRAL contains 54,745 domains, more than three times as many as the
initial release 4 years ago. ASTRAL has undergone major transformations in
the past 2 years. In addition to several complete updates each year,
ASTRAL is now updated on a weekly basis with preliminary classifications
of domains from newly released PDB structures. These classifications are
available as a stand-alone database, as well as integrated into other
ASTRAL databases such as representative subsets. To enhance the utility of
ASTRAL to structural biologists, all SCOP domains are now made available
as PDB-style coordinate files as well as sequences. In addition to
sequences and representative subsets based on SCOP domains, sequences and
subsets based on PDB chains are newly included in ASTRAL. Several search
tools have been added to ASTRAL to facilitate retrieval of data by
individual users and automated methods. ASTRAL may be accessed at
http://astral.stanford. edu/.
Click here to go back to the publication index
- Chen S, Yakunin AF, Kuznetsova E, Busso D, Pufan R, ... Kim R, Kim SH. 2004. Structural and Functional Characterization of a Novel Phosphodiesterase from Methanococcus jannaschii. J Biol Chem 279:31854-62.
Methanococcus jannaschii MJ0936 is a hypothetical protein of unknown
function with over 50 homologs found in many bacteria and Archaea. To help
define the molecular (biochemical and biophysical) function of MJ0936, we
determined its crystal structure at 2.4-A resolution and performed a
series of biochemical screens for catalytic activity. The overall fold of
this single domain protein consists of a four-layered structure formed by
two beta-sheets flanked by alpha-helices on both sides. The crystal
structure suggested its biochemical function to be a nuclease,
phosphatase, or nucleotidase, with a requirement for some metal ions.
Crystallization in the presence of Ni(2+) or Mn(2+) produced a protein
containing a binuclear metal center in the putative active site formed by
a cluster of conserved residues. Analysis of MJ0936 against a panel of
general enzymatic assays revealed catalytic activity toward
bis-p-nitrophenyl phosphate, an indicator substrate for phosphodiesterases
and nucleases. Significant activity was also found with two other
phosphodiesterase substrates, thymidine 5'-monophosphate p-nitrophenyl
ester and p-nitrophenylphosphorylcholine, but no activity was found for
cAMP or cGMP. Phosphodiesterase activity of MJ0936 had an absolute
requirement for divalent metal ions with Ni(2+) and Mn(2+) being most
effective. Thus, our structural and enzymatic studies have identified the
biochemical function of MJ0936 as that of a novel phosphodiesterase.
Click here to go back to the publication index
- Chen S, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Crystal structure of a protein associated with cell division from Mycoplasma pneumoniae (GI: 13508053): a novel fold with a conserved sequence motif. Proteins 55:785-91.
UPF0040 is a family of proteins implicated in a cellular function of
bacteria cell division. There is no structure information available on
protein of this family. We have determined the crystal structure of a
protein from Mycoplasma pneumoniae that belongs to this family using X-ray
crystallography. Structural homology search reveals that this protein has
a novel fold with no significant similarity to any proteins of known
three-dimensional structure. The crystal structures of the protein in
three different crystal forms reveal that the protein exists as a ring of
octamer. The conserved protein residues, including a highly conserved
DXXXR motif, are examined on the basis of crystal structure.
Click here to go back to the publication index
- Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Res 14:1188-90. [PDF]
WebLogo generates sequence logos, graphical representations of the
patterns within a multiple sequence alignment. Sequence logos provide a
richer and more precise description of sequence similarity than consensus
sequences and can rapidly reveal significant features of the alignment
otherwise difficult to perceive. Each logo consists of stacks of letters,
one stack for each position in the sequence. The overall height of each
stack indicates the sequence conservation at that position (measured in
bits), whereas the height of symbols within the stack reflects the
relative frequency of the corresponding amino or nucleic acid at that
position. WebLogo has been enhanced recently with additional features and
options, to provide a convenient and highly configurable sequence logo
generator. A command line interface and the complete, open WebLogo source
code are available for local installation and customization.
Click here to go back to the publication index
- Grosse-Kunstleve RW, Sauter NK, Adams PD. 2004. Numerically stable algorithms for the computation of reduced unit cells. Acta Crystallogr A 60:1-6.
The computation of reduced unit cells is an important building block for a
number of crystallographic applications, but unfortunately it is very easy
to demonstrate that the conventional implementation of cell reduction
algorithms is not numerically stable. A numerically stable implementation
of the Niggli-reduction algorithm of Krivy & Gruber [Acta Cryst. (1976),
A32, 297-298] is presented. The stability is achieved by consistently
using a tolerance in all floating-point comparisons. The tolerance must be
greater than the accumulated rounding errors. A second stable algorithm is
also presented, the minimum reduction, that does not require using a
tolerance. It produces a cell with minimum lengths and all angles acute or
obtuse. The algorithm is a simplified and modified version of the
Buerger-reduction algorithm of Gruber [Acta Cryst. (1973), A29, 433-440].
Both algorithms have been enhanced to generate a change-of-basis matrix
along with the parameters of the reduced cell.
Click here to go back to the publication index
- Jancarik J, Pufan R, Hong C, Kim SH, Kim R. 2004. Optimum solubility (OS) screening: an efficient method to optimize buffer conditions for homogeneity and crystallization of proteins. Acta Crystallogr D Biol Crystallogr 60:1670-3.
One of the most critical steps in the preparation of protein samples for
structural studies by X-ray crystallography is to obtain biochemically
pure and conformationally homogenous protein samples. Very often, the
purified sample does not meet these qualifications and therefore does not
crystallize. A screening method, Optimum Solubility Screen, has been
developed that consists of two steps. The first step selects a better
buffer than that used during purification. 24 different buffers ranging
from pH 3 to pH 10 are screened using a vapor-diffusion method and very
small amounts of protein. The solubility of the protein is first
determined by visual examination using a light microscope and those drops
that remain clear after 24 h are further evaluated using dynamic light
scattering. If the results from the first step are still not satisfactory,
a second step explores a variety of chemical additives in order to improve
the monodispersity of the protein sample. In 64% of the cases,
crystallization was successful from proteins that had initially shown high
levels of aggregation. This screen can be configured to perform in an
automated high-throughput mode and can be expanded for additional buffers
and additives.
Click here to go back to the publication index
- Nguyen H, Martinez B, Oganesyan N, Kim R. 2004. An automated small-scale protein expression and purification screening provides beneficial information for protein production. J Struct Funct Genomics 5:23-7.
One of the first key steps in structural genomics is high-throughput
expression and rapid screening to select highly soluble proteins, the
preferred candidates for crystal production. Here we describe the
methodology used at the Berkeley Structural Genomics Center (BSGC) for
automated parallel expression and small-scale purification of fusion
proteins using a 96-well format. Our robotic method includes cell lysis,
soluble fraction separation and purification with affinity resins. For
detection of His-tagged proteins in the soluble fractions and after
affinity resin elution, a dot-blot procedure with an anti-His-antibody is
used. The expression level and molecular mass of recombinant proteins are
checked by SDS-PAGE. With this approach, we are able to obtain beneficial
information to be used for large-scale protein expression and
purification.
Click here to go back to the publication index
- Oganesyan V, Pufan R, DeGiovanni A, Yokota H, Kim R, Kim SH. 2004. Structure of the putative DNA-binding protein SP_1288 from Streptococcus pyogenes. Acta Crystallogr D Biol Crystallogr 60:1266-71.
The crystal structure of the putative DNA-binding protein SP_1288
(gi/15675166, also listed as gi/28895954) from Streptococcus pyogenes has
been determined by X-ray crystallography to a resolution of 2.3 A using
anomalous diffraction data at the Se peak wavelength. SP_1288 belongs to a
family of proteins whose cellular function is associated with the signal
recognition particle; no structural information has been available until
now about the members of the family. Crystallographic analysis revealed
that the overall fold of SP_1288 consists exclusively of alpha-helices and
that 75% of the structure has good similarity to domain 4 of the sigma
subunit of RNA polymerase. This suggests its possible involvement in the
biochemical function of transcription initiation, which includes
interaction with DNA.
Click here to go back to the publication index
- Ranatunga W, Hill EE, Mooster JL, Holbrook EL, Schulze-Gahmen U, ... Brenner SE, Holbrook SR. 2004. Structural studies of the Nudix hydrolase DR1025 from Deinococcus radiodurans and its ligand complexes. J Mol Biol 339:103-16.
We have determined the crystal structure, at 1.4A, of the Nudix hydrolase
DR1025 from the extremely radiation resistant bacterium Deinococcus
radiodurans. The protein forms an intertwined homodimer by exchanging
N-terminal segments between chains. We have identified additional
conserved elements of the Nudix fold, including the metal-binding motif, a
kinked beta-strand characterized by a proline two positions upstream of
the Nudix consensus sequence, and participation of the N-terminal
extension in the formation of the substrate-binding pocket. Crystal
structures were also solved of DR1025 crystallized in the presence of
magnesium and either a GTP analog or Ap(4)A (both at 1.6A resolution). In
the Ap(4)A co-crystal, the electron density indicated that the product of
asymmetric hydrolysis, ATP, was bound to the enzyme. The GTP analog bound
structure showed that GTP was bound almost identically as ATP. Neither
nucleoside triphosphate was further cleaved.
Click here to go back to the publication index
- Sauter NK, Grosse-Kunstlev RW, Adams PD. 2004. Robust indexing for automatic data collection. Journal of Applied Crystallography 37:399-409.
Improved methods for indexing diffraction patterns from
macromolecular crystals are presented. The novel procedures
include a more robust way to verify the position of the incident
X-ray beam on the detector, an algorithm to verify that the deduced
lattice basis is consistent with the observations, and an alternative
approach to identify the metric symmetry of the lattice. These methods
help to correct failures commonly experienced during indexing, and
increase the overall success rate of the process. Rapid indexing,
without the need for visual inspection, will play an important role as
beamlines at synchrotron sources prepare for high-throughput
automation.
Click here to go back to the publication index
- Shi J, Pelton JG, Cho HS, Wemmer DE. 2004. Protein signal assignments using specific labeling and cell-free synthesis. J Biomol NMR 28:235-47.
The goal of structural genomics initiatives is to determine complete sets
of protein structures that represent recently sequenced genomes. The
development of new high throughput methods is an essential aspect of this
enterprise. Residue type and sequential assignments obtained from
specifically labeled samples, when combined with 3D heteronuclear data,
can significantly increase the efficiency and accuracy of the assignment
process, the first step in structure determination by NMR. A protocol for
the design of specifically labeled samples with high information content
is presented along with a description of the experiments used to extract
essential information using 2D versions of 3D heteronuclear experiments.
In vitro protein synthesis methods were used to produce four specifically
labeled samples of the 23.5 kDa protein phosphoserine phosphatase (PSP)
from Methanoccous jannaschii (MJ1594). Each sample contained two
(13)C/(15)N-labeled amino acids and one (15)N-labeled amino acid. The 135
type and 14 sequential assignments obtained from these samples were used
in conjunction with 3D data obtained from uniformly (13)C/(15)N-labeled
and (2)H/(13)C/(15)N-labeled protein to manually assign the backbone
(1)H(N), (15)N, (13)CO, (13)C(alpha), and (13)C(beta) signals. Using an
automated assignment algorithm, 30% more assignments were obtained when
the type and sequential assignments were used in the calculations.
Click here to go back to the publication index
- Shin DH, Choi IG, Busso D, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Structure of OsmC from Escherichia coli: a salt-shock-induced protein. Acta Crystallogr D Biol Crystallogr 60:903-11. [TXT]
The crystal structure of an osmotically inducible protein (OsmC) from
Escherichia coli has been determined at 2.4 A resolution. OsmC is a
representative protein of the OsmC sequence family, which is composed of
three sequence subfamilies. The structure of OsmC provides a view of a
salt-shock-induced protein. Two identical monomers form a cylindrically
shaped dimer in which six helices are located on the inside and two
six-stranded beta-sheets wrap around these helices. Structural comparison
suggests that the OsmC sequence family has a peroxiredoxin function and
has a unique structure compared with other peroxiredoxin families. A
detailed analysis of structures and sequence comparisons in the OsmC
sequence family revealed that each subfamily has unique motifs. In
addition, the molecular function of the OsmC sequence family is discussed
based on structural comparisons among the subfamily members.
Click here to go back to the publication index
- Shin DH, Brandsen J, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Structural analyses of peptide release factor 1 from Thermotoga maritima reveal domain flexibility required for its interaction with the ribosome. J Mol Biol 341:227-39.
We have determined the crystal structure of peptide chain release factor 1
(RF1) from Thermotoga maritima (gi 4981173) at 2.65 Angstrom resolution by
selenomethionine single-wavelength anomalous dispersion (SAD) techniques.
RF1 is a protein that recognizes stop codons and promotes the release of a
nascent polypeptide from tRNA on the ribosome. Selenomethionine-labeled
RF1 crystallized in space group P2(1) with three monomers per asymmetric
unit. It has approximate dimensions of 75 Angstrom x 70 Angstrom x 45
Angstrom and is composed of four domains. The overall fold of each RF1
domain shows almost the same topology with Escherichia coli RF2, except
that the RF1 N-terminal domain is shorter and the C-terminal domain is
longer than that of RF2. The N-terminal domain of RF1 indicates a
rigid-body movement relative to that of RF2 with an angle of approximately
90 degrees. Including these features, RF1 has a tripeptide anticodon PVT
motif instead of the SPF motif of RF2, which confers the specificity
towards the stop codons. The analyses of three molecules in the asymmetric
unit and comparison with RF2 revealed the presence of dynamic movement of
domains I and III, which are anchored to the central domain by hinge
loops. The crystal structure of RF1 elucidates the intrinsic property of
this family of having large domain movements for proper function with the
ribosome.
Click here to go back to the publication index
- Shin DH, Lou Y, Jancarik J, Yokota H, Kim R, Kim SH. 2004. Crystal structure of YjeQ from Thermotoga maritima contains a circularly permuted GTPase domain. Proc Natl Acad Sci U S A 101:13198-203.
We have determined the crystal structure of the GDP complex of the YjeQ
protein from Thermotoga maritima (TmYjeQ), a member of the YjeQ GTPase
subfamaily. TmYjeQ, a homologue of Escherichia coli YjeQ, which is known
to bind to the ribosome, is composed of three domains: an N-terminal
oligonucleotide/oligosaccharide-binding fold domain, a central GTPase
domain, and a C-terminal zinc-finger domain. The crystal structure of
TmYjeQ reveals two interesting domains: a circularly permutated GTPase
domain and an unusual zinc-finger domain. The binding mode of GDP in the
GTPase domain of TmYjeQ is similar to those of GDP or GTP analogs in ras
proteins, a prototype GTPase. The N-terminal
oligonucleotide/oligosaccharide-binding fold domain, together with the
GTPase domain, forms the extended RNA-binding site. The C-terminal domain
has an unusual zinc-finger motif composed of Cys-250, Cys-255, Cys-263,
and His-257, with a remote structural similarity to a portion of a
DNA-repair protein, rad51 fragment. The overall structural features of
TmYjeQ make it a good candidate for an RNA-binding protein, which is
consistent with the biochemical data of the YjeQ subfamily in binding to
the ribosome.
Click here to go back to the publication index
- Snell G, Cork C, Nordmeyer R, Cornell E, Meigs G, ... Stevens RC, Earnest T. 2004. Automated sample mounting and alignment system for biological crystallography at a synchrotron source. Structure (Camb) 12:537-45.
High-throughput data collection for macromolecular crystallography
requires an automated sample mounting and alignment system for
cryo-protected crystals that functions reliably when integrated into
protein-crystallography beamlines at synchrotrons. Rapid mounting and
dismounting of the samples increases the efficiency of the crystal
screening and data collection processes, where many crystals can be tested
for the quality of diffraction. The sample-mounting subsystem has random
access to 112 samples, stored under liquid nitrogen. Results of extensive
tests regarding the performance and reliability of the system are
presented. To further increase throughput, we have also developed a sample
transport/storage system based on "puck-shaped" cassettes, which can hold
sixteen samples each. Seven cassettes fit into a standard dry shipping
Dewar. The capabilities of a robotic crystal mounting and alignment system
with instrumentation control software and a relational database allows for
automated screening and data collection to be developed.
Click here to go back to the publication index
- Zhang KY, Card GL, Suzuki Y, Artis DR, Fong D, ... Schlessinger J, Bollag G. 2004. A glutamine switch mechanism for nucleotide selectivity by phosphodiesterases. Mol Cell 15:279-86.
Phosphodiesterases (PDEs) comprise a family of enzymes that modulate the
immune response, inflammation, and memory, among many other functions.
There are three types of PDEs: cAMP-specific, cGMP-specific, and
dual-specific. Here we describe the mechanism of nucleotide selectivity on
the basis of high-resolution co-crystal structures of the cAMP-specific
PDE4B and PDE4D with AMP, the cGMP-specific PDE5A with GMP, and the
apo-structure of the dual-specific PDE1B. These structures show that an
invariant glutamine functions as the key specificity determinant by a
"glutamine switch" mechanism for recognizing the purine moiety in cAMP or
cGMP. The surrounding residues anchor the glutamine residue in different
orientations for cAMP and for cGMP. The PDE1B structure shows that in
dual-specific PDEs a key histidine residue may enable the invariant
glutamine to toggle between cAMP and cGMP. The structural understanding of
nucleotide binding enables the design of new PDE inhibitors that may treat
diseases in which cyclic nucleotides play a critical role.
Click here to go back to the publication index
2003:
- Busso D, Kim R, Kim SH. 2003. Expression of soluble recombinant proteins in a cell-free system using a 96-well format. J Biochem Biophys Methods 55:233-40.
For structural and functional genomics programs, new high-throughput
methods to obtain well-expressing and highly soluble proteins are
essential. Here, we describe a rapid procedure to express recombinant
proteins in an Escherichia coli cell-free system using a 96-well format.
The identification of soluble proteins is performed by the Dot Blot
procedure using an anti-His tag antibody. The applications and the
automation of this method are described.
Click here to go back to the publication index
- Hou J, Sims GE, Zhang C, Kim SH. 2003. A global representation of the protein fold space. Proc Natl Acad Sci U S A 100:2386-90.
One of the principal goals of the structural genomics initiative is to
identify the total repertoire of protein folds and obtain a global view of
the "protein structure universe." Here, we present a 3D map of the protein
fold space in which structurally related folds are represented by
spatially adjacent points. Such a representation reveals a high-level
organization of the fold space that is intuitively interpretable. The
shape of the fold space and the overall distribution of the folds are
defined by three dominant trends: secondary structure class, chain
topology, and protein domain size. Random coil-like structures of small
proteins and peptides are mapped to a region where the three trends
converge, offering an interesting perspective on both the demography of
fold space and the evolution of protein structures.
Click here to go back to the publication index
- Kim SH, Shin DH, Choi IG, Schulze-Gahmen U, Chen S, Kim R. 2003. Structure-based functional inference in structural genomics. J Struct Funct Genomics 4:129-35.
The dramatically increasing number of new protein sequences arising from
genomics and proteomics requires the need for methods to rapidly and
reliably infer the molecular and cellular functions of these proteins. One
such approach, structural genomics, aims to delineate the total repertoire
of protein folds in nature, thereby providing three-dimensional folding
patterns for all proteins and to infer molecular functions of the proteins
based on the combined information of structures and sequences. The goal of
obtaining protein structures on a genomic scale has motivated the
development of high throughput technologies and protocols for
macromolecular structure determination that have begun to produce
structures at a greater rate than previously possible. These new
structures have revealed many unexpected functional inferences and
evolutionary relationships that were hidden at the sequence level. Here,
we present samples of structures determined at Berkeley Structural
Genomics Center and collaborators' laboratories to illustrate how
structural information provides and complements sequence information to
deduce the functional inferences of proteins with unknown molecular
functions. Two of the major premises of structural genomics are to
discover a complete repertoire of protein folds in nature and to find
molecular functions of the proteins whose functions are not predicted from
sequence comparison alone. To achieve these objectives on a genomic scale,
new methods, protocols, and technologies need to be developed by
multi-institutional collaborations worldwide. As part of this effort, the
Protein Structure Initiative has been launched in the United States (PSI;
www.nigms.nih.gov/funding/psi.html). Although infrastructure building and
technology development are still the main focus of structural genomics
programs, a considerable number of protein structures have already been
produced, some of them coming directly out of semiautomated structure
determination pipelines. The Berkeley Structural Genomics Center (BSGC)
has focused on the proteins of Mycoplasma or their homologues from other
organisms as its structural genomics targets because of the minimal genome
size of the Mycoplasmas as well as their relevance to human and animal
pathogenicity (http://www.strgen.org). Here we present several protein
examples encompassing a spectrum of functional inferences obtainable from
their three-dimensional structures in five situations, where the
inferences are new and testable, and are not predictable from protein
sequence information alone.
Click here to go back to the publication index
- Kim R, Lai L, Lee HH, Cheong GW, Kim KK, ... Marqusee S, Kim SH. 2003. On the mechanism of chaperone activity of the small heat-shock protein of Methanococcus jannaschii. Proc Natl Acad Sci U S A 100:8151-5.
The small heat-shock protein (sHSP) from Methanococcus jannaschii (Mj
HSP16.5) forms a homomeric complex of 24 subunits and has an overall
structure of a multiwindowed hollow sphere with an external diameter of
approximately 120 A and an internal diameter of approximately 65 A with
six square "windows" of approximately 17 A across and eight triangular
windows of approximately 30 A across. This sHSP has been known to protect
other proteins from thermal denaturation. Using purified single-chain
monellin as a substrate and a series of methods such as protease
digestion, antibody binding, and electron microscopy, we show that the
substrates bind to Mj HSP16.5 at a high temperature (80 degrees C) on the
outside surface of the sphere and are prevented from forming insoluble
substrate aggregates in vitro. Circular dichroism studies suggest that a
very small, if any, conformational change occurs in sHSP even at 80
degrees C, but substantial conformational changes of the substrate are
required for complex formation at 80 degrees C. Furthermore, deletion
mutation studies of Mj HSP16.5 suggest that the N-terminal region of the
protein has no structural role but may play an important kinetic role in
the assembly of the sphere by "preassembly condensation" of multiple
monomers before final assembly of the sphere.
Click here to go back to the publication index
- Moshinsky DJ, Bellamacina CR, Boisvert DC, Huang P, Hui T, ... Kim SH, Rice AG. 2003. SU9516: biochemical analysis of cdk inhibition and crystal structure in complex with cdk2. Biochem Biophys Res Commun 310:1026-31.
SU9516 is a 3-substituted indolinone compound with demonstrated potent and
selective inhibition toward cyclin dependent kinases (cdks). Here, we
describe the kinetic characterization of this inhibition with respect to
cdk2, 1, and 4, along with the crystal structure in complex with cdk2. The
molecule is competitive with respect to ATP for cdk2/cyclin A, with a K(i)
value of 0.031 microM. Similarly, SU9516 inhibits cdk2/cyclin E and
cdk1/cyclin B1 in an ATP-competitive manner, although at a 2- to 8-fold
reduced potency. In contrast, the compound exhibited non-competitive
inhibition with respect to ATP toward cdk4/cyclin D1, with a 45-fold
reduced potency. The X-ray crystal structure of SU9516 bound to cdk2
revealed interactions between the molecule and Leu83 and Glu81 of the
kinase. This study should aid in the development of more potent and
selective cdk inhibitors for potential therapeutic agents.
Click here to go back to the publication index
- Oganesyan V, Busso D, Brandsen J, Chen S, Jancarik J, Kim R, Kim SH. 2003. Structure of the hypothetical protein AQ_1354 from Aquifex aeolicus. Acta Crystallogr D Biol Crystallogr 59:1219-23.
The crystal structure of a hypothetical protein AQ_1354 (gi 2983779) from
the hyperthermophilic bacteria Aquifex aeolicus has been determined using
X-ray crystallography. As found in many structural genomics studies, this
protein is not associated with any known function based on its amino-acid
sequence. PSI-BLAST analysis against a non-redundant sequence database
gave 68 similar sequences referred to as 'conserved hypothetical proteins'
from the uncharacterized protein family UPF0054 (accession No. PF02310).
Crystallographic analysis revealed that the overall fold of this protein
consists of one central alpha-helix surrounded by a four-stranded
beta-sheet and four other alpha-helices. Structure-based homology analysis
with DALI revealed that the structure has a moderate to good resemblance
to metal-dependent proteinases such as collagenases and gelatinases, thus
suggesting its possible molecular function. However, experimental tests
for collagenase and gelatinase-type function show no detectable activity
under standard assay conditions. Therefore, we suggest either that the
members of the UPF0054 family have a similar fold but different
biochemical functions to those of collagenases and gelatinases or that
they have a similar function but perform it under different conditions.
Click here to go back to the publication index
- Rubin SM, Pelton JG, Yokota H, Kim R, Wemmer DE. 2003. Solution structure of a putative ribosome binding protein from Mycoplasma pneumoniae and comparison to a distant homolog. J Struct Funct Genomics 4:235-43. [TXT]
The solution structure of MPN156, a ribosome-binding factor A (RBFA)
protein family member from Mycoplasma pneumoniae, is presented. The
structure, solved by nuclear magnetic resonance, has a type II KH fold
typical of RNA binding proteins. Despite only approximately 20% sequence
identity between MPN156 and another family member from Escherichia coli,
the two proteins have high structural similarity. The comparison
demonstrates that many of the conserved residues correspond to conserved
elements in the structures. Compared to a structure based alignment,
standard alignment methods based on sequence alone mispair a majority of
amino acids in the two proteins. Implications of these discrepancies for
sequence based structural modeling are discussed.
Click here to go back to the publication index
- Schulze-Gahmen U, Pelaschier J, Yokota H, Kim R, Kim SH. 2003. Crystal structure of a hypothetical protein, TM841 of Thermotoga maritima, reveals its function as a fatty acid-binding protein. Proteins 50:526-30.
We determined the three-dimensional (3D) crystal structure of protein
TM841, a protein product from a hypothetical open-reading frame in the
genome of the hyperthermophile bacterium Thermotoga maritima, to 2.0 A
resolution. The protein belongs to a large protein family, DegV or COG1307
of unknown function. The 35 kDa protein consists of two separate domains,
with low-level structural resemblance to domains from other proteins with
known 3D structures. These structural homologies, however, provided no
clues for the function of TM841. But the electron density maps revealed
clear density for a bound fatty-acid molecule in a pocket between the two
protein domains. The structure indicates that TM841 has the molecular
function of fatty-acid binding and may play a role in the cellular
functions of fatty acid transport or metabolism.
Click here to go back to the publication index
- Shin DH, Nguyen HH, Jancarik J, Yokota H, Kim R, Kim SH. 2003. Crystal structure of NusA from Thermotoga maritima and functional implication of the N-terminal domain. Biochemistry 42:13429-37.
We report the crystal structure of N-utilizing substance A protein (NusA)
from Thermotoga maritima (TmNusA), a protein involved in transcriptional
pausing, termination, and antitermination. TmNusA has an elongated
rod-shaped structure consisting of an N-terminal domain (NTD, residues
1-132) and three RNA binding domains (RBD). The NTD consists of two
subdomains, the globular head and the helical body domains, that comprise
a unique three-dimensional structure that may be important for interacting
with RNA polymerase. The globular head domain possesses a high content of
negatively charged residues that may interact with the positively charged
flaplike domain of RNA polymerase. The helical body domain is composed of
a three-helix bundle that forms a hydrophobic core with the aid of two
neighboring beta-strands. This domain shows structural similarity with one
of the helical domains of sigma(70) factor from Escherichia coli. One side
of the molecular surface shows positive electrostatic potential suitable
for nonspecific RNA interaction. The RBD is composed of one S1 domain and
two K-homology (KH) domains forming an elongated RNA binding surface.
Structural comparison between TmNusA and Mycobacterium tuberculosis NusA
reveals a possible hinge motion between NTD and RBD. In addition, a
functional implication of the NTD in its interaction with RNA polymerase
is discussed.
Click here to go back to the publication index
- Shin DH, Roberts A, Jancarik J, Yokota H, Kim R, Wemmer DE, Kim SH. 2003. Crystal structure of a phosphatase with a unique substrate binding domain from Thermotoga maritima. Protein Sci 12:1464-72.
We have determined the crystal structure of a phosphatase with a unique
substrate binding domain from Thermotoga maritima, TM0651 (gi 4981173), at
2.2 A resolution by selenomethionine single-wavelength anomalous
diffraction (SAD) techniques. TM0651 is a member of the haloacid
dehalogenase (HAD) superfamily, with sequence homology to
trehalose-6-phosphate phosphatase and sucrose-6(F)-phosphate
phosphohydrolase. Selenomethionine labeled TM0651 crystallized in space
group C2 with three monomers per asymmetric unit. Each monomer has
approximate dimensions of 65 x 40 x 35 A(3), and contains two domains: a
domain of known hydrolase fold characteristic of the HAD family, and a
domain with a new tertiary fold consisting of a six-stranded beta-sheet
surrounded by four alpha-helices. There is one disulfide bond between
residues Cys35 and Cys265 in each monomer. One magnesium ion and one
sulfate ion are bound in the active site. The superposition of active site
residues with other HAD family members indicates that TM0651 is very
likely a phosphatase that acts through the formation of a phosphoaspartate
intermediate, which is supported by both NMR titration data and a
biochemical assay. Structural and functional database searches and the
presence of many aromatic residues in the interface of the two domains
suggest the substrate of TM0651 is a carbohydrate molecule. From the
crystal structure and NMR data, the protein likely undergoes a
conformational change upon substrate binding.
Click here to go back to the publication index
- Sims GE, Kim SH. 2003. Global mapping of nucleic acid conformational space: dinucleoside monophosphate conformations and transition pathways among conformational classes. Nucleic Acids Res 31:5607-16.
A global conformational space of 6253 dinucleoside monophosphate (DMP)
units consisting of RNA and DNA (free and protein/drug-bound) was 'mapped'
using high resolution crystal structures cataloged in the Nucleic Acid
Database (NDB). The torsion angles of each DMP were clustered in a reduced
three-dimensional space using a classical multi-dimensional scaling
method. The mapping of the conformational space reveals nine primary
clusters which distinguish among the common A-, B- and Z-forms and their
various substates, plus five secondary clusters for kinked or bent
structures. Conformational relationships and possible transitional
pathways among the substates are also examined using the conformational
states of DNA and RNA bound with proteins or drugs as potential pathway
intermediates.
Click here to go back to the publication index
- Zhang C, Kim SH. 2003. Overview of structural genomics: from structure to function. Curr Opin Chem Biol 7:28-32.
The unprecedented increase in the number of new protein sequences arising
from genomics and proteomics highlights directly the need for methods to
rapidly and reliably determine the molecular and cellular functions of
these proteins. One such approach, structural genomics, aims to delineate
the total repertoire of protein folds, thereby providing three-dimensional
portraits for all proteins in a living organism and to infer molecular
functions of the proteins. The goal of obtaining protein structures on a
genomic scale has motivated the development of high-throughput
technologies for macromolecular structure determination, which have begun
to produce structures at a greater rate than previously possible. These
new structures have revealed many unexpected functional and evolution
relationships that were hidden at the sequence level.
Click here to go back to the publication index
2002:
- Chandonia JM, Walker NS, Lo Conte L, Koehl P, Levitt M, Brenner SE. 2002. ASTRAL compendium enhancements. Nucleic Acids Res 30:260-3. [PDF]
The ASTRAL compendium provides several databases and tools to aid in the
analysis of protein structures, particularly through the use of their
sequences. It is partially derived from the SCOP database of protein
domains, and it includes sequences for each domain as well as other
resources useful for studying these sequences and domain structures.
Several major improvements have been made to the ASTRAL compendium since
its initial release 2 years ago. The number of protein domain sequences
included has doubled from 15 190 to 30 867, and additional databases have
been added. The Rapid Access Format (RAF) database contains manually
curated mappings linking the biological amino acid sequences described in
the SEQRES records of PDB entries to the amino acid sequences structurally
observed (provided in the ATOM records) in a format designed for rapid
access by automated tools. This information is used to derive sequences
for protein domains in the SCOP database. In cases where a SCOP domain
spans several protein chains, all of which can be traced back to a single
genetic source, a 'genetic domain' sequence is created by concatenating
the sequences of each chain in the order found in the original gene
sequence. Both the original-style library of SCOP sequences and a new
library including genetic domain sequences are available. Selected
representative subsets of each of these libraries, based on multiple
criteria and degrees of similarity, are also included. ASTRAL may be
accessed at http://astral.stanford.edu/.
Click here to go back to the publication index
- Grosse-Kunstleve RW, Adams PD. 2002. Algorithms for deriving crystallographic space-group information. II. Treatment of special positions. Acta Crystallogr A 58:60-5.
Algorithms for the treatment of special positions in three-dimensional
crystallographic space groups are presented. These include an algorithm
for the determination of the site-symmetry group given the coordinates of
a point, an algorithm for the determination of the exact location of the
nearest special position, an algorithm for the assignment of a Wyckoff
letter given the site-symmetry group, and an alternative algorithm for the
assignment of a Wyckoff letter given the coordinates of a point directly.
All algorithms are implemented in ISO C++ and are integrated into the
Computational Crystallography Toolbox. The source code is freely
available.
Click here to go back to the publication index
- Huang L, Hung L, Odell M, Yokota H, Kim R, Kim SH. 2002. Structure-based experimental confirmation of biochemical function to a methyltransferase, MJ0882, from hyperthermophile Methanococcus jannaschii. J Struct Funct Genomics 2:121-7.
We have determined the three-dimensional (3-D) structure of protein
MJ0882, which derives from a hypothetical open reading frame in the genome
of the hyperthermophile Methanococcus jannaschii. The 3-D fold of MJ0882
at 1.8 A highly resembles that of a methyltransferase, despite limited
sequence similarity to any confirmed methyltransferase. The structure has
an S-adenosylmethionine (AdoMet) binding pocket surrounded by motifs with
similarities to those commonly found among AdoMet binding proteins.
Preliminary biochemical experiments show that MJ0882 specifically binds to
AdoMet, which is the essential co-factor for methyltransferases.
Click here to go back to the publication index
- Kim SH, Wang W, Kim KK. 2002. Dynamic and clustering model of bacterial chemotaxis receptors: structural basis for signaling and high sensitivity. Proc Natl Acad Sci U S A 99:11611-5.
Bacterial chemotaxis receptors can detect a small concentration gradient
of attractants and repellents in the environment over a wide range of
background concentration. The clustering of these receptors to form
patches observed in vivo and in vitro has been suspected as a reason for
the high sensitivity, and such wide dynamic range is thought to be due to
the resetting of the receptor sensitivity threshold by
methylation/demethylation of the receptors. However, the mechanisms by
which such high sensitivity is achieved and how the
methylation/demethylation resets the sensitivity are not well understood.
A molecular modeling of an intact bacterial chemotaxis receptor based on
the crystal structures of a cytoplasmic domain and a periplasmic domain
suggests an interesting clustering of three dimeric receptors and a
two-dimensional, close-packed lattice formation of the clusters, where
each receptor dimer contacts two other receptor dimers at the cytoplasmic
domain and two yet different receptor dimers at the periplasmic domain.
This interconnection of the receptors to form a patch of receptor clusters
suggests a structural basis for the high sensitivity of the bacterial
chemotaxis receptors. Furthermore, we present crystallographic data
suggesting that, in contrast to most molecular signaling by conformational
changes and/or oligomerization of the signaling molecules, the changes in
dynamic property of the receptors on ligand binding or methylation may be
the language of the signaling by the chemotaxis receptors. Taken together,
the changes of the dynamic property of one receptor propagating
mechanically to many others in the receptor patch provides a plausible,
simple mechanism for the high sensitivity and the dynamic range of the
receptors.
Click here to go back to the publication index
- Martinez-Cruz LA, Dreyer MK, Boisvert DC, Yokota H, Martinez-Chantar ML, Kim R, Kim SH. 2002. Crystal structure of MJ1247 protein from M. jannaschii at 2.0 A resolution infers a molecular function of 3-hexulose-6-phosphate isomerase. Structure (Camb) 10:195-204.
The crystal structure of the hypothetical protein MJ1247 from
Methanococccus jannaschii at 2 A resolution, a detailed sequence analysis,
and biochemical assays infer its molecular function to be
3-hexulose-6-phosphate isomerase (PHI). In the dissimilatory ribulose
monophosphate (RuMP) cycle, ribulose-5-phosphate is coupled to
formaldehyde by the 3-hexulose-6-phosphate synthase (HPS), yielding
hexulose-6-phosphate, which is then isomerized to fructose-6-phosphate by
the enzyme 3-hexulose-6-phosphate isomerase. MJ1247 is an alpha/beta
structure consisting of a five-stranded parallel beta sheet flanked on
both sides by alpha helices, forming a three-layered alpha-beta-alpha
sandwich. The fold represents the nucleotide binding motif of a flavodoxin
type. MJ1247 is a tetramer in the crystal and in solution and each monomer
has a folding similar to the isomerase domain of glucosamine-6-phosphate
synthase (GlmS).
Click here to go back to the publication index
- Schulze-Gahmen U, Kim SH. 2002. Structural basis for CDK6 activation by a virus-encoded cyclin. Nat Struct Biol 9:177-81.
Cyclin from herpesvirus saimiri (Vcyclin) preferentially forms complexes
with cyclin-dependent kinase 6 (CDK6) from primate host cells. These
complexes show higher kinase activity than host cell CDKs in complex with
cellular cyclins and are resistant to cyclin-dependent inhibitory proteins
(CDKIs). The crystal structure of human CDK6--Vcyclin in an active state
was determined to 3.1 A resolution to better understand the structural
basis of CDK6 activation by viral cyclins. The unphosphorylated CDK6 in
complex with Vcyclin has many features characteristic of
cyclinA-activated, phosphorylated CDK2. There are, however, differences in
the conformation at the tip of the T-loop and its interactions with
Vcyclin. Residues in the N-terminal extension of Vcyclin wrap around the
tip of the CDK6 T-loop and form a short beta-sheet with the T-loop
backbone. These interactions lead to a 20% larger buried surface in the
CDK6--Vcyclin interface than in the CDK2--cyclinA complex and are probably
largely responsible for the specificity of Vcyclin for CDK6 and resistance
of the complex to inhibition by INK-type CDKIs.
Click here to go back to the publication index
- Shin DH, Yokota H, Kim R, Kim SH. 2002. Crystal structure of a conserved hypothetical protein from Escherichia coli. J Struct Funct Genomics 2:53-66.
The crystal structure of a conserved hypothetical protein from Escherichia
coli has been determined using X-ray crystallography. The protein belongs
to the Cluster of Orthologous Group COG1553 (National Center for
Biotechnology Information database, NLM, NIH), for which there was no
structural information available until now. Structural homology search
with DALI algorism indicated that this protein has a new fold with no
obvious similarity to those of other proteins with known three-dimensional
structures. The protein quaternary structure consists of a dimer of
trimers, which makes a characteristic cylinder shape. There is a large
closed cavity with approximate dimensions of 16 A x 16 A x 20 A in the
center of the hexameric structure. Six putative active sites are
positioned along the equatorial surface of the hexamer. There are several
highly conserved residues including two possible functional cysteines in
the putative active site. The possible molecular function of the protein
is discussed.
Click here to go back to the publication index
- Shin DH, Yokota H, Kim R, Kim SH. 2002. Crystal structure of conserved hypothetical protein Aq1575 from Aquifex aeolicus. Proc Natl Acad Sci U S A 99:7980-5.
The crystal structure of a conserved hypothetical protein, Aq1575, from
Aquifex aeolicus has been determined by using x-ray crystallography. The
protein belongs to the domain of unknown function DUF28 in the Pfam and
PALI databases for which there was no structural information available
until now. A structural homology search with the DALI algorithm indicates
that this protein has a new fold with no obvious similarity to those of
other proteins of known three-dimensional structure. The protein reveals a
monomer consisting of three domains arranged along a pseudo threefold
symmetry axis. There is a large cleft with approximate dimensions of 10 A
x 10 A x 20 A in the center of the three domains along the symmetry axis.
Two possible active sites are suggested based on the structure and
multiple sequence alignment. There are several highly conserved residues
in these putative active sites. The structure based molecular properties
and thermostability of the protein are discussed.
Click here to go back to the publication index
- Wang W, Cho HS, Kim R, Jancarik J, Yokota H, ... Wemmer DE, Kim SH. 2002. Structural characterization of the reaction pathway in phosphoserine phosphatase: crystallographic "snapshots" of intermediate states. J Mol Biol 319:421-31.
Phosphoserine phosphatase (PSP) is a member of a large class of enzymes
that catalyze phosphoester hydrolysis using a phosphoaspartate-enzyme
intermediate. PSP is a likely regulator of the steady-state d-serine level
in the brain, which is a critical co-agonist of the N-methyl-d-aspartate
type of glutamate receptors. Here, we present high-resolution (1.5-1.9 A)
structures of PSP from Methanococcus jannaschii, which define the open
state prior to substrate binding, the complex with phosphoserine substrate
bound (with a D to N mutation in the active site), and the complex with
AlF3, a transition-state analog for the phospho-transfer steps in the
reaction. These structures, together with those described for the BeF3-
complex (mimicking the phospho-enzyme) and the enzyme with phosphate
product in the active site, provide a detailed structural picture of the
full reaction cycle. The structure of the apo state indicates partial
unfolding of the enzyme to allow substrate binding, with refolding in the
presence of substrate to provide specificity. Interdomain and active-site
conformational changes are identified. The structure with the transition
state analog bound indicates a "tight" intermediate. A striking structure
homology, with significant sequence conservation, among PSP, P-type
ATPases and response regulators suggests that the knowledge of the PSP
reaction mechanism from the structures determined will provide insights
into the reaction mechanisms of the other enzymes in this family.
Click here to go back to the publication index
- Zhang C, Hou J, Kim SH. 2002. Fold prediction of helical proteins using torsion angle dynamics and predicted restraints. Proc Natl Acad Sci U S A 99:3581-5.
We describe a procedure for predicting the tertiary folds of alpha-helical
proteins from their primary sequences. The central component of the
procedure is a method for predicting interhelical contacts that is based
on a helix-packing model. Instead of predicting the individual contacts,
our method attempts to identify the entire patch of contacts that involve
residues regularly spaced in the sequences. We use this component to glue
together two powerful existing methods: a secondary structure prediction
program, whose output serves as the input to the contact prediction
algorithm, and the tortion angle dynamics program, which uses the
predicted tertiary contacts and secondary structural states to assemble
three-dimensional structures. In the final step, the procedure uses the
initial set of simulated structures to refine the predicted contacts for a
new round of structure calculation. When tested against 24 small to
medium-sized proteins representing a wide range of helical folds, the
completely automated procedure is able to generate native-like models
within a limited number of trials consistently.
Click here to go back to the publication index
2001:
- Cave JW, Cho HS, Batchelder AM, Yokota H, Kim R, Wemmer DE. 2001. Solution nuclear magnetic resonance structure of a protein disulfide oxidoreductase from Methanococcus jannaschii. Protein Sci 10:384-96.
The solution structure of the protein disulfide oxidoreductase Mj0307 in
the reduced form has been solved by nuclear magnetic resonance. The
secondary and tertiary structure of this protein from the archaebacterium
Methanococcus jannaschii is similar to the structures that have been
solved for the glutaredoxin proteins from Escherichia coli, although
Mj0307 also shows features that are characteristic of thioredoxin
proteins. Some aspects of Mj0307's unique behavior can be explained by
comparing structure-based sequence alignments with mesophilic bacterial
and eukaryotic glutaredoxin and thioredoxin proteins. It is proposed that
Mj0307, and similar archaebacterial proteins, may be most closely related
to the mesophilic bacterial NrdH proteins. Together these proteins may
form a unique subgroup within the family of protein disulfide
oxidoreductases.
Click here to go back to the publication index
- Cho H, Wang W, Kim R, Yokota H, Damo S, ... Kustu S, Yan D. 2001. BeF(3)(-) acts as a phosphate analog in proteins phosphorylated on aspartate: structure of a BeF(3)(-) complex with phosphoserine phosphatase. Proc Natl Acad Sci U S A 98:8525-30.
Protein phosphoaspartate bonds play a variety of roles. In response
regulator proteins of two-component signal transduction systems,
phosphorylation of an aspartate residue is coupled to a change from an
inactive to an active conformation. In phosphatases and mutases of the
haloacid dehalogenase (HAD) superfamily, phosphoaspartate serves as an
intermediate in phosphotransfer reactions, and in P-type ATPases, also
members of the HAD family, it serves in the conversion of chemical energy
to ion gradients. In each case, lability of the phosphoaspartate linkage
has hampered a detailed study of the phosphorylated form. For response
regulators, this difficulty was recently overcome with a phosphate analog,
BeF(3)(-), which yields persistent complexes with the active site
aspartate of their receiver domains. We now extend the application of this
analog to a HAD superfamily member by solving at 1.5-A resolution the
x-ray crystal structure of the complex of BeF(3)(-) with phosphoserine
phosphatase (PSP) from Methanococcus jannaschii. The structure is
comparable to that of a phosphoenzyme intermediate: BeF(3)(-) is bound to
Asp-11 with the tetrahedral geometry of a phosphoryl group, is coordinated
to Mg(2+), and is bound to residues surrounding the active site that are
conserved in the HAD superfamily. Comparison of the active sites of
BeF(3)(-) x PSP and BeF(3)(-) x CeY, a receiver domain/response regulator,
reveals striking similarities that provide insights into the function not
only of PSP but also of P-type ATPases. Our results indicate that use of
BeF(3)(-) for structural studies of proteins that form phosphoaspartate
linkages will extend well beyond response regulators.
Click here to go back to the publication index
- Dreyer MK, Borcherding DR, Dumont JA, Peet NP, Tsay JT, ... Shen J, Kim SH. 2001. Crystal structure of human cyclin-dependent kinase 2 in complex with the adenine-derived inhibitor H717. J Med Chem 44:524-30.
Cyclin-dependent kinases (CDKs) are regulatory proteins of the eukaryotic
cell cycle. They act after association with different cyclins, the
concentrations of which vary throughout the progression of the cell cycle.
As central mediators of cell growth, CDKs are potential targets for
inhibitory molecules that would allow disruption of the cell cycle in
order to evoke an antiproliferative effect and may therefore be useful as
cancer therapeutics. We synthesized several inhibitory
2,6,9-trisubstituted purine derivatives and solved the crystal structure
of one of these compounds, H717, in complex with human CDK2 at 2.6 A
resolution. The orientation of the C2-p-diaminocyclohexyl portion of the
inhibitor is strikingly different from those of similar moieties in other
related inhibitor complexes. The N9-cyclopentyl ring fully occupies a
space in the enzyme which is otherwise empty, while the C6-N-aminobenzyl
substituent points out of the ATP-binding site. The structure provides a
basis for the further development of more potent inhibitory drugs.
Click here to go back to the publication index
- Du X, Wang W, Kim R, Yakota H, Nguyen H, Kim SH. 2001. Crystal structure and mechanism of catalysis of a pyrazinamidase from Pyrococcus horikoshii. Biochemistry 40:14166-72.
Bacterial pyrazinamidase (PZAase)/nicotinamidase converts pyrazinamide
(PZA) to ammonia and pyrazinoic acid, which is active against
Mycobacterium tuberculosis. Loss of PZAase activity is the major mechanism
of pyrazinamide-resistance by M. tuberculosis. We have determined the
crystal structure of the gene product of Pyrococcus horikoshii 999
(PH999), a PZAase, and its complex with zinc ion by X-ray crystallography.
The overall fold of PH999 is similar to that of N-carbamoylsarcosine
amidohydrolase (CSHase) of Arthrobacter sp. and YcaC of Escherichia coli,
a protein with unknown physiological function. The active site of PH999
was identified by structural features that are also present in the active
sites of CSHase and YcaC: a triad (D10, K94, and C133) and a cis-peptide
(between V128 and A129). Surprisingly, a metal ion-binding site was
revealed in the active site and subsequently confirmed by crystal
structure of PH999 in complex with Zn(2+). The roles of the triad,
cis-peptide, and metal ion in the catalysis are proposed. Because of
extensive homology between PH999 and PZAase of M. tuberculosis (37%
sequence identity), the structure of PH999 provides a structural basis for
understanding PZA-resistance by M. tuberculosis harboring PZAase
mutations.
Click here to go back to the publication index
- Du X, Frei H, Kim SH. 2001. Comparison of nitrophenylethyl and hydroxyphenacyl caging groups. Biopolymers 62:147-9.
Nitrophenylethyl (NPE)- and hydroxyphenacyl (HPA)-caged nucleotides were
employed in a time-resolved Fourier transform IR spectroscopy study on
Ras-catalyzed guanosine triphosphate (GTP) hydrolysis. A fast kinetic
component was observed following the photolysis of NPE-caged nucleotides
in the NPE-GTP-Ras complex. However, this kinetic component was not
observed in the HPA-GTP-Ras experiment. This fast kinetic component was
likely due to a chemical reaction between Ras and the detached caging
group, nitrosoacetophenone. This communication serves as a note of caution
in interpreting spectral changes and kinetic behavior of the enzymatic
systems employing NPE-caged compounds.
Click here to go back to the publication index
- Frankenberg RJ, Hsu TS, Yakota H, Kim R, Clark DS. 2001. Chemical denaturation and elevated folding temperatures are required for wild-type activity and stability of recombinant Methanococcus jannaschii 20S proteasome. Protein Sci 10:1887-96.
The 20S proteasome from the extreme thermophile Methanococcus jannaschii
(Mj) was purified and sequenced to facilitate production of the
recombinant proteasome in E. coli. The recombinant proteasome remained in
solution at a purity level of 80-85% (according to SDS PAGE) following
incubation of cell lysates at 70 degrees C. Temperature-activity profiles
indicated that the temperature optima of the wild-type and recombinant
enzymes differed substantially, with optimal activities occurring at 119
degrees C and 95 degrees C, respectively. To ameliorate this discrepancy,
two recombinant enzyme preparations were produced, each of which included
denaturation of the proteasome by 4 M urea followed by high-temperature
(85 degrees C) dialysis. The wild-type temperature optimum was restored,
but only if proteasome subunits were denatured and refolded prior to
assembly (a preparation designated as alpha & beta). In contrast, when
proteasome assembly preceded denaturation (designated alpha + beta) the
optimum temperature was raised to a lesser degree. Moreover, the alpha &
beta and alpha + beta preparations had apparent thermal half-lives at 114
degrees C of 54.2 and 26.2 min, respectively, and the thermostability of
the less stable enzyme was more sensitive to a reduction in pH. Attainment
of wild-type activity and stability thus required the proper folding of
both the alpha- and beta-subunits prior to proteasome assembly. Consistent
with this behavior, dual-scanning calorimetry (DSC) measurements revealed
differences in the reassembly efficiency of the two proteasome
preparations. The ability to produce structural conformers with
dramatically different thermal optima and thermostabilities may facilitate
the determination of molecular forces and structural motifs responsible
for enzyme thermostablity and high-temperature activity.
Click here to go back to the publication index
- Kim VN, Kataoka N, Dreyfuss G. 2001. Role of the nonsense-mediated decay factor hUpf3 in the splicing-dependent exon-exon junction complex. Science 293:1832-6.
Nonsense-mediated messenger RNA (mRNA) decay, or NMD, is a critical
process of selective degradation of mRNAs that contain premature stop
codons. NMD depends on both pre-mRNA splicing and translation, and it
requires recognition of the position of stop codons relative to exon-exon
junctions. A key factor in NMD is hUpf3, a mostly nuclear protein that
shuttles between the nucleus and cytoplasm and interacts specifically with
spliced mRNAs. We found that hUpf3 interacts with Y14, a component of
post-splicing mRNA-protein (mRNP) complexes, and that hUpf3 is enriched in
Y14-containing mRNP complexes. The mRNA export factors Aly/REF and TAP are
also associated with nuclear hUpf3, indicating that hUpf3 is in mRNP
complexes that are poised for nuclear export. Like Y14 and Aly/REF, hUpf3
binds to spliced mRNAs specifically ( approximately 20 nucleotides)
upstream of exon-exon junctions. The splicing-dependent binding of hUpf3
to mRNAs before export, as part of the complex that assembles near
exon-exon junctions, allows it to serve as a link between splicing and NMD
in the cytoplasm.
Click here to go back to the publication index
2000:
- Adler M, Davey DD, Phillips GB, Kim SH, Jancarik J, ... Light DR, Whitlow M. 2000. Preparation, characterization, and the crystal structure of the inhibitor ZK-807834 (CI-1031) complexed with factor Xa. Biochemistry 39:12534-42.
Factor Xa plays a critical role in the formation of blood clots. This
serine protease catalyzes the conversion of prothrombin to thrombin, the
first joint step that links the intrinsic and extrinsic coagulation
pathways. There is considerable interest in the development of factor Xa
inhibitors for the intervention in thrombic diseases. This paper presents
the structure of the inhibitor ZK-807834, also known as CI-1031, bound to
factor Xa and provides the details of the protein purification and
crystallization. Results from mass spectrometry indicate that the factor
Xa underwent autolysis during crystallization and the first EGF-like
domain was cleaved from the protein. The crystal structure of the complex
shows that the amidine of ZK-807834 forms a salt bridge with Asp189 in the
S1 pocket and the basic imidazoline fits snugly into the S4 site. The
central pyridine ring provides a fairly rigid linker between these groups.
This rigidity helps minimize entropic losses during binding. In addition,
the structure reveals new interactions that were not found in the previous
factor Xa/inhibitor complexes. ZK-807834 forms a strong hydrogen bond
between an ionized 2-hydroxy group and Ser195 of factor Xa. There is also
an aromatic ring-stacking interaction between the inhibitor and Trp215 in
the S4 pocket. These interactions contribute to both the potency of this
compound (K(I) = 0.11 nM) and the >2500-fold selectivity against
homologous serine proteases such as trypsin.
Click here to go back to the publication index
- Du X, Frei H, Kim SH. 2000. The mechanism of GTP hydrolysis by Ras probed by Fourier transform infrared spectroscopy. J Biol Chem 275:8492-500.
Time-resolved Fourier transform infrared spectroscopy (FTIR) in
combination with photo-induced release of (18)O-labeled caged nucleotide
has been employed to address mechanistic issues of GTP hydrolysis by Ras
protein. Infrared spectroscopy of Ras complexes with nitrophenylethyl
(NPE)-[alpha-(18)O(2)]GTP, NPE-[beta-(18)O(4)]GTP, or
NPE-[gamma-(18)O(3)]GTP upon photolysis or during hydrolysis afforded a
substantially improved mode assignment of phosphoryl group absorptions.
Photolysis spectra of hydroxyphenylacyl-GTP and hydroxyphenylacyl-GDP
bound to Ras and several mutants, Ras(Gly(12))-Mn(2+), Ras(Pro(12)),
Ras(Ala(12)), and Ras(Val(12)), were obtained and yielded valuable
information about structures of GTP or GDP bound to Ras mutants. IR
spectra revealed stronger binding of GDP beta-PO(3)(2-) moiety by Ras
mutants with higher activity, suggesting that the transition state is
largely GDP-like. Analysis of the photolysis and hydrolysis FTIR spectra
of the [beta-nonbridge-(18)O(2), alphabeta-bridge-(18)O]GTP isotopomer
allowed us to probe for positional isotope exchange. Such a reaction might
signal the existence of metaphosphate as a discrete intermediate, a key
species for a dissociative mechanism. No positional isotope exchange was
observed. Overall, our results support a concerted mechanism, but the
transition state seems to have a considerable amount of dissociative
character. This work demonstrates that time-resolved FTIR is highly
suitable for monitoring positional isotope exchange and advantageous in
many aspects over previously used methods, such as (31)P NMR and mass
spectrometry.
Click here to go back to the publication index
- Du X, Choi IG, Kim R, Wang W, Jancarik J, Yokota H, Kim SH. 2000. Crystal structure of an intracellular protease from Pyrococcus horikoshii at 2-A resolution. Proc Natl Acad Sci U S A 97:14079-84.
The intracellular protease from Pyrococcus horikoshii (PH1704) and PfpI
from Pyrococcus furiosus are members of a class of intracellular proteases
that have no sequence homology to any other known protease family. We
report the crystal structure of PH1704 at 2.0-A resolution. The protease
is tentatively identified as a cysteine protease based on the presence of
cysteine (residue 100) in a nucleophile elbow motif. In the crystal,
PH1704 forms a hexameric ring structure, and the active sites are formed
at the interfaces between three pairs of monomers.
Click here to go back to the publication index
- Falke JJ, Kim SH. 2000. Structure of a conserved receptor domain that regulates kinase activity: the cytoplasmic domain of bacterial taxis receptors. Curr Opin Struct Biol 10:462-9.
Many bacteria are motile and use a conserved class of transmembrane
sensory receptor to regulate cellular taxis toward an optimal living
environment. These conserved receptors are typically stimulated by
extracellular signals, but also undergo adaptation via covalent
modification at specific sites on their cytoplasmic domains. The function
of the cytoplasmic domain is to integrate the extracellular and adaptive
signals, and to use this integrated information to regulate an associated
histidine kinase. The kinase, in turn, triggers a cytoplasmic
phosphorylation pathway of the two-component class. The high-resolution
structure of a receptor cytoplasmic domain has recently been determined by
crystallographic methods and is largely consistent with a structural model
independently generated by chemical studies of the domain in the
full-length, membrane-bound receptor. These results represent an important
step toward a mechanistic understanding of receptor-to-kinase information
transfer.
Click here to go back to the publication index
- Kim SH. 2000. Structural genomics of microbes: an objective. Curr Opin Struct Biol 10:380-3.
A comparison of the genome sequences of more than 20 microorganisms
reveals that a large fraction of the genes have unknown functions.
Determining the structures of the proteins coded by these genes may
provide additional key information in an effort to uncover the molecular
functions of such proteins and new protein fold patterns. Using existing
technology, it is possible to obtain a complete sequence complement and a
near complete structural complement for a small microbial genome. Such
information may provide a comprehensive view of a small organism, which,
in turn, can serve as a platform for understanding more complex organisms.
Click here to go back to the publication index
- Lai L, Yokota H, Hung LW, Kim R, Kim SH. 2000. Crystal structure of archaeal RNase HII: a homologue of human major RNase H. Structure Fold Des 8:897-904.
BACKGROUND: RNases H are present in all organisms and cleave RNAs in
RNA/DNA hybrids. There are two major types of RNases H that have little
similarity in sequence, size and specificity. The structure of RNase HI,
the smaller enzyme and most abundant in bacteria, has been extensively
studied. However, no structural information is available for the larger
RNase H, which is most abundant in eukaryotes and archaea. Mammalian RNase
H participates in DNA replication, removal of the Okazaki fragments and
possibly DNA repair. RESULTS: The crystal structure of RNase HII from the
hypothermophile Methanococcus jannaschii, which is homologous to mammalian
RNase H, was solved using a multiwavelength anomalous dispersion (MAD)
phasing method at 2 A resolution. The structure contains two compact
domains. Despite the absence of sequence similarity, the large N-terminal
domain shares a similar fold with the RNase HI of bacteria. The active
site of RNase HII contains three aspartates: Asp7, Asp112 and Asp149. The
nucleotide-binding site is located in the cleft between the N-terminal and
C-terminal domains. CONCLUSIONS: Despite a lack of any detectable
similarity in primary structure, RNase HII shares a similar structural
domain with RNase HI, suggesting that the two classes of RNases H have a
common catalytic mechanism and possibly a common evolutionary origin. The
involvement of the unique C-terminal domain in substrate recognition
explains the different reaction specificity observed between the two
classes of RNase H.
Click here to go back to the publication index
- Meijer L, Thunnissen AM, White AW, Garnier M, Nikolic M, ... Kim SH, Pettit GR. 2000. Inhibition of cyclin-dependent kinases, GSK-3beta and CK1 by hymenialdisine, a marine sponge constituent. Chem Biol 7:51-63.
BACKGROUND: Over 2000 protein kinases regulate cellular functions.
Screening for inhibitors of some of these kinases has already yielded some
potent and selective compounds with promising potential for the treatment
of human diseases. RESULTS: The marine sponge constituent hymenialdisine
is a potent inhibitor of cyclin-dependent kinases, glycogen synthase
kinase-3beta and casein kinase 1. Hymenialdisine competes with ATP for
binding to these kinases. A CDK2-hymenialdisine complex crystal structure
shows that three hydrogen bonds link hymenialdisine to the Glu81 and Leu83
residues of CDK2, as observed with other inhibitors. Hymenialdisine
inhibits CDK5/p35 in vivo as demonstrated by the lack of
phosphorylation/down-regulation of Pak1 kinase in E18 rat cortical
neurons, and also inhibits GSK-3 in vivo as shown by the inhibition of
MAP-1B phosphorylation. Hymenialdisine also blocks the in vivo
phosphorylation of the microtubule-binding protein tau at sites that are
hyperphosphorylated by GSK-3 and CDK5/p35 in Alzheimer's disease
(cross-reacting with Alzheimer's-specific AT100 antibodies). CONCLUSIONS:
The natural product hymenialdisine is a new kinase inhibitor with
promising potential applications for treating neurodegenerative disorders.
Click here to go back to the publication index
- Zhang C, Kim SH. 2000. The anatomy of protein beta-sheet topology. J Mol Biol 299:1075-89.
Here, we present a systematic analysis of the open-faced beta-sheet
topologies in a set of non-redundant protein domain structures; in
particular, we focus on the topological diversity of four-stranded
beta-sheet motifs. Of the 96 topologies that are possible for a
four-stranded beta-sheet, 42 were identified in known protein structures.
Of these, four account for 50% of the structures that we have studied. Two
sets of the topologies that were not observed may represent the section of
the topological space that is not readily accessible to proteins on either
thermodynamic or kinetic grounds. The first set contains topologies with
alternating parallel and antiparallel beta-ladders. Their rare occurrence
reflects the expectation that it is energetically unfavorable to match
different hydrogen bonding patterns. The polypeptide chains in the second
set of topologies go through convoluted paths and are expected to
experience great kinetic frustrations during the folding processes. A
knowledge of the potential causes for the topological preference of small
beta-sheets also helps us to understand the topological properties of
larger beta-sheet structures which frequently contain four-stranded
motifs. The notion that protein topologies can only be taken from a
confined and discrete space has important implications for structural
genomics.
Click here to go back to the publication index
- Zhang C, Kim SH. 2000. Environment-dependent residue contact energies for proteins. Proc Natl Acad Sci U S A 97:2550-5.
We examine the interactions between amino acid residues in the context of
their secondary structural environments (helix, strand, and coil) in
proteins. Effective contact energies for an expanded 60-residue alphabet
(20 aa x three secondary structural states) are estimated from the
residue-residue contacts observed in known protein structures. Similar to
the prototypical contact energies for 20 aa, the newly derived energy
parameters reflect mainly the hydrophobic interactions; however, the
relative strength of such interactions shows a strong dependence on the
secondary structural environment, with nonlocal interactions in beta-sheet
structures and alpha-helical structures dominating the energy table.
Environment-dependent residue contact energies outperform existing residue
pair potentials in both threading and three-dimensional contact prediction
tests and should be generally applicable to protein structure prediction.
Click here to go back to the publication index
- Zhang C, Kim SH. 2000. A comprehensive analysis of the Greek key motifs in protein beta-barrels and beta-sandwiches. Proteins 40:409-19.
The Greek key motifs are the topological signature of many beta-barrels
and a majority of beta-sandwich structures. An updated survey of these
structures integrates many early observations and newly emerging patterns
and provides a better understanding of the unique role of Greek keys in
protein structures. A stereotypical Greek key beta-barrel accommodates
five or six strands and can have 12 possible topologies. All except one
six-stranded topologies have been observed, and only one five-stranded
topologies have been seen in actual structures. Of the representative
beta-barrel structures analyzed here, half have left-handed Greek keys.
This result challenges the empirical claim of the handedness regularity of
Greek keys in beta-barrels. One of the five-stranded topologies that has
not been observed in beta-barrels comprises two overlapping Greek keys.
The two three-dimensional forms of this topology constitute a structural
unit that is present in a vast majority of known beta-sandwich structures.
Using this unit as the root, we have built a new taxonomy tree for the
beta-sandwich folds and deduced a set of rules that appear to constrain
how other beta-strands adjoin the unit to form a larger double-layered
structure. These rules, though derived from a larger data set, are
essentially the same as those drawn from earlier studies, suggesting that
they may reflect the true topological constraints in the design of
beta-sandwich structures. Finally, a novel variant of the Greek key motif
(defined here as the twisted Greek key) has emerged which introduces loop
crossings into the folded structures. Proteins 2000;40:409-419.
Click here to go back to the publication index
- Zhang C, Kim SH. 2000. The effect of dynamic receptor clustering on the sensitivity of biochemical signaling. Pac Symp Biocomput:353-64.
Lateral clustering has emerged as a general mechanism used by many
cellular receptors to control their responses to critical changes in the
external environment. Here we derive a general mathematical framework to
characterize the effect of receptor clustering on the sensitivity and
dynamic range of biochemical signaling. In particular, we apply the theory
to the bacterial chemosensory system and show that it can integrate a
large body of experimental observations and provide a unified explanation
to many aspects of chemotaxis. The principles of dynamic receptor
clustering and signal amplification incorporated into this theory may
underlie the design of many cellular networks.
Click here to go back to the publication index
1999:
- Grigoriev IV, Kim SH. 1999. Detection of protein fold similarity based on correlation of amino acid properties. Proc Natl Acad Sci U S A 96:14318-23.
An increasing number of proteins with weak sequence similarity have been
found to assume similar three-dimensional fold and often have similar or
related biochemical or biophysical functions. We propose a method for
detecting the fold similarity between two proteins with low sequence
similarity based on their amino acid properties alone. The method, the
proximity correlation matrix (PCM) method, is built on the observation
that the physical properties of neighboring amino acid residues in
sequence at structurally equivalent positions of two proteins of similar
fold are often correlated even when amino acid sequences are different.
The hydrophobicity is shown to be the most strongly correlated property
for all protein fold classes. The PCM method was tested on 420 proteins
belonging to 64 different known folds, each having at least three proteins
with little sequence similarity. The method was able to detect fold
similarities for 40% of the 420 sequences. Compared with sequence
comparison and several fold-recognition methods, the method demonstrates
good performance in detecting fold similarities among the proteins with
low sequence identity. Applied to the complete genome of Methanococcus
jannaschii, the method recognized the folds for 22 hypothetical proteins.
Click here to go back to the publication index
[Back to Top]
|
|
|
|
|
|
^
| | |
|
|
|
| |
| | |
|
|