A. Analysing a protein structure for errors and interesting features
From DISI
This part of the workshop will cover two areas that are of interest to both the crystallographer and molecular modeller.
i) Use Text Search to retrieve 2nv6. This is a good resolution structure for enoyl [acyl-carrier protein] reductase from Mycobacterium tuberculosis. It might be a structure we have refined in house and wish to use for starting a structure-based drug design program against tuberculosis.
ii) This is a Protein Information page. Scroll down the page to the diagram of the ligand that is present in the active site. The ligand is an analog of NADH, which is an important co-factor in many animal oxidoreductase enzymes.
iii) Click on the ligand image to bring up the Ligand Information page.
iv) Click on Show in ReliView. This will bring up the ReliView protein visualiser.
v) We can view water mediated protein-ligand contacts by clicking on the show links in the Visualise column of the Water-Mediated Protein-Ligand Contacts table in the Ligand Information Page. The water-mediated contact will be highlighted in ReliView. For instance click on show for the row that contains water 24. This water has an unusual coordination pattern. More information on this water can be found by clicking on the 24 link. Deviation of the coordination geometry from the ideal tetrahedral one is given. Relibase can use this together with protein–water contact distances and the reported B-factor for the water, to suggest that the assignment as water may be incorrect (an example is given later). In this case Relibase doesn’t mark this water as being in error
vi) We will now assess the ligand for geometrical errors. Go back to ReliView and, under calculate in the top level menu, click on Mogul Geometry Check. This allows us to use the Mogul module of the Cambridge Structural Database System (CSDS) to analyse the geometry against the entire Cambridge Structural Database (CSD). (This is available to CSDS subscribers only) We need to add hydrogens to the ligand and set consistent bond types before carrying out the geometry check so click on Edit and then Apply followed by Accept, to use the default Edit protocol. Now Click on Continue. An options box comes up that allows us to tailor our search criteria. Here we will do a search which will examine all the bond lengths, bond angles and torsion angles in the ligand. However to speed the search up we will only look for fragments that match exactly. Click on the radio button next to Only find fragments that match exactly, then click on Search.
vii) A table will come up showing all bonds, angles and torsions in the molecule. The bonds, angles and torsions that are determined to be unusual are coloured in red. Unusual geometric parameters may reflect significant strain in the ligand or alternatively errors in the refinement. The final column in the table gives an assessment of how unusual a parameter is. For bonds and angles this figure is the number of standard deviations from the mean value found in the CSD for similar fragments. Click on the row corresponding to unusual angle AO4* AC1* AN9 (z-score 8.0). The angle is highlighted in ReliView and also on a histogram of corresponding fragmental bond angles in the CSD. Double click on a distance, angle or torsion in the table to view the entry in Mogul. In this case the angle is off the graph! You can view the structures within the CSD that are being used for comparison via the View Structures pane. The geometry of the ribose unit is not standard and this is quite likely to be a refinement issue. Perhaps a poor model has been used in the refinement process.
viii) The final column of the torsions table records the local density. This is the percent of the entire distribution which lies within ± 10º of the query torsion. If it is less than 5% the torsion is marked as unusual. Click on the unusual torsion AO4*AC4*AC5*AO5*. The value of this torsion angle is well separated from those of similar structures. Is it due to a poor model or is the enzyme straining the ligand? This is a difficult question to answer without further work. However a crystallographer working on this structure might be advised to try to modify the ligand model to find an alternative model with less strain that fits the electron density at least as well.
ix) Now we will look at some interesting interactions of groups on the ligand with groups on the protein and see how they compare with the corresponding interactions found in the CSD. The ligand has an exocyclic primary carbamoyl group hanging off it (heavy atom labels NC7 NN7 NO7 – to check atom labels, select an atom, right-click, and select labels and Label by Atom Label).We will look at how this carbamoyl interacts with the adjacent groups in the protein. For this we will use the IsoStar module of the CSDS (This is available to CSDS subscribers only). This module is a knowledge base containing collated information on intermolecular interactions of different chemical groups as found in the CSD. Click on the backbone N atom of the Ile 194 residue. This is very close to the N atom of the carbamoyl (NN7). With the cursor still in place, right-click, and select IsoStar Intermolecular Contact Database. A new webpage appears which offers a choice of searches associated with selecting an isoleucine residue. Select the peptide link to make this group the Central group in our search. You will now be asked to select a Contact group. From the N-H table look for the amide NH row and select the link 803 in the CSD column (803 represents the number of incidences of this contact in the CSD). A cluster plot is displayed within the IsoStar client window for the amide N-H /peptide interaction. Does the interaction observed in 2nv6 appear to be common? (It may help to reduce the contacts down to those that lie within van der Waals contact for heavy atoms by clicking on vdW overlaps in the IsoStar client. The N Ile - NN07 contact lies well within van der Waals N-N distance.) You can confirm your conclusion by also looking at the scatter plot you get from linking to the corresponding PDB derived data. You may be able to work out what common mistake has been made in the setting up of this part of the ligand model. We will, in part B of this workshop, examine closely related structures to see if we can confirm this error.
x) Additional – You may wish to leave this exercise and come back to it if time allows. Use the Text Search functionality in Relibase+ to bring up structure 2bvr. This is a structure of a thrombin protein-ligand complex. Thrombin is a vital constituent of the blood clotting cascade. Click on Show in ReliView to display the protein. Now click on the grey box labelled water information near the base of the form. This gives extensive information on the water structure around the protein. Additionally waters that may be erroneously assigned are earmarked. Here water 421 is so marked as dubious. It has a low B factor (i.e. excess electron density), octahedral coordination and short bonds. Click on the 421 link to bring up information relating to that water. The water can be highlighted in the visualiser by use of the visualise command. It is less than 10Å from the S1 pocket of the protein active site.
xi) Click on the 4th and lowest ligand diagrams displayed in the Protein Information page, 4CP. This ligand is a synthetic inhibitor of thrombin that has been co-crystallised. We will now search for high resolution protein structures with high homology to this one.
xii) Click on Similar Binding Sites Search at the top of the Ligand Information page. In the resulting Binding Site Analysis page type 1.75 in the Lowest Resolution box. Leave all else at default. Click on Submit.
xiii) 27 structures should be retrieved. We can superimpose them in the same reference frame. Type 11.0 in the Radius of Sphere Around Ligand box, and click on Submit. In the Superposition Analysis page that comes up, a lot of information on the differences between the reference structure (2bvr) and the other structures is presented in table form. Further information is found by following the links in the table. View all the structures in ReliView by clicking on Show in ReliView. Find water 421 in 2bvr (Tip: Turn off all structures, turn on 2bvr and use the waters branch of the tree under 2bvr in the Display pane of ReliView). See whether other structures also have a water assigned in the same space.
xiv) In 14 structures the corresponding residue is correctly assigned as a sodium ion. Seven structures wrongly assign this residue as water. This ion very likely plays an important role in moderating the charge balance and solvation of the S1 pocket of the active site. Modelling the behaviour of thrombin using electrostatic calculations carried out using 2bvr as the starting structure, would likely lead to poor results.
This ends part 1A. - Part 1B - Back to Start

