© 2003 by William H. Sofer
All Rights Reserved

 

Supported by:



Challenge III

May 2005

Waksman Challenge

Protein Folding Problem

Soon after its synthesis on the ribosome, a protein will spontaneously fold into a specific three-dimensional structure following the basic laws of chemistry and thermodynamics. The sequence of its amino acids determines how a protein folds. While we know some of the fundamental principles that drive protein folding, we can't, as of yet, predict the final three-dimensional shape of a protein solely from its sequence. This is referred to as the "protein folding problem".

Three of the fundamental principles of protein folding that determine the final shape of the protein are:

  1. Hydrophobic amino acid side chains are usually buried on the inside of proteins where they are protected from interaction with water.
  2. Charged amino acid side chains are usually found on the surface of proteins where they can interact with water and, sometimes, with other side chains of opposite charge.
  3. Especially for proteins that find themselves outside of cells, pairs of cysteine side chains are often found close to one another, forming covalent disulfide bonds that stabilize the final structure.

In this Challenge, you will explore these fundamental principles that determine how proteins fold. In the Level I Challenge, you will consider an artificial 15-amino acid protein, and how these principles of protein folding determine its shape. This Challenge will involve the combined use of physical models and a computer visualization tool. To receive your box of physical models needed to participate in this challenge, teams must first register for the Challenge, and then the team teacher should contact us via email. Please provide the name of your sponsoring teacher and complete school address and phone number. A model kit will be sent to your teacher's attention at your school. Even though there may be multiple teams from your school that participate in this Challenge, only one kit will be sent to your teacher (one kit per school).


Level I: Folding a 15 Amino Acid Protein
To complete Level I of this Challenge, you will need a four foot long Toober (foam-covered wire), 15 magnetic clips, a set of amino acid side chains, a blue and a red end-cap and red-, blue-, and yellow-colored dots. The side chains are colored-coded on one side to reflect their chemical properties:

  • The hydrophobic side chains are colored yellow.
  • The hydrophilic, non-charged side chains are colored white.
  • The hydrophilic, negatively charged side chains are colored red.
  • The hydrophilic, positively charged side chains are colored blue.
  • The cysteine side chains are colored green.

First, identify the amino acid side chains on the colored chart and then add the colored dots to the gray-colored side to generate the standard CPK coloring scheme (carbon is gray (no dot), oxygen is red, nitrogen is blue and sulfur is yellow). Position each of the amino acids in their proper location on the round magnetic disk with the gray sides containing the dots in one direction.

1. Using a digital camera, photograph the disk with the amino acids showing the position of the dots on each side chain and send us a ".jpg" image of the disk.

Second, distribute the 15 magnetic clips evenly along the Toober. Add the blue end-cap to one end of the Toober to mark the N-terminal end, and a red end-cap to the other end to mark the C-terminal end of the peptide.

Third, select any six hydrophobic side chains, two basic side chains, two acidic side chains, two cysteine side chains and any three hydrophilic, non-charged side chains, mix them randomly, and add them in any random sequence to the Toober.

Finally, fold your Toober following the basic principles that drive protein folding. You should do this in three stages:

  • First, fold your Toober such that all of the hydrophobic side chains are positioned in the middle of the globular protein.
  • Second, make adjustments to this initial structure as needed such that each acidic side chain is paired up with a basic amino acid. At this stage, your protein's shape should SIMULTANEOUSLY satisfy the first two principles of protein folding.
  • Third, adjust your protein's shape such that the two cysteine side chains are paired, and that they could form a disulfide bond. Once again, all three principles of protein folding should be SIMULTANEOUSLY satisfied by your protein's final shape.

2. Did you succeed in folding your randomly selected amino acid sequence into a structure that simultaneously satisfies all three principles of protein folding? If your initial amino acid sequence did not allow for a satisfactory solution, try again with another random sequence until you can find a sequence that can satisfy all three principles. Using a digital camera, photograph your folded Toober and send us a ".jpg" image of it. In addition, list the sequence of 15 amino acids. Remember that your sequence should be written starting at the amino terminal of the peptide and ending at the carboxyl terminal.

3. There are many possible peptide sequences that are 15 amino acids in length.

a) Assuming that a specific amino acid can be present multiple times within a sequence, how many different sequences can be generated that are 15 amino acids long?
b)
If you were able to assemble and fold one sequence per second, how long would it take for you to try out all possible sequences of peptides that are 15 amino acids in length?

4. How many amino acids make up an average human protein? Please reference the source that you used to get this value.

5. If a billion humans were each able to assemble and fold a billion amino acid sequences of the length of an average protein per second, how long would it take for all these humans to try out all possible protein sequences?

Note: If multiple groups are going to be using the model make sure that you remove all the dots from the amino acid side chains so that they can perform the first step of the challenge. Please leave the 15 magnetic clips on the Toober. They break if pulled on or off multiple times.

Level II: Bioinformatics analysis of your peptide
Using your Web browser, connect to the NCBI BLAST home page http://www.ncbi.nlm.nih.gov/BLAST/. Under the Protein section connect to the link for performing Searches with short, nearly conserved matches and perform a BLAST search with your peptide sequence. (Note: Do not use the Protein-protein BLAST (blastp) link since your peptide sequence is very short and will not identify any matches using the standard program). (Help 1)

1) What is the name of the protein? What is the E-value of the best match.

2) Do you think this match is significant?

3) Indicate which range of residues (e.g. 320-335) in the protein match your peptide.

4) Connect to the link on best match and copy the entire protein sequence. Paste the sequence into your answer sheet

Go to the Conserved Domain Database (CDD) at NCBI http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml, paste your sequence into the dialog box and perform the search. (Help 2)

5) What is a protein domain? How is this different than a protein?

6) Does this protein contain any conserved domains? If the protein does not contain any conserved domains, go back to the BLAST result in Step 1 of this level and perform the analysis on the next best match. If this protein does not contain a conserved domain, proceed to the next best match.

7) If the protein contains conserved domains, list these domains and their likely function.

8) Does the match to your peptide sequence fall within any of these domains? If so, indicate which one it is.

Level III: Structure of the peptide and domain

The peptide sequence you are studying was generated by your randomly picking the amino acids. It is possible that this peptide matched a sequence within a domain. However, it is also highly possible that your peptide does not match within a conserved domain on the protein. If your sequence matches within a domain perform the steps below in Section A. If your sequence does not match within a domain perform the steps outlined in Section B.

Section A: Your peptide matches a sequence within a domain
If your peptide sequence matches a sequence in a particular domain of the protein, connect to the link on that specific domain. If a structure of the domain is available download the Cn3D file and view it in Cn3D. If a structure is not available for the domain perform the steps in Section B. (Help 3)

1) Examine the sequence alignment shown in Sequence/Alignment Viewer window and identify the region that contains similarity to your peptide. Highlight the sequence with homology in the alignment window. This will produce a yellow strand of the protein in the corresponding region in the protein window. Send us an image of the protein.

2) Examine the structure of the yellow strand.

a) Is this region of the protein buried or solvent exposed?
b) Is it in contact with other regions of the domain?
c) Indicate if it has a particular secondary structure (α-helix, β-strand, random coil). (Note: It helps if you use the Worm option in the Rendering Shortcuts section of the Style pull down menu when the Cn3D graphics window is active.)

3) After clicking on the graphic window in Cn3D, select the Show Selected Residues option in the Show/Hide pull down menu. Send us an image of the isolated peptide.

a) Is the structure of this region similar to the structure you predicted?
b) If not, indicate why you think there may be differences.

Section B: Your peptide does not match a sequence within a domain
If your peptide sequence does not match a sequence in a particular domain of the protein or if there is no structure for the domain that your peptide matches, choose one of the other domains in the protein and connect to the link on that specific domain. If a structure of the domain is available download the Cn3D file and view it in Cn3D. If a structure is not available for the domain, select another domain. Indicate which domain you have chosen to analyze. (Help 3)

1) Examine the sequence alignment shown in the Sequence/Alignment Viewer window and identify a 15 amino acid region of domain that contains some positions that are strongly conserved (red and purple letters) and others that are not (blue and black letters). Highlight the sequence in the alignment window. This will produce a yellow strand of the protein in the corresponding region in the protein window Send us an image of the protein.

2) Examine the structure of the yellow strand.

a) Is this region of the protein buried or solvent exposed?
b) Is it in contact with other regions of the domain?
c) Indicate if it has a particular secondary structure (α-helix, β-strand, random coil). (Note: It helps if you use the Worm option in the Rendering Shortcuts section of the Style pull down menu when the Cn3D graphics window is active.)

3) After clicking on the graphic window in Cn3D, select the Show Selected Residues option in the Show/Hide pull down menu. Send us an image of the isolated peptide.

a) Do you think this peptide sequence would fold up in the same way if the rest of the domain was not present? Explain your reasoning.