Difference between revisions of "Protein modelling"

From Opengenome.net
Line 8: Line 8:
  
 
===Method===
 
===Method===
Find one or more structural templates of the target protein via sequence homology search against known structures (program: BLASTP or PSI-BLAST, database: PDB)  
+
* Find one or more structural templates of the target protein via sequence homology search against known structures (program: BLASTP or PSI-BLAST, database: PDB)  
Download template structures in PDB format (web server: RCSB PDB)  
+
 
Align the amino acid sequence of the target protein with that(those) of the template protein(s) (program: CLUSTALW, output: PIR format)  
+
* Download template structures in PDB format (web server: RCSB PDB)  
Build 3D structural models based on the multiple sequence alignment and the template 3D structure(s) (program: MODELLER, input: ATM, ALI, and TOP files)  
+
* Align the amino acid sequence of the target protein with that(those) of the template protein(s) (program: CLUSTALW, output: PIR format)  
 +
* Build 3D structural models based on the multiple sequence alignment and the template 3D structure(s) (program: MODELLER, input: ATM, ALI, and TOP files)  
 
manual build  
 
manual build  
 
manually convert file formats from PDB to ATM and PIR to ALI, respectively  
 
manually convert file formats from PDB to ATM and PIR to ALI, respectively  
Line 18: Line 19:
 
automatic build (some PDB and PIR files are not successfully converted to correct ATM and ALI files due to amino-acid sequence mismatch)  
 
automatic build (some PDB and PIR files are not successfully converted to correct ATM and ALI files due to amino-acid sequence mismatch)  
 
perl script to generate ATM, ALI, and TOP files from PDB and PIR files: [[perl script for ATM|txt]], gz  
 
perl script to generate ATM, ALI, and TOP files from PDB and PIR files: [[perl script for ATM|txt]], gz  
 +
 
command-line arguments in order: base_directory_path pir_file_name target_protein_id(exactly 4 letters) one_or_more_PDB_file_names(exactly 4-letter prefix)  
 
command-line arguments in order: base_directory_path pir_file_name target_protein_id(exactly 4 letters) one_or_more_PDB_file_names(exactly 4-letter prefix)  
 
run the perl script (e.g. ../bin/makeModellerInput.pl /home/user/Protein3DModelling/KCIP KCIP.pir KCIP 1QJA.pdb)  
 
run the perl script (e.g. ../bin/makeModellerInput.pl /home/user/Protein3DModelling/KCIP KCIP.pir KCIP 1QJA.pdb)  

Revision as of 09:33, 26 August 2005

http://biome.ngic.re.kr/ProteinModelling/


Problem (문제정의)

  • Build 3D structural models of given or interested proteins
    • Initial input: amino acid sequence of the target protein
    • Final output: its predicted 3D structure

Method

  • Find one or more structural templates of the target protein via sequence homology search against known structures (program: BLASTP or PSI-BLAST, database: PDB)
  • Download template structures in PDB format (web server: RCSB PDB)
  • Align the amino acid sequence of the target protein with that(those) of the template protein(s) (program: CLUSTALW, output: PIR format)
  • Build 3D structural models based on the multiple sequence alignment and the template 3D structure(s) (program: MODELLER, input: ATM, ALI, and TOP files)

manual build manually convert file formats from PDB to ATM and PIR to ALI, respectively write a TOP script file run MODELLER automatic build (some PDB and PIR files are not successfully converted to correct ATM and ALI files due to amino-acid sequence mismatch) perl script to generate ATM, ALI, and TOP files from PDB and PIR files: txt, gz

command-line arguments in order: base_directory_path pir_file_name target_protein_id(exactly 4 letters) one_or_more_PDB_file_names(exactly 4-letter prefix) run the perl script (e.g. ../bin/makeModellerInput.pl /home/user/Protein3DModelling/KCIP KCIP.pir KCIP 1QJA.pdb) run MODELLER (e.g. mod7v7 KCIP.top) example ATM file: 5fd1.atm ALI file: alignment.ali TOP file: model-default.top command: mod7v7 model-default.top output PDB file (predicted model): 1fdx.B99990001.pdb MODELLER manual PDF, HTML Display and compare the 3D structures of the model and the template proteins (program: RasMol)