Changes

From Opengenome.net

Protein modelling

486 bytes added, 17:51, 12 May 2006
no edit summary
== [[단백질 구조 모델링]] 실용 안내서==<br /><br />[http://biome.ngic.re.kr/ProteinModelling/]  <br /><br />===Problem (문제정의)===<br />* Build 3D structural models of given or interested proteins <br />** Initial input: amino acid sequence of the target protein <br />** Final output: its predicted 3D structure  <br /><br />===Method===<br />&nbsp;* Find one or more structural templates of the target protein via sequence homology search against known structures (program: BLASTP or PSI-BLAST, database: PDB)  <br />* Download template structures in PDB format (web server: RCSB PDB) <br />* Align the amino acid sequence of the target protein with that(those) of the template protein(s) (program: CLUSTALW, output: PIR format) <br />* Build 3D structural models based on the multiple sequence alignment and the template 3D structure(s) (program: MODELLER, input: ATM, ALI, and TOP files) <br />** manual build <br />*** manually convert file formats from PDB to ATM and PIR to ALI, respectively <br />*** write a TOP script file <br />*** run MODELLER <br />** automatic build (some PDB and PIR files are not successfully converted to correct ATM and ALI files due to amino-acid sequence mismatch) <br />*** perl script to generate ATM, ALI, and TOP files from PDB and PIR files: [[perl script for ATM|txt]], gz <br />*** command-line arguments in order: base_directory_path pir_file_name target_protein_id(exactly 4 letters) one_or_more_PDB_file_names(exactly 4-letter prefix) <br />*** run the perl script (e.g. ../bin/makeModellerInput.pl /home/user/Protein3DModelling/KCIP KCIP.pir KCIP 1QJA.pdb) <br />*** run MODELLER (e.g. mod7v7 KCIP.top) <br /><br />** example <br />*** ATM file: 5fd1.atm <br />*** ALI file: alignment.ali <br />*** TOP file: model-default.top <br />*** command: mod7v7 model-default.top <br />*** output PDB file (predicted model): 1fdx.B99990001.pdb <br />** MODELLER manual <br />*** PDF, HTML <br /><br />&nbsp;* Display and compare the 3D structures of the model and the template proteins (program: RasMol)  <br /><br />=== Sample proteins (단백질 예제들)===<br />* Sample protein A: 14-3-3 protein gamma (Protein kinase C inhibitor protein-1; KCIP-1) > &gt;sw|P61981|143G_HUMAN 14-3-3 protein gamma (Protein kinase C inhibitor protein-1) (KCIP-1) VDREQLVQKARLAEQAERYDDMAAAMKNVTELNEPLSNEERNLLSVAYKNVVGARRSSWRVISSIEQKTSADGNEKKIEMVRAYREKIEKELEAVCQDVLSLLDNYLIKNCSETQYESKVFYLKMKGDYYRYLAEVATGEKRATVVESSEKAYSEAHEISKEHMQPTHPIRLGLALNYSVFYYEIQNAPEQACHLAKTAFDDAIAELDTLNEDSYKDSTLIMQLLRDNLTLWTSDQQDDDGGEGNN  * Sample protein B: GAK_HUMAN Cyclin G-associated kinase (GAKH) > &gt;sw|O14976|GAK_HUMAN Cyclin G-associated kinase MSLLQSALDFLAGPGSLGGASGRDQSDFVGQTVELGELRLRVRRVLAEGGFAFVYEAQDVGSGREYALKRLLSNEEEKNRAIIQEVCFMKKLSGHPNIVQFCSAASIGKEESDTGQAEFLLLTELCKGQLVEFLKKMESRGPLSCDTVLKIFYQTCRAVQHMHRQKPPIIHRDLKVENLLLSNQGTIKLCDFGSATTISHYPDYSWSAQRRALVEEEITRNTTPMYRTPEIIDLYSNFPIGEKQDIWALGCILYLLCFRQHPFEDGAKLRIVNGKYSIPPHDTQYTVFHSLIRAMLQVNPEERLSIAEVVHQLQEIAAARNVNPKSPITELLEQNGGYGSATLSRGPPPPVGPAGSGYSGGLALAEYDQPYGGFLDILRGGTERLFTNLKDTSSKVIQSVANYAKGDLDISYITSRIAVMSFPAEGVESALKNNIEDVRLFLDSKHPGHYAVYNLSPRTYRPSRFHNRVSECGWAARRAPHLHTLYNICRNMHAWLRQDHKNVCVVHCMDGRAASAVAVCSFLCFCRLFSTAEAAVYMFSMKRCPPGIWPSHKRYIEYMCDMVAEEPITPHSKPILVRAVVMTPVPLFSKQRSGCRPFCEVYVGDERVASTSQEYDKMRDFKIEDGKAVIPLGVTVQGDVLIVIYHARSTLGGRLQAKMASMKMFQIQFHTGFVPRNATTVKFAKYDLDACDIQEKYPDLFQVNLEVEVEPRDRPSREAPPWENSSMRGLNPKILFSSREEQQDILSKFGKPELPRQPGSTAQYDAGAGSPEAEPTDSDSPPSSSADASRFLHTLDWQEEKEAETGAENASSKESESALMEDRDESEVSDEGGSPISSEGQEPRADPEPPGLAAGLVQQDLVFEVETPAVLPEPVPQEDGVDLLGLHSEVGAGPAVPPQACKAPSSNTDLLSCLLGPPEAASQGPPEDLLSEDPLLLASPAPPLSVQSTPRGGPPAAADPFGPLLPSSGNNSQPCSNPDLFGEFLNSDSVTVPPSFPSAHSAPPPSCSADFLHLGDLPGEPSKMTASSSNPDLLGGWAAWTETAASAVAPTPATEGPLFSPGGQPAPCGSQASWTKSQNPDPFADLGDLSSGLQGSPAGFPPGGFIPKTATTPKGSSSWQTSRPPAQGASWPPQAKPPPKACTQPRPNYASNFSVIGAREERGVRAPSFAQKPKVSENDFEDLLSNQGFSSRSDKKGPKTIAEMRKQDLAKDTDPLKLKLLDWIEGKERNIRALLSTLHTVLWDGESRWTPVGMADLVAPEQVKKHYRRAVLAVHPDKAAGQPYEQHAKMIFMELNDAWSEFENQGSRPLF  <br /><br />=== Available Online Tool Servers ===<br /># BLASTP: ## [http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi?CMD=Web&amp;LAYOUT=TwoWindows&amp;AUTO_FORMAT=Semiauto&amp;ALIGNMENTS=250&amp;ALIGNMENT_VIEW=Pairwise&amp;CDD_SEARCH=on&amp;CLIENT=web&amp;DATABASE=nr&amp;DESCRIPTIONS=500&amp;ENTREZ_QUERY=%28none%29&amp;EXPECT=10&amp;FILTER=L&amp;FORMAT_OBJECT=Alignment&amp;FORMAT_TYPE=HTML&amp;I_THRESH=0.005&amp;MATRIX_NAME=BLOSUM62&amp;NCBI_GI=on&amp;PAGE=Proteins&amp;PROGRAM=blastp&amp;SERVICE=plain&amp;SET_DEFAULTS.x=41&amp;SET_DEFAULTS.y=5&amp;SHOW_OVERVIEW=on&amp;END_OF_HTTPGET=Yes&amp;SHOW_LINKOUT=yes&amp;GET_SEQUENCE=yes 'NCBI BLASTP']## [http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+Launch+-id+1uBRI1Ot9iE+-appl+BlastP+-launchFrom+top EBI SRS BLASTP] <br /><br /># PSI-BLAST: <br />## [http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi?CMD=Web&amp;LAYOUT=TwoWindows&amp;AUTO_FORMAT=Semiauto&amp;ALIGNMENTS=250&amp;ALIGNMENT_VIEW=Pairwise&amp;CLIENT=web&amp;COMPOSITION_BASED_STATISTICS=on&amp;DATABASE=nr&amp;CDD_SEARCH=on&amp;DESCRIPTIONS=500&amp;ENTREZ_QUERY=%28none%29&amp;EXPECT= 'NCBI PSI-BLAST'] <br /># CLUSTALW: <br />## [http://www.ebi.ac.uk/clustalw/ 'EBI ClustalW'], <br />## [http://www.genebee.msu.su/clustal/basic.html 'Genebee ClustalW'] # SRS servers: 'NGIC SRS', 'Public SRS servers' # Search engine: Google (Search online servers by yourself!)   <br /><br />===Available Online Databases ===<br /># PDB amino-acid FASTA file: NCBI BLAST DB (local: pdbaa.gz) # PDB: ## [http://www.rcsb.org/pdb/ RCSB], ## [http://pdb.ccdc.cam.ac.uk/pdb/ 'UK PDB mirror'], <br />## [http://pdb.protein.osaka-u.ac.jp/pdb/ 'Japan PDB mirror']   ===Downloadable programs===# BLAST: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/ (local: DOS) # CLUSTALW: ftp://ftp.ebi.ac.uk/pub/software/dos/clustalw/ (local: DOS[WindowsXP]) # MODELLER: http://salilab.org/modeller/ (local: DOS) # RasMol: http://www.openrasmol.org/ (local: Windows) # Perl: http://www.perl.org/ (local: DOS[WindowsXP]) # Other utilities: ALZip.exe (for unzip, untar, and ungzip)   <br /><br />===Solution Example ===  <br /># Found templates (PDB format)## Sample protein A: [http://biome.ngic.re.kr/ProteinModelling/templates/1QJA.pdb 1QJA.pdb] ## Sample protein B: [http://biome.ngic.re.kr/ProteinModelling/templates/1N4C.pdb 1N4C.pdb]   --------------------------------------------------------------------------------  <br /># Multiple sequence alignments (PIR format)<br />## Sample protein A: [http://biome.ngic.re.kr/ProteinModelling/buildingModels/KCIP/modellerResult/KCIP.pir KCIP.pir] <br />## Sample protein B: [http://biome.ngic.re.kr/ProteinModelling/buildingModels/GAKH/modellerResult/GAKH.pir.ORG GAKH.pir.ORG]   --------------------------------------------------------------------------------  * MODELLER input files** Sample protein A: *** [http://biome.ngic.re.kr/ProteinModelling/buildingModels/KCIP/modellerResult/1QJA.atm ATM], *** [http://biome.ngic.re.kr/ProteinModelling/buildingModels/KCIP/modellerResult/KCIP.ali ALI], *** [http://biome.ngic.re.kr/ProteinModelling/buildingModels/KCIP/modellerResult/KCIP.top TOP] files (automatically generated) ** Sample protein B:*** [http://biome.ngic.re.kr/ProteinModelling/buildingModels/GAKH/modellerResult/1N4C.atm ATM], *** [http://biome.ngic.re.kr/ProteinModelling/buildingModels/KCIP/modellerResult/KCIP.ali ALI], *** [http://biome.ngic.re.kr/ProteinModelling/buildingModels/GAKH/modellerResult/GAKH.top TOP] files (manually edited; C-terminal only)   * MODELLER output files** Sample protein A: [http://biome.ngic.re.kr/ProteinModelling/buildingModels/KCIP/modellerResult/KCIP.pdb KCIP.pdb] ** Sample protein B: [http://biome.ngic.re.kr/ProteinModelling/buildingModels/GAKH/modellerResult/GAKH.pdb GAKH.pdb]
Anonymous user

Navigation menu