Difference between revisions of "Protein modelling"
Line 69: | Line 69: | ||
=== Available Online Tool Servers === | === Available Online Tool Servers === | ||
− | # BLASTP: 'NCBI BLASTP' | + | # BLASTP: |
− | # PSI-BLAST: 'NCBI PSI-BLAST' | + | ## [http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS=250&ALIGNMENT_VIEW=Pairwise&CDD_SEARCH=on&CLIENT=web&DATABASE=nr&DESCRIPTIONS=500&ENTREZ_QUERY=%28none%29&EXPECT=10&FILTER=L&FORMAT_OBJECT=Alignment&FORMAT_TYPE=HTML&I_THRESH=0.005&MATRIX_NAME=BLOSUM62&NCBI_GI=on&PAGE=Proteins&PROGRAM=blastp&SERVICE=plain&SET_DEFAULTS.x=41&SET_DEFAULTS.y=5&SHOW_OVERVIEW=on&END_OF_HTTPGET=Yes&SHOW_LINKOUT=yes&GET_SEQUENCE=yes 'NCBI BLASTP'] |
+ | ## [http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+Launch+-id+1uBRI1Ot9iE+-appl+BlastP+-launchFrom+top EBI SRS BLASTP] | ||
+ | # PSI-BLAST: | ||
+ | ## [http://www.ncbi.nlm.nih.gov/BLAST/Blast.cgi?CMD=Web&LAYOUT=TwoWindows&AUTO_FORMAT=Semiauto&ALIGNMENTS=250&ALIGNMENT_VIEW=Pairwise&CLIENT=web&COMPOSITION_BASED_STATISTICS=on&DATABASE=nr&CDD_SEARCH=on&DESCRIPTIONS=500&ENTREZ_QUERY=%28none%29&EXPECT= 'NCBI PSI-BLAST'] | ||
# CLUSTALW: 'EBI ClustalW', 'Genebee ClustalW' | # CLUSTALW: 'EBI ClustalW', 'Genebee ClustalW' | ||
# SRS servers: 'NGIC SRS', 'Public SRS servers' | # SRS servers: 'NGIC SRS', 'Public SRS servers' |
Revision as of 09:49, 26 August 2005
http://biome.ngic.re.kr/ProteinModelling/
Contents
Problem (문제정의)
- Build 3D structural models of given or interested proteins
- Initial input: amino acid sequence of the target protein
- Final output: its predicted 3D structure
Method
- Find one or more structural templates of the target protein via sequence homology search against known structures (program: BLASTP or PSI-BLAST, database: PDB)
- Download template structures in PDB format (web server: RCSB PDB)
- Align the amino acid sequence of the target protein with that(those) of the template protein(s) (program: CLUSTALW, output: PIR format)
- Build 3D structural models based on the multiple sequence alignment and the template 3D structure(s) (program: MODELLER, input: ATM, ALI, and TOP files)
- manual build
- manually convert file formats from PDB to ATM and PIR to ALI, respectively
- write a TOP script file
- run MODELLER
- automatic build (some PDB and PIR files are not successfully converted to correct ATM and ALI files due to amino-acid sequence mismatch)
- perl script to generate ATM, ALI, and TOP files from PDB and PIR files: txt, gz
- command-line arguments in order: base_directory_path pir_file_name target_protein_id(exactly 4 letters) one_or_more_PDB_file_names(exactly 4-letter prefix)
- run the perl script (e.g. ../bin/makeModellerInput.pl /home/user/Protein3DModelling/KCIP KCIP.pir KCIP 1QJA.pdb)
- run MODELLER (e.g. mod7v7 KCIP.top)
- example
- ATM file: 5fd1.atm
- ALI file: alignment.ali
- TOP file: model-default.top
- command: mod7v7 model-default.top
- output PDB file (predicted model): 1fdx.B99990001.pdb
- MODELLER manual
- PDF, HTML
- manual build
- Display and compare the 3D structures of the model and the template proteins (program: RasMol)
Sample proteins (단백질 예제들)
- Sample protein A: 14-3-3 protein gamma (Protein kinase C inhibitor protein-1; KCIP-1)
>sw|P61981|143G_HUMAN 14-3-3 protein gamma (Protein kinase C inhibitor protein-1) (KCIP-1) VDREQLVQKARLAEQAERYDDMAAAMKNVTELNEPLSNEERNLLSVAYKNVVGARRSSWR VISSIEQKTSADGNEKKIEMVRAYREKIEKELEAVCQDVLSLLDNYLIKNCSETQYESKV FYLKMKGDYYRYLAEVATGEKRATVVESSEKAYSEAHEISKEHMQPTHPIRLGLALNYSV FYYEIQNAPEQACHLAKTAFDDAIAELDTLNEDSYKDSTLIMQLLRDNLTLWTSDQQDDD GGEGNN
- Sample protein B: GAK_HUMAN Cyclin G-associated kinase (GAKH)
>sw|O14976|GAK_HUMAN Cyclin G-associated kinase MSLLQSALDFLAGPGSLGGASGRDQSDFVGQTVELGELRLRVRRVLAEGGFAFVYEAQDV GSGREYALKRLLSNEEEKNRAIIQEVCFMKKLSGHPNIVQFCSAASIGKEESDTGQAEFL LLTELCKGQLVEFLKKMESRGPLSCDTVLKIFYQTCRAVQHMHRQKPPIIHRDLKVENLL LSNQGTIKLCDFGSATTISHYPDYSWSAQRRALVEEEITRNTTPMYRTPEIIDLYSNFPI GEKQDIWALGCILYLLCFRQHPFEDGAKLRIVNGKYSIPPHDTQYTVFHSLIRAMLQVNP EERLSIAEVVHQLQEIAAARNVNPKSPITELLEQNGGYGSATLSRGPPPPVGPAGSGYSG GLALAEYDQPYGGFLDILRGGTERLFTNLKDTSSKVIQSVANYAKGDLDISYITSRIAVM SFPAEGVESALKNNIEDVRLFLDSKHPGHYAVYNLSPRTYRPSRFHNRVSECGWAARRAP HLHTLYNICRNMHAWLRQDHKNVCVVHCMDGRAASAVAVCSFLCFCRLFSTAEAAVYMFS MKRCPPGIWPSHKRYIEYMCDMVAEEPITPHSKPILVRAVVMTPVPLFSKQRSGCRPFCE VYVGDERVASTSQEYDKMRDFKIEDGKAVIPLGVTVQGDVLIVIYHARSTLGGRLQAKMA SMKMFQIQFHTGFVPRNATTVKFAKYDLDACDIQEKYPDLFQVNLEVEVEPRDRPSREAP PWENSSMRGLNPKILFSSREEQQDILSKFGKPELPRQPGSTAQYDAGAGSPEAEPTDSDS PPSSSADASRFLHTLDWQEEKEAETGAENASSKESESALMEDRDESEVSDEGGSPISSEG QEPRADPEPPGLAAGLVQQDLVFEVETPAVLPEPVPQEDGVDLLGLHSEVGAGPAVPPQA CKAPSSNTDLLSCLLGPPEAASQGPPEDLLSEDPLLLASPAPPLSVQSTPRGGPPAAADP FGPLLPSSGNNSQPCSNPDLFGEFLNSDSVTVPPSFPSAHSAPPPSCSADFLHLGDLPGE PSKMTASSSNPDLLGGWAAWTETAASAVAPTPATEGPLFSPGGQPAPCGSQASWTKSQNP DPFADLGDLSSGLQGSPAGFPPGGFIPKTATTPKGSSSWQTSRPPAQGASWPPQAKPPPK ACTQPRPNYASNFSVIGAREERGVRAPSFAQKPKVSENDFEDLLSNQGFSSRSDKKGPKT IAEMRKQDLAKDTDPLKLKLLDWIEGKERNIRALLSTLHTVLWDGESRWTPVGMADLVAP EQVKKHYRRAVLAVHPDKAAGQPYEQHAKMIFMELNDAWSEFENQGSRPLF
Available Online Tool Servers
- BLASTP:
- PSI-BLAST:
- CLUSTALW: 'EBI ClustalW', 'Genebee ClustalW'
- SRS servers: 'NGIC SRS', 'Public SRS servers'
- Search engine: Google (Search online servers by yourself!)
Available Online Databases
- PDB amino-acid FASTA file: NCBI BLAST DB (local: pdbaa.gz)
- PDB: RCSB, 'UK PDB mirror', 'Japan PDB mirror'
Downloadable programs
- BLAST: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/ (local: DOS)
- CLUSTALW: ftp://ftp.ebi.ac.uk/pub/software/dos/clustalw/ (local: DOS[WindowsXP])
- MODELLER: http://salilab.org/modeller/ (local: DOS)
- RasMol: http://www.openrasmol.org/ (local: Windows)
- Perl: http://www.perl.org/ (local: DOS[WindowsXP])
- Other utilities: ALZip.exe (for unzip, untar, and ungzip)
Solution Example
- Multiple sequence alignments (PIR format)
- Sample protein A: KCIP.pir
- Sample protein B: GAKH.pir.ORG
- MODELLER input files