Books
in black and white
Main menu
Share a book About us Home
Books
Biology Business Chemistry Computers Culture Economics Fiction Games Guide History Management Mathematical Medicine Mental Fitnes Physics Psychology Scince Sport Technics
Ads

Bioinformatics. A Practical Guide to the Analysis of Genes and Proteins - Baxevanis A.D.

Baxevanis A.D. Bioinformatics. A Practical Guide to the Analysis of Genes and Proteins - New York, 2001. - 493 p.
ISBN 0-471-22392-1
Download (direct link): bioinformaticsapractic2001.pdf
Previous << 1 .. 153 154 155 156 157 158 < 159 > 160 161 162 163 164 165 .. 251 >> Next

Sequences are submitted to PredictProtein either by sending an E-mail message or by using a Web front end. Several options are available for sequence submission; the query sequences can be submitted as single-letter amino acid code or by its SWISS-PROT identifier. In addition, a multiple sequence alignment in FASTA format or as a PIR alignment can also be submitted for secondary structure prediction.
The input message, sent to predictprotein@embl-heidelberg.de, takes the following form:
266
Flavodoxin
1 10
AKI GLFYGTQTGVTQTI
20 30 40
AESIQQEFGGESI VDLNDI ANADAS
D L
50
N A Y D Y L I I
60
G С P T WN V G
PredictProtein - E E - - -
PSIPRED - E JJJ: E - - -
GOR H E J J J: E - E -
Levin - E JJJ: ET - •
DPM ■ ■ • 1JJ: - - - -
SOPMA - - aJJ: - - - -
CNRS Consensus - - J J J:
10FV
Beta 1
Alpha 1
Earn-- - - -
- - - EEHHHHHHHHH T T - - E 1=1- H H H H H H H
---- еШВ- - н н т н
- - - H H - T T - H Eg|- HH - HHHH
Beta 2
Alpha 2
=■:
E J: : IE......
. J: = |e......
E Щ2 = 1.......
_ - [s ......H T
- J:
_ J: : IE......
- J:
Beta3
Flavodoxin
nnpredict
PredictProtein
PSIPRED
GOR
Levin
DPM
SOPMA
CNRS Consensus 10FV
70 80 90 100 110 120
ELQSDWEGIYDDLDSVNFQGKKVAYFGAGDQVGYSDNFQDAMGI LEEKI SSLGSQTVGYW
- - - - И E - - - - 1

H H - TUi!
- НИ- П1 - - E......
T T T
Alpha 3
cm
Alpha 4
HHHH--H.....
HHHHH-HHHTT-E - T - - -EE- T T -HHHTT-
.....H - - Г
H H H H H H HHHH
EE.....H H
E E E E - T - H H ТЕ- T T T - H H
.......TTTB
EE.....H НЕЕ
Beta 4
Alphas
E E E E
Wif
IIIMI
- E - - E -EEEEE-
EE----
EEEEEE
- EE- E-ET- ■ EE - ■ ■
■ EE - ■ ■
130
140
150
160
Flavodoxin
nnpredict
PredictProtein
PSIPRED
GOR
Levin
DPM
SOPMA
CNRS Consensus 10FV
PI EGYDFNESKAVRNNQFVGLAI DEDNQPDLT KNRI KTWVSQLKSEFGL
-E..............H H - -
...........EEE--EEEE
................EEEE
EEE- - - HHHHHHHHHHHHHE 33 HH H
ТЕ- EEE- - HHHH- H- T- - EE.....
■ - ■ T- ■ ■ TH- HHHH- - ■ EE- HHHHH
..........TTEETTEE- -
........ H-HHHH-T-EE-
HHHHH HHHHHHHHHHHHHHH H H H H H H HHHH HHHHH
н н н н н н hQOh h н9н h
нШн HHHHHHHHHHHH
- - H H
- - H H
- - H H
g: TH J? 8 H Hfm!l!lilililH!l!l!i;i;i;.H н:
H - - -HH - ■ H H H E HH - -HH - -TTTT-
Beta 5
Alpha 6
Figure 11.4. Comparison of secondary structure predictions by various methods. The sequence of flavodoxin was used as the query and is shown on the first line of the alignment. For each prediction, н denotes an a-helix, e a /З-strand, and т a /З-turn; all other positions are assumed to be random coil. Correctly assigned residues are shown in inverse type. The methods used are listed along the left side of the alignment and are described in the text. At the bottom of the figure is the secondary structure assignment given in the PDB file for flavodoxin (10FV, Smith et al., 1983).
SECONDARY STRUCTURE AND FOLDING CLASSES
267
Joe Buzzcut
National Human Genome Research Institute, NIH
buzzcut@baldguys.org
do NOT align
# FASTA list homeodomain proteins
>ANTP
---KRGRQTYTRYQTLELEKEFHFNRYLTRRRRIEIAHALSLTERQIKIWFQNRRMKWKK
>HDD
MDEKRPRTAFSSEQLARLKREFNENRYLTERRRQQLSSELGLNEAQIKIWFQNKRAKIKK
>DLX
-KIRKPRTIYSSLQLQALNHRFQQTQYLALPERAELAASLGLTQTQVKIWFQNKRSKFKK
>FTT
---RKRRVLFSQAQVYELERRFKQQKYLSAPEREHLASMIHLTPTQVKIWFQNHRYKMKR
>Pax6
--LQRNRTSFTQEQIEALEKEFERTHYPDVFARERLAAKIDLPEARIQVWFSNRRAKWRR
Above is an example of a FASTA-formatted multiple sequence alignment of homeodomain proteins submitted for secondary structure prediction. After the name, affiliation, and address lines, the # sign signals to the server that a sequence in one-letter code follows. The sequence format is essentially FASTA, except that blanks are not allowed. For this alignment, the phrase do NOT align before the line starting with # assures that the alignment will not be realigned. Nothing is allowed to follow the sequence. The output sent as an E-mail message is quite copious but contains a large amount of pertinent information. The results can also be retrieved from an ftp site by adding a qualifier return no mail in any line before the line starting with #. This might be a useful feature for those E-mail services that have difficulty handling very large output files. The format for the output file can be plain text or HTML files with or without PHD graphics.
The results of the MaxHom search are returned, complete with a multiple alignment that may be of use in further study, such as profile searches or phylogenetic studies. If the submitted sequence has a known homolog in PDB, the PDB identifiers are furnished. Information follows on the method itself and then the actual prediction will follow. In a recent release, the output can also be customized by specifying available options. Unlike nnpredict, PredictProtein returns a ‘‘reliability index of prediction’’ for each position ranging from 0 to 9, with 9 being the maximum confidence that a secondary structure assignment has been made correctly. The results returned by the server for this particular sequence, as compared with those obtained by other methods, are shown in modified form in Figure 11.4.
Previous << 1 .. 153 154 155 156 157 158 < 159 > 160 161 162 163 164 165 .. 251 >> Next