PAML primate HKY85¶
This example just uses PAML to estimate parameters of an HKY85 process and compute a log likelihood, for a small alignment and phylogenetic tree.
Newick tree¶
1 2 | 9 1
(TamarinEDN, ((MacaqueECP, (OrangutanECP, (ChimpanzeeECP, GorillaECP))), (MacaqueEDN, (OrangutanEDN, (ChimpanzeeEDN, GorillaEDN)))));
|
Fasta file¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | >OrangutanECP
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTAGTGGTGTGGGGGGCTCACTCCATGCCAAACCCCGACAGTTTACGAGGGCTCAGTGGTTTGCCATCCAGCACGTCAGTCTGAACCCTCCTCAATGCACCACTGCAATGCGGGTAATTAACAATTATCAACGGCGTTGCAAAGACCAAAATACTTTTCTTCGTACAACTTTTGCTAATGTAGTTAATGTTTGTGGTAACCCAAATATAACCTGTCCTCGTAACAGAACTCTCCACAATTGTCATCGGAGTAGATTCCAGGTGCCTTTACTCCACTGTAACCTCACAGGTCAGAATATTTCAAACTGCAAGTATGCAGACAGAACAGAAAGGAGGTTCTATGTAGTTGCATGTGACAACAGAGATCCACGGGATTCTCCACGGTATCCTGTGGTTCCAGTTCACCTGGATACCACCATCTAA
>OrangutanEDN
ATGGTTCCAAAACTGTTCACTTCTCAAATTTCCCTGCTTCTTCTGTTGGGGCTTCTGGCTGTGGACGGCTCACTCCATGTCAAACCTCCACAGTTTACCTGGGCTCAATGGTTTGAAACCCAGCACATCAATATGACCTCCCAGCAATGCAACAATGCAATGCAGGTCATTAACAATTTTCAACGGCGTTGCAAAAACCAAAATACTTTTCTGCGTACAACTTTTGCTAATGTAGTTAATGTTTGTGGTAACCCAAATATAACCTGTCCTAGTAACAGAAGTCGCAACAATTGTCATCATAGTGGAGTCCAGGTGCCTTTAATCCACTGTAACCTCACAAGTCAGAATATTTCAAACTGCAGGTATGCGCAGACACCAGCAAACATGTTCTATATAGTTGCATGTGACAACAGGGATCCACGGGACCCTCCACAGTATCCGGTGGTTCCAGTTCACCTGGATAGAATCATCTAA
>MacaqueECP
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTATGGGTGTGGAGGGCTCACTCCATGCCAGACCCCCACAGTTTACAAAGGCTCAGTGGTTTGCCATCCAGCACATCAATGTGAACCCCCCTCGATGCACCATTGCAATGCGGGTAATAAATAATTATCAACGGCGTTGCAAAAACCAAAATACTTTTCTTCGTACAACTTTTGCATATACAGCTAATGTTTGTCGTAACGAACGTATACGCTGCCCTCGTAACAGAACTCTCCACAATTGTCATCGTAGTAGATACCGGGTGCCTTTACTCCACTGTGACCTCACAGGTCAGAATATTTCAACCTGCAGGTATGCAGACAGACCAGGACGGAGGTTCTATGTAGTTGCATGTGAAAGCAGAGATCCACGGGATTCTCCACGGTATCCAGTGGTTCCAGTTCACCTGGATACCACCATCTAA
>ChimpanzeeEDN
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTCTGGCTGTGGAGGGCTCACTCCATGTCAAACCTCCACAGTTTACCTGGGCTCAATGGTTTGAAACCCAGCACATCAATATGACATCCCAGCAATGCACCAATGCAATGCAGGTCATTAACAATTATCAACGGCGATGCAAAAACCAAAATACTTTCCTTCTTACAACTTTTGCTAACGTAGTTAATGTTTGTGGTAACCCAAATATGACCTGTCCTAGTAACAAAACTCGCAAAAATTGTCATCAAAGTGGAAGCCAGGTGCCTTTAATCCACTGTAACCTCACAAGTCAGAATATTTCAAACTGCAGGTATGCGCAGACACCAGCAAACATGTTCTATATAGTTGCATGTGACAACAGAGATCAACGGGACCCTCCACAGTATCCGGTGGTTCCAGTTCACCTGGATAGAATCATCTAA
>MacaqueEDN
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTATGGGTGTGGAAGGCTCACTTCATGCCAAACCCGGACAATTTACCTGGGCTCAGTGGTTTGAAATCCAGCATATAAATATGACCTCTGGCCAATGCACCAATGCAATGCAGGTCATTAACAATTATCAACGGCGATGCAAAAATCAAAATACTTTTCTTCTTACAACTTTTGCTGATGTAGTTCATGTCTGTGGTAACCCAAGCATGCCCTGCCCTAGCAACACAAGTCTCAACAATTGTCATCATAGTGGAGTCCAGGTGCCTTTAATCCACTGTAACCTCACAAGTCGAAGGATTTCAAATTGCAGGTATACACAGACAACAGCAAACAAGTACTACATAGTTGCATGTAACAACAGCGATCCACGGGACCCTCCACAGTATCCAGTGGTTCCAGTTCACCTGGATAGAATCATCTAA
>GorillaEDN
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTCTGGCAGTGGAGGGCTCACTCCATGTCAAACCTCCACAGTTTACCTGGGCTCAATGGTTTGAAACCCAGCACATCAATATGACCTCCCAGCAATGCACCAATGCAATGCGGGTCATTAACAATTATCAACGGCGATGCAAAAACCAAAATACTTTCCTTCTTACAACTTTTGCTAACGTAGTTAATGTTTGTGGTAACCCAAATATGACCTGTCCTAGTAACAAAACTCGCAAAAATTGTCATCACAGTGGAAGCCAGGTGCCTTTAATCCACTGTAACCTCACAAGTCAGAATATTTCAAACTGCAGGTATGCGCAGACACCAGCAAACATGTTCTATATAGTTGCATGTGACAACAGAGATCAACGGGACCCTCCACAGTATCCAGTGGTTCCAGTTCACCTGGATAGAATCATCTAA
>GorillaECP
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTATGGGTGTGGAGGGCTCACTCCATGCCAGACCCCCACAGTTTACGAGGGCTCAGTGGTTTGCCATCCAGCACATCAGTCTGAACCCCCCTCGATGCACCATTGCAATGCGGGTAATTAACAATTATCGATGGCGTTGCAAAAACCAAAATACTTTTCTTCGTACAACTTTTGCTAATGTAGTTAATGTTTGTGGTAACCAAAGTATACGCTGCCTTCATAACAGAACTCTCAACAATTGTCATCGGAGTAGATTCCGGGTGCCTTTACTCCACTGTGACCTCACAGGTCAGAATATTTCAAACTGCAGGTATGCAGACAGACCAGGAAGGAGGTTCTATGTAGTTGCATGTGACAACAGAGATCCACAGGATTCTCCACGGTATCCTGTGGTTCCTGTTCACCTGGATACCACCATCTAA
>TamarinEDN
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGCGTGCTTCTTCTTTTCGGGCTTTTGAGTGTGGAGGTCTCACTCCAGGTCAAACCCCAGCAGTTTTCCTGGGCTCAGTGGTTTAGCATCCAGCACATCCAAACAACTCCCCTCCACTGCACCTCTGCAATGCGGGCAATTAACAGGTATCAACCTCGATGCAAAAACCAAAATACTTTTCTTCATACAACTTTTGCTAATGTAGTTAATGTTTGTGGTAACACAAATATCACCTGCCCTCGTAATGCATCTCTCAACAATTGTCATCACAGTGGAGTCCAGGTGCCTTTAACCTACTGTAACCTCACAGGTCAGACTATTTCAAACTGTGTGTATTCCTCGACTCAGGCAAACATGTTCTATGTAGTTGCATGTGACAACAGAGATCCACGGGATCCTCCACAGTATCCAGTGGTCCCGGTTCACCTGGATACCACCATCTAA
>ChimpanzeeECP
ATGGTTCCAAAACTGTTCACTTCCCAAATTTGTCTGCTTCTTCTGTTGGGGCTTATGGGTGTGGAGGGCTCACTCCATGCCAGACCCCCACAGTTTACGAGGGCTCAGTGGTTTGCCATCCAGCACATCAGTCTGAACCCCCCTCGATGCACCATTGCAATGCGGGTAATTAACAATTATCGATGGCGTTGCAAAAACCAAAATACTTTTCTTCGTACAACTTTTGCTAATGTAGTTAATGTTTGTGGTAACCAAAGTATACGCTGCCCTCATAACAGAACCCTCAACAATTGTCATCAGAGTAGATTCCGGGTGCCTTTACTCCACTGTGACCTCACAGGTCAGAATATTTCAAACTGCGGGTATGCAGACAGACCAGGAAGGAGGTTCTATGTAGTTGCATGTGACAACAGAGATCCACGGGATTCTCCACGGTATCCTGTGGTTCCAGTTCACCTGGATACCACCATCTAA
|
Baseml control file¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | seqfile = for.paml.ECP_EDN.fasta
treefile = for.paml.newick.tree
outfile = mlb * main result file name
noisy = 3 * 0,1,2,3,9: how much rubbish on the screen
verbose = 0 * 0: concise; 1: detailed, 2: too much
runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic
model = 4 * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85, 5:T92, 6:TN93, 7:REV
* 8:UNREST, 9:REVu; 10:UNRESTu
Mgene = 1 * 0:rates, 1:separate; 2:diff pi, 3:diff kappa, 4:all diff
clock = 0 * 0:no clock, 1:clock; 2:local clock; 3:CombinedAnalysis
fix_kappa = 0 * 0: estimate kappa; 1: fix kappa at value below
kappa = 2.5 * initial or fixed kappa
fix_alpha = 1 * 0: estimate alpha; 1: fix alpha at value below
alpha = 0. * initial or fixed alpha, 0:infinity (constant rate)
ncatG = 5 * # of categories in the dG, AdG, or nparK models of rates
fix_rho = 1 * 0: estimate rho; 1: fix rho at value below
rho = 0 * initial or fixed rho, 0:no correlation
nparK = 0 * rate-class models. 1:rK, 2:rK&fK, 3:rK&MK(1/K), 4:rK&MK
nhomo = 1 * 0 & 1: homogeneous, 2: kappa for branches, 3:N1, 4:N2, 5:user
getSE = 0 * 0: don't want SEs of estimates, 1: want SEs
RateAncestor = 0 * (0,1,2): rates (alpha>0) or ancestral states
method = 0 * 0: simultaneous; 1: one branch at a time
Small_Diff = 1e-6
cleandata = 0 * remove sites with ambiguity data (1:yes, 0:no)?
* fix_blength = 1 * 0: ignore, -1: random, 1: initial, 2: fixed
|
Output file (mlb)¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | BASEML (in paml version 4.6, August 2012) for.paml.ECP_EDN.fasta HKY85
Frequencies..
T C A G
OrangutanECP 0.27848 0.25105 0.26582 0.20464
OrangutanEDN 0.27637 0.25105 0.28692 0.18565
MacaqueECP 0.27004 0.25949 0.26371 0.20675
ChimpanzeeEDN 0.26793 0.24895 0.30169 0.18143
MacaqueEDN 0.27004 0.24684 0.29536 0.18776
GorillaEDN 0.26582 0.25316 0.29958 0.18143
GorillaECP 0.28270 0.24895 0.25738 0.21097
TamarinEDN 0.28692 0.27426 0.25949 0.17932
ChimpanzeeECP 0.27637 0.25316 0.25738 0.21308
Homogeneity statistic: X2 = 0.02231 G = 0.02218
Average 0.27496 0.25410 0.27637 0.19456
# constant sites: 318 (67.09%)
ln Lmax (unconstrained) = -1446.712071
Distances:HKY85 (kappa) (alpha set at 0.00)
This matrix is not used in later m.l. analysis.
OrangutanECP
OrangutanEDN 0.1471( 1.4771)
MacaqueECP 0.0965( 2.3785) 0.1857( 1.7205)
ChimpanzeeEDN 0.1501( 1.6664) 0.0459( 1.2841) 0.1889( 1.8947)
MacaqueEDN 0.1784( 2.3576) 0.1171( 2.6393) 0.2106( 2.1212) 0.1174( 2.8899)
GorillaEDN 0.1467( 1.4737) 0.0509( 1.9570) 0.1821( 1.8540) 0.0111( 1.5375) 0.1173( 3.1303)
GorillaECP 0.0632( 4.5261) 0.1617( 1.9359) 0.0631( 1.8565) 0.1648( 2.1473) 0.1926( 2.5604) 0.1611( 1.9297)
TamarinEDN 0.1955( 1.7006) 0.1872( 1.8382) 0.2381( 1.9132) 0.1805( 1.8140) 0.2177( 2.3613) 0.1735( 1.7645) 0.2117( 1.9055)
ChimpanzeeECP 0.0636( 5.4030) 0.1557( 1.9072) 0.0634( 2.1638) 0.1588( 2.1223) 0.1863( 2.5434) 0.1552( 1.9020) 0.0133(10.5181) 0.2012( 1.8041)
TREE # 1: (8, ((3, (1, (9, 7))), (5, (2, (4, 6))))); MP score: 214.00
lnL(ntime: 16 np: 20): -1727.569162 +0.000000
10..8 10..11 11..12 12..3 12..13 13..1 13..14 14..9 14..7 11..15 15..5 15..16 16..2 16..17 17..4 17..6
0.075656 0.041862 0.069114 0.042439 0.006624 0.043808 0.008219 0.006325 0.006377 0.029546 0.072769 0.024503 0.025128 0.018754 0.004233 0.006750 2.121452 0.252198 0.254579 0.287265
tree length = 0.48211
(TamarinEDN, ((MacaqueECP, (OrangutanECP, (ChimpanzeeECP, GorillaECP))), (MacaqueEDN, (OrangutanEDN, (ChimpanzeeEDN, GorillaEDN)))));
(TamarinEDN: 0.075656, ((MacaqueECP: 0.042439, (OrangutanECP: 0.043808, (ChimpanzeeECP: 0.006325, GorillaECP: 0.006377): 0.008219): 0.006624): 0.069114, (MacaqueEDN: 0.072769, (OrangutanEDN: 0.025128, (ChimpanzeeEDN: 0.004233, GorillaEDN: 0.006750): 0.018754): 0.024503): 0.029546): 0.041862);
Detailed output identifying parameters
kappa under HKY85: 2.12145
Base frequencies:
0.25220 0.25458 0.28726 0.20596
|