Kakusan4 is a nucleotide substitution model selection script written in Perl language for multi-partitioned data set. Kakusan4 is designed and optimized for multi-core processors and multi-processor computers.


This is a workflow diagram of Kakusan4. The blocks enclosed by dotted lines are optional.

Kakusan4 4.0.2010.09.28 - 4.0.2010.11.12 has a bug of the function of generating configuration files for RAxML. If you use Kakusan4 4.0.2010.09.28 - 4.0.2010.11.12, please update to latest version and re-analyse your data.



No other programs are required on Windows and MacOS X because all requirements except PAUP* are contained in binary distribution for Windows and MacOS X. If you try to use source distribution, requirements are listed below.


Note that included commands of ReadSeq, PHYLIP, PAML, and Treefinder are distributed under their own licenses.

Distributed files

You can download former version of Kakusan from here.

Automatic Installer for Windows, MacOSX, Debian/GNU Linux, and Ubuntu Linux

Public release

If you have any errors, questions and/or comments, please let me know. Do not request a response to the authors of PHYLIP, PAML, ReadSeq, Treefinder and PAUP*.

User manual

Sample data sets and analysis results

Comparable models at each partition

nparam: the number of parameters, ngene: the number of genes (=partitions)

Substitution rate matrix

A-πA rCAπA rGAπA rTA
CπC rAC-πC rGCπC rTC
GπG rAGπG rCG-πG rTG
TπT rATπT rCTπT rGT-

πX is a frequency parameter of nucleotide X. πX rYX is a substitution rate parameter from nucleotide Y to X. rYX is equal to rXY in time-reversible models. Kakusan4 and most phylogenetic analysis softwares can handle time-reversible models only.

nparamequal frequency (nparam±=0)unequal frequency (nparam+=3)rCA, rGA, rTA, rGC, rTC, rTG
0JC69 (Jukes and Cantor, 1969)F81 (Felsenstein, 1981)(a a a a a a)
1K80/K2P (Kimura, 1980)HKY85 (Hasegawa et al., 1985)(a b a a b a)
2TN93efTN93 (Tamura and Nei, 1993)(a b a a e a)
2K81/K3P (Kimura, 1981)K81uf/K3Puf(a b c c b a)
3J1efJ1 (Jobb, 2008)(a b c a e c)
3J2efJ2 (Jobb, 2008)(a b a d e d)
3TIMefTIM (Posada, 2003)(a b c c e a)
4TVMefTVM (Posada, 2003)(a b c d b e)
5SYM (Zharkikh, 1994)GTR (Tavaré, 1986)(a b c d e f)

K81/K3P and K81uf/K3Puf are not compared in tf mode because Treefinder cannot apply this model.

Rate heterogeneity among sites

baseml mode

Note that autocorrelated discrete gamma and N-GAM heterogeneity models are very time-consuming and lead shaky results.

tf mode

PAUP* mode

How to run

If you want to analyze multilocus data, different loci should be separated to different files and all OTU names must be completely same among the files. If your data codes protein, the input file name except extension or the partition name have to terminate "_P" (underscore and p). For example, "ND5_P.fas", "Cyt-b_P.nex", "EF1alpha_P.gbk", etc.

on Windows

Execute from shortcut icon in the start menu, desktop or quick launch bar. When Kakusan4 request an input file name, drop a file into the window. After full path name of the file is input to the prompt, hit enter to next step. If you want to input multiple files, repeat this operation. Alternatively, you can use Kakusan4 from "SendTo" shortcut in right-click menu. If you use Windows Vista, you cannot drop a file into the window, but you can use "Copy as Path" command in shift+right-click menu and paste alternatively.

on MacOS X

Execute "Kakusan4" in extracted folder. When Kakusan4 request an input file name, drop a file into the window. After full path name of the file is input to the prompt, hit enter to next step. If you want to input multiple files, repeat this operation.

in console, terminal or command prompt

Show help message

Please type the following in command prompt, console, or terminal. And read well displayed messages.

INSTALLDIR\kakusan4 --help (Windows version)
perl INSTALLDIR/Kakusan4.app/Contents/MacOS/kakusan4.pl --help (MacOS X version)
perl INSTALLDIR/kakusan4.pl --help

Run model selection

Please type the following in command prompt, console, or terminal.

INSTALLDIR\kakusan4 command_line_options input_file_name (Windows version)
perl INSTALLDIR/Kakusan4.app/Contents/MacOS/kakusan4.pl command_line_options input_file_name (MacOS X version)
perl INSTALLDIR/kakusan4.pl command_line_options input_file_name

Recommended settings

target softwareMrBayesPAUP*PHYMLRAxMLTreefinder
likelihood calculation softwareTreefinderPAUP*PAUP* or TreefinderbasemlTreefinder
all-loci nonpartitioned analysislow needcannot disablecannot disablelow needlow need
all-loci partitioned analysiscannot disable
codon position nonpartitioned analyseslow needcannot disablecannot disablecannot disablelow need
codon position partitioned analysescannot disablelow needlow needcannot disablecannot disable

Suggested citation

You should also cite one or two of the followings.


How to get Perl?

If you want to run script version on Windows, please get ActivePerl. If you use MacOS X, you have it already. If you use others, please ask the distributer.

How to install ReadSeq?

If you have Java execution environment (MacOS X is including Java), please put "readseq.jar" into the place that contain this script. If you don't have Java execution environment but you can install it, download and install Java Runtime Environment. If you cannot install Java, please put C version readseq executable file into the directory in PATH environment variable or the directory that contain this script.

Where is the directory in PATH environment variable?

If you use Windows, please type the following in the command prompt. Then, you will see it.

echo %PATH%

If you use MacOS X or others, please type the following in console or terminal.

echo $PATH

How to install Treefinder?

Please read the manual of Treefinder. If you use Windows, you must copy "tf.exe" into the directory in PATH environment variable or the directory that contain this script or data files after installing Treefinder.

How to get command line version of PAUP*?

If you use Windows version, "win-paup4b10-console.exe" in PAUP* installed directory is executable file of command line version. If you use portable version, "paup4b10-*" is it.

How to install command line version of PAUP*?

Please put the executable file into the directory in PATH environment variable or the directory that contain this script or data files. And then, rename it to "paup.exe" (on Windows) or "paup" and set permission to executable (on MacOS X and other operating systems).

Which criterion should I use?

In ML analysis, I recommend AICc. In Bayesian MCMC analysis, I recommend BIC.

Kakusan4 have calculated 6 AICcs and BICs, which should I use?

Frankly speaking, I don't have any idea. But AICc4/BIC4 (sample size is "the number of sites") is usually used. Association between AICc/BIC number and sample size is indicated below.

  1. the minimum number of substitutions over the tree (parsimonious tree length)
  2. the sum of the minimum number of substitutions at each site
  3. the sum of the minimum number of character states at each site
  4. the number of sites (alignment length)
  5. the number of variable sites
  6. the number of all of the characters (Sites x OTUs)

Inputting multi-partitioned data or multiple files of mixed models, Kakusan4 generates many configuration files, which should I use?

Open and read the comparison results among nonpartitioned, proportional and separate models.

Published works citing Kakusan

  1. Kiyoshi, T., Takahashi, J., Yamanaka, T., Tanaka, K., Hamasaki, K., Tsuchida, K. and Tsubaki, Y., 2011, "Taxonomic uncertainty of a highly endangered brook damselfly, Copera tokyoensis Asahina, 1948 (Odonata: Platycnemididae), revealed by the mitochondrial gene genealogy", Conservation Genetics, doi:10.1007/s10592-011-0189-x.
  2. Toju, H. and Fukatsu, T., 2011, "Diversity and infection prevalence of endosymbionts in natural populations of the chestnut weevil: relevance of local climate and host plants", Molecular Ecology, 20, 853-868, doi:10.1111/j.1365-294X.2010.04980.x.
  3. Hoso, M., Kameda, Y., Wu, S.-P., Asami, T., Kato, M. and Hori, M., 2010, "A speciation gene for left-right reversal in snails results in anti-predator adaptation", Nature Communications, 1, 133, doi:10.1038/ncomms1133.
  4. Tsubaki, R., Kameda, Y. and Kato, M., 2010, "Pattern and process of diversification in an ecologically diverse epifaunal bivalve group Pterioidea (Pteriomorphia, Bivalvia)", Molecular Phylogenetics and Evolution, doi:10.1016/j.ympev.2010.11.014.
  5. Kômoto, N., Yukihiro, K., Ueda, K. and Tomita, S., 2010, "Exploring the molecular phylogeny of phasmids with whole mitochondrial genome sequences", Molecular Phylogenetics and Evolution, doi:10.1016/j.ympev.2010.10.013.
  6. Lamb, T., Biswas, S. and Bauer, A. M., 2010, "A phylogenetic reassessment of African fossorial skinks in the subfamily Acontinae (Squamata: Scincidae): evidence for parallelism and polyphyly", Zootaxa, 2657, 33-46.
  7. Hayashi, M. and Sota, T., 2010, "Identification of elmid larvae (Coleoptera: Elmidae) from Sanin District of Honshu, Japan, based on mitochondrial DNA sequences", Entomological Science, doi:10.1111/j.1479-8298.2010.00404.x.
  8. Sano, K., Kawaguchi, M., Yoshikawa, M., Iuchi, I. and Yasumasu, S., 2010, "Evolution of the teleostean zona pellucida gene inferred from the egg envelope protein genes of the Japanese eel, Anguilla japonica", FEBS Journal, doi:10.1111/j.1742-4658.2010.07874.x.
  9. Bailey, A. L., Brewer, M. S., Hendrixson, B. E. and Bond, J. E., 2010, "Phylogeny and Classification of the Trapdoor Spider Genus Myrmekiaphila: An Integrative Approach to Evaluating Taxonomic Hypotheses", PLoS ONE, 5, e12744, doi:10.1371/journal.pone.0012744.
  10. Kawaguchi, M., Hiroi, J., Miya, M., Nishida, M., Iuchi, I. and Yasumasu, S., 2010, "Intron-loss evolution of hatching enzyme genes in Teleostei", BMC Evolutionary Biology, 10, 260, doi:10.1186/1471-2148-10-260.
  11. Naher, M., Motohash, K., Watanabe, H., Chikuo, Y., Senda, M., Suga, H., Brasier, C. and Kageyama, K., in press, "Phytophthora chrysanthemi sp. nov., a new species causing root rot of chrysanthemum in Japan", Mycological Progress, doi: 10.1007/s11557-010-0670-9.
  12. Watanabe, K., Motohashi, K. and Ono, Y., 2010, "Description of Pestalotiopsis pallidotheae: a new species from Japan", Mycoscience, 51, 182-188, doi:10.1007/s10267-009-0025-z.
  13. Yoshikawa, N., Matsui, M. and Nishikawa, K., 2010, "Allozymic Variation and Phylogeography of Two Genetic Types of Onychodactylus japonicus (Amphibia: Caudata: Hynobiidae) Sympatric in the Kinki District, Japan", Zoological Science, 27, 344-355, doi:10.2108/zsj.27.344.
  14. Lin, L.-H., Ji, X., Diong, C.-H., Du, Y. and Lin, C.-X., 2010, "Phylogeography and population structure of the Reevese's Butterfly Lizard (Leiolepis reevesii) inferred from mitochondrial DNA sequences", Molecular Phylogenetics and Evolution, 56, 601-607, doi:10.1016/j.ympev.2010.04.032.
  15. Matsui, M., Hamidy, A., Murphy, R. W., Khonsue, W., Yambun, P., Shimada, T., Ahmad, N., Belabut, D. M. and Jiang, J.-P., 2010, "Phylogenetic relationships of megophryid frogs of the genus Leptobrachium (Amphibia, Anura) as revealed by mtDNA gene sequences", Molecular Phylogenetics and Evolution, 56, 259-272, doi:10.1016/j.ympev.2010.03.014.
  16. Suzuki, M., Hashimoto, T., Nakayama, T. and Yoshizaki, M., 2010, "Morphology and molecular relationships of Leptofauchea rhodymenioides (Rhodymeniales, Rhodophyta), a new record for Japan", Phycological Research, 58, 116-131, doi:10.1111/j.1440-1835.2010.00569.x.
  17. Kurniawan, N., Islam, M. M., Djong, T. H., Igawa, T., Daicus, M. B., Yong, H. S., Wanichanon, R., Khan, M. M. R., Iskandar, D. T., Nishioka, M. and Sumida, M., 2010, "Genetic divergence and evolutionary relationship in Fejervarya cancrivora from Indonesia and other Asian countries inferred from allozyme and mtDNA sequence analyses", Zoological Science, 27, 222-233, doi:10.2108/zsj.27.222.
  18. Snijman, D. A. and Meerow, A. W., 2010, "Floral and macroecological evolution within Cyrtanthus (Amaryllidaceae): Inferences from combined analyses of plastid ndhF and nrDNA ITS sequences", South African Journal of Botany, 76, 217-238, doi:10.1016/j.sajb.2009.10.010.
  19. Toju, H., Hosokawa, T., Koga, R., Nikoh, N., Meng, X. Y., Kimura, N. and Fukatsu, T., 2010, ""Candidatus Curculioniphilus buchneri," a Novel Clade of Bacterial Endocellular Symbionts from Weevils of the Genus Curculio", Applied and Environmental Microbiology, 76, 275-282, doi:10.1128/AEM.02154-09.
  20. Makino, W., Knox, M. and Duggan, I. C., 2010, "Invasion, genetic variation and species identity of the calanoid copepod Sinodiaptomus valkanovi", Freshwater Biology, 55, 375-386, doi:10.1111/j.1365-2427.2009.02287.x.
  21. Meerow, A. W., Noblick, L., Borrone, J. W., Couvreur, T. L. P., Mauro-Herrera, M., Hahn, W. J., Kuhn, D. N., Nakamura, K., Oleas, N. H. Schnell, R. J., 2009, "Phylogenetic Analysis of Seven WRKY Genes across the Palm Subtribe Attaleinae (Arecaceae) Identifies Syagrus as Sister Group of the Coconut", PLoS ONE, 4, e7353, doi:10.1371/journal.pone.0007353.
  22. Toju, H. and Sota, T., 2009, "Do arms races punctuate evolutionary stasis? Unified insights from phylogeny, phylogeography and microevolutionary processes", Molecular Ecology, 18, 3940-3954, doi:10.1111/j.1365-294X.2009.04340.x.
  23. Makino, W. and Tanabe, A. S., 2009, "Extreme population genetic differentiation and secondary contact in the freshwater copepod Acanthodiaptomus pacificus in the Japanese Archipelago", Molecular Ecology, 18, 3699-3713, doi:10.1111/j.1365-294X.2009.04307.x.
  24. Nagata, N., Kubota, K., Takami, Y. and Sota, T., 2009, "Historical divergence of mechanical isolation agents in the ground beetle Carabus arrowianus as revealed by phylogeographical analyses", Molecular Ecology, 18, 1408-1421, doi:10.1111/j.1365-294X.2009.04117.x.
  25. Yokogawa, T. and Yahara, T., 2009, "Mitochondrial phylogeny certified PGL (Paternal Genome Loss) is of single origin and haplodiploidy sensu stricto (arrhenotoky) did not evolve from PGL in the scale insects (Hemiptera: Coccoidea)", Genes & Genetic Systems, 84, 57-66, doi:10.1266/ggs.84.57.
  26. Wang, D., Fan, W., Han, G. Z. and He, C. Q., 2009, "The selection pressure analysis of chicken anemia virus structural protein gene VP1", Virus Genes, 38, 259-262, doi: 10.1007/s11262-008-0316-z.
  27. Verbruggen, H. and Theriot, E., 2008, "Building trees of algae: some advances in phylogenetic and evolutionary analysis", European Journal of Phycology, 43, 229-252, doi:10.1080/09670260802207530.
  28. Kawai, H., Hanyuda, T., Lindeberg, M. and Lindstrom, S. C., 2008, "Morphology and molecular phylogeny of Aureophycus aleuticus gen. et sp. nov. (Laminariales, Phaeophyceae) from the Aleutian Islands", Journal of Phycology, 44, 1013-1021, doi: 10.1111/j.1529-8817.2008.00548.x.
  29. Han, G. Z., Liu, X. P. and Li, S. S., 2008, "Cross-species recombination in the haemagglutinin gene of canine distemper virus", Virus Research, 136, 198-201, doi: 10.1016/j.virusres.2008.04.022.
  30. Sato, H. and Murakami, N., 2008, "Reproductive isolation among cryptic species in the ectomycorrhizal genus Strobilomyces: Population-level CAPS marker-based genetic analysis", Molecular Phylogenetics and Evolution, 48, 326-334, doi: 10.1016/j.ympev.2008.01.033.
  31. Nagata, N., Kubota, K., Yahiro, K. and Sota, T., 2007, "Mechanical barriers to introgressive hybridization revealed by mitochondrial introgression patterns in Ohomopterus ground beetle assemblages", Molecular Ecology, 16, 4822-4836, doi: 10.1111/j.1365-294X.2007.03569.x.




Tenth public release. A bug of multiple input files mode was fixed.


Ninth public release. RAxML calling option was changed.


Eighth public release. A function of optimizing log-likelihoods by RAxML was added.


Seventh public release. A bug of the function of generating configuration files for RAxML was fixed. (The configuration files for separate models generated by previous version cannot apply separate models.)


Sixth public release. Messages of interactive mode were improved. I renamed "single model" and "codonshared model" to "nonpartitioned model" and "codonnonpartitioned model", respectively.


Fifth public release. Executable binary of Treefinder was integrated into binary distribution for Windows and MacOS X. Messages of interactive mode were improved. Default settings were changed.


Fourth public release. Output options for PHYML and RAxML were added (but these functions are still experimental).


Third public release. Spaces were supported in input file path.


Second public release of 4.0 branch. A bug of treating noncoding data was fixed.


First public release of 4.0 branch. Wild cards were supported in input file specification. Messages of interactive mode were refined. Default settings were changed.


Fifth release candidate of 4.0 branch. Mac binaries were updated for Tiger. Messages of interactive mode and default settings were changed.


Fourth release candidate of 4.0 branch. Messages of interactive mode and command line options were changed.


Third release candidate of 4.0 branch. Mac binaries were updated for Leopard. Error messages were refined.


Second release candidate of 4.0 branch. A bug of calculating BIC was fixed. The code was refactored.


First release candidate of 4.0 branch. Bugs of analysing single protein-coding data and no variation data were fixed.


Third testing release of 3.5 branch. The code was refactored.


Second testing release of 3.5 branch. Messages were refined.


First testing release of 3.5 branch. A function to compare single, proportional and separate models was added.


Third testing release of 3.1 branch. A function of outputting TL files to compare single, proportional and separate models was added. Pre-selection sorting of candidate models was disabled in default setting because this test may be too sensitive on long sequence partition.


Second testing release of 3.1 branch. Default settings were changed.


First testing release of 3.1 branch. A function of specifying starting trees was added.

