Genome Manuscript

Decoding the genome of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor smallholder farmers in Asia and Africa

Rajeev K Varshney1,2, Wenbin Chen3, Yupeng Li4, Arvind K Bharti5, Rachit K Saxena1, Jessica A Schlueter6, Mark TA Donoghue7, Sarwar Azam1, Guangyi Fan3, Adam M Whaley6, Andrew D Farmer5, JaimeSheridan6, Aiko Iwata4, Reetu Tuteja1,7, R Varma Penmetsa8, Wei Wu9, Hari D Upadhyaya1, Shiaw-Pyng Yang9, Trushar Shah1, KB Saxena1, Todd Michael9, W Richard McCombie10, Bicheng Yang3, Gengyun Zhang3, Huanming Yang3, Jun Wang3,11, Charles Spillane7, Douglas R Cook8, Gregory D May5, Xun Xu3,12, Scott A Jackson4

1International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru 502324, India
2CGIAR Generation Challenge Programme (GCP), c/o CIMMYT, 06600 Mexico DF, Mexico
3BGI-Shenzhen, Shenzhen, 518083, China
4University of Georgia, 111 Riverbend Rd., Athens, GA 30605, USA
5National Center for Genome Resources (NCGR), Santa Fe, New Mexico 87505, USA
6University of North Carolina, Charlotte, North Carolina 28223, USA
7National University of Ireland Galway (NUIG), Botany and Plant Science, C306 Aras de Brun, University Road, Galway, Ireland
8University of California, 354 Hutchison Hall, One Shields Avenue, Davis, CA 95616-8680, USA
9Monsanto Company, 800 North Lindbergh Blvd., Creve Coeur, Missouri 63167, USA
10Cold Spring Harbour Laboratory (CSHL), One Bungtown Road Cold Spring Harbor, New York 11724, USA
11Department of Biology, University of Copenhagen, DK-2100, Denmark
12BGI-Americas, Cambridge, MA, 02142, USA
*Correspondence should be addressed to R.K.V. (


Pigeonpea is an important legume food crop grown primarily by resource-poor farmers in the semi-arid tropical regions of the world. Next generation sequencing (Illumina) was used to generate 237.2 Gbp of sequence that, along with Sanger-based BAC end sequences and a genetic map, was assembled into scaffolds representing 72.7% (605.78 Mb) of the 833.07 Mbp pigeonpea genome. Genome analysis predicted 48,680 genes for pigeonpea and also showed the potential role of some gene families during evolution/domestication, e.g. drought tolerance related genes. Although a few segmental duplication events were found, recent genome-wide duplication events, such as seen in soybean, were not observed. This pigeonpea reference genome sequence will facilitate the identification of the genetic basis of important traits, and accelerate the development of improved pigeonpea varieties.

Click here to download "Assembly and annotation data" for pigeonpea genome

Supplementary material:

Supplementary Table 1 Construction of libraries, generation and filtering of sequencing data used for genome assembly

Supplementary Table 2 Statistics of the final genome assembly

Supplementary Table 3 Genome sequence assembly in chromosome level pseudomolecules

Supplementary Table 4 Estimation of pigeonpea genome based on K-mer statistics

Supplementary Table 5 Assessment of the transcript coverage with the transcript contig data

Supplementary Table 6 General statistics of gene prediction and predicted protein-coding genes for pigeonpea

Supplementary Table 7 Statistics of CEGMA evaluation

Supplementary Table 8 Comparison of protein coding genes between pigeonpea and soybean

Supplementary Table 9 Functional annotation of predicted genes for pigeonpea

Supplementary Table 10 Identification of non-coding RNA genes in the pigeonpea genome

Supplementary Table 11 Summary of synteny blocks between pigeonpea and other legume genomes, medicago, soybean and lotus

Supplementary Table 12 Functional analysis (Gene Ontology and InterPro) of specific genes to pigeonpea, Phaseoleae and legume genomes

Supplementary Table 13 Comparison of sequence and structural features of ORFans and non-ORFans in the pigeonpea genome

Supplementary Table 14 Details on identification of SSRs, their distribution and primer designing for developing genetic markers

Supplementary Table 15 Primer sequences for the SSR markers

Supplementary Table 16 SNP mining in different crossing parents

Supplementary Table 17 SNP information across 12 pigeonpea genotypes

Supplementary Table 18 Heterozygosity estimated in Asha genotype

Supplementary Table 19 GO analysis of drought responsive genes in pigeonpea

Supplementary Figure 1 Comparative GC content distributions in genome sequence data of pigeonpea alongwith soybean (G. max), poplar (P. trichocarpa) and castor (R.communis) species genome.

Supplementary Figure 2 A scheme showing linking scaffold with the help of mapped marker loci on linkage group to define the chromosome level pseudomolecules.

Supplementary Figure 3 K-mer analysis for estimating the pigeonpea genome.

Supplementary Figure 4 This figure shows the support information on different gene prediction ways to define the final gene set for pigeonpea.

Supplementary Figure 5 FISH (fluorescence in situ hybridization) image of rRNA genes on a nucleus (left) and mitotic chromosomes (right).

Supplementary Figure 6 Whole-genome dot-plot between pigeonpea linkage groups (x-axis) and soybean chromosome arms without the pericentromeric regions (y-axis).

Supplementary Figure 7 Whole-genome dot-plot between pigeonpea linkage groups (x-axis) and Medicago truncatula chromosome arms (y-axis).

Supplementary Figure 8 Whole-genome dot-plot between pigeonpea linkage groups (x-axis) and Lotus japonicus chromosome arms (y-axis).

Supplementary Figure 9 Syntenic relationships of individual chromosomes of pigeonpea with soybean genome.

Supplementary Figure 10 Fragmentary genome duplications in the pigeonpea genome.

Supplementary Figure 11 Occurrence of the ancient and legume family duplication event in the pigeonpea genome.

Supplementary Figure 12 Gene Ontology (GO) terms in three groups, pigeonpea, four legumes (pigeonpea, soybean, Medicago and Lotus), and grape.