Keywords

1 Introduction

Rubber is a polymer material obtained either from the exudates of certain tropical plants or derived from petroleum and natural gas (synthetic rubber). From mere rubber bands to catheters to condom to latex threads, rubber makes more than 50,000 products. Natural rubber (NR) is produced from over 7500 plant species (Compagnon 1986), confined to 300 genera of seven families viz., Euphorbiaceae, Apocynaceae, Asclepiadaceae, Asteraceae, Moraceae, Papaveraceae and Sapotaceae (Cornish et al. 1993). At least two fungal species are also known to make NR (Stewart et al. 1955). Hevea brasiliensis (Willd. Ex. A. de. Juss. Müll-Arg.) is the major contributor of NR produced worldwide (Greek 1991). Hevea trees descended from seedlings that were transplanted from Brazil to South and South East Asia that have undergone several cycles of breeding are now the prime source of the modern world’s NR.

The latex found in the latex vessels of inner bark of H. brasiliensis is obtained by tapping (shaving the bark with a sharp knife) and collecting the latex in cups (Fig. 1). Addition of acid, such as formic acid, solidifies rubber through manipulating the ionic balance. The solidified rubber is then pressed between twin rollers to remove excess water to form sheets, the primary produce. The sheets are commonly visually graded and packed in bales for shipping. NR is also commonly transported in the form of concentrated latex for specific purposes. History of the Hevea rubber is documented in ancient religious documents from Mexico dating back to 600 AD (Serier 1993) while Columbus wrote the first mention of rubber in 1496. de la Condamine, an astronomer, was the first to scientifically describe and to send samples of the elastic substance called ‘caoutchouc’ (the French word meaning “weeping wood”), from Peru to France in 1736 with full details about the habit and habitat and procedures for processing (Dijkman 1951; Baker 1996). The English chemist Joseph Priestley coined the name ‘rubber’ in 1770 when he found its wonderful property of rubbing out the pencil marks. As a botanist, Fusée Aublet described the genus Hevea in 1775. Rubber derived from Hevea brasiliensis is predominantly constituted of cis-1, 4 polyisoprene (C5 H8)n where n may range from 150 to 2,000,000. Carbonyl groups were also detected which significantly contribute to the degree of cross linking and storage hardening (Pushparajah 2001). Many reviews have appeared in the past highlighting the importance of genic diversity and utility in breeding programmes (Priyadarshan and Clément-Demange 2004; Priyadarshan et al. 2008; Priyadarshan 2015, 2016; Gireesh et al. 2017). This review deals with genic conservation in Hevea and its vitality in breeding in tune with recent developments in the era of genomics.

Fig. 1
figure 1

General appearance of a Hevea plantation of a latex and timber clone (inset: a tapped panel of Hevea tree)

2 Introduction of Rubber to South East Asia

The trios, Clement Markham of British India Office , Joseph Hooker, Director of Kew Botanic Gardens and Henry Wickham, Naturalist, played a key role in early exploration of the species. Also, Henry Ridley (Director of Singapore Botanic Gardens) and R. M. Cross (Kew Gardner), with Kew Botanic Gardens, played pivotal role in rubber procurements and distribution. As per directions of Markham, Wickham collected 70,000 seeds from Rio Tapajos region of Upper Amazon (Boim District) and transported them to Kew Botanical Gardens during June 1876 (Wycherley 1968; Schultes 1977; Baulkwill 1989). Of the 2700 seeds germinated, 1911 were sent to Botanical Gardens, Ceylon (now Sri Lanka), during 1876, and 90% of them survived. During September 1877, 100 Hevea plants specified as ‘Cross material’ were also sent to Ceylon. Earlier, in June 1877, 22 seedlings not specified either as Wickham or Cross were sent from Kew to Singapore, which were distributed in Malaya and formed the prime source of 1000 tappable trees found by Ridley during 1888. An admixture of Cross and Wickham materials might have occurred, as the 22 seedlings were unspecified (Baulkwill 1989). One such parent tree planted during 1877 was available in Malaysia even after 100 years (Schultes 1987). Seedlings from Wickham collection of Ceylon were also distributed worldwide. Rubber trees covering millions of hectares in South East Asia are believed to be derived from very few plants of Wickham’s original stock from the banks of the Tapajos (Imle 1978). After reviewing the history of rubber tree domestication into East Asia, Thomas (2001) reported the origin of modern clones invariably from the 1911 seedlings that were sent to Ceylon (now Sri Lanka) during 1876. Also, Charles Farris could transport some seedlings to Kolkata in India (erstwhile Calcutta) during 1873. Hence, the contention that the modern clones were derived from “22 seedlings” is debatable. Moreover, if the modern clones are derived from 1911 seedlings, then the argument that they originated from a ‘narrow genetic base’, as believed even now, needs to be reviewed (Thomas 2002).

First introduction of rubber to India was during 1879 from Ceylon when 28 Hevea plants were planted in Nilambur Valley of Kerala State in South India (Haridasan and Nair 1980). During 1880–1882, plantations on experimental scale were raised in different parts of South India and the Andaman islands. Hevea was first introduced to Vietnam in 1897 by French, but got rejuvenated only after 1975 because of long-lasting war (Priyadarshan et al. 2005). Developments in domestication of rubber after 1880 commenced in Singapore Botanical garden, one of the world’s finest in terms of both its aesthetic appeal and the quality of its botanical collection. Approximately 3000 species of tropical and subtropical plants and a herbarium of about 500,000 preserved specimens are the hallmark of this garden. Under the direction of Henry N. Ridley, who took over as superintendent in 1888, the garden became a centre for research on Hevea brasiliensis. Ridley developed an improved method of tapping rubber trees that resulted in better latex yield that revolutionized the region’s economy. His persistence resulted in establishment of the first rubber estate in 1896, and thereon the rubber industry grew into one of the economic mainstays of the Malay states.

Synthetically speaking, yield improvement through breeding was initiated with a very strict mass selection among the trees at the beginning of the twentieth century. With the introduction of bud-grafting, ‘generative’ and ‘vegetative’ selection methodologies simultaneously resulted in development of seedlings and grafted clones (Dijkman 1951). Around 1950, the advantages of grafted clones proved to be overwhelming for yield potential compared to genetically improved seedlings and the focus shifted to derivation of clones for latex productivity. With all these cultural developments, Hevea brasiliensis soon ousted many other rubber producing species: Castilla, Manihot glaziovii, (ceara or manicoba rubber tree), Ficus elastica, Landolphia and Clitandra vines (African rubber).

Once the Hevea had been successfully transplanted to South East Asia, the development of rubber plantation industry was rapid, and considerable quantity of NR-based commodities were available in the market by 1910. Factors like availability of labour, favourable soil and climate contributed to this development. With the steady increase in global demand, the total area of plantation in the East amounted to 5000 acres by the turn of 1900. In 1910, it was 1 million acres, and in 1920, four million acres. After the end of World War II in 1945, the total acreage exceeded nine million, and by the mid-1960s, it was 11.5 million. According to the Food and Agriculture Organization of the United Nations, in 1996 the total land area of harvested NR in Asia amounted to about 15.6 million acres. Rubber produced from Hevea in Asian countries, ranging from Philippines to Sri Lanka, accounted for almost 95% of the world’s NR supply (9.2 million tons from 9.5 million ha, IRSG 2015). Worldwide, there was a 37% increase in yield from 1995 to 2007 (IRSG 2015). There was always a constant correlation in the prices of oil and NR. World economic recessions also contributed for the downfall in the prices of NR. An extensive survey of the history and development of NR is out of scope of this review. Readers interested in such details may refer Baulkwill (1989) for an extensive account.

3 Narrow Genetic Base

Extent of variability and access to varied gene pool is basic requirement prime for the improvement of any crop species. Utilization of Amazonian germplasm and in-situ conservation strategies of wild germplasm are the added advantages of Hevea rubber. Conservation and utilization of allied gene resources are vital for the improvement of crop species, and NR has been noted as an undeniably beneficial commodity for the past 100 years (Priyadarshan and Goncalves 2003). Progress in yield improvement over past 70 years resulted in primary and hybrid clones with exceptional yielding abilities. Basic philosophy of genetic improvement programme of any crop in the world lies in the enhancement of variability in its gene pool. NR improvement programme is not an exception that needs new variability and diversity to circumvent new challenges in future, especially in the changing climatic era. Adaptability is the capacity for genetic response to selection that results in adaptation (Simmonds 1962). Genetic diversity remains as fuel for all crop improvement activities of plants to tide over new challenges. Importance of NR was well determined and even attained attention over 450 years since its historically known domestication. Later, accidental discovery of vulcanization by Goodyear in 1839 followed by value addition and utilization fuelled by the industries in the developed world grew its importance into many folds.

While primary introduction of H. brasiliensis is believed be originated from the Para region of Amazon rain forest of Brazil, it does not represent the entire genetic resources available in the tropical rain forests south of the Amazon River (Schultes 1977). World War II led second Amazon rubber boom followed by invasion of Japan during 1942 in Malaysia and Indonesia brought tremendous changes in world’s NR supply. The wide species diversity available in the region was not covered fully during the initial exploration and collection. About 70,000 seeds collected by Sir Henry Wickham ended up with 22 seedlings at Singapore Botanic Gardens through a series of shipments and transfers right from the Amazon to Kew Botanic Garden and then to Ceylon (now Sri Lanka). Most of the initial collection was restricted to H. brasiliensis because of predominance of the species in the Para region. Thus, the ‘Wickham’ gene pool (Simmonds 1989) by origin itself is represented as ‘very narrow’. Genetic improvement of the species rewarded tremendous progress in many important agronomic traits, particularly dry rubber yield from a meagre 250 to 3000 kg ha−1. But the theoretical yield remained to a tune of 7000–12,000 kg ha−1 (Pardekooper 1989). This remains as a distant goal mainly due to the constraints like long-breeding cycle, strong G × E interactions for useful traits (Meenakumari et al. 2018) and availability of land resources for laying out wide space multilocation trials and absence of variability towards major diseases and abiotic stresses. But fortunately, additional variations were inducted through the conserved wild germplasm which facilitated the introgression of genes into the Wickham gene pool in a slow pace.

The genetic variability among the present cultivated clones and other wild species could be improved considerably through limited interspecific hybridization (Figs. 2 and 3). Canopy and branching pattern of rubber tree are important traits that often determine the tree-level and stand-level productivity (McCrady and Jokela 1996; Cilas et al. 2004) and its vulnerability towards wind fastness (CleÂment-Demange et al. 1995). Retention of tree stands for a long period is affected by many factors like wind and planting density. It is presumed that compact canopy genotypes can escape wind damage to a certain level and can contribute for the optimal utilization of land through deploying suitable intercropping systems. Canopy variability which is obviously available in the present gene pool (Fig. 4) exists rarely in mutants spotted from the seedling and segregating populations (Gireesh and Mydin 2014).

Fig. 2
figure 2

The picture depicts the set fruit after hybridization by hand pollination

Fig. 3
figure 3

Seeds of different species of Hevea genera and their comparative seed morphology, characterized with unique mottling patterns

Fig. 4
figure 4

This picture depicts different and distinct canopy patterns observed in H. brasiliensis

Molecular markers played a crucial role in delineating the genetics of germplasm lines. DNA fingerprinting studies using RFLP and ribosomal DNA variations were used to assess the genetic variability among H. brasiliensis genetic resources. A genetic analysis using random nuclear probes and isozymes of the 168 individuals (including 73 cultivated Wickham clones and 95 wild clones) (Besse et al. 1993a, b, 1994) revealed a relatively high level of polymorphism in the cultivated clones, despite their narrow genetic base due to high level of polymorphism in the cultivated clones and high level of inbreeding and same level of polymorphism. Molecular interventions revealed higher level of conservation mtDNA in modern clones was contributed by only two clones (PB 56 and Tjir 1), and the nuclear DNA was due to breeding and selection under varies geo-climates (Priyadarshan and Goncalves 2003).

Gouvêa et al. (2010) observed the heterozygosity to range between 0.05 and 0.96 based on multivariate techniques and microsatellite marker studies in a group of clones. Both high total diversity (HT′ = 0.58) and high gene differentiation (Gst′ = 0.61) were observed which indicated high genetic variation among the 60 genotypes that may be useful in breeding programs. Results found agreeable with that obtained from crosses involved within Wickham gene pool as selections imparted promising yield improvement when compared to their parents (Priyadarshan and Clément-Demange 2004). But long breeding cycle and genotype by environmental interaction of multigene traits often fool the selection of superior clones which needs to be confirmed through multilocation trial for introgressed traits from the wild sources to express. Genotype by environment interactions in perennial species has received considerable attention recently through the use of progeny tests set up in different locations. Depending on whether they are controlled or not, such interactions can lead to gains or losses in tree breeding programs (Zobel and Talbert 1984). But studies indicated that use of two sites is more profitable when the gains in efficiency of selection are greater than 10% (Costa et al. 2000).

4 Addition of New Amazonian Germplasm

Collection of new genotypes from the centre of origin had been happening since 1890. Through many planned expeditions, Malaysia received 1614 seedlings belonging to five species during 1952 collection (Brookson 1956; Tan 1987). Tan (1987) reported combined efforts by USDA, IAN (Brazil) and Liberia between 1957 and 1959 which led to more accessions reaching Malaysia.

The 1981 international collection of wild genetic resources from H. brasiliensis species, organized by IRRDB in the three Amazonian states viz., Acre, Rondonia and Mato Grosso with the essential partnership of Brazil, was the predominant operation aimed at enlarging the genetic base of the cultivated H. brasiliensis species since the beginning of rubber cultivation one century before (Clément-Demange et al. 2000). The approach of directional selection for commercial yield followed by vegetative propagation devastated the then existed genetic diversity in the gene pool. Polyclonal seedling populations are also continuously diminishing which needs in-situ conservation at least in the secondary centre of origin. International Board of Plant Genetic Resources (IBPGR 1984) identified rubber as priority crop among the others for conservation of entire gene pool. Amazonian populations introduced to the main rubber-growing countries are still under experimental stages. Performance of hybrid generated from artificial hybridisations between Wickham and Amazonian accessions (belonging to Rondonia, Mato Grosso and Acre) is being evaluated under various trials (Chandrasekhar et al. 2004) and has superior siring ability as a male parent. Pollination success rate in W × A crosses ranged from nil to 14.2% while Wickham × Wickham crosses 0.8–19%. The wild accession MT 2226 as a male parent success rate of 12.1% was highest and was in concurrent with earlier reports (Morris 1929; Ross 1960; Rao 1961; Attanayake and Dharmaratnae 1984; Olapade and Omokhafe 1990; Gireesh and Pravitha 2009). Current level of rubber yield from undomesticated Amazonian population is around 12% of that of domesticated Wickham clones in addition to the fairly high resistance to Microcyclus or Corynespora (Clément-Demange et al. 2001).

During the last two decades, India made many attempts to characterize the IRRDB 1981 germplasm at the field level. Breeders used wild germplasm accessions for integrating through many crossing programmes. Mercy et al. (2014) documented most of the hybridization campaigns involving two Rondonian accessions (RO 87 and RO 1420) with RRIM 600 as female parent during 1990 and could identify 26 F1 hybrids superior to RRII 105 in terms of growth parameters. Highest yield was observed from the family of RRII 105 × MT 196. From another breeding programme, using Rondonian clones (RO 24, RO 26, RO 34, RO 87 and RO 132) and MT 196 as male parent with RRII 105, seven hybrid clones reached on farm trials recently. In 1997, RRIM 600 and RRII 105 as female parent and five Mato Grosso clones (MT 1021, MT 999, MT1027, MT 1014 and MT 1005); three Acre clones (AC 495, AC 498 and AC 817) and one Rondonian accession clone (RO 380) as male parent. Five promising preliminary selections were identified from the hybridization programmes happened during 2007, 2009, 2011, 2012 and 2013 utilized high yielding Wickham clones RRII 105, RRII 429 and RRII 414 as female parent and drought-tolerant wild accessions as male parent in four cross combinations. The resultant hybrids are under different stages of evaluation.

Seguin et al. (2003) proposed a general organization of Hevea brasiliensis germplasm with six genetic groups: group 1 made up with the two districts AC/T (Tarauaca) and AC/F (Feijo) in the western part of Acre, and with the Calima component of the Schultes collection; group 2 made up with the three districts AC/B (Brasileia), AC/S (Sena Madureira) and AC/X (Xapuri) in the eastern part of Acre; group 3 made up with the six following districts of Rondonia: RO/A (Ariquemenes), RO/C (Calama), RO/CM (Costa Marques), RO/J (Jaru), RO/JP (Jiparana) and RO/OP (Ouro Preto), the district MT/VB (Vila Bella) of Mato Grosso and accessions MDF (Madre de Dios Firestone) from the Firestone collection in Peru; group 4 made up with three districts MT/A (Aracatuba), MT/C (Juruena) and MT/IT (Itauba) of Mato Grosso and the district RO/PB (Pimenta Bueno) of Rondonia; group 5 made up with the Palmira component of the Schultes collection and group 6 made up with the domesticated Wickham population. Even if no prediction can be made about the progenies of crosses between these groups, they can be used as a base for managing the genetic variability in the long term and organizing the recombination process.

5 Nuclear vs. Cytoplasmic Genetic Diversity

Most of the breeding and selection activities being carried out in progenies are based on phenotypic values expressed in consequence to the nuclear genome. However, no much efforts have been made to explore the cytoplasmic genome influence on phenotype particularly on sterility, incompatibility and reproductive biology of the species. Previous studies in Hevea had made use of molecular markers such as RAPD, RFLP, microsatellite or SSR and SSCP for clone identification and genetic variability assessment (Besse et al. 1994; Varghese et al. 1997; Venkatachalam et al. 2002; Mathew et al. 2005; Luo et al. 1995; Roy et al. 2004; Saha et al. 2005; Lekawipat et al. 2003; Priyadarshan 2003). All these efforts mainly focused on nuclear genome. Besse et al. (1994) categorized accessions of Brazil Amazonia according to their geographic origin (Acre, Rondonia and Mato-Grosso) based on nuclear DNA polymorphism . In contrast, the cultivated clones exhibited relatively higher level of polymorphism, despite its narrow genetic base and continuous assortative mating and selection. As expected, polymorphism is very prudent among allied species of Hevea. A comparison of isozyme analysis (Lebrun and Chevallier 1990) with demonstrated DNA markers revealed much similarity (Besse et al. 1994). Identification of all Wickham clones with 13 probes associated with restriction enzyme EcoRI (Besse et al. 1993b). The cultivated clones are genetically close to the Mato-Grosso genotypes. Rondonia and Mato-Grosso clones are more polymorphic as per RFLP data (Besse et al. 1994). A Rondonia clone (RO/C/8/9) showed eight specific restriction fragments and a unique malate dehydrogenase (MDH) allele, indicating this clone’s interspecific origin. Such molecular markers are useful in rubber tree improvement since no distinct morphological traits exist. In a major study on mitochondrial DNA (mtDNA) polymorphism in 345 Amazonian accessions, 50 Wickham clones and two allied species (H. benthamiana, H. pauciflora) by Luo et al. (1995), variation in wild accessions was evident while the cultivated clones formed only two clusters.

5.1 Potentiality of Organelle Genome

mtDNA: In the initial attempts, selections were based on nuclear DNA polymorphism, and expressed phenotypic traits to evolve modern clones. The geographic specificity towards nuclear and mtDNA (Fig. 5) polymorphisms are due to great level of genetic structuring among natural populations in the Amazon forests in relation to hydrographic network (Luo et al. 1995). In wild accessions, seed dispersal and selection are as per the environmental conditions. Most of the phenotypic variations occurred in the natural habitat were being lost due to natural selection pressure prevailed during the evolution. It was observed that the wild accessions have not been introgressed for developing high yielding clones, after they had been introduced to South East Asian countries. In contrast, the Wickham clones exhibited high nuclear DNA polymorphism, perhaps due to directional breeding under different agro climates. Thanks to the nuclear genome that played main role in enhancing the variation in clones according to the diverse climatic conditions prevailed in the newly introduced areas. Certain primary clones alone provided common lineage for development of new genetic stocks which resulted in lesser genetic variation in improved Wickham clones.

Fig. 5
figure 5

Annotated map of mitochondrial genome (outer circle) of Hevea brasiliensis. Grey arches indicate the mapping of each pair of the Illumina paired-end sequence data (inner circle). Direct repeats are shown as blue arches and inverted repeats as orange arches. (Adopted with permission from Shearman et al. 2014)

In general, the primary clones like PB 56 or Tjir1 were the cytoplasmic donors for most of the improved clones. For example, PB 56 is the donor of PB 5/51 while Tjir 1 is the donor of RRII 105, RRIM 600 and RRIM 605. In most of the conventional breeding systems of Hevea, best parents were selected for next cycle of breeding (Simmonds 1989), which led to the development of two clusters of mtDNA profile in the clones developed, while the wild accessions displayed greater polymorphism since they evolved through possible interspecific hybridization. Obviously, this is the reason for the mtDNA profile exhibiting only two clusters (Priyadarshan and Goncalves 2003). A possible explanation for greater polymorphism in mtDNA of wild accessions is that many might have been evolved through interspecific hybridization. More investigations are needed to study the mtDNA polymorphism across wild accessions to ascertain the relatedness between wild and domesticated clones through employing molecular marker approaches.

Mitochondrial genome expansion in land plants is primarily due to large intergenic regions, repeated segments, intron expansion and incorporation of foreign DNA such as plastid and nuclear DNA (Turmel et al. 2003; Bullerwell and Gray 2004). Accumulation of repetitive sequences in plant mitochondrial genomes causes frequent recombination events and dynamic genome rearrangements within a species (Chang et al. 2011; Allen et al. 2007). Several mutations in the mitochondrial genes were found associated with cytoplasmic male sterility (CMS). Mutation in T-urf13 gene in maize (Dewey et al. 1986), pcf gene (a fusion of atp9 and cox2 portions) in petunia (Young and Hanson 1987), cox1 in rice (Wang 2006) and mutations in ATPase subunits in sunflower (Laver et al. 1991) and Brassica (Landgren et al. 1996) are few examples. RNA processing also plays an important role in controlling CMS as evidenced in orf355/orf77 (atp9) and T-urf13 in maize (Gallagher et al. 2002; Dill et al. 1997). Clone BPM 24 exhibits cytoplasmic male sterility that was inherited from GT 1. Shearman et al. (2014) identified a unique rearrangement in mitochondrial genome of BPM 24 that codes for a novel transcript containing a portion of atp9. This unique rearrangement led to slight reduction in ATP production efficiency and thus causing cytoplasmic male sterility. With the development of next-generation sequencing (NGS) technologies , new strategies were used to obtain plant mitochondrial genomes. A combination approach of shotgun and paired-end NGS from non-enriched whole genome DNA libraries could be successfully used to obtain the mitochondrial genomes (Rahman et al. 2013; Shearman et al. 2014).

cpDNA: Chloroplast genomes are sufficiently larger and complex to include structural and point mutations that are useful for evolutionary studies from intraspecific to interspecific levels (Neale et al. 1988; McCauley 1992; Graham and Olmstead 2000; Provan et al. 2001). Since the first report on complete chloroplast (cp) genome sequence of liverwort (Marchantia polymorpha ) in 1986 (Ohyama et al. 1986), more than 150 chloroplast genomes have been sequenced and characterized thus disclosing an enormous amount of evolutionary and functional information of chloroplasts (Fig. 6). Tangphatsornruang et al. (2010, 2011) reported the complete chloroplast genome sequence of rubber tree with a 161,191 bp length including a pair of inverted repeats of 26,810 bp separated by a small single copy region of 18,362 bp and a large single copy region of 89, 209 bp. It contains 112 unique genes, 16 of which are duplicated in the inverted repeat. Of the 112 unique genes, 78 are predicted protein-coding genes, four are ribosomal RNA genes and 30 are tRNA genes. Relative to other plant chloroplast genomes, Tangphatsornruang et al. (2011) observed a unique rearrangement of a 30-kb inversion between the trnE(UUC)-trnS(GCU) and the trnT(GGU)-trnR(UCU). A comparison between the rubber tree chloroplast genes and cDNA sequences revealed 51 RNA editing sites in which most (48 sites) were located in 26 protein-coding genes while the rest of the three sites were in introns.

Fig. 6
figure 6

Annotated map of the chloroplast genome of Hevea brasiliensis. Thick lines indicate the extent of the inverted repeats (IRa and IRb) which separate the genome into small and large single copy regions. Genes on the outside of the map are transcribed clockwise and those on the inside of the map are transcribed counter clockwise. Genes containing introns and pseudogenes are marked with * and #, respectively. Arrows indicate the positions of a 30-kb unique rearrangement in relative to the Cassava chloroplast genome. (Adopted with permission from Tangphatsornruang et al. 2011)

This information on plastome can be better utilized to determine phylogenetic relationships among angiosperms. Phylogenetic analysis of chloroplast genes using 144 orthologus clusters from 17 sequenced plant genomes showed Hevea’s closest ancestry with Manihot esculenta within the family and with Populus trichocarpa outside Euphorbiaceae (Rahman et al. 2013). The latest report on chloroplast genome of H. benthamiana, a SALB resistant and close relative of H. brasiliensis species also illustrates its close relationship thus indicating the possibility of utilizing the same in breeding programmes to obtain SALB-resistant H. brasiliensis clones (Niu et al. 2020). As a synthesis of these diversity studies, closer relationships were found from the studies using different genetic markers. Even if the contribution of isozymes is important by itself, molecular markers provided important clarifications for the distinction between different groups. There would be no barrier to migration of Hevea genes within the Amazonian basin. However, the vastness of the area and the limited dispersion of Hevea seeds enabled the preservation of the current structure, which is assumed to have resulted from the fragmentation of the Amazonian forest during the Pleistocene period, according to the refuge theory presented by Haffer (1982). Moreover, the Hevea germplasm genetic structure clearly appears as geographically structured in relationship with the hydrographic network of the Amazonian forest, which confirms the role of rivers and inundated zones in the transport of seeds and dissemination of the species (Besse et al. 1993a; Luo et al. 1995; Seguin et al. 1996).

6 Perspectives for New Germplasm

The objective of utilization of germplasm is the introgression of genes to present domesticated resources of Hevea. Wild accessions can be included in the research programmes to improve or develop superior clones with high latex yield potential, biotic, abiotic stress tolerance, etc. Recently, IRRDB initiated sharing of frontline improved genetic resources among member countries. In situ conservation of species in their natural habitats and creation of field gene bank is considered the most appropriate way of conserving biodiversity in easily manageable way. Protecting the areas where populations of species exist naturally is difficult for an introduced crop like Hevea brasiliensis. In situ conservation of species is necessary for preserving genetic diversity in ecosystem level. Many-fold increase in the global demand for latex and dry rubber products is expected in the coming years (Smith and Burger 1992), mainly by the automobile industry in emerging economies. In order to bridge the gap between production and consumption, cultivation has to be extended to marginal areas with limiting climatic conditions forced breeders to develop new clones with tolerance to environmental constraints like drought, cold, wind, higher altitude, moisture stress deficit and diseases. Vast areas have been identified in major rubber-producing countries like India, Thailand, Vietnam, China, Brazil, Cote d’Ivoire and other African countries. IPR issues and related restrictions may restrict accessing the germplasm transfer, in the coming years, especially for collection of new entries from the centre of origin. It is highly essential to safeguard the precious germplasm accessions available in each country for further utilization in crop breeding programmes. Mydin and Gireesh (2016) made an attempt to improve the diversity and heterosis among the present germplasm by creating new combinations through cross breeding which yielded hybrids with improved yield and other secondary traits. Success of heterosis breeding looks promising when compared to other breeding methods in Hevea. Such new combinations have great potential to contribute towards improving future breeding tools that may ultimately end up with pyramiding of economic traits in at least few genotypes. Studies at ploidy levels of germplasm accessions of Hevea were very meagre with few attempts (Chen et al. 1979). But unfortunately , there has not been any follow-up studies exploiting this system like production of dihaploids and so on for crop improvement in Hevea.

7 Genic Diversity and Clone Development

The early genetic resources are high yielding and highly adapted to current cropping systems in Asia, Africa and non-SALB areas of Latin America. Breeders believe that the reduced genetic variability of Wickham base is weak and insufficient to meet the demand for the improvement of growth and latex yield traits. Hevea brasiliensis originated from miniscule of germplasm, about 22 seedlings recovered from consignment originated from Singapore to Malaysia during 1876, popularly known as l Wickham germplasm. Extensive use of these seedlings resulted in the origin of present-day cultivars. Breeders used to select elite progenies based mainly on latex yield, eventually sidelined other traits of importance, resulted in directional selection, appears to be the major reason for the genetic erosion. The fact that populations were subjected to several rounds of controlled crossing further narrowed the diversity. But the strategy followed by the breeders to select only the desirable genotypes and to reject the unwanted ones (without assessing the utility other than yield) is the main reason that reduced diversity. The strategy for utilization of wild gene pool involves gene introgression from germplasm, pyramiding favourable gene and gene combinations and improvement of the cultivated clones using IRRDB wild collection. The recently conducted plant breeders meeting by IRRDB at Malaysia (2019) reviewed the progress of bilateral clone exchange programme between the member countries and suggested to intensify exchange of germplasm between all the stake holders to maximize the genetic diversity and to improve the gene pool.

Possibility of genetic diversification using other species as sources of poly isoprene also needs to be exploited. Apart from the latex, availability and inheritance pattern of wood production has not been attempted seriously either in cultivated or wild Amazonian gene pool. Related species, genera and wild accessions from the original habitat available in all rubber-growing nations can also be integrated and networked into breeding programmes in a global scale to overcome the narrow genetic base. Utilizing allied species as excellent resources of timber and possible pre-breeding activity are the other options for widening the genetic base of rubber.

Classic plant breeding programs under changing climatic conditions mainly depend upon evaluation of phenotypes in varying environmental conditions. Selection of genotypes that are resistant to environmental variations is basically done by monitoring their physiological responses under adverse environmental conditions. In breeding programmes, the common physiological parameters employed for selecting the tolerant varieties include photosynthesis, content of photosynthetic pigments (chlorophyll a and b and carotenoids) and chlorophyll fluorescence (Holá et al. 2010; Sterling et al. 2019). Since photosynthesis is the key process that determines the plant productivity, it is important to assess its photosynthetic capacity under limiting environmental conditions as marker for tolerance. The recent developments in genome-based data and molecular markers have enabled the selection of such climate resilient genotypes much easier which can significantly reduce the time and cost required to identify or develop such genotypes. Several genomic prediction models combining genotype × environment (G × E) interactions have been developed for application in genomic selection in tree breeding programmes (Lopez-Cruz et al. 2015; Cuevas et al. 2016; Souza et al. 2019; Cros et al. 2019; Meuwissen et al. 2001). It is highly essential to employ physiological markers like gas exchange parameters for validating the genetic markers to identify potential markers associated with abiotic stress tolerance in Hevea.

8 Molecular Advances in Hevea Research

Conventional genetic analysis in Hevea is cumbersome and time consuming because of its perennial nature, long breeding cycles and difficulties in raising F2 progenies. The recent development of molecular markers like isozymes, restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs), SSR markers, microsatellites, and ESTs has helped to understand the genetic diversity in Hevea. The recent advances in gene engineering and in vitro culture methods have increased the possibility of developing new genotypes with improved latex yield, tolerance to various diseases and tapping panel dryness (TPD) syndrome, growth rate and wood quality or reduction in undesirable traits (Supriya and Priyadarshan 2019). Gene editing is a viable and potential option to develop genotypes with altered gene expression that facilitates expression of the desired traits. In Hevea, CRISPR/Cas9-based targeted mutagenesis was first reported by Fan et al. (2020) who could successfully target FLOWERING LOCUS T (FT) and TERMINAL FLOWER1 (TFL1) genes with five sgRNAs through one transformation step in protoplasts. This recently developed system of genome editing with Cas9 nuclease directed by target-specifying single-guide RNA (sgRNA) has proven efficient in functional analysis of endogenous genes and breeding of new varieties (Cong et al. 2013; Francis et al. 2017). This approach minimizes off-target effects and prevents genomic integration of foreign DNA without the need for codon optimization and species-specific promoters. This study opens up the possibility of large-scale production of DNA-free genome-edited plants of Hevea with specific traits from protoplasts. Conventional breeding in combination with the recent molecular techniques can be best utilized to develop Hevea clones for new environments (Saha and Priyadarshan 2012).

Many studies in Hevea have made use of RAPD, RFLP, microsatellite or SSR and SSCP markers for clone identification and genetic variability studies (Besse et al. 1994; Varghese et al. 1997; Venkatachalam et al. 2002; Mathew et al. 2005; Luo et al. 1995; Roy et al. 2004; Saha et al. 2005; Lekawipat et al. 2003). RAPD markers were also used to study the genetic diversity of a vast collection of germplasm collections from Amazonian region under the aegis of International Rubber Research Development Board (IRRDB) during 1981 and to classify them into five groups that would further facilitate crosses to utilize heterosis and abundant genetic diversity (Lam et al. 2009). Though RAPD markers were used for genotyping, they are less preferred when compared to microsatellite markers due to their low reproducibility and low discrimination power (García et al. 2011). In contrast, the RFLP markers are considered efficient in terms of genetic polymorphism and are evenly distributed across the Hevea genome, and their applicability to genotyping of large progenies is expensive in terms of time and cost (Seguin et al. 2008). While the AFLP markers allow rapid identification and mapping of hundreds of markers, they are not as informative as RFLP. The microsatellite markers or simple sequence repeats (SSRs) are considered most efficient because of their properties like co-dominant multi-allele nature and random distribution throughout the genome with high polymorphism (Saha et al. 2005). They combine the advantage of RFLP (in terms of polymorphism, its abundance in genome and its locus specificity) and AFLP (PCR-based high output genotyping) and are dependable apart from their high efficiency (Seguin et al. 2008).

Many attempts have been made to create dense SSR map of Hevea for genetic mapping and to locate QTLs (obtained from different mapping projects) in chromosomes. The pioneering studies of genetic mapping for agronomic traits include SALB resistance (Seguin et al. 1996; Lespinasse et al. 2000b; Le Guen et al. 2007) and latex yield and growth (Clément-Demange et al. 2006, 2008). These studies indeed explained the diploid nature of H. brasiliensis and H. benthamiana genomes with a limited number of duplicated chromosome segments. Initial studies with 108 Hevea accessions identified SSR M574 as highly polymorphic (Lekawipat et al. 2003). Four markers (HMAC4, HMAC5, HMCT1 and HMCT5) could discriminate 27 Hevea clones (Saha et al. 2005) while HMGR marker was found suitable for studying genetic variability in wild Hevea (Saha et al. 2007). In a similar attempt, microsatellite markers were used to analyze genetic diversity of 307 clonally propagated individuals from 19 different collection areas to understand the existence of a moderate differentiation among a population of 220 individuals and larger genetic distance in Mato Grosso populations (Le Guen et al. 2009).

The focus of subsequent research shifted from anonymous markers (AFLPs and microsatellite markers) to single-nucleotide polymorphisms (SNPs) which are abundant in plant genomes. Initial attempts had contributed to the identification of SNP loci that could be utilized in the construction of high-density (10 SNP loci belonging to 5 loci) genetic maps (Pootakham et al. 2011; Mantello et al. 2014). The subsequent high-density genetic map developed by Pootakham et al. (2015a) had one SNP marker per 0.89 cM. Genetic variation plays a significant role in maintaining heterosis in Hevea. Saturated genetic maps are essential to identify the genomic regions harbouring major genes and QTLs that control important agronomic traits for employing in the breeding programs. The development of next-generation sequencing (NGS) technologies provided ample opportunities to extract large number of SNP markers without the need of a reference genome sequence. The advantages of NGS are: higher sensitivity to detect low-frequency variants, faster turnaround time for sample volumes, comprehensive genomic coverage, higher throughput with sample multiplexing and the ability to sequence hundreds to thousands of genes or gene regions simultaneously (Shendure and Ji 2008; Schuster 2008).

Salgado et al. (2014) reported H. brasiliensis transcriptome, covering a wide range of tissues and organs, leading to the production of the first developed SNP markers. Silva et al. (2014), identified 172 SNPs that are suitable for breeding studies from a set of 17,166 contigs developed from cDNA of cold stressed and different tissues of Hevea. This study identified two highly upregulated cold stress-responsive ESTs from 5025 unigenes and 912 novel ESTs along with 169 novel EST-SSR markers and also 43 SNP markers in 13 ESTs associated with stress response, latex biosynthesis and developmental processes. During 2015, two reports were published based on advanced technology like genotyping-by-sequencing (GBS) to develop high-density genetic map. Shearman et al. (2015) reported the first high-density genetic map of rubber with an average marker density of 1.90 cM. They employed the identified SNPs and Indels to genotype 149 progenies from a cross between RRIM 600 and RRII 105. They included 20,143 genes to construct the linkage map. In a study of gene expression analysis and SNP Indel discovery for yield heterosis in a F1 hybrid population (RRIM 600 vs. PR 107) by Li et al. (2016), higher yield in F1 population was found positively associated with their higher genome heterozygosity. This study indicated the role played by variation of SNPs in the genic region in manifestation of heterosis that contribute for better yield in F1 populations. The haplotype analysis of about 25 SNPs done for three genes such as farnesyl diphosophate synthase (FPPS), hydroxymethylglutaryl-CoA synthase (HMGS) and cis-prenyl transferase (CPT) in various Hevea clones (of diverse parentage) identified seven FPPS, eight HMGS and eight CPT haplotypes (Uthup et al. 2013, 2016, 2018). The segregation and linkage disequilibrium analysis revealed recombination events as the reason for the generation of allelic diversity rather than point mutations. These studies emphasized the major role of SNPs in evolution of candidate genes coding for functional traits in plants. This information can even be best utilized to trace the effect of phenotypic selection on changes at the genome level and further at the level of successive generations of crosses (Chow et al. 2020).

A first study on genome-wide association mapping of latex yield and girth of 170 Amazonian accessions grown under suboptimal environments (limited rainfall and lengthy dry season) using 14,155 high-quality filtered SNPs (from transcripts) was reported by Chanroj et al. (2017). With a mixed linear model, they could detect three significant SNPs in three candidate genes associated with plant adaptation to drought stress, individually explaining 12.7–15.7% of phenotypic variance. This study identified one SNP marker (SNP 53285) associated with stem growth and two SNP (6672 and 14857) markers associated with yield. De Souza et al. (2018) obtained a total of 77,660 and 21,283 SNPs from 626 genotypes (including 368 germplasm accessions and 254 individuals from a mapping population, respectively) through GBS approach. The previously mapped mapping population constructed with 1062 SNPs had only 576 SNPs from GBS approach thus reducing the average interval between markers to 4.4 cM (de Souza et al. 2018).

8.1 Saturated Genetic Linkage Mapping and QTLs

Genetic linkage map describes the linear order of markers whether genes or any other small DNA sequences in their respective linkage groups depicting their relative chromosomal locations by their pattern of inheritance (Priyadarshan 2017). It enhances our understanding on specific segments of the genome associated with a trait. Based on segregation analysis, Lespinasse et al. (2000b) assembled the first ever saturated genetic map of Hevea, encompassing 717 loci (covered 18 chromosomes) which eventually served as a reference genetic map. Initially, SSR loci/markers were identified both from published ESTs and through transcriptome sequencing. Rattanawong et al. (2008) constructed a genetic linkage map with 229 SSR and 198 AFLP markers. They could also identify a QTL directly associated with rubber yield (Hbg16a131) and another (Hbg3a312) associated with girth of the trunk. Triwitayakorn et al. (2011) developed EST-SSR markers through transcriptome sequencing of shoot apical tissue to construct linkage map (97 loci with a mean interval of 11.9 cM) and to identify traits of commercial interest. The subsequent molecular genetic map obtained by Souza et al. 2013, revealed 284 markers distributed among 23 linkage groups with a total of 2688.8 cM. The length of each group ranged from 2.7 to 228.7 cM. They also detected a total of 18 QTLs for growth traits during summer and winter seasons. Nine QTLs (7 for height and 2 for girth) for summer, five for winter (2 for height and 3 for girth) and four for both height (2) and girth (2) were detected in 11 linkage groups. The average distance between adjacent markers on these maps was between 8 and 11.9 cM, which was larger for downstream marker-trait association analyses. This led to research on achieving high-density linkage maps to directly study the analysis of sequence variations based on single-nucleotide polymorphisms (SNPs). Identification and genotyping of common SNPs became possible by the invention of genotyping-by-sequencing (GBS), a technique which combined the next-generation sequencing, genome complexity reduction techniques and barcoding. Using this technique, Pootakham et al. (2015b) constructed genetic linkage maps for two populations comprising 1704 and 1719 markers with a coverage of 2041 and 1874 cM, respectively. The average marker densities ranged between 1.23 and 1.25 cM. This saturated map showed an improved SNP marker density of 0.89 cM. However, the average distances between these adjacent markers were still large, and demanded a genetic linkage map with still higher density.

In a similar study by Shearman et al. (2015), a high-density genetic map constructed from 149 progenies of RRIM 600 and RRII 105 generated 12,326 SNPs from 4244 contigs with an average marker density of 1.90 cM. In a study by Chanroj et al. (2017), two SNP markers (SNP7772 and SNP14857) associated with yield and one SNP marker (SNP53285) associated with stem growth were identified. Hou et al. 2017, identified 180 primer pairs of EST-SSR markers (based on 38815 EST sequences) to exhibit polymorphism. In 2018, the first ultra-high-density genetic linkage map constructed by Xia et al. (2018) in Hevea identified 571,267 SNPs and 134,184 indels and could elucidate 6940 SNP markers with a density of 0.30 cM and a recombination rate of 0.97 cM/Mb. They could obtain 17 QTLs for dry latex yield (DLY) among which QTLs qDFY-10 and qDFY-18-4 could explain up to 38.3% and 33.3%, respectively. This study also identified highly promising QTL candidate genes such as thioredoxin h, plastin-like protein, calmodulin-binding protein, cytochrome c oxidase and methylglutaconyl-CoA hydratase that have significant association with yield. QTL mapping for stem diameter, tree height and number of whorls was reported from a sibling population of GT1 × RRIM 701 cross (Conson et al. 2018). The base map with 18 linkage groups was constructed using 225 SSR and 186 SNP markers and the dense linkage map with 671 SNPs generated through genotyping by sequencing. This study investigated on key dominant genes associated with growth of Hevea under water stress conditions and could identify 24 QTLs for stem diameter, seven for tree height and seven for number of whorls. Rosa et al. (2018) constructed a saturated multipoint integrated genetic map containing 354 microsatellite and 151 SNP markers to identify QTL for growth and latex production in a full-sib population (progenies of RRIM 600 × PB 217) cultivated under suboptimal conditions. In a recent study, An et al. (2019) genotyped 206 F1 progenies of CATAS 8-79 and MT/C/119/67 through Locus Amplified Sequencing Technology (SLAF-Seq) using 268,592 SNPs to construct a high-density genetic map with 4543 SNPs covering 2670.27 cM of the whole rubber genome with an average marker density of 0.59 cM. They could detect 11 QTLs for stem growth and 12 QTLs for latex yield distributed in 15 linkage groups. The QTLs for stem growth and latex yield (qzSG-8-2 and qLD-8-3, respectively) were closely linked with a genetic distance of 3.0 cM.

The additional information generated through the recently developed ultra-high-density genetic linkage maps have immensely contributed to enrich the database on genome-wide EST-SSR markers, SNP markers and genomic regions containing major genes and QTLs containing agronomically important traits thus having direct application in molecular breeding through marker-assisted selection. Since the beginning of Hevea cultivation, attaining higher yield potential had been the major breeding objective through improving the yield traits like trunk biomass, bark thickness and number of latex vessel rows (Priyadarshan 2003). The recent reports have identified few QTLs for stem growth and height (Rattanawong et al. 2008; Souza et al. 2013; Chanroj et al. 2017; Rosa et al. 2018; An et al. 2019) and yield (Rattanawong et al. 2008; Rosa et al. 2018; Xia et al. 2018; An et al. 2019) which can now be used for MAS in Hevea. The winter stress responsive QTLs that influence phenotypic variation in Hevea can be best employed for MAS to develop cold stress-tolerant clones of Hevea. The map constructed by Conson et al. 2018, provides information on genomic regions related to growth under water stress which can be best used for MAS for drought tolerance.

The first ultra-high-density genetic linkage map constructed by Xia et al. 2018, identified 17 most reliable QTLs for dry yield among which two QTLs had phenotypic variation of about 33 and 38%. The additionally found QTLs associated with defence mechanisms, energy metabolism and rubber biosynthesis pathways in this study also revealed significant association with dry yield (Fig. 7). The traits like latex yield and stem growth are complex and are polygenically or quantitatively controlled (Simmonds 1989). This was addressed by An et al. 2019, who did conditional QTL mapping to identify six QTLs for stem growth, seven QTLs for latex yield and five and six QTLS for stem growth and latex yield, respectively by conventional mapping. More studies are being carried out in this aspect as well as on QTLs for biotic and abiotic tolerance. All these information generated (Table 1) could be used as precious resources by the breeders to explore the possibility of utilizing them in the MAS programmes in Hevea.

Fig. 7
figure 7

The candidate QTL genes in rubber biosynthesis pathway as described by Xia et al. (2018)

Table 1 Details of published genetic linkage maps of Hevea brasiliensis

Apart from employing markers for growth , yield and tolerance to the biotic and abiotic stresses, these microsatellite markers are also used to study the genetic variance in Hevea (Hou et al. 2017; Antwi-Wiredu et al. 2019). Mantello et al. (2011) identified 69 microsatellite loci from 384 clones and reported the existence of highly conserved flanking regions of microsatellites between species of Hevea. In a similar study, Souza et al. (2009) characterized 27 polymorphic microsatellite loci in a GA-CA enriched genomic library of Hevea and obtained cross-species amplification in wild Hevea species (H. guianensis, H. rigidifolia, H. nitida, H. pauciflora, H. benthamiana and H. camargoana) thus revealing the highly conserved nature of loci/markers and the complete colinearity of Hevea genome among its species (Souza et al. 2009; Prapan et al. 2006). There were many attempts to study the genetic diversity using EST-SSR markers. Cubry et al. (2014) characterized 164 polymorphic EST-SSRs for diversity and SALB resistance. A genetic diversity study by genotyping 1117 accessions with 13 microsatellite markers identified a total of 408 alleles among which 319 were shared between groups while 89 were confined to different groups of accessions (de Souza et al. 2015). A high frequency of gene flow between groups with high genetic distance was observed, and a total of 99 accessions with maximum genetic diversity were further identified for genomic breeding. In a similar study, Feng et al. (2008) identified 799 SSR loci from Hevea ESTs and studied genetic diversity in 12 clones and four related species . Analysis on 74 alleles using 30 randomly selected primer pairs indicated medium polymorphism of the EST-SSRs. This study also provided evidence for cross-species/genera transferability of the EST-SSR markers developed.

8.2 Transcriptome Studies by High-Throughput Sequencing

Recent high-throughput sequencing technology and various genomic tools have hugely facilitated the data acquisitions in several crops in general and rubber in particular over the last two decades (Table 2). High-throughput genomic techniques are associated with innovative bioinformatics tools that assume much importance in rubber tree breeding programmes. Recent technological developments made RNA sequencing a cost-effective tool in gene expression profiling while providing qualitative and quantitative information at whole-genome level compared to other conventional methods (Zhang et al. 2010; Xia et al. 2011). Some of the earlier studies in Hevea dealt with ESTs from latex transcriptome, e.g. Ko et al. (2003) identified 1176 ESTs from latex transcriptome. Chow et al. (2007) sequenced 10,040 ESTs and obtained 3441 unique transcripts. With the advent of transcriptome sequencing technologies, large-scale EST studies followed which led to the development of plenty of EST-derived SSR markers for rubber (Xia et al. 2011; Triwitayakorn et al. 2011; Pootakham et al. 2011; Li et al. 2012). Investigations on ESTs of various stress and latex biosynthesis-related cDNA libraries have revealed a set of genes that can be employed as marker for important traits. A systematic approach on the transcriptome of bark was reported by Li et al. (2012) using RNA from the high yielding genotype RY7-33-97 (CATAS, China) through Illumina paired-end sequencing technology study that identified a total of 22,756 unigenes and predicted 39,257 EST-SSRs.

Table 2 List of genome and transcriptome studies using new generation sequencing technologies

Liang et al. (2009) reported the constitutive expression of a translationally controlled tumour protein, HbTCTP in latex, leaves and barks upon ethylene (ET) treatment. Deng et al. (2012) further analyzed the gene structure and developed molecular markers of HbTCTP. Li et al. (2013) cloned HbTCTP1, another TCTP gene in Hevea. HbTCTP1 is expressed in different tissues throughout the developmental stages of leaves and is regulated by drought, low temperature, high salt, ethylene stimulation, wounding, H2O2 and methyl-jasmonate (Deng et al. 2012). Isolation and molecular characterization of 1-aminocyclopropane-1-carboxylic acid synthase (ACS) genes in Hevea brasiliensis revealed the presence of nine ACS-like genes (Zhu et al. 2015). Structure and expression profile of sucrose synthase gene family in Hevea and their role in stress response and sucrose utilization in the laticifers was discussed by Xiao et al. (2013).

Latex and Rubber Biosynthetic Pathway: The first report on the ethylene-stimulated induction of laticifer-specific expression of genes was by Kush et al. (1990). Later, Miao and Gaynor (1993) isolated MnSOD gene, which was found to express in leaf, petiole, root, latex and callus and at higher levels in younger leaves. Thanseem et al. (2003, 2005) cloned and characterized β-1, 3-glucanase gene from various clones of Hevea. Priya et al. (2006) characterized promoter sequence of REF (rubber elongation factor) gene from Hevea. This was followed by reports on a full-length cDNA encoding cysteine protease, HbCP1 (Peng et al. 2008), full-length aquaporin cDNAs HbPIP2; 1 and HbTIP1; 1 (Tungngoen et al. 2009) and coronatine-insensitive 1 protein (HbCOI1) (Peng et al. 2009). Chow et al. (2007) obtained more than 10,000 ESTs from latex (clone RRIM 600) from which 3441 unique transcripts were identified. Through quantitative PCR (qPCR), they reported REF (rubber elongation factor) and SRPP (small rubber particle protein) as the most abundant transcripts in latex. In a similar study, Han et al. (2000) identified 245 ESTs in which REF and SRPP were the most abundant followed by defence and protein metabolism-related genes. Sando et al. (2008) reported full-length cDNA of genes encoding all the enzymes of MVA pathway which were abundant in latex. Yield stimulation through ethrel application diluted the latex, and the study revealed a positive correlation between expression of HbPIP2.1, (an aquaporin) and latex yield. This study also observed a normally downregulated HbTIP1.1 an aquaporin gene in laticifers getting up-regulated under ethrel treatment (Tungngoen et al. 2009). Promoter sequences of these genes were found to harbour ethylene, auxin, copper and sulphur responsive elements that are known to be associated with yield.

The transcriptome profiling by Wei et al. (2015) observed downregulation of almost all genes associated with energy metabolism indicating the long-time latex flowing trees as more energy intensive than normal flowing types. To support the long-term flow of latex (more intensive tapping), it is necessary to improve energy supplies (oxidative phosphorylation and photosynthesis process to generate more ATP). They also observed the possibility of using copper ion of latex as a standard to estimate tapping intensity as copper (Cu) is an essential micronutrient that is involved in essential functions, metalloenzymes, non-electron transfer reactions and ethylene-responsive receptor. Availability of copper ions has been observed as one of the control mechanisms of rubber tree latex isoprene biosynthesis metabolism (Fan and Yang 1991). Mantello et al. (2014) identified rubber biosynthesis-related ESTs from bark samples of GT1 and PR255 clones that are known as high yielders and tolerant to cold and wind. They validated a total of 78 SNPs of MVA and MEP pathways of rubber biosynthesis in 36 genotypes. Comparative analysis of latex transcriptome between clone PR107 (low yielder) and CATAS8-79 (high yielder) was performed to uncover the molecular mechanism behind the regulation of latex regeneration and duration of latex flow and to identify differential expression of several genes related to rubber biosynthesis, cellulose and lignin biosynthesis (Chao et al. 2015). They observed the influence of higher level expression of unigenes such as HbLOX and HbOPDR in the JA biosynthesis pathway, HbSUT, HbBAM and HbPK mediating carbohydrate metabolism, HbHMGR1 in MVA pathway and HbHRT2 in rubber biosynthesis pathway with enhanced latex regeneration. In turn, prolonged latex flow upregulated HbACO-mediated ethylene biosynthesis (Chao et al. 2015).

The first ever genome sequencing in Hevea reported by Rahman et al. (2013) identified 12 distinct sub-metabolic pathways represented by 417 genes involved in carbon assimilating steps in rubber biosynthesis (Fig. 8). This study reported representation of 417 genes in connection with the 12 distinct sub-metabolic pathways of carbon assimilatory mechanisms. The study enlists the genes involved in metabolic pathways that contribute for the production of isoprenoid precursors required for the rubber biosynthesis pathway (fructan metabolism—54 genes; starch metabolism—38 genes; glycolysis—52 genes; alternate pentose phosphate pathway—14 genes; acetyl CoA biosynthetic pathway—120 genes). Genome sequencing report by Lau et al. (2016) explained a comprehensive genome-wide analysis of clone RRIM600 and attributed to H. brasiliensis’s capacity to produce higher levels of latex to the expansion of rubber biosynthesis-related genes in its genome and their enhanced expression. They observed increased levels of MVA pathway genes in latex (acetyl CoA acetyltransferase, HMG-CoA synthase and mevalonate diphosphate decarboxylase gene families) than leaf and bark. They also observed that MEP pathway that contributes for the intermediates required for rubber biosynthesis is not preferentially expressed in latex. A high-quality genome assembly of H. brasiliensis by Tang et al. (2016) reported 84 rubber biosynthesis genes from 20 families (in which 18 and 22 genes belonging to cytosolic mevalonate pathway and plastidic MEP pathway were identified, respectively), to contribute for IPP synthesis. They also identified 15 initiator synthesis genes in cytosol and 39 genes involved in rubber elongation on rubber particles. REF/SRPP family of Hevea is the largest one with 18 members and is the most abundant proteins in latex. Of the 24 MEP genes, only two 1-deoxy-d-xylulose 5-phosphate synthase genes (DXS7 and DXS10) were found preferentially expressed in latex.

Fig. 8
figure 8

Schematic representation of metabolic pathway leading to natural rubber biosynthesis. Import of sucrose until biosynthesis of rubber involves 12 sub-metabolic pathways represented in the large boxes. The number of enzymes and associated proteins in each individual pathway is shown in small white boxes and the number of orthologs in Hevea in the grey boxes. (Adopted from Rahman et al. 2013)

Wu et al. (2018) analysed the relationship between rubber yield and expression of nine major latex metabolism-related genes, i.e. HMG-CoA synthase (HMGS), HMG-CoA reductase (HMGR), diphosphomevalonate decarboxylase (PMD), farnesyl diphosphate synthase (FPS), cis-prenyltransferase (CPT), rubber elongation factor (REF), small rubber particle protein (SRPP), dihydroxy acid dehydratase (DHAD) and actin depolymerizing factor (ADF), and found most of the genes to express predominantly except HbHMGR1, HbPMD and HbDHAD. Most of these genes (except HbDHAD) were greatly influenced by both ethephon (ETH) and methyl-jasmonate (MeJA). This study also found positive correlation between the expression level of HbCPT, HbFPS, HbHMGS, HbHMGR1 and HbDHAD and rubber yield and/or yield characteristics while HbREF exhibited a negative correlation. HbSUT3, a sucrose transporter gene, was observed to express predominantly in latex (Tang et al. 2010, 2013). Expression of HbSUT3 got induced by tapping, ethrel stimulation and jasmonic acid and was positively correlated with latex yield. Efficient sucrose loading in laticifer cells is important for improved rubber production in Hevea, and this function is carried out mainly by sucrose transporters encoded by SUT (sucrose transporter) genes. A study by Sui et al. (2017) suggested the role of SWEET proteins in sugar efflux transport as response to plant development and stress response based on a genome-wide analysis for the first time. From a total of 127 SWEET genes studied, HbSWEET1a, 2e, 2f and 3b proteins were involved in phloem loading while HbSWEET10a and 16b in laticifer sugar transport and HbSWEET9a in nectary-specific sugar transport. SWEET proteins also have versatile role in plant development, stress response and plant–microbe interactions and transport sugar substrates like sucrose, fructose and glucose selectively.

Jasmonate (JA) signalling regulates secondary laticifer differentiation (Hao and Wu 2000) and NR biosynthesis in Hevea through jasmonate ZIM-domain (JAZ) proteins which are the master regulators of jasmonate signalling. In a recent study by Loh et al. (2019), influence of JA on secondary laticifer differentiation through Bel5-GA2 oxidase 1-KNOTTED-like homeobox complex was observed based on 450 genes unique to jasmonic acid (JA) and linolenic acid (LA). Chao et al. (2019) identified 18 JAZ proteins in Hevea among which the HbJAZ5.0 and HbJAZ10.0b were presumed to be associated with differentiation of laticifer while the HbJAZ8.0b negatively regulates rubber biosynthesis. Latex yield depends upon the duration of latex flow after tapping and the capability of latex regeneration between two consecutive tappings (d’Auzac et al. 1997). Latex regeneration involves multiple numerous biological processes including transcription and translation, intracellular trafficking and signalling pathways and movement of water and biochemical intermediates across the laticifer cells (Tang et al. 2013). The elaborate studies being made through NGS technology have generated plenty of information on rubber biosynthesis genes and their regulatory mechanisms in Hevea. Almost all the genes involved in general metabolic pathways leading to rubber biosynthesis have been identified in latex and bark, some have been even cloned and further characterized. More studies are coming up on what exactly decides a high yielder. For example, a historical and a detailed study by Tang et al. (2013) in a super high yielding rubber tree attributed improved sugar loading capability, rubber biosynthesis-preferred sugar utilization, enhanced general metabolism and timely stress alleviation as the reason for high yield. More investigations are needed in this line to illustrate the yield-enhancing factors and in increasing the productivity.

Ethylene Stimulation and Tapping Panel Dryness: Ethephon, an ethylene releaser applied to tapping panel, stimulates latex production by inducing several biochemical changes in laticifers like sucrose loading, water uptake and nitrogen assimilation or expression of ethylene-responsive genes/defence proteins (Putranto et al. 2015). Initially, Kush et al. (1990) reported laticifer-specific genes induced by ethylene in H. brasiliensis. The tapping induced endogenous ethylene production and ethephon induced exogenous ethylene are the major sources of oxidative stress in Hevea that results in the development of the tapping panel dryness (TPD) syndrome. Though ethylene is important for the optimum latex yield, it becomes deleterious in excess levels. Ethylene signal is perceived through ethylene response factors (ERFs) which belong to AP2/ERF superfamily. ERFs are trans-acting factors that bind to GCC or DRT/CRT cis-acting elements in the promoter region of target genes (Shoji et al. 2013) and have been reported as biotic and abiotic stress responsive (Mizoi et al. 2012). In latex of high yielding clone PB 217, ethylene stimulation induces expression of sugar transporter HbSUT1B than inner bark tissues while it was vice versa in clone PB260, a low yielder. Tang et al. (2010) identified predominant expression of HbSUT3, a functional sugar transporter in latex of Hevea, was found to be ethylene responsive and have positive correlation with yield.

Putranto et al. (2015) studied the expression of abiotic stress and hormone responsive ERF genes in latex and identified 21 HbERF genes regulated by ethylene, methyl-jasmonate and dehydration. This study made an important observation that tapping induces expression of mostly dehydration-responsive genes except for few wounding-related genes. In addition, they also found that most of the ethephon-induced ERFs were responsive to osmotic stress influenced dehydration indicating the occurrence of tapping-induced osmotic stress. Twenty differentially expressed genes belonging to ERF groups I, II, IV, VI, VIII, IX and X were identified as stress responsive and thus can be considered as expression marker genes. Study by Putranto et al. (2015) found various cis-acting regulatory elements in promoters of HbERF-IX that could be induced by ethylene, JA, auxin, cytokinin, gibberellins, abscisic acid and oxidative stress. Moreover, these genes were intron free (Nakano et al. 2006) and had better chance for their rapid and constitutive expression required under stress (Jeffares et al. 2008).

Transcriptome analysis in ethylene-induced cDNA library (Liu et al. 2016) indicated the upregulation of certain regulatory enzymes in the glycolytic pathway as well as genes in the carbon fixation pathway (Calvin cycle). It was presumed that rapid acceleration of glycolytic pathway required to supply precursors for the biosynthesis of IPP, and subsequently NR instead of rubber biosynthesis per se may be responsible for ethylene stimulation of latex in Hevea. Tang et al. (2016) reported 509 ethylene-responsive genes in latex which included 53 transcription factors and 31 protein kinase both of which are essential for ethylene-responsive signal transduction in higher plants. A total of 225 ethylene-responsive element (AP2/ERF) binding factors were also observed. The transcription factors involved in ethylene signalling pathways like AP2/ERF domain-containing transcription factors and ethylene-responsive transcription factor RAP2 had direct association with latex yield (Xia et al. 2018). Ethylene biosynthesis and signalling genes were also identified in Hevea (Kuswanhadi et al. 2010; Piyatrakul et al. 2012; Duan et al. 2013). Three MADS-box genes from Hevea, HbMADS1, HbMADS2 and HbMADS3, were cloned and characterized by Li et al. (2011).

Molecular studies in bark and latex of tapping panel dryness (TPD)-affected rubber trees indicated downregulation of HbMyb1 transcription factor (Chen et al. 2002). Initially, TPD-associated genes were identified by suppression subtractive hybridization (Sathik et al. 2006; Venkatachalam et al. 2007; Li et al. 2010). Subsequent mRNA differential display study revealed HbTOM20’s association with TPD (Venkatachalam et al. 2009). A comparative transcriptome analysis between healthy and TPD-affected rubber trees indicated differential expression of genes related to rubber biosynthesis and jasmonate synthesis (Liu et al. 2015). In another study, transcriptome analysis between healthy and TPD trees revealed differential expression of genes associated with ROS metabolism, jasmonate and ethylene biosyntheses, ubiquitin proteosomal pathway (UPP), programmed cell death (PCD) and rubber biosynthesis (Li et al. 2016).

Abiotic Stress: Abiotic stresses like drought, low temperature, wind, hailstorm are the major factors that affect the yield of Hevea in major rubber-producing countries. Drought and low temperature stresses hinder plant growth and development by hindering several metabolic processes including stomatal conductance, nutrient uptake and photosynthetic assimilation that eventually results in significant yield loss (Buttery and Boatman 1976; Sethuraj et al. 1984; Sreelatha et al. 2007, 2011; Shinozaki et al. 2003). Plants have also evolved multilevel stress perception and signalling and acclimation mechanisms to cope up with such limiting conditions. To mitigate the adverse effect of stress, plants often respond by altering at physiological, biochemical and gene expression levels. Drought tolerance is a complicated phenomenon that involves multiple genes and their interaction with the environmental factors (Zandalinas et al. 2020). Developing resilience to multiple stresses in crops is the real challenge which can be achieved by identification of key genes associated with such traits and signal transduction pathways and by incorporating in the breeding programmes to develop varieties with those specific traits without compromising on yield (Bailey-Serres et al. 2019).

Initial studies on cold stress indicated significant increase in cold-responsive genes like carbonic anhydrase, glutathione peroxidase, metallothionein, chloroplastic Cu/Zn SOD, serine/threonine protein kinase, transcription factor, and DNA-binding protein. (Saha et al. 2010). In a similar study, gene expression analysis in cold-stressed leaf samples revealed higher levels of expression of LEA 5 protein, NAC tf and peroxidase and suggested association of genes viz LEA 5 protein, peroxidase, ETR1, ETR2 and NAC transcription factor with cold tolerance (Sathik et al. 2012). In a transcriptome study by Sathik et al. (2018), ethylene-responsive transcription factor (ERF) was found strongly associated with cold tolerance. Cheng et al. (2015) functionally characterized CRT/DRE binding factor 1 (HbCBF1) gene in H. brasiliensis when transgenic Arabidopsis overexpressing HbCBF1 was found to have enhanced cold tolerance. Expression analysis of stress-responsive transcripts and their association with drought tolerance in different Hevea clones was studied (Thomas et al. 2012) and found upregulation of NAC tf in a drought-tolerant genotype (RRIM 600) than a relatively drought-susceptible one (RRII 105) indicating its strong association with drought tolerance (Luke et al. 2017). Differential gene expression analysis in several Hevea genotypes with varying levels of drought tolerance indicated the association of MAP Kinase, Myb tf, CRT/DRE binding factor and NFYA with drought tolerance (Luke et al. 2015). In another study, there was strong association between expression of ferritin, DNA-binding protein, NAC tf and aquaporin with drought tolerance (Sathik et al. 2018). Significant upregulation of genes related to energy biosynthesis and ROS scavenging systems including HbCuZnSOD, HbMnSOD, HbAPX, HbCAT, HbCOA, HbATP and HbACAT was also observed under drought stress (Wang et al. 2014).

Cheng et al. (2016) found ErbB-3 binding protein 1 (EBP1) to undergo changes in its expression in response to cold, drought stress and ABA treatment in Hevea. Overexpression of EBP1 in Arabidopsis enhanced resistance to freezing and drought stress. Long et al. (2015) identified four glucose-6-phosphate dehydrogenase (HbG6PDHs) gene family in Hevea that respond to temperature and drought stresses in root, bark and leaves were found to be involved in redox balance maintenance and defence against oxidative stress. Gong et al. (2018) identified 1457 and 2328 DEGs from leaves of cold-tolerant clone Reyan 7-33-97 in response to 2 and 24 h of cold stress treatment. Flavonoid biosynthesis, phenylpropanoid biosynthesis, plant hormone signal transduction, cutin, suberin and wax biosynthesis, pentose and glucuronate interconversions, phenylalanine metabolism and starch and sucrose metabolism were the significantly expressed main KEGG pathway genes. This study also observed differential expression of 239 cold-responsive transcription factors (TFs). This included TFs viz., ARR-B, B3, BES1, bHLH, C2H, CO-like, Dof, ERF, AR1, G2-like, GRAS, GRF, HD-ZIP, HSF, LBD, MIKC-MADS, M-type MADS, MYB, MYB-related, NAC, RAV, SRS, TALE, TCP, Trihelix, WOX, WRKY, YABBY and ZF-HD. ATP-binding cassette proteins (ABC proteins) in the laticifers of H. brasiliensis was first characterized by Zhiyi et al. (2015). All eight plant ABC protein paralog subfamilies were identified in latex among which ABCB, ABCG and ABC1 were the most abundant. Expression of HbABCB15, HbABCB19, HbABCD1 and HbABCG21 was significantly regulated by ethylene, JA and wound stresses.

In plants, gene expression is regulated by non-coding small RNAs (Storz 2002) like microRNA (miRNA), small interfering RNA (siRNA), small nucleolar RNA (snoRNA), small rDNA-derived RNA (srRNA), small nuclear RNA (U-RNA), tRNA-derived small RNA (tsRNA) in response to development, biotic and abiotic stresses (Meister et al. 2013). miRNAs play main role in negatively regulating gene expression by binding with the complementary mRNAs to either cleave or repress translation (Chinnusamy et al. 2007). Role of miRNAs in Hevea was first reported by Zeng et al. (2010) followed by Gebelin et al. (Gébelin et al. 2012; Gebelin et al. 2013a, b) and Lertpanyasampatha et al. (2012). Gébelin et al. (2012) reported drought-responsive miRNAs by deep sequencing (48 conserved and 10 novel miRNA families) while Lertpanyasampatha et al. (2012) identified 115 miRNAs belonging to 56 families and 20 novel miRNAs through high-throughput sequencing in high and low yielding clones (PB 260 and PB 217, respectively). miRNAs that regulate MYB transcription factor, auxin-responsive factor (ARF) and type III HD-Zip transcription factors were abundantly expressed in leaves. In response to cold stress and JA, MIR159b was found enhanced in leaves and bark (Gebelin et al. 2013a). Enhanced expression of eMIR159b gene was observed in TPD-affected trees (Gebelin et al. 2013b). Four miRNAs (miR482, miR164, miR167 and HbmiRn_42) were found drought responsive while miR482 had strong association with drought tolerance (Kuruvilla et al. 2016). A differential expression analysis of cold-responsive miRNAs indicated miR169, miR159 and miR482 to have strong association with cold tolerance (Kuruvilla et al. 2017). Under cold stress, miR169 got downregulated while its corresponding target mRNA (NF-YA) associated with stress tolerance got upregulated (Luke et al. 2015). A novel miRNA (HbmiRn_42) that targeted HMG-CoA reductase was found highly upregulated under drought conditions in tolerant clones probably indicates the suppressed rubber biosynthesis under drought (Kuruvilla et al. 2016). These studies also indicate the possibility of using miRNA as markers for stress tolerance and in crop improvement programmes (Sathik and Kuruvilla 2017) while more investigations are required to study the relationship between miRNAs and their corresponding mRNAs in the context of biotic and abiotic stress tolerance. A recent report by Leclercq et al. (2020) indicates maximum level of post-transcriptional regulation in latex biosynthetic pathway and hormonal signalling in latex by mRNA cleavage through miRNAs and ta-siRNAs (trans-activated small interfering RNA).

Biotic Stress: Hevea is vulnerable to many fungal pathogens. Oeidium heveae, Colletotrichum spp., Phytophthora spp., Corynespora cassicola and Microcyclus ulei are some of the fungal pathogens that cause leaf diseases . Corticium salmonicolor causes stem infection called pink disease. South American leaf blight (SALB) caused by M. ulei is the most devastating in Brazil. It has not yet come to Asia region, but there is always a possibility that it may reach this region and breeders should be ready with SALB-resistant/tolerant clones to tackle the situation. Corynespora leaf disease caused by Corynespora cassicola is one of the major diseases that leads to significant yield loss in Hevea. Through transcriptomic sequencing approach, Roy et al. (2019) could identify transcripts related to defence response, response to stimulus and stresses from RRII 105 (susceptible) and GT1 (moderately resistant). They could identify enhanced expression of defence-related genes, receptor-like kinases and transcription factors (TFs). TFs such as NAC, ERF, MYB, GATA, WRKY, LEA and bZIP are important upstream regulatory proteins which play crucial role in regulating plant responses to stresses and enhanced disease resistance against pathogens. NAC TFs play critical role in plant immune responses. This study also identified a total number of 32,323 SSRs with a relative abundance of 503.59 SSRs/Mb and SSR polymorphism between clones RRII 105 and GT 1 in control and challenged conditions. Information on these SSRs and the SNPs would be a valuable source for the construction of linkage maps, QTL mapping, genetic diversity and MAS breeding. RAPD markers were used to identify Oidium-resistance gene (Chen et al. 1994, 2003). In a study on powdery mildew, Li et al. (2016) reported 78 differentially expressed genes from RRIC 52 (an Oidium-tolerant clone). The ESTs expressed at least 78 h post infection were found Oidium responsive among which ESTs related to membrane pathway, transcription factors, signal transduction, phytoalexin biosynthesis and other metabolic pathways were observed. Increased chitinase and B-1,3 glucanase activity could be observed in both tolerant and partially susceptible clones.

Páez et al. (2015) identified 86 differentially expressed SALB-responsive genes in clone FX3864 (a cross between H. brasiliensis PB 86 and H. brasiliensis B 38) infected with M. ulei for 48 h and observed induction of SA-dependent defence genes that mediates resistance to SALB. The results indicated downregulation of seven putative gene members of ethylene-dependent AP2/ERF family and upregulation of three SA-associated genes involved in cell wall synthesis. Fang et al. (2016) identified 3905 differentially expressed genes from leaves of clone Reyan 7-37-97 at four developmental stages and revealed the distinct expression pattern of genes associated with cyanogenic metabolism, lignin and anthocyanin biosynthesis and its correlation with change in SALB resistance between the stages of leaf development. Three defence-responsive genes such ascorbate oxidase, NADPH oxidase and lipoxygenase and six genes coding for ROS scavenging such as thioredoxins, glutaredoxin, caleosin-related peroxygenase, polyamine oxidase and thioredoxin were found differentially expressed in latex (Tang et al. 2016). More such disease resistance genes have to be identified from Hevea using high-throughput sequencing to employ in breeding for disease-tolerance programme.

DNA Methylation: Under biotic and abiotic stress conditions, plants undergo DNA methylation to alter gene expression in numerous biochemical pathways associated with stress acclimatization and molecular adaptation (Finnegan et al. 1998). Hypomethylation in genome induced by cold stress (Chinnusamy et al. 2008) and drought stress (Labra et al. 2002) have proven that it is a well-synchronized strategy of the plants to alter gene expression to cope up with the changing environment (Uthup et al. 2011). In Hevea, abiotic stress influenced methylation was observed in the promoter region of rubber biosynthesis genes and disease-resistance gene (Uthup et al. 2011) while the non-stressed plants (clone RRII 105) remained unmethylated. The clone RRIM 600 and PB 260 were found epigenetically more stable. Methylation in coronatine-insensitive 1 (COI1) gene , a defence-related gene (Xie et al. 1998) in clone RRII 105 and RRIM 600, indicates the suppression of its expression thus making it susceptible to pathogen attack. This study indicates that methylation can be used as a parameter to identify environmentally stable clones for molecular breeding (Uthup et al. 2011).

9 Genome Sequencing

Construction of Hevea genome and generating multi-transcriptome database can accelerate the process of genome analysis and facilitate genome data-based marker identification for application in crop improvement programs. Till date, four draft Hevea genome have been published between 2013 and 2017. A comparative table on these sequencing is furnished in Table 3. First draft genome sequencing data of Hevea was made available by Malaysian team (Rahman et al. 2013) in RRIM 600, a progeny of genotypes Tjir1 and PB 86. The whole-genome shotgun (WGS) approach generated ~43× coverage from Roche/454, Illumina and SOLiD platforms. The assembly covered about 1.1 Gb of the estimated 2.15 Gb haploid genome, and they could predict 68,955 genes of which 12.7% were unique. Based on 154 microsatellite markers, they could anchor 143 scaffolds and associated 1325 genes onto the 18 H. brasiliensis linkage groups. They could identify 9516 cluster of genes common to Euphorbiaceae among which 2708 clusters comprising 8748 genes are specific to Hevea. Expression analysis indicated plastidic mevalonate independent (MVA) pathway (29 genes) contributing as an alternate source of IPP precursor to rubber biosynthesis.

Table 3 A comparison of four published draft genome data of Hevea brasiliensis

The second genome sequencing was reported by Rubber Research Institute, CATAS, China, (Tang et al. 2016) in genotype Reyan7-33-97 based on WGS and pooled BAC clones. This assembly covered a length of 1.37 Gb covering about 94% of the predicted genome size (1.46 Gb). This study observed Hevea to have largest genome size and greatest repeat content (71%) among the five genera of Malpighiales (Hevea, Manihot, Ricinus, Populus and Linum). This study identified 84,241 unique transcripts and 43,792 protein-coding genes (39,919 were found in NCBI) from which 84 rubber biosynthetic genes from 20 families were found. Eighteen genes belonging to cytosolic mevalonate pathway and 22 genes of plastidic MEP pathway were found to contribute for IPP synthesis. This study reported the latex biased expression of at least one gene each of the MVA pathway towards rubber biosynthesis. Eighteen-member REF/SRPP family of Hevea was found as the largest with the other 17 sequenced plants thus indicating the larger representation of this family of genes in Hevea. Ethylene-responsive DGE analysis in latex indicated expression of 342 annotated DEGs, 53 transcription factors and 31 protein kinases (signal transduction and response). Among the eight ETRs expressed in response to ethylene, four got highly upregulated (ETR1, CTR1, EIN2 and EIN3/EIL1). Higher levels of EIN2 and EIL1 indicates the occurrence of active ethylene signalling in laticifers. A total of 225 AP2/ERF members of ethylene-responsive element binding factors were also identified. Ethylene triggered downstream biochemical cascade of events related to latex production such as sugar loading and its catabolism, water uptake, energy availability, cytosolic alkalinization, nitrogen assimilation and defence responses were also identified. Five out of six DEGs related to ROS production and scavenging were also reported to be highly upregulated. While expression of ascorbate oxidase and NADPH oxidase were at higher levels, lipoxygenase got downregulated.

The third report on Hevea genome sequencing was by Lau et al. (2016), a joint venture by Malaysia (Centre for Chemical Biology, University of Malaysia) and Japan (RIKEN Centre for sustainable Resource Science) in clone RRIM 600. This assembly has a total length of 1.55 Gb with 72.5% repetitive DNA sequences and represented more than 93.7% of the transcriptome sequences. They could predict a total of 84,450 high-confidence protein-coding genes. This study also revealed the presence of highest number (483) of disease-resistance–related genes. Proteome comparison indicated sharing of 12,406 gene families among Euphorbiaceae species such as Ricinus communis, Manihot esculenta and Jatropha curcas. Among the 3142 putative transcription factors (tfs) identified in this study, MYB tfs were the most abundant group which is mainly involved in hormonal signal transduction, disease resistance and secondary metabolism. The CAGE analysis (cap analysis gene expression) indicated preferential expression of MVA pathway genes in latex at higher levels than the MEP pathway.

Consequently in the following year, the fourth report on Hevea genome assembly was published by Pootakham et al. (2017) from National Centre for Genetic Engineering and Biotechnology and Rubber Authority of Thailand. The genotype BPM 24 which is resistant to major fungal pathogens like Phytophthora and Corynespora was sequenced using three platforms (Roche 454 GS FLX + and Illumina HiSeq 2000 and PacBio RSII). The final assembly had a length of 1.26 Gb (N50 = 96.8 Kb) and contained 69.2% repetitive sequences with a GC content of 34.31%. Apart from predicting a total of 43,868 genes coding for proteins, 623 tRNA, 274 rRNA, 282 snoRNA, 164 snRNA and 193 miRNA were also reported. This study found 2308 markers (from 2321 markers reported by Pootakham et al. 2015a, b, by GBS-SNP–based linkage map) unambiguously located on the genome assembly and could anchor on the 862 scaffolds spanning approximately 310 Mb on 18 linkage groups of Hevea. With an additional 708 scaffolds from literature, a total of 1568 anchored scaffolds were found associated with about two-thirds of the predicted coding genes.

The assembly by Pootakham et al. (2017) is slightly larger than 1.1 Gb reported by Rahman et al. (2013) in RRIM 600 but smaller than 1.37 Gb assembly of Reyan7-37-97 by Tang et al. (2016) and 1.55 Gb of RRIM 600 reported by Lau et al. (2016). The percentage of repeat sequences (69.2%) reported by Pootakham et al. (2017) is also somewhat similar to the previous reports (71, 72 and 72.15%) by Rahman et al. (2013), Tang et al. (2016) and Lau et al. (2016), respectively. This study also revealed the common ancestry of Hevea and Manihot genome (Pootakham et al. 2017). Hevea also was found to have a large number of nucleotide-binding site (NBS) and leucine-rich repeat (LRR) containing R genes similar to its close relatives such as Manihot and Ricinus. An important finding of this study is the identification of 164 syntenic blocks containing 2951 paralogous gene pairs in Hevea that led to a paleotetraploidization event. Distribution of these paralogous gene pairs across the 18 linkage groups evidently revealed one-to-one synteny between five pairs of homeologous linkage groups that might have evolved from a recent whole-genome duplication event. The presence of conserved syntenic genes between the Hevea and Manihot indicates the possibility of occurrence of paleotetraploidy event prior to the speciation of these two species.

Makita et al. (2018) created a Hevea genome and multi-transcriptome–based database with an aim to help Hevea researchers and breeders make use of vast information on genome and transcriptome, gene expression and precise gene structure such as transcription start sites (TSSs) and isoforms of Hevea. They employed cap analysis gene expression (CAGE) method that captures 5′ end of the transcribed and capped mRNAs to study the major expressed TSS in different tissues. Gene annotation becomes very difficult in non-model plants with genes of unknown functions and to overcome this problem (especially for the 22,991 Hevea specific genes), Makita et al. (2018) carried out co-expressed gene analysis to predict functionally related gene groups that are regulated by the same transcription factor (TF) based on the fact that genes regulated by the same TF often display similar pattern of expression. They prepared an NR biosynthetic pathway by this method to clearly illustrate information on the genes involved (Fig. 9) by integrating three types of transcriptome data such as RNA-Seq, Full-length cDNA (FL-cDNA) and CAGE.

Fig. 9
figure 9

Database for rubber biosynthesis pathway search. Genes involved in each step is furnished. For easy access of each gene, the reader can visit the link provided in the original paper. (Adopted from Makita et al. 2018)

Currently, four draft genome data are available for three clones viz. RRIM 600 (Rahman et al. 2013; Lau et al. 2016), Reyan 7-33-97 (Tang et al. 2016) and BPM 24 (Pootakham et al. 2017) of Hevea brasiliensis. Though the most advanced sequencing platforms like PacBio which can generate reads of about 10–15 kb length has made high-quality genome-sequencing assembly possible, the raw reads may suffer from high error rate. This was addressed by Pootakham et al. (2017) by employing long-range Chicago data along with PacBio reads to scaffold contigs which improved sequence contiguity considerably. Further, a recently developed Hi-C–based sequencing approach (Lieberman-Aiden et al. 2009; van Berkum et al. 2010) has made possible to decipher the relationship between chromosome organization and genome activity to understand genomic processes like transcription and replication. This does not require population data for map construction and can be done in relatively shorter time and with ease (Cheng et al. 2020). It is also possible to detect finer interactions between enhancers, silencers and insulators as the cost of sequencing is getting cheaper (Lieberman-Aiden et al. 2009). Similar approach is required in Hevea to generate higher quality genome data in chromosome level with a contig N50 size more than 5 Mb, about 20 Mb of scaffold N50 and less than 100 scaffolds of which 95% should be mounted and ordered onto the whole set of 18 chromosomes (Cheng et al. 2020). There is no doubt that data from these efforts should make whole-genome association studies, identification of potential markers, studying marker–trait relationship at whole-genome level, genetic modification of Hevea with traits of interest along with high yield and improvement of germplasm possible.

10 Conclusions and Future Perspectives

Unlike other crops, Hevea got domesticated around 150 years ago, and the progress attained in Hevea breeding has been meticulous, especially after 1950s. Through conventional breeding itself, the crop yield could be remarkably and tremendously improved from a mere 456 kg ha−1 year−1 to more than 2500 kg ha−1 year−1, and in exceptional cases even more than 3500 kg ha−1 year−1 could be attained which is still far below the theoretically possible yield potential of 7000–12,000 kg ha−1 year−1. It is worth appreciating the enormous efforts put in by the breeders to raise the yield potential to such higher levels through recombination from a mere stock of few dozen seedlings from the Wickham base that survived the travel to Asian countries then. The subsequent IRRDB initiated germplasm collection conducted during 1980s from the primary centre of origin also injected a vast collection of new Amazonian germplasm into the breeding programmes which is still in infancy only. Much of the variation could not be utilized for want of techniques that can transfer traits into the existing superior clones. The absence of feasible juvenile selection process makes breeding more cumbersome and time consuming. The spurge in molecular techniques and information generated from molecular investigations in Hevea has opened the avenue of rapid progress in crop improvement aspects. Genome data along with transcriptome data enables us to investigate on the gene expression at the levels of rubber biosynthesis, disease and abiotic stress tolerance, etc. Further, incorporating the proteomic and metabolomics data can provide comprehensive picture in every respect. Markers are supposed to play the main role in early selection, and this property can be best utilized to develop early selection methods for accurate prediction of mature phenotype at juvenile stage itself which is the main objective of Hevea breeding. Incorporating genomics into breeding programmes to identify high yielding genotypes would minimize the requirement of both space and time.

Genomic selection (GS), a new approach that uses whole-genome molecular markers, has the potential to improve the quantitative traits in shorter time in a large population. By the availability of genomic data, G × E can be modelled by means of M × E when effect of marker varies among environments or groups of environments, and a correlation is possible between these effects. This is fast emerging as an alternate genome-wide marker-based method to predict future genetic responses that allows the estimation of genetic value of large number of selection of candidates at an early stage. It requires genotype × environment (G × E) interactions for developing prediction models. The phenotypic data collected over decades can be used for genomic selection to develop superior clones of Hevea. The whole responsibility of crop improvement entirely lies with the breeder who can only glean the apt information from the vast data available to develop superior Hevea clones with most of the required traits incorporated. Probably, the decade to come will witness such multidimensional approach in crop breeding programmes to develop new varieties of Hevea augmented with all the economically as well as ecologically important traits.