Pan-angiosperm analysis of the CLE signaling peptide family unveils paths, patterns, and predictions of paralog diversification
Pan-angiosperm analysis of the CLE signaling peptide family unveils paths, patterns, and predictions of paralog diversification
Gentile, I.; Santo Domingo, M.; Zebell, S. G.; Fitzgerald, B.; Lippman, Z.
AbstractThe compositions of conserved gene families often vary widely between species, complicating predictions and experimental tests of shared versus distinct functions, especially in families shaped by extensive duplication, redundancy, and paralog diversification. The plant CLV3/EMBRYO-SURROUNDING REGION (CLE) small-signaling peptide family exemplifies these challenges. Although genetic studies in model systems have identified shared roles for a few CLE genes and species-specific redundancies, an evolutionary analysis of the entire family over deep time could empower predictive and experimental dissections of functions obscured by redundancy. We developed a scanning pipeline that de novo annotated CLE genes from 2,000 genomes representing 1,000 species, uncovering thousands of previously undetected family members and producing a comprehensive phylogenetic reconstruction and tracing of the family\'s evolution and sequence diversification over 140 million years. Computational modeling of coding and cis-regulatory regions predicted lineage-specific asymmetries in paralog redundancy, stemming from ancestral amino acids in the functional core of the dodecapeptide and partial conservation of promoter elements. We tested these predictions using two genome-editing strategies in Solanaceae. Base-editing of deeply conserved residues in the CLV3 dodecapeptide and its paralogs across three species confirmed their critical roles in repressing stem-cell proliferation, and multiplex CRISPR knockouts of the 52 tomato CLE genes resolved pairwise and higher-order redundancies, revealing previously uncharacterized regulators of shoot architecture and plant size. These findings show how both peptide and cis-regulatory erosion shape CLE redundancy and provide a framework for detecting and translating deep evolutionary signals into testable genetic hypotheses across compositionally complex gene families.