Breakthrough in Sugar Chemistry: Unravelling Synthetic Carbohydrate via Statistical Analysis and Machine Learning

Carbohydrates, widely distributed on cell membrane, dominate numerous signal transduction among cells and the infection of bacteria and virus. Tumor cell exhibits abundant abnormal glycan sequences and bacteria capsular polysaccharides show great difference from mammalian glycoconjugates, making tumor associated carbohydrates and capsular polysaccharides highly potential vaccine candidates. However, the development of carbohydrate-based vaccine and medicine is greatly limited due to the absence of a reliable guideline on glycosylation, core to carbohydrate synthesis. Without an efficient and stable control on the stereoselectivity and yield of glycosylation reaction, the mass production of carbohydrate-based vaccine and medicine is unpractical.

Recently, Dr. Cheng-Chung Wang, an associate research fellow at the Institute of Chemistry, Academia Sinica, Dr. Chi-Huey Wong, a former president, Academia Sinica, and their research teams successfully integrated real experiments, quantitation, big data analysis and machine learning algorithm to establish a designed program “GlycoComputer”, and a website” GlycoComputer: Explorer for chemical glycosylation” (Link) enabling a precise prediction of glycosylation reaction. An acceptor nucleophilicity constant (Aka), summarizing the steric, electronic and structural effects, was developed to quantify the reactivity of hydroxyl groups, providing a connection between synthetic experiments and computer algorithm. This new discovery has been published in Angewandte Chemie International Edition on February, 2021.

At least eleven factors across chemical participants and environment are involved in chemical condition. A subtle change on the building blocks can greatly influence the stereoselectivity and yield. The optimization of this reaction therefore often results in trial-and-error, and renders the mass production and manufacturing of complicated carbohydrate molecules unattainable goals. The GlycoComputer, established by Dr. Wang and Dr. Wong, can accurately predict the stereoselectivity and yield of glycosylation reaction before manual manipulation by using the concept of computer-aided synthesis, and is expected to greatly facilitate the production of oligosaccharides and carbohydrate-based vaccine and medicine.

Dr. Wang remarked, “Conventional carbohydrate synthesis is a trial and error process, while empirical rules highly rely on and are usually misled by human judgment. Big data analysis and machine learning provide an evaluation platform to analyze different factors in glycosylation reaction under big data analysis and unravel potential parameters.” By establishing the GlycoComputer program, a diverse range of glycosylation donors and acceptors with well-defined reactivity and promotors were analyzed and studied. The applicability was further validated by the synthesis of a carbohydrate antigen to show that the stereoselectivity and yield can be accurately estimated without involving sophisticated computational processing. The production of carbohydrate molecules is expected to be greatly simplified in the future by integrating this program.

Dr. Chun-Wei Chang is the first author in this study. The corresponding authors, Dr. Cheng-Chung Wang and Dr. Chi-Huey Wong, appreciate the financial support from Academia Sinica and Ministry of Science and Technology, Taiwan.

Article Link: https://onlinelibrary.wiley.com/doi/full/10.1002/anie.202013909

Breakthrough in Sugar Chemistry: Unravelling Synthetic Carbohydrate via Statistical Analysis and Machine Learning