First spoiler! Deecamp 2022 magnesium Kasai Title Open!
The 2022deecamp artificial intelligence training camp and innovation challenge jointly organized by mga technology, Innovation workshop, and Intelligent Industry Research Institute of Tsinghua University was officially opened on April 25. So far, we have received enthusiastic participation from students from universities at home and abroad, including the University of the Chinese Academy of Sciences, Tsinghua University, Peking University, Nanyang University of Technology, Johns Hopkins University, and so on.
This year's 17 innovation competition topics cover AI and Life Sciences, clinical diagnosis, industrial manufacturing, and other popular fields. As a strategic partner, MGGA technology acted as the problem maker in the five competition topics of the three major tracks of the competition. At the same time, we also obtained the official exclusive benefits for the students and disclosed the details of these five competition questions first.
Come and choose the topic you are interested in and join us! Maybe this is the starting point for you to change the world.
Life and health track
Genome-wide expression prediction based on representative gene sets
gene expression profile
Large-scale gene expression profile acquisition has been widely used to identify cell states under various disease conditions and genetic disturbances. Although the measurement cost of genome-wide expression profile has been reduced to a very low level, the sequencing of thousands of samples is still a high cost. Recognizing that the expression of many genes is often highly correlated, researchers from NIH Lincs in the United States developed a panel (L1000)  composed of about 1000 genes and predicted the expression of other genes through the gene expression in this panel, to obtain the whole expression profile.
However, the algorithm used in early Lincs is simple linear regression and can not capture complex linear relationships. Recently, the algorithm based on deep learning has been applied to solve this calculation problem and achieved higher accuracy . Furthermore, the selection of the L1000 gene is based on the key genes in small molecule drug screening, but it is not necessarily the most reasonable panel for transcription spectrum prediction. Therefore, if we can redesign a more refined or representative panel for predicting genome-wide expression profiles, it is not only a good computational problem but also of great application value.
- The purpose of selecting new panel genes is to find more representative genes in the predicted transcription spectrum. The most ideal scheme is the panel with higher prediction accuracy and fewer genes.
- Various methods can be tried, including combinatorial optimization and graph network. Gene selection and subsequent transcription spectrum prediction are two independent or inseparable links.
 Subramanian A, et al. A Next Generation Connectivity Map: L1000 Platform And The First 1,000,000 Profiles. Cell. 2017/12/1. 171(6):1437–1452.
 Chen Y, et al. Gene expression inference with deep learning. Bioinformatics. 2016. 32(12): 1832–1839.
Small size defect detection on-chip surface
Defect detection / Industrial
AI technology has been widely used in the field of industrial defect detection and has achieved good results in some products with a relatively single background. However, in products with a complex background, such as semiconductor chips, wafers, display panels, and other products, the existing conventional AI algorithms are easy to show more false detection and missed detection problems because of the small contrast difference and similar shape between the defect area and the background area. At the same time, in the fields of display panels, chip semiconductors, and so on, due to the high precision of the product itself, the size of defects is mostly small, coupled with the sample balance problem of the industry, defect detection is challenging. If a set of model structures and methods is designed, it can solve the problem of small-size defect detection under a complex background, which will undoubtedly improve the product yield of the whole industry and increase the gold content made in China.
- This competition focuses on improving the detection rate of small-size defects under complex backgrounds, reducing the false detection rate, and improving the operational efficiency of the algorithm as much as possible.
- The team can choose the direction of algorithm accuracy optimization or the direction of algorithm speed improvement, or both.
- This competition requires that the algorithm model constructed can run, and can count the effect on the test set.
- More pan semiconductor industry data will be provided to the participating teams in private before the competition.
AI chemical track
Exploration of Key Technologies of AI chemical synthesis
AI chemistry/reaction similarity retrieval/path planning and search
With the further deepening of artificial intelligence in scientific research and empowerment, AI technology has begun to help the automation and intelligence of synthetic chemistry in recent two years.
There are countless chemical reactions, which vary infinitely under different conditions. How plan a simple and feasible synthetic route is a difficult problem for chemists. With the help of big data and artificial intelligence algorithms, researchers can efficiently search and optimize the reaction path and realize reliable inverse synthesis analysis, which is expected to greatly improve research efficiency.
In chemical informatics, the traditional similarity retrieval of a single molecule has a mature solution. However, because chemical reactions often involve multiple compounds, and each compound involves structural transformation and electron migration in the reaction, the difficulty of similarity retrieval of chemical reactions is much greater than that of molecular retrieval. Designing an efficient chemical reaction retrieval algorithm will have a great positive impact on the chemical synthesis industry.
This competition topic is an open topic. The participating team can choose one or more focus directions, including but not limited to:
- Based on the given one-step reaction model, the appropriate reaction path can be predicted more effectively. It can be regarded as a problem of path planning, from the perspective of reinforcement learning, or as a generalized search problem. Participating teams can explore from multiple disciplines and perspectives.
- Design the reaction similarity retrieval algorithm to ensure that the retrieved chemical reaction mechanism and the types of functional groups should be consistent as far as possible, and the structural skeleton of the chemical reaction should be similar enough.
Duvenaud, D. K. et al. Convolutional networks on graphs for learning molecular fingerprints. In Proc. 28th International Conference on Neural Information Processing Systems Vol. 2, 2224–2232 (NIPS, 2015).
Wei, J. N., Duvenaud, D. & Aspuru-Guzik, A. Neural networks for the
prediction of organic chemistry reactions. ACS Cent. Sci. 2, 725–732 (2016).
Sandfort, F., Street-Kalthoff, F., Khnemund, M., Becks, C. & Glorius, F. A
structure-based platform for predicting chemical reactivity. Chem 6,