Full publication list can be found on Google Scholar.
* indicates equal contribution.
In biochemical modeling, some foundational systems can exhibit sudden and profound behavioral shifts, such as the cellular signaling pathway models, in which the physiological responses promptly react to environmental changes, resulting in steep changes in their dynamic model trajectories. These steep changes are one of the major challenges in biochemical modeling governed by nonlinear differential equations. One promising way to tackle this challenge is converting the input data from the time domain to the frequency domain through Fourier Neural Operators, which enhances the ability to analyze data periodicity and regularity. However, the effectiveness of these Fourier based methods diminishes in scenarios with complex abrupt switches. To address this limitation, an innovative Multiscale Attention Wavelet Neural Operator (MAWNO) method is proposed in this paper, which comprehensively combines the attention mechanism with the versatile wavelet transforms to effectively capture these abrupt switches. Specifically, the wavelet transform scrutinizes data across multiple scales to extract the characteristics of abrupt signals into wavelet coefficients, while the self-attention mechanism is adeptly introduced to enhance the wavelet coefficients in high-frequency signals that can better characterize the abrupt switches. Experimental results substantiate MAWNO’s supremacy in terms of accuracy on three classical biochemical models featuring periodic and steep trajectories. https://github.com/SUDERS/MAWNO.
[Background] Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder characterized by three neurobiological factors beta-amyloid, pathologic tau, and neurodegeneration. There are no effective treatments for AD at a late stage, urging for early detection and prevention. However, existing statistical inference approaches in neuroimaging studies of AD subtype identification do not take into account the pathological domain knowledge, which could lead to ill-posed results that are sometimes inconsistent with the essential neurological principles. [Purpose] Integrating systems biology modeling with machine learning, the study aims to assist clinical AD prognosis by providing a subpopulation classification in accordance with essential biological principles, neurological patterns, and cognitive symptoms. [Methods] We propose a novel pathology steered stratification network (PSSN) that incorporates established domain knowledge in AD pathology through a reaction-diffusion model, where we consider non-linear interactions between major biomarkers and diffusion along the brain structural network. Trained on longitudinal multimodal neuroimaging data, the biological model predicts long-term evolution trajectories that capture individual characteristic progression pattern, filling in the gaps between sparse imaging data available. A deep predictive neural network is then built to exploit spatiotemporal dynamics, link neurological examinations with clinical profiles, and generate subtype assignment probability on an individual basis. We further identify an evolutionary disease graph to quantify subtype transition probabilities through extensive simulations. [Results] Our stratification achieves superior performance in both inter-cluster heterogeneity and intra-cluster homogeneity of various clinical scores. Applying our approach to enriched samples of aging populations, we identify six subtypes spanning AD spectrum, where each subtype exhibits a distinctive biomarker pattern that is consistent with its clinical outcome. [Conclusions] The proposed PSSN (i) reduces neuroimage data to low-dimensional feature vectors, (ii) combines AT[N]-Net based on real pathological pathways, (iii) predicts long-term biomarker trajectories, and (iv) stratifies subjects into fine-grained subtypes with distinct neurological underpinnings. PSSN provides insights into pre-symptomatic diagnosis and practical guidance on clinical treatments, which may be further generalized to other neurodegenerative diseases.
AMP-activated protein kinase (AMPK), a cellular energy sensor in response to changes in the ADP/ATP ratio, has been proven to not only play roles in metabolic actions but also holds pivotal roles in neurodegenerative diseases such as Alzheimer’s disease (AD), Parkinson’s disease, and Lewy body dementia. However, the downstream effectors regulated by AMPK and its specific contributions to the pathology of these diseases remain elusive. In this study, we combine the power of systems biology and neuroscience to understand the dynamics and regulatory effects of AMPK under the progression of AD. Particularly, our study develops a regulatory protein network that seeks to demystify this intricate protein-protein interaction, placing particular emphasis on the role of AMPK in the pathology of AD. By focusing on two important cellular activities, mRNA translation and autophagy, our model explores the underlying effects of the central governance of a change in AMPK activity and concludes that eEF2-controlled translation is more sensitive to the perturbation. Additionally, we adjust various kinetic parameters to restore proteins affected by the disease to their baseline levels, with the goal of identifying potential new drug targets. The model accurately captures the regulatory effect of AMPK, provides the temporal dynamics of key regulators in AD progression, contributes to the understanding of the disease’s pathophysiology, and potentially generalizes the function of AMPK into other neurodegenerative diseases.
Understanding the interfacial behaviors of biomolecules is crucial to applications in biomaterials and nanoparticle-based biosensing technologies. In this work, we utilized autoencoder-based graph clustering to analyze discontinuous molecular dynamics (DMD) simulations of lysozyme adsorption on a graphene surface. Our high-throughput DMD simulations integrated with a Go̅-like protein–surface interaction model makes it possible to explore protein adsorption at a large temporal scale with sufficient accuracy. The graph autoencoder extracts a low-dimensional feature vector from a contact map. The sequence of the extracted feature vectors is then clustered, and thus the evolution of the protein molecule structure in the absorption process is segmented into stages. Our study demonstrated that the residue–surface hydrophobic interactions and the π–π stacking interactions play key roles in the five-stage adsorption. Upon adsorption, the tertiary structure of lysozyme collapsed, and the secondary structure was also affected. The folding stages obtained by autoencoder-based graph clustering were consistent with detailed analyses of the protein structure. The combination of machine learning analysis and efficient DMD simulations developed in this work could be an important tool to study biomolecules’ interfacial behaviors.
Alzheimer's disease (AD) is a heterogeneous, multifactorial neurodegenerative disorder, where beta-amyloid (A), pathologic tau (T), neurodegeneration ([N]), and structural brain network (Net) are four major indicators of AD progression. Most current studies on AD rely on single-source modality and ignore complex biological interactions at molecular level. In this study, we propose a novel multimodal spatiotemporal stratification network (MSSN) that is built upon the fusion of multiple data modalities and the combined power of systems biology and deep learning. Altogether, our stratification approach could (1) ameliorate limitations caused by insufficient longitudinal imaging data, (2) extract important spatiotemporal features vectors from imaging data, (3) exploit the subject-specific longitudinal prediction of a holistic biomarker set, and (4) generate symptoms related finegrained subtype classification.
Nowadays, more and more cloud testing platforms provide enterprise developers with solutions for cloud device debugging and automatic testing. It is a great challenge for these cloud platforms to schedule the arriving requests to run on the specific smart device resources in real-time and efficiently. The traditional scheduling algorithm is difficult to adapt to the application interface call request with a vast difference in volume and behaviour ability. To solve this problem, we integrate these smart devices into a device cloud and propose a measurement method of the service capability of a single device in the group. Then we build an adaptive scheduling algorithm model according to the characteristics of the serviceability of a single device to improve the scheduling efficiency of the group. Practice shows that the adaptive scheduling algorithm can effectively control the network traffic. Finally, through the analysis and optimization, we get the method of obtaining the optimal parameter combination in the algorithm.
Mobile app ecosystem has gained giant success in providing services on mobile devices to facilitate almost all aspects in our daily life. However, the whole-package installation and dramatically increasing package size are now preventing users from trying more apps. To address the issue, many lightweight frameworks have emerged, enabling to provide the experience of instant service consumption where apps are of small size and no installation is needed to consuming services provided by the apps. In this paper, we conduct the first empirical study on instant service consumption on mobile devices. We focus on one of the most popular frameworks, quick apps, which are proposed and supported by nine mainstream mobile phone manufacturers in China. Quick apps are implemented with Web-based technologies, and run as native apps without the need of installation. We find that quick apps have much smaller size and only provide a limited set of services compared to their corresponding native apps. Then, we characterize the performance differences between quick apps and native apps in terms of launching time, data drain, and network connections, when the two kinds of apps provide the same services. Our observations reveal that quick apps perform better than native apps thanks to its much smaller size and less functionalities in a single page. Finally, we propose a machine learning based approach to helping developers construct the quick app from an existing native app.
Symbolic regression (SR) is a powerful technique for discovering the analytical mathematical expression from data, finding various applications in natural sciences due to its good interpretability of results. However, existing methods face scalability issues when dealing with complex equations involving multiple variables. To address this challenge, we propose SRCV, a novel neural symbolic regression method that leverages control variables to enhance both accuracy and scalability. The core idea is to decompose multi-variable symbolic regression into a set of single-variable SR problems, which are then combined in a bottom-up manner. The proposed method involves a four-step process. First, we learn a data generator from observed data using deep neural networks (DNNs). Second, the data generator is used to generate samples for a certain variable by controlling the input variables. Thirdly, single-variable symbolic regression is applied to estimate the corresponding mathematical expression. Lastly, we repeat steps 2 and 3 by gradually adding variables one by one until completion. We evaluate the performance of our method on multiple benchmark datasets. Experimental results demonstrate that the proposed SRCV significantly outperforms state-of-the-art baselines in discovering mathematical expressions with multiple variables. Moreover, it can substantially reduce the search space for symbolic regression. The source code will be made publicly available upon publication.
This study presents a novel Machine Learning Algorithm, named Chemical Environment Graph Neural Network (ChemGNN), designed to accelerate materials property prediction and advance new materials discovery. Graphitic carbon nitride (g-C3N4) and its doped variants have gained significant interest for their potential as optical materials. Accurate prediction of their band gaps is crucial for practical applications, however, traditional quantum simulation methods are computationally expensive and challenging to explore the vast space of possible doped molecular structures. The proposed ChemGNN leverages the learning ability of current graph neural networks (GNNs) to satisfactorily capture the characteristics of atoms' local chemical environment underlying complex molecular structures. Our benchmark results demonstrate more than 100% improvement in band gap prediction accuracy over existing GNNs on g-C3N4. Furthermore, the general ChemGNN model can precisely foresee band gaps of various doped g-C3N4 structures, making it a valuable tool for performing high-throughput prediction in materials design and development.