Machine learning involves a range of methods for turning data into predictive models and capturing complex relationships that are challenging to capture via traditional analytical approaches.
We focus on deep neural networks that take structural information as input and predict
- physicochemical properties
- functionally relevant features
- structural aspects not present in the original input
A particular challenge is the development of transferable machine learning models in domains where there is limited data available for training.
To improve transferability we are developing structural embeddings and transfer learning strategies.
Software and modeling resources:
- ProteinStructureEmbedding: structural embedding for proteins, including applications for predictions of molecular properties and pKa shifts
- PolII-mutants: analysis of RNA polymerase II mutation data via autoencoders
