Data scientist / ML (Cheminformatics)
We have 30 years of expertise in designing and building custom software systems. We provide software development services focusing on complex high-load applications, AI and BI solutions, and mobile apps.
Project focused on virtual developability/progressability analysis for small molecules.
The goal is to improve hit triaging and decision-making in early-stage medicinal chemistry by leveraging machine learning, active learning, and statistical approaches. The team works closely with scientists to identify the most informative compounds for experimental testing, improve ADME prediction models, and rank chemical series based on their potential.
The scope of work includes improvement of existing ML models, design of training datasets, active learning workflows, uncertainty estimation, analysis of chemical data, and automation of ML pipelines. A single analysis may include up to 100K compounds.
Analysis functionality will be integrated with existing molecular design and data-driven decision-making platforms and may require development of simple Streamlit/Dash applications.
Required skills
- Strong Python (4+ years of experience).
- Experience with AI/ML model development and analysis (RF, GPs, LGBM, GNNs, etc.).
- Strong understanding of statistics and model evaluation.
- Experience working with:
- small datasets;
- imbalanced datasets;
- uncertainty quantification;
- active learning approaches.
- Understanding of in-distribution vs out-of-distribution (OOD) data.
- Basic understanding of Bayesian statistics and probabilistic modeling.
- Basic knowledge of medicinal chemistry concepts
- molecular structures;
- functional groups;
- chemical series;
- structure-activity relationships (SAR).
- Understanding of ADME/ADMET concepts and common activity metrics (pIC50, pAC50).
- Experience with Docker.
- Experience with Git.
Would be a plus
- Experience in the chemistry, cheminformatics, or drug discovery domain.
- Experience working with RDKit or other chemistry packages.
- Experience with Bayesian optimization.
- Experience with Airflow, Kubernetes.
- Experience with Streamlit/Dash.
- Experience designing active learning or experiment selection workflows.
Our offer as your future employer
- Collaboration via a B2B contract with payments in USD or EUR, depending on your preference, or through a labor contract
- Flexible work schedule.
- Possibility to work remotely
- Opportunities for professional growth.
- A company laptop to ensure a comfortable and efficient work setup.