Scientific Data Engineer (Pharma R&D)

Software Country is a technology company focused on providing software development services to clients worldwide. Our tech knowledge coupled with our deep industry expertise is what allows us to create effective high-quality solutions. We have been helping enterprises scale engineering capacity and deliver efficient software since 1993.

We are looking for a Data Engineer with strong dual competency in data and life science to build and scale data pipelines and curated datasets that power scientific analytics, AI/ML, and data products across R&D.

You will work in a multidisciplinary environment, partnering closely with scientists, data scientists, data architects, and product owners to translate scientific workflows into reliable, reusable, and governed data assets.

This role is hands-on and delivery-focused: you will design, develop, and operate data ingestion and transformation pipelines, optimize performance and reliability, and ensure data is discoverable and trusted through strong metadata, lineage, and access controls—within the scientific data and AI ecosystem for development and RWE assets.

Availability between 8:00 AM and 2:00 PM EST is required.

Responsibilities

  • Design and implement scalable ETL/ELT pipelines for diverse scientific datasets using Databricks and Snowflake.
  • Develop production-grade workflows with robust testing, monitoring, and performance optimization.
  • Build curated data layers and structures to support downstream analytics and AI products.
  • Implement data quality checks, metadata, and lineage practices to ensure data is discoverable and trusted.
  • Collaborate with scientists, architects, and security teams to align pipelines with governance and operational standards.
  • Contribute to architectural decisions, balancing short-term delivery with long-term platform sustainability.

Requirements

  • 5+ years of experience in data engineering with a strong engineering background.
  • Deep knowledge of the Pharma R&D space, including experience with clinical or RWE datasets.
  • Expertise in Python, SQL, and collaborative workflows (Databricks/Snowflake).
  • Proven ability to build and operate production-grade data pipelines and transformation layers.
  • Experience with metadata management, data cataloging, and FAIR data principles.
  • Degree in Computer Science, Bioinformatics, Life Sciences, or a related field.
  • English level: B2 (Upper-Intermediate) or higher.

Nice to have

  • Experience with semantic data, ontologies, or knowledge graphs.
  • Work experience in regulated environments (GxP, data privacy).

Our offer as your future employer

  • Flexible work schedule.
  • The ability to work remotely.
  • Opportunities for professional growth.
  • Medical insurance.
  • Relocation bonus for candidates when moving.
Back to the list of jobs