Back to all jobs

Member of Engineering – Pre-training, Data Engineering

Work from home Full-time role Hiring

Job Description:

  • Build and maintain high-performance pipelines for trillions of tokens.
  • Deliver diverse and high quality datasets for pre-training foundation models.
  • Closely work with other teams such as Pretraining, Posttraining, Evals and Product to to ensure alignment on the quality of the models delivered.

Requirements:

  • Strong background in building production-grade, distributed data systems for machine learning, with experience in:
  • Orchestration: Slurm, Airflow, or Dagster
  • Observability & Reliability: CI/CD, Grafana, Prometheus, etc.
  • Infra: Git, Docker, k8s, cloud managed services
  • Batched inference (ex: vLLM)
  • Performance obsession, especially with large-scale GPU clusters and distributed pipelines
  • Expert-level python knowledge and ability to write clean and maintainable code
  • Strong algorithmic foundations
  • Proficiency with libraries like Polars, Dask, or PySpark
  • Nice to have:
  • Experience in building trillion-scale SOTA pretraining datasets
  • Experience translating research to production at scale
  • Experience with OCR, web crawling, or evals
  • Prior experience pre-training LLMs

Benefits:

  • Fully remote work & flexible hours
  • 37 days/year of vacation & holidays
  • Health insurance allowance for you and dependents
  • Company-provided equipment
  • Wellbeing, always-be-learning and home office allowances
  • Frequent team get togethers
  • Great diverse & inclusive people-first culture

Apply To This Job

More remote roles to explore

Remote Full Stack Developer - AI-Enhanced & Cloud-Native

Work from home Full-time role

Senior Full Stack Developer (Remote Opportunity) at VetsEZ

Work from home Full-time role

Lead .NET Developer / Architect

Work from home Full-time role

.net Developer / Dot Net Developer (remote Quarterly Travel) with Security Clearance

Work from home Full-time role

Site Reliability Engineer, Core Streaming (Remote - United States)

Work from home Full-time role

Sys Admin/Dev Ops - Linux/PHP/DB (Remote)

Work from home Full-time role

Part-Time Remote Administrative Assistant (MUST LIVE IN LOUDOUN COUNTY)

Work from home Full-time role

New Openings for Remote Customer Service Representatives - Earn 19 Per Hour - Full-time / Part-time

Work from home Full-time role

Animator / Motion Designer, Unexplainable (Part-time, Temporary)

Work from home Full-time role

Graphic Designer / Web Developer

Work from home Full-time role

Experienced Specimen Processor/Data Entry Specialist – Molecular Diagnostics Laboratory

Work from home Full-time role

Experienced Full Stack Customer Service Representative – Live Chat Support Agent (Entry-Level) at arenaflex

Work from home Full-time role

Director, Digital Solutions - Healthcare

Work from home Full-time role

RN | Surgery Circulator | Operating Room | Night Shift

Work from home Full-time role

Experienced Customer Service Representative – Deliver Legendary Customer Experience in Aviation

Work from home Full-time role

Experienced Full Stack Customer Success Lead – Biopharma Industry

Work from home Full-time role

Entry Level Sales High Pay

Work from home Full-time role

Digital Marketing Specialist (Paid Ads & Optimization) | Remote | Part-time | PH-Based | US Hours (EST/Michigan)

Work from home Full-time role

Experienced Administrative Clerk / Claims Specialist / Data Entry Professional – Remote Opportunity in New York State

Work from home Full-time role

Analyst - Office of the CIO

Work from home Full-time role