Back to all jobs

Part-Time Benchmarking Engineer (Remote)

Work from home Full-time role Hiring

Join Our Team and Shape the Future of AI Benchmarking! We're seeking a skilled and motivated Part-Time Benchmarking Engineer to play a key role in maintaining and improving our public LLM benchmarks. As a remote team member, you'll enjoy the freedom and flexibility to work from anywhere, with a competitive salary and the opportunity to grow with our company.

About the Role: As a Benchmarking Engineer, you will be responsible for creating new datasets, running benchmarks against new models, and analyzing results to provide valuable insights. You will have significant ownership of our benchmarking site and the opportunity to propose new benchmarks based on your ideas and hypotheses.

Key Responsibilities:

  • Create new, private datasets in conjunction with our data annotators and partner groups
  • Run existing benchmarks against new models and compile results
  • Write free-text analyses of raw quantitative results to answer key questions about model performance
  • Create social media posts to share our findings with the community
  • Maintain and improve scripts used to run benchmarks against our datasets

Requirements:

  • Deep experience with Python
  • Strong communication and writing skills
  • Experience working in teams, including development sprints and Git
  • Availability of approximately 20 hours per week

Nice to Haves:

  • Familiarity with LLM methods and developments
  • Experience in ML research setting or data science

About Us: At Vals AI, we're building the enterprise benchmark for LLM and LLM apps on real-world business tasks. Our mission is to create the infrastructure and certification to automatically audit LLM applications, verifying they are ready for consumption. We're a team of talented individuals with a passion for AI and a drive to make a difference.

What We Offer:

  • Competitive salary
  • Optional ability to work in our SF office
  • Opportunity to grow into a full-time role
  • Collaborative and dynamic work environment

How to Apply: If you're a motivated and skilled individual with a passion for AI, we encourage you to apply for this exciting opportunity. Don't worry if you don't meet every single requirement - we value a great attitude and a willingness to learn above all. Apply now and join our team!

Apply for this job

More remote roles to explore

Remote Behavioral Health Therapist - Flexible Schedule and Competitive Salary

Work from home Full-time role

Transform Lives as a Behavioral Health Case Manager at CVS Health

Work from home Full-time role

Transform Lives as a Behavioral Health Care Advocate - Utilization Management

Work from home Full-time role

Remote Behavioral Health Care Advocate - Utilization Management (California)

Work from home Full-time role

Transform Lives as a Part-Time Remote Behavioral Coach

Work from home Full-time role

Transform Lives as a Behavior Analyst (BCBA) at Soar Autism Center

Work from home Full-time role

Launch Your Career as a Remote Transcription Specialist with KvikCareer

Work from home Full-time role

Remote Game Tester - Unlock New Worlds and Shape the Future of Gaming

Work from home Full-time role

Remote Sales Champion - Unlock Your Earning Potential and Achieve Success

Work from home Full-time role

Remote User Experience Tester - Earn £30/Hour Sharing Your Opinions

Work from home Full-time role

Experienced Customer Care Representative – Remote Work Opportunity with blithequark

Work from home Full-time role

Experienced Remote Part-time Data Entry Associate – Accurate and Efficient Data Management Professional for Blithequark

Work from home Full-time role

Part-Time Evening Remote Data Entry Specialist – Unlock Your Potential at arenaflex

Work from home Full-time role

Apply Now: Require (USA) Overnight Stocking Coach, Non-Complex

Work from home Full-time role

[Remote] Associate Manager/Manager, Clinical Data Systems (CTMS, eTMF), CDOC

Work from home Full-time role

Remote arenaflex Data Entry Specialist – Entry Level – $30/Hour Work‑From‑Home Opportunity

Work from home Full-time role

Pharmacist-LCMC

Work from home Full-time role

Talent Acquisition Coordinator

Work from home Full-time role

IT Service Delivery Associate

Work from home Full-time role

Staff Software Engineer I- Fleet Management

Work from home Full-time role