Crossing Hurdles

Apply

$15/ hour

United States

Remote

Posted May 6, 2026

English

Polish

Crossing Hurdles Is Hiring An Ai Evaluation Specialist

We are seeking a dedicated LLM – AI Quality Analyst (Personalization) – Polish for a short-term contract role, perfect for those passionate about AI and language evaluation. 🌍

Position Overview

Type: Short-Term Contract
Location: Remote (Global)
Commitment: 20-40 hours/week with 4 hours overlap with PST
Engagement Length: 1 month
Start Date: Immediate

Role Responsibilities 🎯

Design multi-turn conversational prompts based on personal context
Evaluate personalized AI responses for relevance, grounding, and helpfulness
Assess correct and incorrect use of personal data in model outputs
Perform side-by-side (SxS) evaluation and ranking of AI responses
Identify grounding errors, poor inferences, and forced personalization
Write clear, structured rationales referencing specific conversation turns
Extract and verify model debug information and data source usage
Maintain strict data hygiene by deleting evaluation conversations

Requirements ✅

Polish fluency (reading and writing) is mandatory, as Polish is the focus language for this project
Experience in data annotation, AI quality evaluation, content moderation, or related roles is strongly preferred
Strong analytical thinking and attention to detail
Ability to evaluate nuanced and ambiguous AI responses
Comfortable using a primary personal Google account with enabled data sources
BS/BA degree or equivalent experience in a relevant analytical field
Strong written communication and structured feedback skills
Self-motivated and able to work independently in a remote setting
Reliable desktop/laptop with stable internet connection

Apply