Go See All Bilingual JobsApply
$15/ hour
·
United States
·
Remote
·
Posted May 6, 2026
English
Polish
Crossing Hurdles Is Hiring An Ai Evaluation Specialist
We are seeking a dedicated LLM – AI Quality Analyst (Personalization) – Polish for a short-term contract role, perfect for those passionate about AI and language evaluation. 🌍
Position Overview
- Type: Short-Term Contract
- Location: Remote (Global)
- Commitment: 20-40 hours/week with 4 hours overlap with PST
- Engagement Length: 1 month
- Start Date: Immediate
Role Responsibilities 🎯
- Design multi-turn conversational prompts based on personal context
- Evaluate personalized AI responses for relevance, grounding, and helpfulness
- Assess correct and incorrect use of personal data in model outputs
- Perform side-by-side (SxS) evaluation and ranking of AI responses
- Identify grounding errors, poor inferences, and forced personalization
- Write clear, structured rationales referencing specific conversation turns
- Extract and verify model debug information and data source usage
- Maintain strict data hygiene by deleting evaluation conversations
Requirements ✅
- Polish fluency (reading and writing) is mandatory, as Polish is the focus language for this project
- Experience in data annotation, AI quality evaluation, content moderation, or related roles is strongly preferred
- Strong analytical thinking and attention to detail
- Ability to evaluate nuanced and ambiguous AI responses
- Comfortable using a primary personal Google account with enabled data sources
- BS/BA degree or equivalent experience in a relevant analytical field
- Strong written communication and structured feedback skills
- Self-motivated and able to work independently in a remote setting
- Reliable desktop/laptop with stable internet connection