Requirements: English
Company: Experis Manpower Group
Region: Wysokie Mazowieckie , Podlaskie Voivodeship
Work scheduled according to US Eastern or Central Time zones Responsibilities: - Design and implement scalable QA strategies for validating LLM-generated outputs, including meeting summaries, task extraction, document search results, and contextual content - Evaluate the effectiveness and reliability of GenAI features across variable formatting preferences (e.G., Roberts Rules, anonymous vs. named notes, bullet vs. narrative)- Create prompt-scoring and output confidence models to evaluate changes in behavior or regressions from previous prompt iterations - Partner with product teams to align GenAI output validation to real-world use cases and customer expectations - Collaborate within a modern Microsoft-based environment, including .NET Core (C#), Azure, microservices, and Cypress - Leverage Azure OpenAI, Azure AI Search, Recall.AI, Windsurf IDE, and RAG (Retrieval-Augmented Generation) concepts to identify edge cases and test data relevancy - Where appropriate, use or configure AI-based tooling to assist in regression testing, test generation, or PR review workflows - Document QA methodologies for AI testing and build playbooks for repeatability and internal enablement - Contribute to evolving the definition of quality in GenAI shifting from traditional pass/fail to output value, usability, and trust - Support broader QA initiatives (when needed), including Cypress test maintenance, smoke test maturity, and test coverage improvements - Help reduce cycle time and deployment friction by improving overall test reliability and structure Requirements: Minimum Qualifications: - 2+ years in QA, test engineering, or quality automation within a software product environment - Proven experience validating generative AI/LLM outputs (e.G., OpenAI, Claude, Cohere, Anthropic, etc.)- Deep understanding of prompt engineering, tuning, and the challenges of hallucinations and inconsistent LLM behavior - Familiarity with techniques like prompt scoring, fuzzy matching, domain validation, and output consistency testing - Experience with both manual and automated test strategy design in dynamic, prompt-based systems - Ability to work independently in remote settings, delivering structure within ambiguity - Excellent communication skills with ability to document results, process, and rationale clearly Preferred Qualifications: - Experience testing GenAI features in a B2B SaaS, enterprise, or regulated environment (e.G., education, healthcare, financial services)- Working knowledge of: Azure OpenAI, Azure AI Search, Recall.AI, Windsurf IDE Cypress, .NET Core/C#, Vue.Js (basic familiarity only needed)Vector databases, RAG pipelines, or AI-enhanced search functions - Experience using GenAI for QA acceleration (e.G., writing test cases, automating regression checks)- Familiarity with Agile/DevOps environments, including CI/CD pipelines and shift-left QA practices - Experience working with or supporting offshore/nearshore QA teams Our offer: - 100% remote work - MultiSport Plus - Group insurance - Medicover Premium - e-learning platform