Unlimited Job Postings Subscription - $99/yr!

Job Details

Senior AI Reliability Engineer

  2025-07-14     T-Mobile     Frisco,TX  
Description:

As a Senior AI Reliability Engineer, you will play a critical role in ensuring the operational excellence, scalability, and performance of AI-powered platforms and services at T-Mobile. This role requires strong SRE fundamentals, experience in managing LLM-based services and APIs, and the ability to drive observability and reliability for Gen. AI systems across cloud environments. We pride ourselves on encouraging a culture of innovation, advocating for agile methodologies, and promoting transparency in all that we do. Join us in embodying the spirit of the 'Un-carrier' and make a tangible impact! Our team is dynamic where no day is the same, and we are diverse and inclusive passionate about grow. Job Responsibilities:Implement observability tools, dashboards, and SLO frameworks for LLM-based services and inference pipelines. Monitor and improve the health, latency, and throughput of AI infrastructure in multi-cloud (primarily Azure) and hybrid environments. Manage incident detection, ...Reliability Engineer, Liability, AI, Engineer, Reliability, Reliability, Technology


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search