Unlimited Job Postings Subscription - $99/yr!

Job Details

Sr. Site Reliability Engineer, Security

  2026-04-16     CentralReach     all cities,AK  
Description:

CentralReach is a leading provider of autism and IDD care software for Applied Behavior Analysis (ABA), multidisciplinary therapy, and special education. Trusted by more than 200,000 users, we enable therapy providers, educators, and employers to scale the way they deliver ABA and related therapies with innovative technology, market-leading industry expertise, and world-class customer satisfaction.The Engineering Operations group atCentralReachbuilds the underlying technologies that power our Public and Private Cloud Platforms worldwide. The groupis responsible forstorage, data infrastructure, IT, observability systems, DevOps, SRE, provisioning, compute, orchestration platform, internal tools, internal platforms (laptops, networks, systems etc.) and services - all the components that make up theCentralReachPlatform.If you have a passion for the future, enjoy and thrive in an agile, fast-moving, ever-changing startup environment, welcome and take on technical challenges of all shapes and sizes,have excellent interpersonal skill and sense of humorand enjoy rolling up your sleeves and jumping in, then read on!As a Sr. SRE, you will work closely with the key stakeholders in Software Engineering to driveadoptionof modern reliability practices like SLOs, error budget policies, actionable alerts, incident retrospectives, chaos testing, and end-to-end ownership.Key Accountabilities:Responsible for availability, latency, performance, efficiency, monitoring/observability, emergency response, capacity planning,settingand maintaining SLOs, SLIs and Error Budgets, creating dashboards.Analyze,troubleshootand resolve operational challenges contributing to defined SLO's.Manage site stability, performance, reliability, andmaintainuptime for production environments.Develop a fully automated multi-environment observability stack based on the existing system and extend it to predict capacity needs based on the usage patterns.Strive for automation to reduce toil and increase development velocity.Perform application-specific production support, incident management, change management, problem management, RCAs, and service restoration as needed.Identifychanges for the product architecture fromthereliability,performanceand availability perspective with a data driven approach.Document resolutionrunbooks and standard operating procedures.Actively look for opportunities to improve the availability and performance of the system by applying the learnings from monitoring and observation.Collaborate with software development teams in the release management process and to shape the future roadmap andestablishstrong operational readiness across teams.Implementation of reliability and observability tools (like New Relic, Prometheus, Grafana etc.,)Collaborates with Security team and other platform engineering teams to build reliable, maintainable, and scalable solutions that improve our security posture.Desired Skills and Experience:Strong background as a SRE supporting a 24x7highly availableproduction environment for a SaaS or cloud service provider.Solid experience with Monitoring/APM/Observability tools (Splunk, NewRelicetc.)Experience implementing observability plans around logs, metrics, and traces.Experience in an agile development team developing software.Experience with cloud infrastructure environments, preferably AWS, andInfrastructure ascode (Terraform, CloudFormation).Extensive experience with Docker, Kubernetes, Helm, CI/CDand config management tools like Ansible, Chef.Strong experience with containerization technology and/or Kubernetes.Experience with Release automation, system administration, configuration management.Experience withprogramming languages (Java, Python, Go, etc.).Strong understanding of Linux, Windows, software development, systems, networking, and cloud concepts.Strong interpersonal and teaming skills - ability to set and enforce process and influence engineers who are not direct reports.Strong analytical and programming skills (Python, Go,Javaetc.).Deep understanding around best practices for modern cloud security.Proven experience building observability for security concerns, such asprivilege escalations and bot detection.Base Salary Range$160,000 - $180,000 USDBacked by Roper Technologies, Inc. (Nasdaq: ROP), and led by award-winning CEO Chris Sullens, CentralReach is entering an exciting phase of growth, innovation, and scale.Recognized as one of the best places to work over 10 times by organizations such as Inc, Built In, and NJBIZ, our culture is centered around impact, inclusion, and flexibility. As a hybrid company with collaborative offices in Ft. Lauderdale, FL; Holmdel, NJ; and Verona, Italy, we foster a workplace where top talent can thrive and make a real difference in the lives of those we serve.We offer competitive compensation, comprehensive health benefits, generous PTO, 401(k) matching, and paid parental leave. Our team members also enjoy hybrid work schedules, career development support, wellness programs, and opportunities to give back through CR Cares™, our community engagement initiative.Be part of a market leader driving the future of care. Explore opportunities at centralreach.com/careers .


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search