A1 MM Logo

DevOps Engineer

A1 MM|Posted 8 days ago

Skills and experience

Role:DevOps engineer
Other roles:Cloud engineer, Site reliability engineer (SRE)
Experience in role:4+ years
Language proficiency:English
Must-have skills:
    Ruby on Rails
Nice-to-have skills:
    Docker
    AWS
    Kubernetes

Location and salary

Remote policy:Hybrid
Location of job:Pretoria, South Africa
Visa requirements:Authorised to work in South Africa With status of citizen/passport holder or permanent resident
Visa sponsorship:Unable to sponsor visa
Employment type:Contract

Role description

DevOps Engineer - Chumo Live Streaming Platform

About the Project

Chumo is a high-performance live streaming platform built with Ruby on Rails, handling real-time video streaming at scale. The platform leverages LiveKit for WebRTC streaming, Google Cloud Storage for video assets, and requires robust infrastructure to support thousands of concurrent viewers across multiple live events.

Position Overview

We're seeking an experienced DevOps Engineer to architect, implement, and maintain scalable

infrastructure for our live streaming platform. You'll be responsible for ensuring high availability,

optimizing performance, implementing monitoring solutions, and building automation to support our rapid growth.

Technical Requirements

Essential Skills

• Container Orchestration: Production experience with Kubernetes, Docker, and container registry

management

• Cloud Platforms: Deep expertise with Google Cloud Platform (GCP) or AWS

• Infrastructure as Code: Proficiency with Terraform, CloudFormation, or similar laC tools

CI/CD: Experience with GitHub Actions, GitLab Cl, or similar pipeline tools

• Monitoring & Observability: Expertise with Prometheus, Grafana, ELK stack, or similar tools

• Database Management: PostgreSQL optimization, replication, and high availability

• Caching & Message Queues: Redis clustering, performance tuning, and monitoring

Required Experience

• 4+ years of DevOps/SRE experience with production systems

• Experience scaling video streaming or real-time communication platforms

• Strong understanding of CDN configuration and video delivery optimization

• Experience with load balancing and auto-scaling for variable traffic patterns

Security best practices for cloud infrastructure and data protection

• On-call experience and incident response management

Nice to Have

Experience with LiveKit or WebRTC infrastructure

Knowledge of video transcoding pipelines (AWS MediaConvert, FFmpeg)

ClickHouse or time-series database experience for analytics

Experience with multi-region deployments and edge computing

Familiarity with Ruby on Rails deployment patterns

Experience with cost optimization for cloud resources

Key Responsibilities

Infrastructure Architecture

Design and implement scalable Kubernetes infrastructure for the platform

Architect multi-region deployment strategy for global low-latency streaming

Implement auto-scaling policies based on viewer traffic and streaming load

• Design disaster recovery and high availability solutions

Video Streaming Infrastructure

Optimize Google Cloud Storage for video asset delivery

Implement CDN strategy for global video distribution

Configure LiveKit clusters for WebRTC streaming at scale

Design video transcoding pipeline for multiple quality streams

Platform Reliability

Achieve 99.9%+ uptime for live streaming services

Implement comprehensive monitoring and alerting systems

Build automated failover and recovery mechanisms

Conduct regular disaster recovery drills

Security & Compliance

Implement security best practices for cloud infrastructure

Manage secrets and credentials using HashiCorp Vault or similar

Ensure GDPR/privacy compliance for video storage

Regular security audits and vulnerability assessments

Performance Optimization

Optimize PostgreSQL for high-throughput operations

Configure Redis clustering for session management

Implement caching strategies for API and video metadata

Database query optimization and indexing strategies

Developer Experience

Build and maintain CI/CD pipelines for zero-downtime deployments

Create development environments that mirror production

Implement infrastructure automation and self-service tools

Documentation and runbooks for common operations

Technical Stack

Current Infrastructure:

• Docker containers with multi-stage builds

• Render.com deployment (current)

• PostgreSQL primary database

• Redis for caching and Action Cable

• Google Cloud Storage for video assets

• AWS S3 for LiveKit recordings

Target Infrastructure:

• Kubernetes (GKE/EKS) for container orchestration

• Terraform for infrastructure as code

• ArgoCD for GitOps deployments

• Prometheus + Grafana for monitoring

• ClickHouse for analytics data

• CloudFlare/Fastly for CDN

Monitoring & Operations:

• Honeybadger for error tracking

• New Relic for APM

• Rails Performance monitoring

• Custom dashboards for streaming metrics Specific Challenges

Scaling Requirements

• Handle 10,000+ concurrent viewers per event Support 100+ simultaneous live streams

• Sub-3 second stream latency globally

• 99.9% uptime during peak events

Cost Optimization

• Optimize video storage costs (currently using GCS)

• Implement intelligent caching strategies

• Right-size compute resources based on demand

• Monitor and control data transfer costs

Technical Debt

• Migrate from Render.com to Kubernetes

Implement proper staging environments

• Set up comprehensive integration testing

• Automate security scanning and updates

What You'll Build

• Kubernetes Platform: Design and deploy production Kubernetes infrastructure with auto-scaling,

monitoring, and GitOps

• Global CDN: Implement multi-region video delivery with intelligent routing

• Observability Platform: Build comprehensive monitoring covering application metrics,

infrastructure health, and streaming quality

CI/CD Pipeline: Zero-downtime deployments with automated testing and rollback capabilities

• Disaster Recovery: Multi-region backup and failover strategies

• Cost Management: Automated resource optimization and cost tracking

Development Process

• Infrastructure changes reviewed through pull requests

• Terraform for all infrastructure changes

• Automated testing for infrastructure code

• Documentation-first approach

• On-call rotation participation

Ideal Candidate

You're passionate about building reliable, scalable infrastructure for demanding real-time applications.

You have battle-tested experience with video streaming or similar high-bandwidth applications. You understand the unique challenges of live streaming and can architect solutions that balance

performance, reliability, and cost. You're comfortable working with developers to understand

application needs and translate them into infrastructure solutions.

About A1 MM

0 employees

What we do

A1 Marketing and Media is a leading media production company based in Centurion, South Africa, specializing in innovative digital marketing solutions. We are proud to be the first company in Africa to deliver holographic technology through our exclusive partnership with ARHT Media, enabling clients to engage audiences globally without the need for travel.

Why work for us

Joining A1 Marketing and Media means being part of a dynamic team that values creativity and innovation. We offer opportunities to work with cutting-edge technology, including holograms and high-quality live streaming, while fostering a collaborative environment that encourages professional growth and client-driven customization.

Our culture

At A1 Marketing and Media, we prioritize results-based relationships that create a supportive atmosphere for clients and employees alike. Our culture emphasizes teamwork, transparency, and a commitment to achieving shared goals, ensuring that everyone contributes to the success of our projects.

Our engineering process

We leverage a range of technologies, including Apache and WordPress, to deliver seamless user experiences and high-quality content. Our engineering process is centered around collaboration and innovation, allowing us to efficiently manage production workflows and provide tailored solutions for our clients.

Our hiring process

Our hiring process is designed to identify candidates who align with our values and vision. We conduct thorough interviews that assess both technical skills and cultural fit, ensuring that new team members are well-equipped to contribute to our mission of delivering exceptional media solutions.

Tech Stack

application and data

Ruby
Ruby
Redis
Redis

Similar jobs on OfferZen: