DevOps Engineer
A1 MM|Posted 8 days ago
Sign up or log in to apply:
Skills and experience
Location and salary
Role description
DevOps Engineer - Chumo Live Streaming Platform
About the Project
Chumo is a high-performance live streaming platform built with Ruby on Rails, handling real-time video streaming at scale. The platform leverages LiveKit for WebRTC streaming, Google Cloud Storage for video assets, and requires robust infrastructure to support thousands of concurrent viewers across multiple live events.
Position Overview
We're seeking an experienced DevOps Engineer to architect, implement, and maintain scalable
infrastructure for our live streaming platform. You'll be responsible for ensuring high availability,
optimizing performance, implementing monitoring solutions, and building automation to support our rapid growth.
Technical Requirements
Essential Skills
• Container Orchestration: Production experience with Kubernetes, Docker, and container registry
management
• Cloud Platforms: Deep expertise with Google Cloud Platform (GCP) or AWS
• Infrastructure as Code: Proficiency with Terraform, CloudFormation, or similar laC tools
•
CI/CD: Experience with GitHub Actions, GitLab Cl, or similar pipeline tools
• Monitoring & Observability: Expertise with Prometheus, Grafana, ELK stack, or similar tools
• Database Management: PostgreSQL optimization, replication, and high availability
• Caching & Message Queues: Redis clustering, performance tuning, and monitoring
Required Experience
• 4+ years of DevOps/SRE experience with production systems
• Experience scaling video streaming or real-time communication platforms
• Strong understanding of CDN configuration and video delivery optimization
• Experience with load balancing and auto-scaling for variable traffic patterns
Security best practices for cloud infrastructure and data protection
• On-call experience and incident response management
Nice to Have
Experience with LiveKit or WebRTC infrastructure
Knowledge of video transcoding pipelines (AWS MediaConvert, FFmpeg)
ClickHouse or time-series database experience for analytics
Experience with multi-region deployments and edge computing
Familiarity with Ruby on Rails deployment patterns
Experience with cost optimization for cloud resources
Key Responsibilities
Infrastructure Architecture
Design and implement scalable Kubernetes infrastructure for the platform
Architect multi-region deployment strategy for global low-latency streaming
Implement auto-scaling policies based on viewer traffic and streaming load
• Design disaster recovery and high availability solutions
Video Streaming Infrastructure
Optimize Google Cloud Storage for video asset delivery
Implement CDN strategy for global video distribution
Configure LiveKit clusters for WebRTC streaming at scale
Design video transcoding pipeline for multiple quality streams
Platform Reliability
Achieve 99.9%+ uptime for live streaming services
Implement comprehensive monitoring and alerting systems
Build automated failover and recovery mechanisms
Conduct regular disaster recovery drills
Security & Compliance
Implement security best practices for cloud infrastructure
Manage secrets and credentials using HashiCorp Vault or similar
Ensure GDPR/privacy compliance for video storage
Regular security audits and vulnerability assessments
Performance Optimization
Optimize PostgreSQL for high-throughput operations
Configure Redis clustering for session management
Implement caching strategies for API and video metadata
Database query optimization and indexing strategies
Developer Experience
Build and maintain CI/CD pipelines for zero-downtime deployments
Create development environments that mirror production
Implement infrastructure automation and self-service tools
Documentation and runbooks for common operations
Technical Stack
Current Infrastructure:
• Docker containers with multi-stage builds
• Render.com deployment (current)
• PostgreSQL primary database
• Redis for caching and Action Cable
• Google Cloud Storage for video assets
• AWS S3 for LiveKit recordings
Target Infrastructure:
• Kubernetes (GKE/EKS) for container orchestration
• Terraform for infrastructure as code
• ArgoCD for GitOps deployments
• Prometheus + Grafana for monitoring
• ClickHouse for analytics data
• CloudFlare/Fastly for CDN
Monitoring & Operations:
• Honeybadger for error tracking
• New Relic for APM
• Rails Performance monitoring
• Custom dashboards for streaming metrics Specific Challenges
Scaling Requirements
• Handle 10,000+ concurrent viewers per event Support 100+ simultaneous live streams
• Sub-3 second stream latency globally
• 99.9% uptime during peak events
Cost Optimization
• Optimize video storage costs (currently using GCS)
• Implement intelligent caching strategies
• Right-size compute resources based on demand
• Monitor and control data transfer costs
Technical Debt
• Migrate from Render.com to Kubernetes
Implement proper staging environments
• Set up comprehensive integration testing
• Automate security scanning and updates
What You'll Build
• Kubernetes Platform: Design and deploy production Kubernetes infrastructure with auto-scaling,
monitoring, and GitOps
• Global CDN: Implement multi-region video delivery with intelligent routing
• Observability Platform: Build comprehensive monitoring covering application metrics,
infrastructure health, and streaming quality
CI/CD Pipeline: Zero-downtime deployments with automated testing and rollback capabilities
• Disaster Recovery: Multi-region backup and failover strategies
• Cost Management: Automated resource optimization and cost tracking
Development Process
• Infrastructure changes reviewed through pull requests
• Terraform for all infrastructure changes
• Automated testing for infrastructure code
• Documentation-first approach
• On-call rotation participation
Ideal Candidate
You're passionate about building reliable, scalable infrastructure for demanding real-time applications.
You have battle-tested experience with video streaming or similar high-bandwidth applications. You understand the unique challenges of live streaming and can architect solutions that balance
performance, reliability, and cost. You're comfortable working with developers to understand
application needs and translate them into infrastructure solutions.
Sign up or log in to apply:
About A1 MM
What we do
A1 Marketing and Media is a leading media production company based in Centurion, South Africa, specializing in innovative digital marketing solutions. We are proud to be the first company in Africa to deliver holographic technology through our exclusive partnership with ARHT Media, enabling clients to engage audiences globally without the need for travel.
Why work for us
Joining A1 Marketing and Media means being part of a dynamic team that values creativity and innovation. We offer opportunities to work with cutting-edge technology, including holograms and high-quality live streaming, while fostering a collaborative environment that encourages professional growth and client-driven customization.
Our culture
At A1 Marketing and Media, we prioritize results-based relationships that create a supportive atmosphere for clients and employees alike. Our culture emphasizes teamwork, transparency, and a commitment to achieving shared goals, ensuring that everyone contributes to the success of our projects.
Our engineering process
We leverage a range of technologies, including Apache and WordPress, to deliver seamless user experiences and high-quality content. Our engineering process is centered around collaboration and innovation, allowing us to efficiently manage production workflows and provide tailored solutions for our clients.
Our hiring process
Our hiring process is designed to identify candidates who align with our values and vision. We conduct thorough interviews that assess both technical skills and cultural fit, ensuring that new team members are well-equipped to contribute to our mission of delivering exceptional media solutions.
Tech Stack
application and data

