About the Team
We are building a new-generation Platform Operations team that reimagines the traditional CloudOps/SRE model. Instead of siloed specialists, every engineer on this team is expected to operate across Cloud Infrastructure, Databases, Networking, Operating Systems, and Data Pipelines—powered by Generative AI as a force multiplier.
Over time, you will develop deeper expertise in one or more areas, but the baseline expectation is full-stack operational capability from day one.
Role Summary
As a Platform Operations Engineer I, you will manage and maintain customer cloud infrastructure across multiple technology layers. You will leverage AI-assisted tools and workflows to diagnose issues, automate toil, and deliver reliable infrastructure at scale. This is an ideal role for a curious, fast-learning graduate who wants broad exposure to modern infrastructure and AI-augmented operations.
Key Responsibilities
- Monitor, troubleshoot, and resolve incidents across customer cloud environments (GCP / Azure / AWS) using AI-assisted diagnostic workflows.
- Perform routine infrastructure operations: provisioning, patching, backup verification, certificate renewals, and capacity reviews.
- Support database operations (relational and NoSQL) including health checks, query performance triage, and backup/restore procedures.
- Assist with network configurations such as VPNs, firewalls, DNS, load balancers, and connectivity troubleshooting.
- Manage OS-level tasks across Linux and Windows servers: log analysis, service management, disk and memory diagnostics.
- Contribute to data pipeline monitoring and basic ETL/ELT troubleshooting across platforms such as Databricks and Snowflake, in collaboration with data engineering stakeholders.
- Use Generative AI tools from providers such as Google Gemini, Anthropic Claude, OpenAI, and open-source LLMs to accelerate problem-solving, write automation scripts, and generate runbooks.
- Write and maintain Infrastructure-as-Code (Terraform, CloudFormation, or equivalent) with AI-assisted code generation; support DevOps practices including CI/CD pipelines and GitOps workflows.
- Participate in on-call rotations and follow incident management best practices (triage, communication, post-mortems).
- Document procedures, solutions, and knowledge base articles clearly and concisely.
Required Qualifications
- Bachelor's degree in Computer Science, Information Technology, or a related field (recent graduates welcome).
- Foundational knowledge of at least one major cloud platform (GCP, Azure, or AWS): core compute, storage, networking, and IAM concepts.
- Basic understanding of operating systems (Linux preferred) including command-line proficiency, process management, and file systems.
- Familiarity with networking fundamentals: TCP/IP, DNS, HTTP/S, subnets, firewalls, and basic routing.
- Introductory knowledge of relational databases (SQL) and at least awareness of NoSQL paradigms.
- Exposure to data engineering concepts: data pipelines, batch vs. stream processing, and common platforms and tools (e.g., Databricks, Snowflake, Spark, Airflow, or equivalent).
- Basic understanding of AI/ML concepts: what LLMs are, prompt engineering basics, and willingness to use AI tools daily (Google Gemini, Anthropic Claude, OpenAI, open-source LLMs) in operational work.
- Familiarity with at least one programming or scripting language (Python, Bash, Go, or JavaScript) and configuration languages such as YAML.
- Strong written and verbal communication skills; ability to explain technical issues to both technical and non-technical stakeholders.
Preferred Qualifications
- Any cloud certification (e.g., GCP Digital Leader, AZ-900, AWS Cloud Practitioner).
- Personal projects or coursework involving DevOps practices, Infrastructure-as-Code, CI/CD, or containerisation (Docker/Kubernetes).
- Exposure to monitoring and observability tools (Datadog, Prometheus, Grafana, CloudWatch, or similar).
- Demonstrated curiosity in Generative AI: prompt engineering experiments, AI-powered side projects, or relevant coursework.
What We Offer
- A supportive environment to kick-start your career with broad exposure to modern infrastructure and AI-powered operations.
- Structured mentorship and a clear path to specialisation (Cloud, DBA, Network, OS, or Data Engineering).
- Broad exposure to multi-cloud, multi-technology customer environments.
- Investment in certifications and continuous learning.