Site Reliability Engineer

Compass IoT

Date: 4 days ago

City: Sydney, New South Wales

Contract type: Full time

We're looking for a Site Reliability Engineer - Data Infrastructure/Platform to join our team.

About Compass

Compass IoT helps transport professionals make data-backed decisions about our roads and infrastructure. Founded in 2018, we aggregate connected vehicle data that is used by state governments, local councils, and private orgs across Australia, New Zealand, the UK, and North America.

We're a team of 26 people who value collaboration, energy, ambition, and building things that solve genuine problems.

About the Role

We process trillions of vehicle events and billions of trips to power real-time geospatial analytics for our customers. To scale cost-effectively, we're moving portions of our managed cloud services toward self-hosted, open-source infrastructure — and we're looking for an SRE to help with shift helping set up necessary telemetry and making it fast and reliable.

You'll be the person who makes self-hosted distributed data systems run cleanly in production: ClickHouse for analytics, Apache Flink for stream processing on Kubernetes / GCP. This is a high-ownership role at the centre of a major infrastructure transition, with a direct line from your work to our reliability.

What You'll Do

Operate, scale, and harden self-hosted distributed data systems (ClickHouse, Flink) on GKE
Help with the migration of high-cost managed workloads (BigQuery, Dataflow) to self-hosted alternatives — owning reliability and cost the whole way
Build the observability, alerting, and capacity planning needed to run systems at trillions-of-events scale
Drive down cloud spend through right-sizing, autoscaling, commitment planning, and eliminating waste
Tune ingestion, storage, and query performance under real load (merges, memory, sharding, checkpointing, backpressure)
Partner with data and backend engineers so new systems are operable and efficient by design, not after the fact

What we look for

Hands-on experience running self-hosted distributed data systems in production — ideally ClickHouse and/or Flink specifically; experience with Kafka, Spark, Pinot, StarRocks, or similar is also valued
Strong Kubernetes operations (GKE a plus) and infrastructure-as-code
Comfort debugging and tuning stateful, high-throughput systems under pressure
Solid distributed-systems fundamentals: replication, consistency, partitioning, checkpointing, backpressure
Comfortable with change, adaptable. You've scaled before or worked in a startup environment. You've lived through the ambiguity and the flexibility of a fast-moving company: the pivots and constant iterations. That excites you, not exhausts you. You hold plans loosely and your outcomes tightly.
Positive energy, no brilliant jerks. You lead with empathy and kindness and show up in a way that makes people want to work with you. We're casual, but we care deeply about our people, our product, and the problems we're solving. No one here is too senior to do a coffee run or too junior to have the best idea in the room.

Nice to have

Operating GCP at scale (BigQuery, Dataflow, Pub/Sub, GCS)
Go (our primary language)
Geospatial, time-series, or IoT data experience
FinOps / cloud cost-engineering background

Don't tick every box? If you've got the right attitude and most of the experience, we'd still love to hear from you.

Why you’ll love it here

Real scale, real problems: trillions of events and billions of trips, not a toy dataset
High ownership:you'll largely define how we run self-hosted infrastructure; it's mostly greenfield
ESOP: We have no investors. We want the people that build our stuff to directly benefit from its success.
Competitive Pay: Sydney is expensive, and we want to remunerate fairly for someone's experience. We review salaries regularly, including bonuses and promotions.
Other perks: Offsites, free coffee, and a well-stocked kitchen.
High care, high performance culture. Big ambitions, real support. We push each other to be brilliant and look after each other while we do it
Genuine impact: You'll be in a team at a company that's changing how the world thinks about traffic and transport data.

How to Apply

Send your resume and profile to ***email_hidden***

Include:

Something you've shipped that you're proud of (code, design, data or product). Just tell us what it was, your part in it, the problem you were trying to solve, the outcomes you were aiming for, and why it mattered.
Why this role, and why Compass?
A Github or portfolio, if you have one

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Our final hiring decisions are ultimately made by humans. Tell us why you want one of our orange hoodies in your application, our company colours are definitely not purple. While we don't care if you use AI to help you, we do care deeply about effort beyond a quick LLM dump.

See more jobs in Sydney, New South Wales