[{"last_updated":1780520540,"legal":"API Terms of Service: Please link back (with follow, and without nofollow!) to the URL on Remote OK and mention Remote OK as a source, so we get traffic back from your site. If you do not we'll have to suspend API access.\n\nPlease don't use the Remote OK logo without written permission as it's a registered trademark, please DO use our name Remote OK though."},{"slug":"remote-senior-database-reliability-engineer-cloudlinux-1131613","id":"1131613","epoch":1778947203,"date":"2026-05-16T16:00:03+00:00","company":"Cloudlinux","company_logo":"","position":"Senior Database Reliability Engineer","tags":["senior","dba","postgres","reliability","engineer","devops","ansible","mongo","redis","grafana"],"description":"<p>CloudLinux \/ TuxCare is a remote-first infrastructure and security company. More than 300 engineers build and operate products used by hosting providers, enterprises, and internal service teams worldwide. Our Infrastructure Department runs the platforms behind CloudLinux OS, Imunify, KernelCare, TuxCare ELS, and our engineering systems.<\/p><p>We are hiring a Senior Database Reliability Engineer to join the Infrastructure DBA cell. This is a hands-on production ownership role, not a narrow ticket-processing DBA position. You will keep critical database services reliable, automate repeated work, support engineering teams, and reduce single-person dependency in our PostgreSQL, ClickHouse, MongoDB, and Redis operations.<\/p><p>PostgreSQL is the main requirement. ClickHouse experience is a strong plus, but it is not a day-one blocker. We need a senior engineer with enough database, Linux, automation, and incident-response depth to learn our ClickHouse environment quickly and operate it safely.<\/p><p><\/p><p>Your Responsibilities:<\/p><ul><li>Own production PostgreSQL reliability: HA design, Patroni, PgBouncer, replication, failover, upgrades, vacuum\/bloat control, query tuning, locks, indexes, capacity, backups, PITR, and restore validation.<\/li><li>Improve disaster recovery and operational evidence: tested restores, documented recovery paths, measurable RTO\/RPO targets, runbooks, and safe maintenance plans.<\/li><li>Support the wider database estate: ClickHouse, MongoDB, and Redis. You will troubleshoot incidents, review access and data-safety changes, improve monitoring, and learn the production ClickHouse patterns already in use.<\/li><li>Automate DBA workflows with Ansible, Terraform\/OpenTofu, GitLab CI\/CD, scripts, and reproducible runbooks for provisioning, grants, backups, restores, health checks, and ownership metadata.<\/li><li>Help build DBaaS-style self-service capabilities so engineering teams can request databases, access, credentials, and operational checks with less manual DBA intervention.<\/li><li>Improve observability and incident response through Grafana, metrics, logs, SLOs, alert rules, Opsgenie routing, and clear communication during production issues.<\/li><\/ul><p><\/p><p>What Success Looks Like:<\/p><ul><li>PostgreSQL clusters have tested backup and restore paths, useful dashboards, clear ownership, and documented failover procedures.<\/li><li>Repeated DBA tickets become automation or self-service workflows.<\/li><li>ClickHouse operational knowledge is no longer a single-person dependency.<\/li><li>Database incidents have owners, runbooks, evidence, and measurable recovery paths.<\/li><li>Product and engineering teams get database help faster without sacrificing safety, auditability, or reliability.<\/li><\/ul><p><\/p><p><\/p><p>Why CloudLinux?<\/p><ul><li>You will work on real production infrastructure used across CloudLinux and TuxCare products. <\/li><li>You will have a direct impact on reliability, incident response, developer experience, and operational resilience. <\/li><li>You will also work in an AI-assisted engineering culture where automation, documentation, Claude, Codex, and careful human verification are part of the daily operating model.<\/li><\/ul>\n<p>What We Expect From You:<\/p><ul><li>Deep hands-on PostgreSQL experience in business-critical production environments, typically 5+ years or equivalent depth.<\/li><li>Strong understanding of PostgreSQL internals and operations: MVCC, WAL, transactions, locks, indexes, query planning, replication, autovacuum, bloat, major upgrades, backups, PITR, and restore testing.<\/li><li>Proven experience with highly available databases and the ability to reason about quorum, split-brain risk, failover, rollback, and recovery.<\/li><li>Strong Linux and infrastructure fundamentals: systemd, networking, storage, filesystems, CPU\/memory\/disk bottlenecks, TLS, DNS, firewalls, and root-cause troubleshooting.<\/li><li>Automation skills with Ansible and scripting. Terraform\/OpenTofu, GitLab CI\/CD, and merge-request based delivery are strong advantages.<\/li><li>Ability to support more than one database engine. You do not need to be a ClickHouse expert on day one, but you must be ready to learn it quickly and take responsibility for it.<\/li><li>Practical use of AI engineering assistants such as Claude and Codex. We expect you to use them to improve speed and quality, while personally verifying generated SQL, commands, scripts, and operational conclusions.<\/li><li>Clear written English for asynchronous work in Jira, Slack, GitLab, Slite, and runbooks.<\/li><\/ul><p><\/p><p>Nice to Have:<\/p><ul><li>ClickHouse operations: replication, Keeper\/ZooKeeper, MergeTree engines, distributed DDL, grants, row policies, backups, query troubleshooting, and cluster recovery.<\/li><li>MongoDB replica sets and Percona Backup for MongoDB.<\/li><li>Redis\/Sentinel and broker\/cache failure modes.<\/li><li>Database observability, SLOs, golden signals, alert tuning, and executable incident runbooks.<\/li><li>Building internal platforms, self-service portals, or DBaaS workflows for engineering teams.<\/li><\/ul>\n<p>What's in it for you?<\/p><ul><li>A focus on professional development.<\/li><li>Interesting and challenging projects.<\/li><li>Fully remote work with flexible working hours, which allows you to schedule your day and work from any location worldwide.<\/li><li>Paid 24 days of vacation per year, 10 days of national holidays, and unlimited sick leaves.<\/li><li>Compensation for private medical insurance.<\/li><li>Co-working and gym\/sports reimbursement.<\/li><li>Budget for education.<\/li><li>The opportunity to receive a reward for the most innovative idea that the company can patent.<\/li><\/ul><p><\/p><p>By applying for this position, you agree with\u00c2\u00a0CloudLinux Privacy Policy\u00c2\u00a0and give us your consent to maintain and process your personal data with this respect. Please read our Privacy Policy for more information.<\/p><br\/><br\/>Please mention the word **UNDERSTANDABLE** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-database-reliability-engineer-cloudlinux-1131613","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-database-reliability-engineer-cloudlinux-1131613"},{"slug":"remote-senior-site-reliability-engineer-pave-bank-1131263","id":"1131263","epoch":1776844817,"date":"2026-04-22T08:00:17+00:00","company":"Pave Bank","company_logo":"","position":"Senior Site Reliability Engineer","tags":["senior","engineer","reliability","devops","gcp","kubernetes","docker","python","go","grafana"],"description":"<p style=\"min-height:1.5em\"><strong>The Role<\/strong><\/p><p style=\"min-height:1.5em\">Pave Bank is building the future of programmable banking \u00e2\u0080\u0094 combining traditional banking with digital assets under a single, regulated platform. We\u00e2\u0080\u0099re looking for a <strong>Site Reliability Engineer (SRE)<\/strong> to ensure our core systems are highly available, scalable, and performant as we grow.<\/p><p style=\"min-height:1.5em\">As an SRE at Pave Bank, you\u00e2\u0080\u0099ll work closely with Engineering, Product, Security and Operations teams to build robust infrastructure, automate operations, and maintain reliability across all services. Your work will directly impact the safety, performance, and scalability of our banking platform, helping our customers trust Pave Bank with their finances.<\/p><p style=\"min-height:1.5em\"><\/p><p style=\"min-height:1.5em\"><strong>What You\u00e2\u0080\u0099ll Be Doing<\/strong><\/p><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Monitor, maintain, and improve the reliability, availability, and performance of production systems and services.<\/p><\/li><li><p style=\"min-height:1.5em\">Build and maintain infrastructure as code (IaC), deployment pipelines, and automation to support continuous delivery, scalability, and disaster recovery.<\/p><\/li><li><p style=\"min-height:1.5em\">Respond to incidents, perform root-cause analysis, and drive postmortems to ensure lessons learned are applied.<\/p><\/li><li><p style=\"min-height:1.5em\">Implement and enforce operational best practices: observability, logging, metrics, alerting, capacity planning, failover strategies, and backups.<\/p><\/li><li><p style=\"min-height:1.5em\">Collaborate with Engineering, Product, Compliance, and Operations teams to ensure infrastructure meets reliability, compliance, and security standards.<\/p><\/li><li><p style=\"min-height:1.5em\">Support service scaling, database operations, cloud infrastructure (GCP preferred), networking, and microservices orchestration.<\/p><\/li><li><p style=\"min-height:1.5em\">Document operational runbooks, on-call procedures, and system architecture to support maintenance, knowledge sharing, and compliance.<\/p><\/li><\/ul><p style=\"min-height:1.5em\"><\/p><p style=\"min-height:1.5em\"><strong>What You\u00e2\u0080\u0099ll Bring<\/strong><\/p><p style=\"min-height:1.5em\"><strong>Technical Skills and Experience<\/strong><\/p><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Strong programming or scripting skills (Go, Python, Bash, or similar) for automation, tooling, and operational tasks.<\/p><\/li><li><p style=\"min-height:1.5em\">Hands-on experience with cloud infrastructure, ideally Google Cloud Platform (GCP).<\/p><\/li><li><p style=\"min-height:1.5em\">Familiarity with containerization and orchestration (Docker, Kubernetes, or equivalent).<\/p><\/li><li><p style=\"min-height:1.5em\">Experience with infrastructure-as-code tools (Terraform, Cloud Deployment Manager, or similar).<\/p><\/li><li><p style=\"min-height:1.5em\">Experience with either FluxCD or ArgoCD for GitOps-based delivery.<\/p><\/li><li><p style=\"min-height:1.5em\">Solid understanding of distributed systems, microservices architecture, and reliability patterns.<\/p><\/li><li><p style=\"min-height:1.5em\">Experience setting up monitoring, logging, alerting, and observability (e.g., Prometheus, Grafana, ELK, distributed tracing).<\/p><\/li><li><p style=\"min-height:1.5em\">Strong troubleshooting skills and ability to respond to incidents under pressure.<\/p><\/li><li><p style=\"min-height:1.5em\">Knowledge of backup and disaster recovery strategies, database management, and secure operations.<\/p><\/li><\/ul><p style=\"min-height:1.5em\"><\/p><p style=\"min-height:1.5em\"><strong>Other Skills<\/strong><\/p><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Ownership mindset: proactive, responsible, and committed to system reliability.<\/p><\/li><li><p style=\"min-height:1.5em\">Strong communication skills \u00e2\u0080\u0094 able to coordinate across technical and non-technical stakeholders.<\/p><\/li><li><p style=\"min-height:1.5em\">Comfortable working in a fast-paced, early-stage startup environment.<\/p><\/li><li><p style=\"min-height:1.5em\">High integrity, attention to detail, and passion for fintech and programmable banking systems.<\/p><\/li><\/ul><p style=\"min-height:1.5em\"><\/p><p style=\"min-height:1.5em\"><strong>Nice to Have<\/strong><\/p><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Prior experience in fintech, banking, or other highly regulated industries.<\/p><\/li><li><p style=\"min-height:1.5em\">Familiarity with compliance, security, and data protection best practices.<\/p><\/li><li><p style=\"min-height:1.5em\">Experience with high-availability, high-throughput systems, or financial infrastructure.<\/p><\/li><li><p style=\"min-height:1.5em\">Exposure to blockchain or crypto systems integrated with banking.<\/p><\/li><li><p style=\"min-height:1.5em\">Experience optimizing cloud infrastructure for cost and performance under rapid growth.<\/p><\/li><\/ul><p style=\"min-height:1.5em\"><\/p><p style=\"min-height:1.5em\"><strong>Why Pave Bank?<\/strong><\/p><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Work alongside a founding team from Monzo and BigPay, bringing top-tier fintech expertise.<\/p><\/li><li><p style=\"min-height:1.5em\">Tackle real-world reliability challenges in a regulated, fast-growing fintech environment.<\/p><\/li><li><p style=\"min-height:1.5em\">Learn from and collaborate with experienced engineers while developing your SRE career.<\/p><\/li><li><p style=\"min-height:1.5em\">Competitive salary and meaningful equity with room for growth.<\/p><\/li><li><p style=\"min-height:1.5em\">Be part of a well-funded startup shaping the future of programmable banking.<\/p><\/li><\/ul><br\/><br\/>Please mention the word **CONGRATULATIONS** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Kuala Lumpur","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-site-reliability-engineer-pave-bank-1131263","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-site-reliability-engineer-pave-bank-1131263"},{"slug":"remote-senior-backend-engineer-grafana-k6-uk-1130569","id":"1130569","epoch":1772254812,"date":"2026-02-28T05:00:12+00:00","company":"","company_logo":"","position":"Senior Backend Engineer Grafana k6 UK","tags":["senior","backend","engineer","grafana"],"description":"Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted everywhere from a NASA launch and Minecraft HQ to Wimbledon and the Tour de France. Grafana Labs also helps more than 3,000 companies -- including Bloomberg, JPMorgan Chase, and eBay -- manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack, both featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo).\n\nWe're scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do.\n\nYou may not meet every requirement, and that's okay. If this role excites you, we'd love you to raise your hand for what could be a truly career-defining opportunity.\n\n&nbsp;\n\nThis is a remote opportunity and we would be interested in applicants in the UK.\n\nSenior Backend Engineer - Grafana k6&nbsp;\n\nThe Opportunity:&nbsp;\n\nWe are the team behind Grafana k6, Grafana Cloud k6, and...<br\/><br\/>Please mention the word **DIVINE** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"United Kingdom","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-backend-engineer-grafana-k6-uk-1130569","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-backend-engineer-grafana-k6-uk-1130569"},{"slug":"remote-senior-site-reliability-engineer-senior-manager-accenture-federal-services-1130102","id":"1130102","epoch":1770328808,"date":"2026-02-05T22:00:08+00:00","company":"Accenture Federal Services","company_logo":"","position":"Senior Site Reliability Engineer Senior Manager","tags":["senior","reliability","engineer","manager","devops","grafana"],"description":"At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people.\u00e2\u0080\u00afOur 13,000+ people are united in a shared purpose to pursue the limitless potential of technology and ingenuity for clients across defense, national security, public safety, civilian, and military health organizations.\n\nJoin Accenture Federal Services, a technology company and part of global Accenture, to do work that matters in a collaborative and caring community, where you feel like you belong and are empowered to grow, learn and thrive through hands-on experience, certifications, industry training and more.\n\nJoin us to drive positive, lasting change that moves missions and the government forward!\n\nYou Are:\n\nWe are seeking a\u00c2\u00a0Senior Site Reliability Engineer (SRE) with deep expertise in building and maintaining reliable, scalable systems and a passion for optimizing the performance, reliability, and efficiency of technical infrastructure. The ideal candidate will have a strong background in site reliability engineering principles, extensive experience with automation, and a proven ability to collaborate across teams to ensure seamless service delivery.\n\nThe Work:\n\n\u00e2\u0080\u00a2 Design, build, and maintain reliable, scalable, and high-performance infrastructure and services to support business needs.\n\u00e2\u0080\u00a2 Implement and advocate for SRE best practices, including automation, CI\/CD pipelines, monitoring, and incident management.\n\u00e2\u0080\u00a2 Collaborate with cross-functional teams to develop systems that meet high availability, performance, and reliability standards.\n\u00e2\u0080\u00a2 Drive incident management processes, including root cause analysis, mitigation strategies, and long-term preventive measures.\n\u00e2\u0080\u00a2 Establish, monitor, and refine service level objectives (SLOs), service level agreements (SLAs), and key performance indicators (KPIs) to ensure systems adhere to reliability and performance targets.\n\u00e2\u0080\u00a2 Automate repetitive tasks to improve operational efficiency and reduce manual intervention.\n\u00e2\u0080\u00a2 Build and maintain robust monitoring, logging, and alerting systems to ensure visibility into system performance and reliability.\n\u00e2\u0080\u00a2 Provide technical mentorship and guidance to team members, fostering a culture of knowledge sharing and continuous improvement.\n\u00e2\u0080\u00a2 Act as a technical leader by driving solutions to complex challenges, ensuring alignment with organizational goals.\n\u00e2\u0080\u00a2 Prepare and deliver performance and reliability reports to stakeholders, offering insights and recommendations for improvements.\n\nHere's What You Need:\n\n\u00e2\u0080\u00a2 Proven experience in site reliability engineering or a similar role, with a focus on application and infrastructure scalability, reliability, and performance.\n\u00e2\u0080\u00a2 Strong knowledge of ITSM principles and incident management processes.\n\u00e2\u0080\u00a2 Expertise in automation tools, scripting, and infrastructure-as-code (IaC) technologies.\n\u00e2\u0080\u00a2 Proficiency with monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Splunk).\n\u00e2\u0080\u00a2 Experience with cloud platforms (e.g., AWS, Azure, GCP) and container technologies (e.g., Docker, Kubernetes).\n\u00e2\u0080\u00a2 Strong analytical and problem-solving skills, with the ability to troubleshoot complex systems.\n\u00e2\u0080\u00a2 Excellent communication and collaboration abilities, with a focus on cross-team partnerships.\n\u00e2\u0080\u00a2 A passion for continuous learning, innovation, and driving imp<br\/><br\/>Please mention the word **APPRECIATES** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Washington, DC","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-site-reliability-engineer-senior-manager-accenture-federal-services-1130102","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-site-reliability-engineer-senior-manager-accenture-federal-services-1130102"},{"slug":"remote-senior-site-reliability-engineer-cloudbeds-1129921","id":"1129921","epoch":1769911215,"date":"2026-02-01T02:00:15+00:00","company":"Cloudbeds","company_logo":"","position":"Senior Site Reliability Engineer","tags":["senior","engineer","reliability","aws","kubernetes","devops","grafana","cloud"],"description":"<p><strong><span style=\"color: #1d81bb;\">What Makes Us Unique&nbsp;<\/span><\/strong><\/p>\n<p>At Cloudbeds, we're not just building software, we\u00e2\u0080\u0099re transforming hospitality. Our intelligently designed platform powers properties across 150 countries, processing billions in bookings annually. From independent properties to hotel groups, we help hoteliers transform operations and uplevel their commercial strategy through a unified platform that integrates with hundreds of partners. And we do it with a completely remote team. Imagine working alongside global innovators to build AI-powered solutions that solve hoteliers' biggest challenges. Since our founding in 2012, we've become the World's Best Hotel PMS Solutions Provider and landed on Deloitte's Technology Fast 500 again in 2024 \u00e2\u0080\u0093 but we're just getting started.&nbsp;<\/p>\n<p>&nbsp;<\/p><p>&nbsp;<\/p>\n<p>As a Sr. Site Reliability Engineer, you'll be the guardian of our platform's reliability and performance, ensuring millions of hospitality transactions flow seamlessly across the globe. You'll architect and implement scalable AWS cloud solutions that keep the most ambitious hotels running 24\/7, while fostering a culture of automation, resilience, and continuous improvement across our engineering teams.<\/p>\n<p><strong><span style=\"color: rgb(29, 129, 187;\">Our SRE Team:<\/span><\/strong><\/p>\n<p>We're a bottom-up, collaborative team that thrives on healthy debate and shared ownership of our infrastructure. You'll have endless opportunities to influence architecture decisions while working with cutting-edge cloud technologies at scale. We believe the best solutions come from engineers who are empowered to innovate, experiment, and challenge the status quo.<\/p>\n<p><span style=\"color: rgb(29, 129, 187);\"><strong>What You Bring to the Team:<\/strong><\/span><\/p>\n<ul>\n<li>Design and implement reliable and scalable AWS architecture to meet the needs of the organization.<\/li>\n<li>Maintain and support highly loaded Kubernetes (EKS) clusters and infrastructure-related components.<\/li>\n<li>Support the CICD process with ArgoCD and GitOps.<\/li>\n<li>Automate the platform deployments with Terraform infrastructure-as-code.<\/li>\n<li>Develop and continuously improve product Observability and Monitoring systems based on the Grafana, Prometheus, DataDog, and Cloudwatch.<\/li>\n<li>Respond and participate with Incident Management and Root Cause Analysis, ensuring minimal impact on services.<\/li>\n<li>Optimize system performance and troubleshoot issues as they arise.<\/li>\n<li><br\/><br\/>Please mention the word **INSIGHTFUL** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Argentina","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-site-reliability-engineer-cloudbeds-1129921","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-site-reliability-engineer-cloudbeds-1129921"},{"slug":"remote-site-reliability-engineer-ii-restaurant365-1129863","id":"1129863","epoch":1769691610,"date":"2026-01-29T13:00:10+00:00","company":"Restaurant365","company_logo":"","position":"Site Reliability Engineer II","tags":["reliability","engineer","devops","cloud","aws","gcp","kubernetes","ansible","python","grafana"],"description":"<p>Restaurant365 is a SaaS company disrupting the restaurant industry! Our cloud-based platform provides a&nbsp;unique, centralized solution for accounting and back-office operations for restaurants. Restaurant365\u00e2\u0080\u0099s culture is focused on empowering team members to produce top-notch results while elevating their skills. We\u00e2\u0080\u0099re constantly evolving and improving to make sure we are and always will be \u00e2\u0080\u009cBest in Class\u00e2\u0080\u009d ... and we want that for you too!<\/p><p><br><\/p><p><span style=\"font-size: 10pt;\">The&nbsp;<\/span><b style=\"font-size: 10pt;\">Site Reliability Engineer II<\/b><span style=\"font-size: 10pt;\">&nbsp;will&nbsp;be responsible for&nbsp;supporting, enhancing, and maintaining Restaurant365\u00e2\u0080\u0099s cloud infrastructure and applications. Qualified candidates will&nbsp;demonstrate&nbsp;growing&nbsp;expertise&nbsp;in site reliability practices, with skills in incident response, system monitoring, automation, and performance troubleshooting. You will collaborate with DevOps, development, and infrastructure teams to resolve moderately complex issues, propose improvements, and strengthen the reliability, scalability, and security of our SaaS platform.&nbsp;<\/span><\/p>\\n<p><\/p><p><br><\/p><b>How you'll add value: <\/b><ul><li><b>Execution &amp; Collaboration<\/b>&nbsp;<\/li><li>Respond to production incidents, perform triage and troubleshooting, and contribute to post-incident analysis.&nbsp;<\/li><li>Identify&nbsp;and automate manual processes to improve efficiency and reduce risk.&nbsp;<\/li><li>Enhance and evolve monitoring tools and platforms to improve observability.&nbsp;<\/li><li>Promote and apply best practices for reliability, scalability, and performance across engineering.&nbsp;<\/li><li>Implement  and support cloud automation using Terraform, Ansible, or CloudFormation.&nbsp;<\/li><li>Work within change management protocols to provide&nbsp;maximum&nbsp;uptime for production systems.&nbsp;<\/li><li>Participate in on-call rotation, providing 24x7 support for incidents and contributing to root cause analysis.&nbsp;<\/li><li>Partner with developers, architects, vendors, and IT teams to ensure reliable system operations.&nbsp;<\/li><li>Research and remediate vulnerabilities in coordination with security teams.&nbsp;<\/li><li>Maintain documentation of infrastructure, monitoring, runbooks, and incident response procedures.&nbsp;<\/li><\/ul><div><br><\/div><ul><li><b>Standards &amp; Process<\/b>&nbsp;<\/li><li>Apply company policies and procedures when handling operational tasks and incidents.&nbsp;<\/li><li>Suggest and implement improvements to operational processes and monitoring practices.&nbsp;<\/li><li>Contribute to technical diagrams, documentation, and runbooks for system reliability.&nbsp;<\/li><\/ul><div><br><\/div><ul><li><b>Learning &amp; Growth<\/b>&nbsp;<\/li><li>Expand&nbsp;expertise&nbsp;in cloud services (Azure, AWS, or GCP) and container platforms (EKS, ECS, AKS).&nbsp;<\/li><li>Build&nbsp;proficiency&nbsp;with observability and monitoring tools (Prometheus, Grafana, ELK, Site24x7, Nagios).&nbsp;<\/li><li>Develop scripting and automation skills using Python, Bash, PowerShell, or similar.&nbsp;<\/li><li>Participate in planning discussions by contributing technical input on system stability and reliability.&nbsp;<\/li><\/ul><p><br><\/p><b>What you'll need to be successful in this role: <\/b><ul><li>BS in Computer Science, Information Systems, or related field (or equivalent experience).&nbsp;<\/li><li>2\u00e2\u0080\u00934  years of experience in site reliability engineering, DevOps, or cloud operations.&nbsp;<\/li><li>Experience with cloud platforms (Azure or AWS), including services such as AKS, ECS\/EKS, Functions\/Lambda, S3, and Blob storage.&nbsp;<\/li><li>Proficiency&nbsp;with infrastructure-as-code and automation (Terraform, Ansible, YAML, Python, Bash, PowerShell).&nbsp;<\/li><li>Strong Linux engineering skills; working knowledge of Windows administration.&nbsp;<\/li><li>Experience supporting production environments and&nbsp;participating&nbsp;in on-call rotations.&nbsp;<\/li><li>Familiarity with web servers and middleware (Nginx, Apache Tomcat).&nbsp;<\/li><li>Experience with CI\/CD tools (GitLab, Git, or similar).&nbsp;<\/li><li>Strong written, oral, and interpersonal communication skills.&nbsp;<\/li><\/ul><div><b>Preferred Qualifications<\/b>&nbsp;<\/div><ul><li>Experience with monitoring tools (Prometheus, Grafana, ELK, Site24x7, Nagios).&nbsp;<\/li><li>Knowledge of performance analysis and system vulnerability remediation.&nbsp;<\/li><li>Cloud certification (AWS or Azure)&nbsp;preferred.&nbsp;<\/li><li>Familiarity  with restaurant industry SaaS platforms and customer-facing applications.&nbsp;<\/li><\/ul><p><br><\/p><b>R365 Team Member Benefits &amp; Compensation<\/b><ul><li>This position has a salary range of $98,583-$138,016 annually. The above range represents the expected salary range for this position. The actual salary may vary based upon several factors, including, but not limited to, relevant skills\/experience, time in the role, business line, and geographic location. Restaurant365 focuses on equitable pay for our team and aims for transparency with our pay practices. <\/li><li>Comprehensive medical benefits, 100% paid for employee<\/li><li>401k + matching<\/li><li>Equity Option Grant<\/li><li>Unlimited PTO + Company holidays<\/li><li>Wellness initiatives<\/li><\/ul><div><br><\/div><div>#BI-Remote<\/div><p><br><\/p><p><\/p>\\n<p><span style=\"font-size: 14.6667px;\">DYN365, Inc d\/b\/a Restaurant365 is an equal opportunity employer.<\/span><\/p><br\/><br\/>Please mention the word **SERENITY** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Remote","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-site-reliability-engineer-ii-restaurant365-1129863","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-site-reliability-engineer-ii-restaurant365-1129863"},{"slug":"remote-support-engineer-blue-cube-services-1129773","id":"1129773","epoch":1769443223,"date":"2026-01-26T16:00:23+00:00","company":"Blue Cube Services","company_logo":"","position":"Support Engineer","tags":["support","engineer","blockchain","sql","grafana","ops"],"description":"We\u00e2\u0080\u0099re expanding our Engineering Operations team and seeking a diligent Support Engineer to help us scale effectively. In this role, you\u00e2\u0080\u0099ll tackle production issues, address technical queries from customers, and ensure the blockchain application is performing optimal through careful analysis and maintenance.\n\nYour contributions will directly impact our ability to provide a seamless user experience and sustain our rapid growth. If you\u00e2\u0080\u0099re proactive, analytical, and have a solid technical foundation, join us in making a difference in the recruiting software landscape.\n\n### Responsibilities\n- Investigate and help resolve customer issues\n- Troubleshoot technical issues or questions reported by customers\n- Perform root cause analysis for production errors and recommend improvements\n- Develop scripts to automatically verify end-to-end operation of integrations\n- Implement and execute data imports\/exports for customers\n- Maintain and perform operations related to third-party integrations\n\n### Requirements and skills\n- At least one year of experience in software development, technical support, or quality assurance\n- Diligence, quality-focused, and analytical skills\n- Proactive in contributing to organizational success\n- Excellent communication skills and team collaboration\n- Working knowledge of databases and SQL\n- Experience with Observability tools such as , Grafana and Datadog\n- Degree in Computer Science or relevant engineering field\n- Willingness to learn Kotlin\n\n### Compensation and Perks\n- Competitive compensation, in cryptocurrency.\n- The opportunity to progress for high performers.\n- Remote working<br\/><br\/>Please mention the word **MARVELLOUS** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Philippines","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-support-engineer-blue-cube-services-1129773","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-support-engineer-blue-cube-services-1129773"},{"slug":"remote-senior-web-scraping-engineer-python-oxylabs-1129243","id":"1129243","epoch":1765900801,"date":"2025-12-16T16:00:01+00:00","company":"Oxylabs ","company_logo":"","position":"Senior Web Scraping Engineer Python","tags":["python","senior","engineer","backend","git","docker","kubernetes","redis","elasticsearch","grafana"],"description":"<p><span style=\"font-size: 11pt;\">We\u00e2\u0080\u0099re a team of 500+ professionals who develop cutting-edge proxy and web data scraping solutions for thousands of the world\u00e2\u0080\u0099s best known businesses, including Fortune 500 companies.&nbsp;<\/span><\/p><p><br><\/p><p><b><span style=\"font-size: 11pt;\">What\u00e2\u0080\u0099s in store for you:<\/span><\/b><\/p><p><span style=\"font-size: 11pt;\">You\u00e2\u0080\u0099ll be solving complex challenges and maintaining our own infrastructure with 60PB+ monthly data traffic. Here are its scale and maturity in numbers:<\/span><\/p><p><br><\/p><p>\t- 6PB+ Ceph storage<\/p><p>\t- 60PB+ monthly data traffic through our systems<\/p><p>\t- 300k+ service requests\/sec processed<\/p><p>\t- 500k+ Kafka messages\/sec streamed<\/p><p><br><\/p><p><span style=\"font-size: 11pt;\">A word from the team:<\/span><\/p><p><br><\/p><p><span style=\"font-size: 11pt;\">We run one of the most advanced and largest scraping and parsing products in the world. We serve thousands of requests per second with a very high success rate. Our scrapers and parsers are used by leading e-commerce, market intelligence, and AI industry players making the work challenging and truly global. The team is a blend of different interesting personalities from different walks of life and nationalities. Here you can find people who are experts in gaming, playing guitar, riding bicycles, and other areas. We, as a team, will support you in learning how to build your own scrapers and will share all the tips, tricks and hacks we know to ensure that you are onboard in no time.<\/span><\/p>\\n<p><\/p><p><br><\/p><b>Your day-to-day:<\/b><ul><li>Develop scalable scrapers.<\/li><li>Define resilient scraping strategies, unblock websites for scraping.<\/li><li>Improve observability in the system.<\/li><li>Develop back-end solutions for scraping &amp; parsing problems of various magnitudes.<\/li><li>Maintain the current system and develop new features related to scraping &amp; parsing.<\/li><\/ul><p><br><\/p><b>Your skills &amp; experience:<\/b><ul><li>Experience working with Python.<\/li><li>Understanding of computer science, including data structures, algorithms, computability and complexity.<\/li><li>Version Control skills using Git.<\/li><li>Knowledge on how to unblock websites for scraping.<\/li><li>Is able to use different scraping techniques &amp; open-source tools to build scrapers.<\/li><li>Is comfortable with using Dev Tools.<\/li><li>Network (TLS\/SSL) knowledge.<\/li><li>Worked with browser automations.<\/li><li>Knows their way around asynchronous programming.<\/li><\/ul><div><br><\/div><div><b>Nice to have<\/b>:<\/div><div><br><\/div><ul><li>Web development knowledge.<\/li><li>Knows how to use CSS Selectors \/ XPaths for parsing.<\/li><li>Experience working with Go &amp; C++.<\/li><li>Worked on browser source code.<\/li><li>Knowledge of any front-end framework.<\/li><li>Experience working with Pydantic, FastAPI, SQLAlchemy.<\/li><li>Has experience working with Redis, MySQL, Docker, Kubernetes, Elasticsearch, Kibana and monitoring tools like Grafana, Prometheus.<\/li><li>Experience with machine learning that is scraping domain-specific.<\/li><li>Has experience in building scalable systems.<\/li><\/ul><div><br><\/div><div><br><\/div><p><br><\/p><b>Salary:<\/b><ul><li>Gross salary: from 5600 EUR\/month. Keep in mind that we are open to discussing a different salary based on your skills and experience.<\/li><\/ul><div><br><\/div><p><br><\/p><p><\/p>\\n<p><b>Up for the challenge? Let\u00e2\u0080\u0099s talk! <\/b><\/p><br\/><br\/>Please mention the word **PROUD** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Poland","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-web-scraping-engineer-python-oxylabs-1129243","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-web-scraping-engineer-python-oxylabs-1129243"},{"slug":"remote-senior-infrastructure-engineer-core-systems-quicknode-1129232","id":"1129232","epoch":1765825201,"date":"2025-12-15T19:00:01+00:00","company":"Quicknode","company_logo":"","position":"Senior Infrastructure Engineer Core Systems","tags":["senior","engineer","devops","cloud","linux","ansible","grafana","python","go","blockchain"],"description":"<p style=\"min-height:1.5em\">Quicknode is a cloud-based infrastructure company that powers the blockchain ecosystem.<\/p><p style=\"min-height:1.5em\">Our mission is to be the indispensable utility that empowers companies and innovators globally to build next-generation, Web3 enabled businesses &amp; applications using blockchain technology. Quicknode is backed by some of the world's best investors including Tiger Global, Y Combinator, SoftBank, and the Seven Seven Six Fund. The Quicknode team has over 120 people maintaining high performance global data infrastructure for amazing customers serving billions of requests daily.<\/p><p style=\"min-height:1.5em\">We are a global remote company with an HQ in Miami, Florida.<\/p><h2><strong>The Role<\/strong><\/h2><p style=\"min-height:1.5em\">As a Senior Infrastructure Engineer at QuickNode, you\u00e2\u0080\u0099ll play a pivotal role in architecting, developing, and implementing our next-generation infrastructure platforms across both private and public clouds. This role focuses on building robust, scalable systems that go beyond traditional cloud deployments\u00e2\u0080\u0094developing hybrid environments that integrate bare-metal, virtualization, and orchestration technologies. You\u00e2\u0080\u0099ll drive automation, standardization, and operational excellence across QuickNode\u00e2\u0080\u0099s infrastructure ecosystem, ensuring performance, reliability, and scalability as we expand our Web3 infrastructure footprint globally.<\/p><p style=\"min-height:1.5em\"><\/p><h2><strong>What You\u00e2\u0080\u0099ll Do<\/strong><\/h2><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Research, architect, and deploy complex infrastructure systems across bare-metal servers, hypervisors, orchestrators, virtual machines, and containerized environments.<\/p><\/li><li><p style=\"min-height:1.5em\">Design and implement automation using Infrastructure-as-Code and Configuration Management principles (Terraform, Ansible, Consul) to ensure reproducibility, speed, and consistency in infrastructure deployment.<\/p><\/li><li><p style=\"min-height:1.5em\">Establish and maintain infrastructure standards, documentation, and architecture diagrams to support scale, reliability, and compliance across environments.<\/p><\/li><li><p style=\"min-height:1.5em\">Partner with Technical Operations, CloudOps, and Platform teams to ensure smooth integration of systems, improve deployment efficiency, and enhance observability.<\/p><\/li><li><p style=\"min-height:1.5em\">Optimize resource allocation and infrastructure cost efficiency while maintaining performance and uptime goals.<\/p><\/li><li><p style=\"min-height:1.5em\">Continuously improve resilience and scalability through proactive capacity planning, disaster recovery testing, and infrastructure modernization initiatives.<\/p><\/li><\/ul><h2><strong>What You\u00e2\u0080\u0099ll Bring<\/strong><\/h2><ul style=\"min-height:1.5em\"><li><p style=\"min-height:1.5em\">Minimum 5 years of experience in Systems Administration, Datacenter Operations, or Infrastructure Engineering, with deep expertise in Linux\/Unix systems<\/p><\/li><li><p style=\"min-height:1.5em\">Proven success designing and managing hybrid cloud and traditional infrastructure platforms at scale (bare-metal, virtualized, and cloud-native).<\/p><\/li><li><p style=\"min-height:1.5em\">Hands-on experience with automation, configuration management, and CI\/CD tools such as Terraform, Ansible, Consul, and Jenkins.<\/p><\/li><li><p style=\"min-height:1.5em\">Proficiency with observability and monitoring platforms (Grafana, ELK, VictoriaMetrics, DataDog).<\/p><\/li><li><p style=\"min-height:1.5em\">Programming experience in one or more languages such as Python, Go, or JavaScript.<\/p><\/li><li><p style=\"min-height:1.5em\">Strong understanding of infrastructure architecture principles \u00e2\u0080\u0094 scalability, redundancy, fault tolerance, and cost optimization.<\/p><\/li><li><p style=\"min-height:1.5em\">Familiarity with containerization (Docker, Kubernetes) and networking fundamentals across cloud and datacenter environments.<\/p><\/li><li><p style=\"min-height:1.5em\">A proactive, analytical mindset and the ability to deliver under pressure while fostering a culture of continuous improvement and reliability.<\/p><\/li><li><p style=\"min-height:1.5em\">Excellent communication and documentation skills to facilitate alignment across cross-functional engineering teams.<\/p><\/li><\/ul><p style=\"min-height:1.5em\">International ranges, in local currency, will be discussed during the hiring process with applicable candidates. This role is eligible for a quarterly bonus tied to company and individual goal achievement. We consider years of experience, level of proficiency in job function, the technical competencies required and location when determining base salary ranges for positions and levels.<\/p><p style=\"min-height:1.5em\">The Quicknode compensation philosophy includes pillars to ensure fair and unbiased compensation for all employees. To design and deliver total reward offerings that are employee-centric. To offer a competitive benefit package in all locations where we operate. To prioritize attracting and retaining the best talent globally. To maintain a high-performing and flexible way of working.<\/p><p style=\"min-height:1.5em\">During the hiring process, we are committed to discussing compensation openly and honestly. We encourage candidates to share their salary expectations and requirements early, allowing for an individualized discussion. We know that our total rewards practices impact the lives and wellbeing of our employees. Therefore, we will never stop learning about the market, our business, your needs, and how best to achieve our goals through thoughtful and data-driven practices. If you have any questions or require further information about the compensation for this position, please don't hesitate to reach out to your Recruiter.\u00c2\u00a0<\/p><p style=\"min-height:1.5em\">We at Quicknode are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability status, genetic information, protected veteran status, or any other characteristic protected by law.<\/p><br\/><br\/>Please mention the word **INTELLIGENT** and tag ROjox when applying to show you read the job post completely (#ROjox). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.","location":"Lisbon","apply_url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-infrastructure-engineer-core-systems-quicknode-1129232","salary_min":0,"salary_max":0,"logo":"","url":"https:\/\/remoteOK.com\/remote-jobs\/remote-senior-infrastructure-engineer-core-systems-quicknode-1129232"}]