Senior Site Reliability Engineer

Full Time
Boston, MA 02114
Posted
Job description

Senior Site Reliability Engineer

We are looking for a talented Site Reliability Engineer (SRE) with a deep interest in distributed systems, cloud computing and the architecture of large-scale systems. The SRE lead will ensure our InsightIDR services have the ultra-high reliability and uptime necessary to meet our customers’ needs. As SRE, you will work closely with our engineering team and partner teams throughout Rapid7 to help solve extremely challenging problems at a massive scale.

About the Team

InsightIDR helps identify and address key cybersecurity risks to our customers. We apply machine learning, threat intelligence, and business intelligence to event sources, including desktops, servers, network switches, firewalls, cloud services, directory servers, DHCP servers, and SIEMs in order to distill hundreds or thousands of daily events per customer into the few real, high priority threats that need attention. Our systems ingest large amounts of data that need to be highly available and performant at all times.

Some of the technologies we use include: Java, Python, Cassandra, Dynamo, MySQL/RDS, Redis, ElasticSearch, AWS (EC2, S3, CloudFormation, etc.), Terraform and Jenkins.

At Rapid7, we value intellectual curiosity, problem solving ability, initiative, and team spirit.

About the Role

We are looking for a talented Site Reliability Engineer (SRE) with a deep interest in distributed systems, cloud computing and the architecture of large-scale systems. The SRE lead will ensure our InsightIDR services have the ultra-high reliability and uptime necessary to meet our customers’ needs. As SRE, you will work closely with our engineering team and partner teams throughout Rapid7 to help solve extremely challenging problems at a massive scale.

In this role, you will:

  • Establish a new Site Reliability Engineering function within Engineering

  • Work closely with Engineering teams, Architecture, Infrastructure and Product teams to improve the lifecycle of the InsightIDR services - from inception, design, deployment, operations, monitoring, security, upgrade and maintenance

  • Support services before they go live through activities such as design, deployment, migration strategy, monitoring, and playbook reviews

  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health

  • Scale systems through automation and driving service and infrastructure improvements

  • Troubleshoot production issues and liaising with relevant Engineering, product deployment, and platform teams for a resolution

  • Manage and participate in on-call support, and incident response follow-ups such as post-mortems

  • Mentor and coach team members

The skills you’ll bring include:

  • Previous experience in a lead engineering role

  • 5+ years of experience scaling SaaS services and infrastructure

  • Expert knowledge of developing, scaling, automating, and troubleshooting large-scale systems

  • Expert knowledge of deployment and monitoring frameworks

  • Ability to debug, optimize code and automate routine tasks

  • Advanced understanding of System Performance and tuning

  • Strong knowledge of NoSQL and SQL concepts

  • Strong knowledge of OOP languages such as Java

  • Experience with scripting languages such as Shell, Python

  • Extensive experience with database operation and optimization

  • Strong knowledge of RESTFul architectures

  • Understanding of Unix/Linux operating systems

  • Proficient in AWS services, including EC2, RDS, S3, streaming data, etc.

  • Systematic problem-solving approach

  • Excellent communication & influencing skills

  • Strong technical writing skills

We know that the best ideas and solutions come from multi-dimensional teams. Teams reflecting a variety of backgrounds and professional experiences. If you are excited about this role and feel your experience can make an impact, please don’t be shy - apply today.

About Rapid7


Rapid7 is creating a more secure digital future for all by helping organizations strengthen their security programs in the face of accelerating digital transformation. Our portfolio of best-in-class solutions empowers security professionals to manage risk and eliminate threats across the entire threat landscape from apps to the cloud to traditional infrastructure to the dark web. We foster open source communities and cutting-edge research–using these insights to optimize our products and arm the global security community with the latest in attackers methods. Trusted by more than 10,000 customers worldwide, our industry-leading solutions and services help businesses stay ahead of attackers, ahead of the competition, and future-ready for what’s next.

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status.

bryanjabs.com is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, bryanjabs.com provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, bryanjabs.com is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs