Site Reliability Engineer

7 maanden geleden

Leusden, Nederland HSD Campus Voltijd

Information is available in abundance, yet many organizations struggle to turn this information into actionable intelligence. That is why we, at Pandora Intelligence, are dedicated to helping our customers reveal the narrative in data. We designed a set of 12 Elementary Scenario Components (ESC12) enabling us to unveil the story hidden within any type and form of data. The application of those components is made possible through our software platform, making use of Artificial Intelligence (AI) to automate complex analytics tasks, reduce the need for manual work, and empower our customers to timely and efficiently make decisions.

We are an innovative and continuously growing startup with a passionate team of technologists who love solving problems. Constituted of a varied mix of skills, experience, and backgrounds, our teams are looking for new colleagues to extend our expertise and knowledge further.

SITE RELIABILITY ENGINEER

Do you like using cutting-edge technology to ensure availability, reliability and scalability of systems and (Cloud) infrastructures? Do you enjoy being a key player, working hand-in-hand with software development and operations teams? Then you will love being our new Site Reliability Engineer

WHAT WILL YOU DO AS A SITE RELIABILITY ENGINEER?

The Site Reliability Engineer is mainly responsible for ensuring availability and reliability of systems used to power Pandora Intelligence’s platform. Therefore, the major aspects of the Site Reliability Engineer’s job lie in development and implementation of tooling to automate processes, testing and monitoring the production environments by analyzing logs. As expert in systems and infrastructure, the Site Reliability Engineer is also responsible for responding to incidents and fixing issues.

By ensuring the reliability and availability of the systems, the Site Reliability Engineer is effectively building the bridge between development and operations; thereby it plays a crucial to guarantee that service levels are delivered as per customer expectations.

Typical tasks of the Site Reliability Engineer are to:

Identify the technical requirements and dependencies to ensure availability and reliability of the systems (based on SLA);
Support the teams by promoting best-practices to ensure software and systems are built considering security, availability and scalability;
Collaborate with peers from the software development teams (Platform and Pilots & Solutions), to ensure clarity in requirements;
Identifying risks and bottlenecks, and propose workarounds to preserve systems availability and performance;
Think ahead and identify potential improvements to submit those to the team manager;
Investigate, fix and resolve issues related to application deployment and operational usage;
Build automation tools to streamline and scale applications in the production environment with minimum impact on operations;
Develop tooling to automate the deployment of the Pandora Intelligence on Cloud or on-premise infrastructures;
Keep up-to-date with industry trends, developments and newest tools to discuss improvements offered by usage of those;
Monitor production environments and act proactively to avoid downtime;
Secure applications and infrastructures (manage certificates, assess encryption and authentication mechanisms, manage credentials, etc.).

WHAT DO WE EXPECT FROM A SITE RELIABILITY ENGINEER?

At least a BSc or equivalent, preferably a MSc in a relevant field;
At least 4 years of experience in a similar position;
Proven track record in the above (reference from previous position);
Problem solving mindset, ability to take initiative and attention to detail;
Strong knowledge of TCP/IP networking and infrastructure;
Familiarity with automation and provisioning tools like Helm and Ansible;
Expertise with cloud services such as MS Azure, AWS and GCP;
Knowledge of Identity Access Management mechanisms such as OpenID;
Experience with Docker and Kubernetes;
Knowledge of monitoring systems such as Graylog, Grafana, Prometheus;
Experience with CI/CD tooling, preferably Azure DevOps CI;
Ability to take ownership and support the team through all situations;
Perfect verbal and written English, Dutch would be a great addition.

WHAT CAN YOU EXPECT FROM PANDORA INTELLIGENCE? In addition to a challenging role, we also offer:

The opportunity to be part of a fast-growing company where you can help make the world safer;
A flexible remote working policy, where balance between a pleasant office space and working from home provides the comfort that suits you;
A role in a company building strong impact solutions, with a drive for technology and innovation;
A competitive salary, depending on your experience;
A role within a multidisciplinary team where we consider wellbeing and personal development important;
In addition to a dynamic working environment, we like to have fun and love to organize office drinks and other events.

Due to the nature of our work, background checks are part of the application process.

Amerika

Europa

Azië / Oceanië

Afrika

Site Reliability Engineer