Site Reliability/DevOps Engineer
The Site Reliability/DevOps Engineer (SRE) will be responsible to develop innovative and effective solutions in an energetic and highly collaborative environment to support the vision and strategic goals.
The SRE will help build, manage, maintain and improve solutions across multiple clouds. As part of the team, the SRE will collaborate and assist in automation and streamlining of operations and processes, build and maintenance of tools, deployment and monitoring of systems and troubleshooting and resolving issues in our production and non-production environments. This position will require skills in development, networking, security, and system administration.
ESSENTIAL DUTIES AND RESPONSIBILITIES:
- Implement and maintain technologies such as Oracle Database, Oracle Data Integration, Apache Tomcat, Linux and related technologies in support of Ellucian Banner and Oracle Integration.
- Design and management of AWS components such as VPCs, EC2, S3, CloudFormation, Lambda etc.
- Deployment and management automation of cloud-based infrastructure and software
- Ensuring cloud-based architectures meet availability, security, performance and recoverability requirements
- Architecture and implementation of cloud-based monitoring, alerting and reporting
- Drive innovations that improve availability, resiliency and performance of the service
- Develop and conduct component and system tests
- Maintain technical currency in related products and technologies with the goal of improving approaches, methodologies, and components
- Installation and maintenance of Database & applications in accordance with established best practices.
- Documents processes and procedures.
OTHER DUTIES AND RESPONSIBILITIES
- Primary liaison for all integration and system related activities to Banner and other SIS systems
- Participate in the occasional on-call rotation supporting the infrastructure
- Engage with corporate and institutional technical staff to insure alignment and execution of technical directions and methods.
- May perform other duties and responsibilities that management may deem necessary from time to time.
EDUCATION and/or EXPERIENCE:
- Bachelor of Science Degree in Computer Science or related discipline; commensurate experience is also accepted.
- Oracle Relational Database 12c experience, to include SQL & PL/SQL, Recovery Manager (RMAN)
- Demonstrable knowledge of container technologies
- Proven knowledge of the TCP/IP stack, latency matters, internet routing and load balancing
- Experience managing AWS or Google Cloud infrastructure
- Experience with central logging and monitoring systems (ELK, Datadog, Sensu)
- Experience with Infrastructure as a Code and/or Configuration Management tools (Ansible, Terraform, CloudFormation).
- 5+ years production application support experience in a Highly Available / Highly Utilized environment
- 5+ years UNIX/Linux administration experience including diagnosis of performance issues, package management, load estimation, kernel tuning, networking configuration, etc.
- Experience working to deadlines and managing competing priorities
- Strong problem solving and analytical skills.
- Multilingual (Spanish and Portuguese) preferred, but not required
Compensation is commensurate on experience and is targeting between $95-110 annually.