What you’ll do
Areas of focus for this role include but are not limited to:
- Assist with the design, implementation, and administration of a Private Cloud Infrastructure.
- Code and test scripting and/or programming related to the automation and support of Private Cloud Infrastructure.
- Work closely with Engineering and Systems Reliability teams to deliver highly available systems.
- Assist in troubleshooting and resolving system outages or performance impact, and communicate findings for ongoing improvement.
- Analyze and improve efficiency, reliability, and scalability of the platform
- Work collaboratively with others to achieve goals
- Participate in an on-call rotation supporting compute platform and handle escalation issues
What you’ll need to succeed
The successful candidate will demonstrate experience or capability in the following areas:
- Advanced experience with VMWare is a must, preferably in a large enterprise environment. Virtualization technologies such as KVM, and Xen are a plus.
- Experience with automation and configuration management tools such as Puppet, Ansible, Salt, Chef, or equivalent.
- Experience with scripting/programming languages such as Python, Perl, Bash, PowerShell.
- Knowledge of software version control systems (e.g. Git) and release management process.
- Large Infrastructure System Administration – working with thousands of hosts and the practices that are required to support and scale Storage and Network infrastructures.
- Experience with designing, running, and/or consuming customer-facing cloud technologies such as AWS, Azure, OpenStack, Google Cloud Platform is a plus.
- Experience with complex IP networking capabilities in large production environments is a plus.
- Experience with Unix/Linux system administration.
- Understanding of the ITIL framework and concepts.
About our ideal candidate:
- Passionate about making things better, about using the latest technology to improve current processes
- Eager to learn about infrastructure operations. We want someone who does not have a preference for one tool but rather looks for the right tool for the job.
- Analytical with the ability to troubleshoot complex system problems, to assess/resolve system performance problems, and can troubleshoot situations looking for all the options and turning over every rock to find the best solution.
- Demonstrated ability to turn design, compliance and support references/content into documentation, presentations and training
- Ability to work hard as a member of a small team that is asked to do a lot while always remaining customer focused.
- Excellent written and oral communication skills, ability to work well with remote teams.
- Ability to effectively prioritize and execute tasks in a cross-functional and collaborative team of developers and operations.
- Ability to manage time properly, with commitments being communicated and then met.
- Understanding of broad business processes and how they work together to drive an organization’s success.
- Able to think holistically and excel at systems thinking.