- Experienced Sr DevOps/Systems Engineer with a demonstrated history of automation and IaC. Committed to maintaining cutting edge technical skills and up - to-date industry knowledge with over 17 years of IT experience including Linux Administration, DevOps, Configuration Management, Application Deployment and Reliability, Virtualization, Hyperconverged Infrastructure, Cloud Deployments, Automation, Backup/Recovery and DR Planning, Infrastructure Monitoring, Server Administration, Batch Scheduling, and Storage Administration.
Applications/Software: Chef, Ansible, Kubernetes, Docker, ESXi, vSAN, Hyper-V, Azure, AWS, EC2, VPC, SNS, SQS, Route 53, S3, Redshift, RDS, DynamoDB, IAM, Elasticache, CloudFormation, CloudWatch, EFS, Docker, Git, CircleCI, JIRA, Weblogic, Tomcat, Apache, NGINX, Zabbix, Elastic Stack, Prometheus, Jenkins, OpenStack, Bash scripting, PowerShell, Memcached, Solr, New Relic, Aspera, Confluence, Bitbucket, CDN, Microservices, DRM, Footprints, BMC Performance Manager, BMC Performance Manger Portal, BMC Patrol, BMC ProactiveNet Performance Management, BMC Control-M, Symantec NetBackup, NetBackup PureDisk, Symantec OpsCenter, NetBackup Appliance, Legato Networker, BMC Service Desk Express, Crystal Reports, Active Directory, Microsoft SQL Server, Oracle, Sybase, Altiris Deployment Console, MongoDB
Operating Systems: Linux/Unix, CentOS, RHEL, Ubuntu, Solaris, Windows Server
Storage: EMC VNX, EMC Clariion, EMC Isilon, NetApp 7-Mode, Netapp Cloud Volumes, S3, EFS, NFS Server, vSAN
Networking: SAN, Brocade, Cisco, F5
Sr DevOps Engineer
- Perform and automate system administration services including i nstallation, configuration, maintenance, and disaster recovery of thousands of EC2 instances in 10+ regions.
- Control EC2 instance life-cycle and other AWS resources.
- Respond to production incidents, troubleshoot, resolve, and document.
- Analyze, troubleshoot and resolve system, software, network, and storage failures for a globally distributed cloud infrastructure.
- Configure, test, deploy, and upgrade software for production EC2 servers in AWS.
- Author and recommend settings for applications, operating systems, networks, and cloud services to improve performance, security, and reliability.
- Design and develop monitoring and 100% uptime solutions for critical systems including.
- Analyze cloud spend and implement cost-cutting solutions.
- Manage Kubernetes clusters and automated CI/CD for microservices.
- Automate and manage the deployment of AWS resources including but not limited to EC2, Route 53, CloudFormation, CloudWatch, S3, Elasticache, EFS, EKS, ECR, VPC, RDS, Redshift, DynamoDB, SQS, SNS, IAM, and CloudFront.
Principal Systems Engineer
- Participate in design and architecture phases of software development projects. Ensure the software design conforms to high availability architecture of the production infrastructure.
- Supervise and automate the code implementation through Chef, Ansible, and Kubernetes in development and production, provide post release support, and deliver operational documentation. Ensure automation is properly version controlled through Git.
- Lead and manage infrastructure projects as assigned. Ensure the design and implementation of new or incremental infrastructure adheres to corporate high availability and security standards. Document and train other staff members on the operation of new infrastructure.
- Deploy and maintain Weblogic Domains and associated JVM’s, Data Sources, and Deployments.
- Perform incident management on development and production environments. Assume ownership for problems, work with other team members, developers, and third parties to identify and resolve root cause issues. Responsible for 24/7 on-call duty in rotating shift with other team members.
- Manage and deploy hybrid cloud on-prem (Hyper-V/ESXi) and in the cloud (Azure/AWS)
- Write Bash scripts to automate routine tasks that are outside the scope of automation tools.
- Configure and analyze monitoring of applications and logs through New Relic, Zabbix, and ELK.
- Deploy, configure, and ensure SSL security standards on webservers (Apache/NGINX)
- Maintain HA and performance of services through utilization and proper configuration of load balancers (F5/NGINX)
- Create security hardened gold images of Centos and Windows servers for deployment. Tune and patch operating systems. Troubleshoot and resolve OS related issues.
- Lead Performance Management, Storage, and Backup team.
- Delegated tasks and ensured completion by team System Administrators.
- Performed team member performance reviews.
- Maintained and ensured the proper functioning of Performance Monitoring products used in the environment including BMC ProactiveNet Performance Management (BPPM), BMC Performance Manager Portal, BMC Performance Manager for Servers, BMC Performance Manager for Exchange, BMC Performance Manager for Databases, and Patrol 7 architecture. Including apache webserver, JBOSS app server, and Oracle DB.
- Ensured security standards for SSL communications to monitoring and backup webservers.
- Maintained Solaris, Red Hat, Linux appliance, and Windows servers for the monitoring and backup environments.
- Monitored and backed up all Solaris, Red Hat, Windows servers, and Load Balancers.
- Performed security remediations on Unix/Linux and Windows systems.
- Managed VMware ESXi server.
- Managed Symantec NetBackup for daily system backups.
- Maintained backup hardware, including NetBackup Appliance PureDisk, Amazon S3 storage, Amazon Storage Gateway, VTL, Tape Libraries, NAS, and SAN.
- Created Disaster Recovery (DR) procedures and participated in semiannual DR tests.
- Administered, and zoned director level fibre switches.