Cloud/systems Engineer Production Support Resume
Dallas, TX
SUMMARY:
- 10+ years of solid experience in Unix, Linux & VMware &AWS Cloud Administration.
- Large scale migration experience Data Center to Data Center and/or Data Center to Cloud
- Installed, Administered, Configured and Deployed Red Hat Enterprise Linux 4.x/5.x/6.x,7.x and Centos 6.x,7x/SUSE 11.x/12x versions operating systems
- Significant experience designing, deploying, and supporting production cloud environments in AWS ecosystem (including EC2, VPC, RDS, ELB, EBS, S3,Route 53
- Responsible for 24*7 production support of Mission Critical applications both on premises and in AWS as the applications migrate to AWS.
- Serving as one point of contact team for all L3 production issues.
- Patching, O/S Upgrades, Clustering, Migrations and Major Hardware Activities
- Worked on Server End to End Life cycle like Server Builds, Provisioning, Production Break fix, Deployments, Datacenter Operations, Hardening and Production
- Unix System Administration (Linux, VMware, Solaris, VERITAS Volume manager & VERITAS Clusters, Red hat Clusters, Pacemaker
- Upgrading Mellanox and Solar flare, Mellanox card as per the requirements.
- Performing Kickstart to build out Linux environments to automate Linux installations
- Experienced with Oracle RAC clustering
- Working on cloud infrastructure deployment - design, operate, and optimize new and Existing Production Environments.
- Working with IT Security tools like CyberArk, MFA and powerbroker, VPN.
- Experience in server Builds, Data center operations, Installs, Upgrades, Patches, Backup, Recovery Performance Monitoring and Fine tuning on Centos/Red hat Linux systems.
- Experience in Package Management using Centos/RedHat RPM/YUM and RedHat Satellite server.
- Upgradation of Kernel in Redhat Linux SUSE LINUX and CentOS servers
- Deployment experience using common AWS technologies like VPC, Elastic Load Balancing, Application Load Balancing, and regionally distributed EC2 instances.
- System troubleshooting and Maintenance using tools such as Ansible, Bash and Python scripting.
- Scripting experience in bash, python and ksh etc.
- Able to write shell scripts in order to automate repetitive and day to day jobs
- Responsible for providing advanced engineering support to production support teams for complex application performance and infrastructure issues.
- Develop legacy to AWS migration plans, including approach, roadmap, tooling and costing. present migration plan to customers.
- Automating the Infrastructure with the tools like BladeLogic, Ansible, Puppet, dockers.
- Install Configure & Deploy on All Red hat operating systems
- Worked as an on-site Coordinator for Major Activities like Storage Migrations, Data Centre Maintenance/Migration and VMware VC set up for ESX servers
- Maintaining and coordinating with data center techies for all Hardware replacements
- Remote Installation of physical server via DELL -DRACK, HP-ILO, IBM and virtual servers via VMware ESX 3.5,4.0 (RAID, LVM)
- Involved in UNIX Architectural decisions & experience in designing, implementing and supporting UNIX Server technology solutions.
- Installed/Configured/Administrated VMware, ESX 4.1, 5.1, 5.5 & 6.0 and migrated existing servers into VMware Infrastructure
- Migration of ESX servers from one version to another version and also from one hardware to another
- Managing VMware infrastructure/Vsphere clusters on production and UAT/Deployment environment’s
- Extensive knowledge on Esx/Vsphere/Vcenter/ operations in VMWare environment’s
- Experience NIC bounding configuration in Linux and UNIX systems to increase the bandwidth or redundancy based on requirement by the application.
- Installation & Configuring VCS clusters with multiple nodes administration of VCS Clusters like increasing the clustered file systems and performing the flip-over & fallback in between the nodes
TECHNICAL SKILLS:
Operating systems: Red Hat Enterprise Linux 5.x, 6.x, 7.x, SUSE Linux 11, 12.1Ubuntu 11.x, 12.x, 13.x and 14.04,, Windows
Virtualization/Hypervisor: VMware, Linux Vritualisation / ESXi
Cluster: VCS, Redhat,Pacemaker,CRM
Configuration Management tools: CHEF,PUPPET,DOCKER, ANSIBLE,JENKINS
Integration: Active Directory, LDAP, HMC, IVM, DRAC, HP ILO
Storage Array/NAS: IBM SVC, V7000, EMC Clarion, Netapp filers
Hardware: IBM Power series P4, P5, p6, p7, p8 ( p6 520, p6 550, p6 570, P710, P740, P780, P795), IBM Blades PS701, PS702, HP Blade server C3000, C7000, IBM X Series, HP ProLiant Servers, Dell Edge servers, CISCO UCS serversProfessional Experience:
PROFESSIONAL EXPERIENCE:
Confidential, Dallas, TX
Cloud/Systems Engineer Production Support
Role&Responsibilities:
- Managing Server Patching & troubleshooting all the issues on 30,000+ critical Trading production, Contingency and development servers.
- Patching, O/S Upgrades, Clustering, Migrations and Major Hardware Activities.
- Expertise in VCS/Redhat/Pacemaker Clusters
- Upgrading of Solaris/Linux /RHEL 5, RHEL 6 and RHEL 7 Patching, O/S Upgrades, Migrations
- Remediating and Fixing the vulnerabilities by applying Monthly patches and perform a thorough Qualys scan to remediate the Open &Existing vulnerabilities
- One point of Escalation for all L3 Production impacting issues.
- Upgrading Azul version, Solar flare and Mellanox drivers periodically .
- Experience with Unix/Linux Servers for Production impacting issues, maintenance, and troubleshooting, as well as Handling daily service requests, routine sever maintenance, upgrading firmware, failures, hardening and server consolidation
- Migrate existing services from physical data centers to AWS cloud
- Significant experience designing, deploying, and supporting production cloud environments in AWS ecosystem (including EC2, VPC, RDS, ELB, EBS, S3,Route 53
- Develop infrastructure as code on Amazon Web Server (AWS) with best practices and implementations for non-production and production environments.
- Build automation tools and frameworks for on-demand deployment environments, application definition, availability, security, and performance monitoring and alerting.
- Integration of cloud services with on premise
- Collaborate with Architects, DevOps Engineers, Engineering Managers, Product Managers and Engineers across the organization to deliver a comprehensive solution
- Define, create, test, and execute operations procedures.
- Install OS updates/hotfixes/patches/service packs
- Maintain and enhance the Continuous Integration & Deployment environment
- Experience with high availability and scalability in AWS
- Strategize, plan and manage all processes related to continuous integration, continuous delivery, and process automation
- Experience with configuration management / automation tools (Ansible, Puppet, Chef, Docker
- Hands-on experience leading the design, development and deployment of business software at scale or current hands on technology infrastructure, network, compute, storage, and virtualization experience
- Monitor AWS maintenance and outages, assess impact, and develop strategies to minimize impact
- Experience working with Docker and containerized applications
- Experience working with and developing enterprise monitoring/tooling solutions like Grafana, Kibana, Splunk, Nagios, and elastic search
- Working with SQL and NoSQL database (Oracle, My SQL, Mongo DB and Cassandra for there Production and performance issues.
- Supporting the DB/Applications like TIBCO, DB2,Sybase, Oracle, IBM MQ,JBOSS and Middleware Technologies and working closely with there teams on day to day issues.
- Experienced in working with Backup and Recovery Solutions in the Public/Private/Hybrid Cloud and Virtual environments.
- Extensively worked on planning to migrating production database & application servers to new storage and servers
- Working on BladeLogic for all Automated Patching and successfully Implemented BL Jobs & Roles .
- Setting up Jobs in BMC BladeLogic Server Automation activities - Adding Servers, creation / Maintenance of BLCLI or NSH Scripts, Creation of BL packages, Jobs, Snapshots for Bulk Patching/Upgrades
- Responsible for providing advanced engineering support to production support teams for complex application performance Network and infrastructure issues.
- Implemented MFA, CyberArk, powerbroker for infrastructure.
- Monitor the Infrastructure with IBM Tivoli agents, Autosys
- Handle escalations related to AWS infrastructure that were coming from L1/L2 teams
- Provides consultative guidance to clients to solve complex business problems including Design and Architecture consulting services to bring total cost of ownership reductions, automation, and ease of management to our clients requiring Amazon Web Services.
- Extensively worked on planning to migrating production database & application servers to new storage and servers.
- Applying monthly Patches to Remediate the security Vulnerabilities
- Actively Participating on other project which is related to all physical servers migrating to APP HOST Cloud by 2018.
- Trouble shooting of various issues reported by Application/DBA teams.
- Patch Upgrades, Installations, Configurations, Storage Migrations.
- Administration of Veritas Volume Manager and Veritas Cluster issues
- Analyzing the root cause for server’s crash, Performance Monitoring &Tuning of servers
- Following up with Maintech Vendor for Hardware Replacements
- Upgrading the Firmware’s to the Latest Versions, Arranging Downtime with Server Owners
- Attending weekly call with Lob, Business stake Holders for smoother Operations
- Excellent ability to handle significant workload with experience in managing multiple projects
- Experience in UNIX shell scripting and knowledge in Perl
- Analyzing the root cause, Hardware failures and working with vendors.
- Leading the team and preparing Reports and report to all CTO leads, Preparing SOP’s, Attending strategic Meetings with leads for Future Automations .
- Closely coordinated with SAN team for storage, fabric issues. Upgrading Emulex cards, firmware for HBA to the current revision levels.
- Hands on installation and configuration of management tool Puppet
- Troubleshooting VERITAS Cluster Suite, Volume Manager, ESX 5.X.
- ESX Component Administration, Troubleshooting ESX 5.X, VCenter issues.)
Confidential
Analyst11- Sys admin
Role&Responsibilities:
- Installing, Managing & troubleshooting all the issues on 30,000+ critical Trading production, development and QA servers.
- Hands-on experience in handling all kinds of hardware issues on Sun, HP, IBM and Dell.
- Interacting directly with Clients for getting work window for maintenance works and clarifying their doubts if they raise any as part of seeking window.
- Expertise on installation of various applications, packages and patches on Linux servers
- Applying OS Patches, Upgrades on all Linux and Solaris servers Quarterly .
- Verify the Load Balance of My Sql servers using Ha proxy
- Trouble Shooting of NFS, VAS, &LDAP User Authentications
- Replacement of H/W (Motherboard, Memory, Media-Drives, NIC Cards, PSUs, Mellon ax cards, Fusion cards and HBA.etc) by coordinating with onsite team/vendor.
- Working with various software and hardware vendors to fix various issues and regularly following up with them for RCA and updating the same to clients.
- Pushing the configurations and setting up the huge jobs via Blade Logic
- Trouble shooting of various issues reported by Application/DBA teams which comes through either Maximo tool or emails.
- Troubleshooting problems pertaining to Performance Tuning, Network Administration, and System Bugs
- Installing & upgrading of various packages of Linux, Solaris, VxVM and Netbackup, Legato, DNT and Autosys packages.
- Performing various Firmware upgrades such as UEFI,IMM,OBP,LOM and HBAs
- Administration of Veritas Volume Manager and Veritas Cluster issues.
- Trouble shooting of VCS cluster Nodes, Switching nodes, Freezing .
- Worked on 2000 Red hat cluster servers and pacemaker cluster
- Adding / removing new LUNs and creating the new file systems or increasing the existing file systems using these luns in Solaris and Linux.
- Root cause analysis after Panic/Dump/Crash related to Linux and Solaris by co-coordinating with Vendors for permanent resolution
- Creating password less SSH(familiar with Open SSH and Commercial SSH) for application teams and fixing if any issues
- Closely coordinated with SAN team for storage, fabric issues. Upgrading Emulex cards, firmware for HBA to the current revision levels.
- Actively participating in a high uptime server reboots project.
- Currently working on a automation project which keeps the server In Maintenance mode before it goes for reboot(working with Mips team)
- De-Commissioning servers according to Bank standards
- Attending the weekly call’s with LOBs and I’m one of representative from L2 team and cascading the information to colleagues when required and keep getting their issues and getting clarification from onsite managers
- Identifies and implements opportunities to reduce problems and optimize support.
- Documentation of all the setup procedures and System related Policies (SOP’s).
- Excellent ability to handle significant workload with experience in managing multiple projects concurrently in a demand environment.
Confidential
Analyst11- Sys admin
Role&Responsibilities:
- Providing Application support for various projects.
- Monitoring Application & Linux SAMBA, FTP, NFS, APACHE Services and predicate in quick recovery actions
- Supporting there hosting and data center operations 24/7
- As a System administrator, Day to day administration on Redhat Linux which includes installation, upgrades and installing packages & patches, Controlling the System Logging services, and examining system Log Files of all system events.
- Creation, modification and deletion of the user accounts and assigning permissions to them, Disk quota, support to the users.
- Configuring new disks, creating file systems and mounting, file system maintenance and repair, process management, setting up crontab, at, and batch jobs, password management, and related UNIX administration work.
- Configuration of network services like SFTP, NFS, SAMBA, DHCP, SSH & DNS, etc.,
- Packages and Patches Management
- Delegating calls to team engineers and taking feedback of the call status while troubleshooting issues like Router installations, Servers (Rhel5 & Win2k & 03), like Raid, LVM Installation & failure, SCSI hard disk failure and Power failure etc.
- Moving the volumes between the Disk Groups.
- Creation and removal of Volume and mirrors
- Extensive support towards Linux desktop troubleshooting.
- Handled Configuration & Troubleshooting of LaserJet & network printer.
- Successfully resolved IT related issues, maintained network connectivity, and troubleshooting desktops & server hardware.
- Deftly handling the task of User administration including addition, changing and deletion of User Accounts and addition, changing and deletion of groups and managing of user passwords.
- Responsible for User-level security and classifying ways of controlling root access on the system
- Maintaining the users with effective permission as well as Special permission
- Responsible for monitoring LAN / WAN architecture related to the customer site using WINDOWS 98/2K/XP and RHEL5, Windows 2K server-based Network and maintaining Server, Desktop systems & Norton Anti Virus Server.