System/software Engineer Resume
Palo Alto, CA
SUMMARY
- Around 9 years’ experience as Linux System/DevOps Engineer. Good experience in building, supporting and automating Linux infrastructure at Confidential .
- Good experience in Configuration management tools like Puppet/Ansible and Hudson/Jenkins for continuous integration and for end to end deployments.
- Experienced in product design, integration and architecting solutions to run SAP Hana appliances.
- Experienced in OS Installations at Confidential using PXE,Kickstart and automating post configurations.
- Experience in configuring and managing public/private cloud using AWS including EC2, S3, Elastic Load Balancers and other services of AWS family.
- Experience in implementing Continuous Integration and deployment using CI/CD Tools like Jenk - ins,Hudson and configuration management tools like Puppet, Ansible in cloud and container environment like Amazon EC2, S3, CloudFront, CloudWatch and Dockers.
- Exposure in building,configuring and supporting servers/clusters of various technologies like Splunk,Oracle,Oracle Rac,MongoDB,Cassandra,Hadoop,Vertica,TimesTen,Couchbase etc.
- Extensive experience in installing,deploying/upgrading Hadoop Clusters and pushing configurations using Puppet.
- Experience with build tools like Ant and Maven.
- Experience working with GIT repositories and monitoring tools like Nagios,Splunk,NewRelic etc.
- Ability to deploy/configure and administer Splunk clusters on Confidential . Also help various teams in on boarding Splunk and configure dashboards,alerts,reports accordingly.
- Exposure in configuring SAP Hana Appliances based on Scale up and Scale out architecture.
- Experienced on working with different kinds of hardware/firmwares of IBM and HP.
- Able to handle various requirements related to filesystems involving Raid,LVM,NFS,SAN etc.
- Exposure in provisioning Luns from 3PAR Storage and implement San Zoning.
- Experienced in configuring Yum servers and managing Yum Repositories.
- Can tune various system and kernel parameters according to application or vendor requirements.
- Experience in configuring network bonds,DHCP,DNS,NFS,FTP,HTTP.
- Experience in creating and managing Confidential virtual environment based on OracleVM, KVM, VMWare and Cloudstack.
- Exposure with containerizing HANA database using Dockers.
- Experience in building tools using Bash and Python.
- Exposure to tools/systems like: GIT,hp-openview,nagios,ganglia,expresso(ticketing),service now(ticketing)
- Have experience of working in datacenter and hardware management.
- Ability to manage and co-ordinate projects including requirement gathering,building,configuring and supporting
- Have experience in coordinating and working with different infrastructure teams like dba, san, network etc. along with vendors such as ibm, hp,splunk, oracle,etc.
TECHNICAL SKILLS
- RHEL-5,6,7
- OEL-5,6,7 CENTOS-5,6,7
- UBUNTU
- SUSE-11,12
- MAC OSX
- IBM x3630 M3
- M4; IBMX3550 M3
- M4, I
- BM HS22
- HS23
- HP -DL360
- DL380
- DL560
- DL580
- DL980
- C7000 - G7
- G8. 3PAR Storage
- Bonding dhcp dns ftp apache nginx pxe nfs
- OracleVM/Xen
- KVM
- VMWare
- Cloudstack
- AWS
- Dockers
- Kubernetes
- Bash
- Python
- Expect
- Puppet
- Chef, A nsible
- Jenkins
- Hudson
- Ant
- Maven
- GIT svn
- HPOpen View
- Nagios
- Espresso (ticketing)
- Service Now (ticketing)
PROFESSIONAL EXPERIENCE
Confidential, Confidential, CA
System/Software Engineer
Responsibilities:
- Primarily responsible for designing, certifying and building SAP HANA in memory database appliances from development to manufacturing. These appliances are high performing Proliants with Skylake and Broadwell E7 processors, upto 4TB memory,4 socket using SUSE and RHEL and optionally VMWare virtualization including high availability using service guard, replication and backup/recovery.
- Provide performance testing and fstest/fsperf certification of HPE hardware solutions for HANA
- Set up and working on Build and Release, Continuous Integration tools and frameworks like Jenkins,GIT,Maven.
- Defining branching strategies and creating release and hotfix branches.
- Merging of code from Dev branch to Integration and Release branches and deploying the builds.
- Deploy system configurations and applications using Ansible. Modify and write Ansible playbooks.
- Worked on core AWS services further setting up new EC2 instances, configuring security groups and setting up elastic IPs, auto scaling configuration.
- Hosting applications on AWS cloud.
- Get various components of appliance racked and cabled according to ScaleUp/ScaleOut architecture and CPU to RAM ratio.
- Configure HPE 2900/5900 switches for CS500 and CS900 systems.
- Install and Configure SUSE or RHEL on DL580s/DL560s according to scale up or scale out systems.
- Configure 3Par Storage, provision luns and implement san zoning.
- Install and Configure Service Guard on DL580/DL560s/DL380s and implement HA according to different appliance solution.
- Install and Configure Hana database according to scale up/scale out architecture and no of hana nodes involved.
- Perform various functional,stress,destructive and HA test cases on servers and database to achieve KPI (key performance indicator) values.
- Created prep media to automate applying bios settings, updating storage controller firmware and apply disk layout on storage controller using hpe-scripting-toolkit.
- Use SSDs for smart cache mechanism and manage its properties.
- To assist automated installation, created autoyast profile for SLES and kickstart for RHEL. These profiles have intelligence to detect booting drive (either local disk,san,vmdisk etc.), also install only required packages,apply required patches and tunes based on different type of appliances.
- Create custom SPP for applying firmwares and drivers for selected category of hardwares.
- Configure appliance and perform tests for various hana versions to get appliance certified by SAP.
- Part of development projects with many remote employees using Python, Bash, Git and other frameworks for tools development.
- Developed python framework for creating storage, applying network configuration and mapping storage on to distributed scale-up and scale-out nodes thus reducing server build to single step.
- Implemented python tool for generating detail configuration reports on distributed scale-up and scale-out appliances.
- Containerize HANA database using Dockers.
- Deployed the generated build to WEB and APP server using the continuous integration process to all environments.Coordinated different teams across the globe to deploy different builds to different environments on parallel development for multiple regions.
- Implemented, tested and documented HPE and SAP technologies with SAP HANA including disk encryption, high availability, multi tenancy, multi SID, replication.
- Review, monitor and troubleshoot hana cases raised from various client sites.
- Support linux servers hosting customer support portals.
- Provide consulting to HPE Sales, technical services, factory integration on design and integration of hardware and software.
Environment: SUSE 11.x,12.x RHEL6.x,7.x SAP HANA, HPE Service Guard for HA,HPE 3Par Storage, HPE servers/blades. HPE Switches, Ansible, Jenkins, Maven, AWS
Confidential, Charlotte, NC.
Linux/Splunk System Engineer
Responsibilities:
- Install and configure Linux virtual machines, standalone servers for various applications via kickstart, pxe.
- Supporting existing Linux environment of around 9000 servers including blades, virtual machines and standalone servers.
- Working with engineering and operations teams to establish standards and repeatable process for managing change and upgrades across hosting environment.
- Involved in automating and deployments using Ansible, Jenkins and Maven.
- Setting up Ansible environment and involved in writing Ansible playbooks.
- Worked on building SCM strategies like branching, merging etc.
- Involved in editing the existing Ant/Maven files in case of errors or changes in project requirements.
- Managed Amazon web services like EC2,S3 bucket, RDS, EBS, ELB, Auto scaling, AMI, IAM through AWS console and API integration with Puppet code.
- Resolving issues related user/group management, access and sudoers.
- Experience on Logical Volume Manager, managing physical volumes, volume group and logical volumes.
- Scanning Luns and working with multipath.
- Resolving routine issues/tasks related to raid, filesystems, nfs shares etc.
- Managing packages through rpm and yum.
- Configuring and managing Ethernet bonds, kernel parameters, ulimits etc.
- Automating tasks using Shell Scripts and Python.
- Assisting application teams during various release cycles and upgradation.
- Co-ordinating with various teams such as dba's, application developers, network, san for completion of projects and resolving issues
- As primary for Splunk support, supported more than 10000 forwarders across diff OS including aix, Solaris and windows.
- Deployed/configured, upgraded 12 search heads and 20 indexers.
- Was responsible for data inputs/app creation/objects/views managing in Splunk
- Assisted in upgrade of Splunk from version 5 to version 6
- Creating index clustering, search head pooling and distributed peers.
- Troubleshooting and resolving splunk performance/search pooling/log monitoring issues, role mapping and dashboard creation.
- Assisted clients in onboarding splunk by installing relevant apps such as
- Monitoring daily health checks with regards to license usage, indexer f/s usage etc.
- Working according to ITIL following procedures for incidents, service request and change requests.
- Raising cases with vendors like Splunk, IBM, HP, Redhat, Oracle for either resolving issues or shipping required parts.
Environment: RHEL 5.x, 6.x, OEL 5.x,6.x, Solaris, VMware, Splunk 5.x,6.x, IBM/HP hardware, Jenkins/Hudson, Maven, Ant, Ansible, Puppet, AWS
Confidential, Sunnyvale,CA
Linux System Engineer
Responsibilities:
- Building new Linux infrastructure and supporting existing linux environment of 20K+ servers including virtual machines.
- Installation of OS on Confidential using PXE and Kickstart.
- Building and configuring servers/clusters for various technologies like Splunk, Oracle, Oracle-rac,Cas- sandra, Couch base, Times ten etc.
- Planning and executing different kinds of upgrade and migration projects like OS upgrades,PCI/SOX migration, vlan migration etc.
- Monitored software, hardware and/or middleware updates utilizing technologies like Jenkins/Hud- son,Maven, Ant, Subversion.
- Involved partially in deploying WARs/EARs(backend) through Apache tomcat Application server console.
- Installing/Deploying Hadoop clusters and pushing various hadoop properties via Puppet.
- Write Puppet manifests and modules for onboarding new clusters of Hadoop.
- Upgradation of hadoop clusters to various hadoop versions like 1.3,1.3.2,2.0.6,2.1.2 etc.
- Kerberization of hadoop clusters.
- Managing other hadoop tasks related to hadoop users,data sharing,adding new nodes and hardware issues on nodes.
- Manage drives using hardware raid (megaraid/hp raid utility)
- Manage/troubleshoot filesystem requirements and issues involving lvm,nfs,san.
- Manage/troubleshoot user/group and access/sudoers related issues.
- Configure yum servers and manage yum repositories.
- Setting various system and kernel parameters according to application requirement.
- Create and manage virtual environment involving Oracle VM, VMWare and Cloudstack.
- Updating firmwares and handling hardware related issues by co-ordinating with relevant vendors.
- Manage configuration files and common scripts/tools for oVm/RHEL/OEL through subversion.
- Push system configurations for various virtual environment using Puppet/Ansible.
- Wrote scripts to create virtual machines, post install configuration of VMs.
- Automate system admin tasks with Python and Bash.
- Manage and co-ordinate projects including requirement gathering,building,configuring and supporting.
- Co-ordinate with different infrastructure teams like database, network, storage etc. to fulfill new requirements and troubleshoot current issues
- Document, create process and plan change activities of existing projects to streamline day-to-day operations.
- Handle routine service requests,change requests and other priority tickets using espresso.
- Work in 24/7 environment supporting current infrastructure
Environment: RHEL 5.x, 6.x, OEL 5.x, 6.x Oracle VM, VMware, Cloudstack, Hortonworks, IBM/HP hardware
Confidential .
Operations Engineer
Responsibilities:
- Monitor aix,solaris,linux,OSX servers,network and storage devices and handle alerts based on priority.
- Co-ordinate with SE’s and HL for critical issues.
- Handle alerts related to disk space,CPU,swap,host down, service down etc.
- Monitor alerts for various application teams and co-ordinate with them to get issue addressed..
- Manage users and access on OSX hosts.
- Implement user quota, handle filesystem and NFS issues on OSX.
- Manage server reboot and crash issues on OSX hosts.
- Install/Renew SSL certificates on OSX servers..
- Handle autosys administration and related issues.
- Manage and troubleshoot SAP PD2,PD3,PD6 related issues.
- Perform weekly maintenance activity for SAP PD2,PD3,PD6 involving AIX servers
- Install and configure TSM clients on various unix/linux machines.
- Manage/Troubleshoot routine and scheduled backup related issues using TSM.NFS, FTP, PXE, SSH configuration and management
- Take yearly archives on aix and solaris using TSM
- Add monitoring of servers and applications..
- Co-ordinate with site services for various hardware installation and maintenance activities.
- Handle service requests, change requests and other tickets using espresso.
- Wrote shell and python scripts to automate various repeatable tasks.
- Worked in 24/7 environment.
Environment: Oracle Linux,AIX,Solaris,HP Open View,TSM.
Confidential
Linux Support Engineer
Responsibilities:
- Installing OS through virtual media/Kickstart.
- Monitoring and Managing various game urls and processes representing various game rooms.
- Creating/Modifying user accounts, granting/modifying access to game room servers.
- Creation / Management of filesystems and implementing user quota.
- Creating Logical volumes, extending existing ones on requirement
- Troubleshooting issues related to file permissions/ownerships.
- Monitoring performance of various game servers.
- Administrating package installation and package upgrade using yum.
- Managing and troubleshooting access issues.
- Involved in quarterly upgrade/patching of servers.
- Configuring NFS share and exporting to relevant game servers.
- Running daily backups and generating reports using Netbackup.
- Managing daily alerts related to filesystem, cpu, swap etc.
- Setting various kernel parameters as per application requirements.
- Troubleshooting day-to-day issues related to unresponsive hosts and processes.
- Coordinating with hardware vendors for hardware related issues.
- Using CA ticketing tool CA for carrying out daily tasks.
- Documenting changes, upgrades and generating root cause analysis reports.
- Wrote shell scripts for backup, upgrades and post-install configuration.
- Working in 24/7 environment.
Environment: RHEL- 4.x, RHEL- 5.x, CentOS 4.x,5.x, Veritas Netbackup, Dell hardware.