Sr System Engineer Resume
2.00/5 (Submit Your Rating)
SUMMARY:
- A collaborative engineering professional with substantial experience designing and executing systems for complex business problems involving large scale data warehousing, real - time analytics and reporting solutions.
- Known for using the right tools when and where they make sense and creating an intuitive architecture that helps organizations effectively analyze and process terabytes of structured and unstructured data.
- Design Large Scale, Fault Tolerance Hadoop Clusters with components of scalability, and high data throughput.
- Expertise in installation, configuration, support and management of Hadoop Clusters running Hortonworks & Cloudera distribution.
- Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
- Excellent command in creating Backups & Recovery and Disaster recovery procedures and Implementing BACKUP and RECOVERY strategies for off - line and on-line Backups.
- Experience in benchmarking and fine tuning Hadoop clusters, for optimum performance.
- Making Hadoop clusters ready for development team working on POCs.
- Extensive experience with Hadoop and Hadoop eco system components upgrade.
- Expertise in analyzing Hadoop, Linux systems Log files to investigate root cause.
- Experience on Commissioning, Decommissioning, Balancing and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
- Hands on experience in provisioning and managing multi-tenant Hadoop clusters on public cloud environment - Amazon Web Services (AWS)-EC2 and on private cloud infrastructure - Open Stack cloud platform.
- Setting up automated 24x7 monitoring and escalation infrastructure for Hadoop cluster using Nagios and Cloudera/Hortonworks alerts.
- Ability to prepare documents including Technical Design, testing strategy, and supporting documents.
- Proficient in understanding and deploying security mechanism like Kerberos, Data Encryption, SSL etc.
- Expert in Configure, Provision, Deploy new AWS EC2 Virtual Servers in the cloud. Managing entire life-cycle of AWS Servers ranging from Micro to Large Systems.
- Managing Volume’s creation, Deletion & modification, creation of volume from snapshots.
- Backup & Snapshot management, recovering servers from Snapshots in the AWS EC2 Instances.
- Identify Network Security Rules - Firewall Rules & configuration for zone wise AWS Cloud servers.
- Ability to convert business requirements to Cloud solutions. Build and test Cloud Apps for new and existing backend services to help facilitate development team migrations
- End-to-End Cloud Data Solutioning and data stream design, experience with tools of the trade: Hadoop, Storm, Hive, Pig, AWS (EMR, Redshift, S3, etc)
PROFESSIONAL EXPERIENCE:
Confidential
Sr System Engineer
Responsibilities:
- Discuss, plan and build Linux system architectures purposed for webservers, application servers and big data platform Hadoop .
- Document and categorize all SOP (standard operating system) to ensure the onshore/offshore model has maximum reliability
- Planning, installing, configuring, maintaining, and monitoring Hadoop Clusters and various components including HDFS, Yarn, Hive, Impala, Flume, Zookeeper, Oozie etc
- Setup High availability of Namenode and YARN Resource Manager
- Integrate HA Hadoop Cluster with Confidential ’s proprietary Analytical tool ‘Signal Hub’
- Extensive usage of cloud server instances and storage from Amazon AWS , to enable client web applications to scale horizontally and vertically, and to have an on demand scalable Dev environment.
- Documentation of system design, remote monitoring/alerting of critical client services, and root cause analysis and resolution of issue(s).
- Designed , deployed in house Spacewalk server to serve as internal yum server.
- Designed and implemented authentication method, which uses Active directory domain accounts on all Linux servers, using winbind daemon. To fix the SID to uid/gid mapping issue, used a LDAP server, that stores all the uid/gid mappings in its ldif files.
- Tuning of various Kernel parameters using sysctl and virtual memory subsystem to improve system performance.
- Configure and tune LVM volumes.
- Designed and deployed web Reverse proxy solutions using mod proxy, jk mod modules in apache.
- Performance monitor using sar, iotop, iostat, top, vmstat, mpstat, glances, pmap, iptraf
- Crash dump analysis by using Crash utility to analyse the crash dump file generated by kdump, diskdump and Netdump.
- Tuning of system performance by system control variables stored in /proc/sys
- Filesystem tuning to reduce I/O latency by disabling mounting options like atime and tuning bdflush.
- Designed, deployed monitoring solutions for all dev and prod Linux servers, tools like Nagios, Sargraph, graylog, opennms
- Deployed open source, inventory tool OCSInventory
- Work with application developers to deploy new tomcat builds, discuss possible solutions for given requirement changes or upcoming project needs on linux systems.
- Co-ordinate with enterprise backup teams to ensure failed backup jobs are taken care of and perform RCA activities.
- Perform any host-level tasks needed to add new SAN volume on a given linux server.
- Assist presales team with client meetings, call to address any system/application related queries.
Senior System Administrator
Confidential
Responsibilities:
- Installation and configuration of Linux servers for dev environment.
- Troublshooting Performance Issues, using sar, iotop, iostat, top, vmstat, mpstat, pmap, iptraf
- Responsible for performing emergency/scheduled failover for a number for mission critical applications like Confidential whenever required, so as to maintain High availability.
- Performing weekly/monthly maintenance change tasks, which involves patch update.
- Configure and tune LVM volumes.
- Monitoring and managing Webservers(Apache), FTP servers
- Meeting the SLA defined as per the pre-categorized Severity 1,2,3 & 4 issues.
- Hosting and Attending Change meetings for corresponding Line of Businesses.
- Performing SOP(Standard Operating Procedure) for troubleshooting variety of issues, also performing emergency failover of apps on production server.
- Coordinating with different support groups and respective LOB’s availability managers to conduct meetings for root cause analysis of Sev 1 & 2 issues.
Specialist
Confidential
Responsibilities:
- User administration: creating new accounts on Directory Server, managing group access.
- Provisioning sudo access.
- Troubleshooting performance issues and seek help from level 3 system admins for further support.
- Package management: using rpms, yum and compiling from source, installing python packages, compiling python binaries from source.