Data Center Engineer Resume
SUMMARY
- Experience in software development, deployment and maintenance of applications of various stages.
- Experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Zookeeper, Sqoop, Oozie, Kafka, Yarn, Spark, Scala and Avro.
- Experience in applying the latest development approaches including applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
- Hands on experience in coding Map Reduce/Yarn Programs using Java, Scala for analyzing Big data.
- Have a good experience working in Agile development environment including Scrum methodology.
- Hands on experience in scripting for automation, and monitoring using Shell.
- Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa according to client's requirement.
- Good knowledge in Cluster coordination services through Zookeeper and Kafka.
- Experience in Object relational mapping and Persistence mechanism is executed using Hibernate ORM.
- Worked on version control tools like CVS, GIT, SVN.
- Expertise in implementing and maintaining an Apache Tomcat /MySQL/PHP, LDAP, LAMP web service environment.
- Self-starter always inclined to learn new technologies and Team Player with very good communication, organizational and interpersonal skills.
- Experience in all phases of Software development life cycle (SDLC).
- Experienced in Installing and Configuring Cloudera Manager and install CDH, managed service, and Cloudera Manager Agent software on the hosts.
- Expertise on setting up Hadoop security, data encryption and authorization using Kerberos, TLS/SSL and Apache Sentry respectively.
- Experience in setting up Encryption Zones in Hadoop and worked in Data Retention.
- Practical knowledge on functionalities of every Hadoop daemon, interaction between them, resource utilizations and dynamic tuning to make cluster available and efficient.
- Experience in understanding security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
- Experience in deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster using Cloudera.
- Knowledge of SQL Server performance tuning, backup and recovery methods.
- Strong knowledge in configuring Name Node High Availability and Name Node Federation.
- Tested all the available commands for Hive, Impala, Hdfs and HBase and analyzed the audits of those commands in Cloudera Navigator. Also, collaborated with Cloudera Support to fix issues with specific commands which are having issues in auditing.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Administration, installing, upgrading and managing distributions of Hadoop (CDH4, Cloudera manager), Hive, HBase.
- Hands of experience on data extraction, transformation and load in Hive, Pig and HBase
- Experience in creating DStreams from sources like Flume, Kafka and performed different Spark transformations and actions on it.
- Experience in integrating Apache Kafka with Apache Storm and created Storm data pipelines for real time processing.
- Worked on improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frames, RDD's, Spark YARN.
- Procedural knowledge in cleansing and analyzing data using HiveQL, Pig Latin and custom MapReduce programs in Java.
- Strong experience in RDBMS technologies like MySQL, Oracle, Postgres and DB2.
- Involved in configuring and working with Flume to load the data from multiple sources directly into HDFS.
- Experience on predictive intelligence and smooth maintenance in spark streaming is done using Conviva and MLlib from Spark.
- Experience in Installing, upgrading and configuring RedHat Linux 4.x, 5.x, and 6.x using Kickstart Servers
- Expert in JAVA 1.8 LAMBDAS, STREAMS, Type annotations.
- Experience of MPP databases such as HP Vertica and Impala.
- Hands on experience in the SVN and GitHub.
- Hands on experience on implementation projects like Agile and Waterfall methodologies.
- Performed analytics in Hive using various files format like JSON, ORC, and Parquet.
- Experience in using design pattern, Java, JSP, Servlets, JavaScript, HTML, JQuery, Angular JS, XML, Web Logic, SQL and Apache-Tomcat, Linux.
TECHNICAL SKILLS
Big Data Ecosystem: Hadoop, MapReduce, Pig, Hive, YARN, Kafka, Sqoop, Oozie, Zookeeper, Spark, DB2, and Snappy.
Hadoop Distributions: Cloudera (CDH3, CDH4, and CDH5)
Languages: SQL, HTML, Scala, JavaScript, XML and C/C++
No SQL Databases: HBase
Methodology: Agile, waterfall
Development / Build Tools: Eclipse, Maven
DB Languages: MySQL, Oracle
RDBMS: Oracle 9i,10g,11i, MySQL and DB2
Operating systems: UNIX, LINUX, Mac OS and Windows Variants
PROFESSIONAL EXPERIENCE
Data Center Engineer
Confidential
Responsibilities:
- Involved in writing java API for Amazon Lambda to manage some of the AWS services
- Managing AWS EC2 instances utilizing Auto Scaling, Elastic Load Balancing and Glacier for our QA environments as well as infrastructure servers for GIT and Chef.
- Designed and developed AWS Cloud Formation templates to createEC2 instances, custom sized VPC, subnets. Used Cloud Front to deliver content from AWS edge locations to users, allowing for further reduction of load on front-end servers
- Wrote shell scripts to automate the process the adding SSH-keys and generate passwords and then integrated the scripts to Jenkins.
- Created functions and assigned roles in AWS Lambda to run python scripts, and AWS Lambda using java to perform event driven processing. Created Lambda jobs and configured Roles using AWSCLI.
- Used AWS Beanstalk for deploying and scaling web applications and services developed with Python and Ruby on familiar servers such as Apache, and IIS.
- Build and configure a virtual data center in the AWS cloud to support Enterprise Data Warehouse hosting including VPC, Public and Private Subnets, Security Groups, Route Tables, Elastic Load Balancer.
- Managed different infrastructure resources, like physical machines, VMs and even Docker containers using
- Configuring and managing an ELK stack, setup the Elastic search ELK Stack to collect, search and analyze log files across the servers. Evaluated system logs using ELK software stack.
- Configures and implements a suite of VMware virtualization technologies and concepts
- Performing the installation, configuration and maintenance of the VMware virtualinfrastructure, including, but not limited to, vCenter.
- Performing advanced troubleshooting and root cause analysis to expedite incidentresolution.
- Analyzing and troubleshooting VMware vSphere physical hardware and virtual machinePerformance
- Migrating Servers to AWS and Oracle Cloud Platforms
- Active Directory Account Management and Clean Up
- Work in a production data center environment that requires 100% uptime
- Manage Daily Backups Manage VM Failures
- Manage Storage Arrays
- Professionally work with vendors and other service providers.
- Work after hours or weekends to meet customer maintenance window requirements, work on projects in different time zones.
- Performing the installation, documentation, and quality control of projects
Big Data Administrator
Confidential
Responsibilities:
- Experienced in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, Oozie, Sqoop, Kafka, Spark, Impala with Cloudera distribution
- Experience in performing backup and Disaster Recovery of Name Node metadata and important sensitive data residing on cluster.
- Performed cluster backup using DISTCP , Cloudera manager BDR and parallel ingestion.
- Configuring Cloudera Manager Agent heartbeat interval and timeouts.
- Monitoring Hadoop Cluster through Cloudera Manager and Implementing alerts based on Error messages. Providing reports to management on Cluster Usage Metrics.
- Expertise in setting up in-memory layer such as Spark (1.6 and 2.x), impala and its maintenance like resolving out of memory issues, balancing load across daemons .
- Installed and configured various Hadoop distributions like CDH -5.7 and HDP 2.2 and higher versions.
- Worked on Hadoop Security with MIT Kerberos , Ranger with LDAP .
- Collected logs of data from various sources and integrated into HDFS Using Flume .
- Responsible for generating actionable insights from complex data to drive real business results for various application teams.
- Created applications using Kafka , which monitors consumer lag within Apache Kafka clusters.
- Implemented map-reduce counters to gather metrics of good records and bad records.
- Used Kafka to transfer data from different data systems to HDFS.
- Good experience with Ab Initio open studio for designing ETL Jobs for Processing of data.
- Experience in processing large volume of data and skills in parallel execution of process using Ab Initio functionality
- Configuring the Zookeeper to coordinate the servers in Clusters and to maintain the Data Consistency.
- Worked on Agile Methodology projects extensively.
- Involved in Hadoop Cluster environment administration that includes adding and removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, Troubleshooting.
- Experience designing and executing time driven and data driven Oozie workflows.
- Setting up Kerberos principals and testing HDFS, Hive, Pig, and MapReduce access for the new users.
- Experienced in working with Spark eco system using SCALA and HIVE Queries on different data formats like Text file and parquet.
- Log4j framework has been used for logging debug, info & error data.
- Collected the logs data from web servers and integrated in to HDFS using Flume
- Experience in importing data from various data sources like Mainframes , Teradata , Oracle and DB2 using Sqoop and loaded data into HDFS .
- Responsible for installing, configuring, supporting, and managing of Hadoop Clusters.
- Worked on Installing Cloudera Manager, CDH and install the JCE Policy File to Create a Kerberos Principal for the Cloudera Manager Server, Enabling Kerberos Using the Wizard.
- Enabled Active Directory / LDAP for Cloudera Manager , Cloudera Navigator and Hue .
- Worked on Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
- Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
- Built, stood up and delivered Hadoop cluster in Pseudo distributed Mode with Name Node, Secondary Name node, Job Tracker, and the Task tracker running successfully with Zookeeper installed, configured and Apache Accumulate (NO SQL Google's Big table) is stood up in Single VM environment.
- Maintenance of all the services in Hadoop ecosystem using ZOOKEPER .
- Configured, monitored, and optimized Flume agent to capture web logs from the VPN server to be put into Hadoop Data Lake.
- Written automated HBase test cases for data quality checks using HBase command line tools.
- Importing and exporting data into HDFS from Oracle 10.2 database and vice versa using SQOOP.
- Experience in working with Hadoop clusters using Cloudera distributions.
- Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.
Environment: Hadoop, Hive, Map Reduce, Sqoop, Kafka, Impala, Spark, Yarn, Pig, Oozie, shell Scripting, Scala, Maven, Java, JUnit, agile methodologies, Cloudera, Ab Initio, Teradata, MySQL.
AWS Cloud BigData Administrator
Confidential
Responsibilities:
- Involved in writing java API for Amazon Lambda to manage some of the AWS services
- Managing AWS EC2 instances utilizing Auto Scaling, Elastic Load Balancing and Glacier for our QA environments as well as infrastructure servers for GIT and Chef.
- Designed and developed AWS Cloud Formation templates to createEC2 instances, custom sized VPC, subnets. Used Cloud Front to deliver content from AWS edge locations to users, allowing for further reduction of load on front-end servers
- Wrote shell scripts to automate the process the adding SSH-keys and generate passwords and then integrated the scripts to Jenkins.
- Created functions and assigned roles in AWS Lambda to run python scripts, and AWS Lambda using java to perform event driven processing. Created Lambda jobs and configured Roles using AWSCLI.
- Used AWS Beanstalk for deploying and scaling web applications and services developed with Python and Ruby on familiar servers such as Apache, and IIS.
- Build and configure a virtual data center in the AWS cloud to support Enterprise Data Warehouse hosting including VPC, Public and Private Subnets, Security Groups, Route Tables, Elastic Load Balancer.
- Managed different infrastructure resources, like physical machines, VMs and even Docker containers using
- Configuring and managing an ELK stack, setup the Elastic search ELK Stack to collect, search and analyze log files across the servers. Evaluated system logs using ELK software stack.
- Worked with Red Hat Open Shift Container Platform for Docker and Kubernetes.Used Kubernetes to manage containerized applications using its nodes, Config Maps, selector, Services & deployed application containers as Pods.
- Built Jenkins pipeline to drive all micro services builds out to the Docker registry and then deployed to Kubernetes, created pods and managed using Kubernetes. Deployed various databased and applications using Kubernetes cluster management some of the services are Radis, Nginx etc.
- Created additional Docker Slave Nodes for Jenkins using custom Docker Images and pulled them to ECR. Exposure to Mesos, Marathon & Zookeeper cluster environment for application deployments and Docker
- Worked in code deployment and orchestration with tools such as Chef, Cloud Formation, and automation validation using Test Kitchen, Vagrant, Ansible and Terraform. Build Jenkins jobs to create AWS infrastructure from GitHub repos containing Terraform code. Assist Asset Managers to monitor funding and payments in regulations with the Terraform Power Policy.
- Used Ansible to Orchestrate software updates and verify functionality. Wrote Ansible Playbooks with Python SSH as the Wrapper to Manage Configurations of AWS nodes and Tested Playbooks on AWS instances using Python. Run Ansible Scripts to Provide Dev Servers.
- Involved in Architect, build and maintain Highly available secure multi-zone AWS cloud infrastructure utilizing Chef with AWS Cloud Formation. Wrote Chef recipes for various applications and deploying them in AWS using Terraform.
- Deployed and maintained Chef role-based application servers, including Apache, JBoss, Nginx, and Tomcat.
- Developed Java API to interact with the Amazon SQS used in sending bulk emails.
- Implemented log4j API for exception handling, logging of the errors, warnings, messages, stack traces, debuggers throughout the code
- Wrote Chef Cookbooks and recipes in Ruby to provision several pre-prod environments consisting of Cassandra DB installations, WebLogic domain creations and several proprietary middleware installations.
- Installed Jenkins on a Linux server & create master-slave configuration to implement multiple parallel builds through a build farm. Manage several plugins to automate the tasks like code coverage, AWS-EC2 plugin, and job creation. Strong experience utilizing Jenkins for enterprise scale infrastructure configuration and application deployments.
- Developed build and deploy scripts using ANT, MAVEN as build tools in Jenkins to move from on premise environment to cloud environments. Managed the Maven Repository using Nexus tool to automate the build process and used the same to share the snapshots and releases of internal projects.
- Implemented CI/CD pipelines for various DEV/QA teams in Multi Family group.
- Developed GIT hooks for the local repository, code commit and remote repository, code push functionality and worked on the GitHub. Configured GIT with Jenkins and schedule jobs using Poll SCM option.
- Used Splunk for log analyzing and improving the performance of servers. Wrote several custom Splunk queries for monitoring and alerting. Used Nagios for application and hardware resource monitoring.
- Maintained Python scripts for Automating Build and Deployment process and Creating Web based Application by Using Django Framework.
- Hands on experience on JIRA for creating bug tickets, storyboarding, pulling reports from dashboard, planning sprints and to handle DCR (defect Change Request) MR (Maintenance Request) and Confluence for documentation.
- Worked on Bash, Python, Ruby and Groovy scripting. Installed and configured network infrastructure using routing and switching strategies . Worked with different team members for automation and Release components .
Environment: AWS (EC2, S3, VPC, ELB, RDS, EBS, Cloud Formation, Cloud watch, Cloud trail, Route 53, AMI, SQS, SNS, Lambda, CLI, CDN), Azure, Terraform, ELK, Docker, Ansible, Chef, Jenkins, ANT, Maven, Git, SVN, Jira, Bash, Shell, Perl, Python, Ruby, Tomcat, WebLogic, Auto scaling, Route53, DNS, Nagios, RHEL.
