We provide IT Staff Augmentation Services!

Sr. Hadoop Engineer Resume

0/5 (Submit Your Rating)

Charlotte, NC

SUMMARY

  • 8 years of overall IT experience and 4 Years of comprehensive experience as an Apache Hadoop Engineer. Expertise in writingHadoopJobs for analyzing data using Hive, Pig and oozie.
  • Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Spark and MapReduce concepts.
  • Experience in working with MapReduce programs using Hadoopfor working with Big Data.
  • Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Spark, Flume and zookeeper.
  • Experience in providing support to data analyst in running Pig and Hive queries.
  • Developed Map Reduce programs to perform analysis.
  • Performed Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in writing shell scripts to dump the Sharded data from MySQL servers to HDFS.
  • Experience in designing both time driven and data driven automated workflows using Oozie.
  • Experience in setting up Infiniband network and build Hadoop cluster to improve the map reduce performance.
  • Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
  • Experience in automating the Hadoop Installation, configuration and maintaining the cluster by using the tools like puppet.
  • Experience in setting up monitoring infrastructure for Hadoop cluster using Nagios and Ganglia.
  • Experience in working with flume to load the log data from multiple sources directly into HDFS.
  • Strong debugging and problem solving skills with excellent understanding of system development methodologies, techniques and tools.
  • Good knowledge of No-SQL databases- HBASE, Cassandra and Horton works.
  • Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) in different application domain involving different technologies varying from object oriented technology to Internet programming on Windows NT, Linux and UNIX/ Solaris platforms and RUP methodologies.
  • Familiar with RDBMS concepts and worked on Oracle 8i/9i, SQL Server 7.0., DB2 8.x/7.x
  • Involved in writing shell scripts, Ant scripts for Unix OS for application deployments to production region.
  • Exceptional ability to quickly master new concepts and capable of working in-group as well as independently with excellent communication skills.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Sqoop, scala, Flume, Spark, kafka, Hortonworks, Oozie, and ZooKeeper.

No SQL Databases: Hbase, Cassandra, mongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, UNIX shell scripts Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, python, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate

Operating Systems: Sun Solaris, HP-UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata

PROFESSIONAL EXPERIENCE

Sr. Hadoop Engineer

Confidential - Charlotte, NC

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager
  • Upgrading the Hadoop Cluster from CDH3 to CDH4, setting up High availability Cluster and integrating HIVE with existing applications
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop
  • Worked extensively with Sqoop for importing metadata from Oracle
  • Experience migrating MapReduce programs into Spark transformations using Spark and Scala
  • Configured Sqoop and developed scripts to extract data from MySQL into HDFS
  • Hands-on experience with production alizing Hadoop applications viz. administration, configuration management, monitoring, debugging and performance tuning
  • Created HBase tables to store various data formats of PII data coming from different portfolios
  • Cluster co-ordination services through Zookeeper
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary
  • Installed and configured Hive and also written Hive UDFs in java and python
  • Helped with the sizing and performance tuning of the Cassandra cluster
  • Involved in the process of Cassandra data modeling and building efficient data structures.
  • Trained and mentored analyst and test team on Hadoop framework, HDFS, Map Reduce concepts, Hadoop Ecosystem
  • Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud
  • Responsible for architecting Hadoop clusters
  • Written shell scripts and Python scripts for automation of job
  • Assist with the addition of Hadoop processing to the IT infrastructure
  • Perform data analysis using Hive and Pig

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Scala, Cassandra, Pig, Sqoop, Oozie, ZooKeeper, Teradata, PL/SQL, MySQL, Windows, Horton works, Oozie, HBase

Hadoop/ Big data Engineer

Confidential, MD

Responsibilities:

  • Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster
  • Developed workflows using Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig
  • Implemented scripts to transmit sysprin information from Oracle to HBase using Sqoop.
  • Worked on partitioning the HIVE table and running the scripts in parallel to reduce the run time of the scripts.
  • Involved in the pilot of Hadoop cluster hosted on Amazon Web Services (AWS)
  • Load log data into HDFS using Flume, Kafka
  • Optimized Map/Reduced Jobs to use HDFS efficiently by using various compression mechanisms
  • Analyzes the data by preforming Hive queries and running Pig scripts to study data
  • Implemented business logic by writing Pig UDF's in Java and used various UDF's from Piggybanks and other sources.
  • Continuously monitored and managed the Hadoop cluster using Ganglia.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Exported the analyzed data to the relational databases using Scoop for visualization and to generate reports for the BI team.
  • Supported in settling up QA environment and updating configuration for implementing scripts with Pig and Scoop.
  • Implemented testing scripts to support test driven development and continuous integration.

Environment: Hadoop, Map Reduce, HDFS, Hive, Pig, Java, SQL, Ganglia, Amazon Web Services (AWS), kafka, Scoop, Flume, Oozie, Java, Maven, Eclipse.

Hadoop and Java Developer

Confidential - Schaumburg, IL

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing
  • Importing and exporting data into HDFS and Hive using Sqoop
  • Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures
  • Extracted files from CouchDB through Sqoop and placed in HDFS and processed
  • Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS
  • Developed Puppet scripts to install Hive, Sqoop, etc. on the nodes
  • Load and transform large sets of structured, semi structured and unstructured data
  • Supported Map Reduce Programs those are running on the cluster
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions
  • Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes
  • Used JAVA, J2EE application development skills with Object Oriented Analysis and extensively involved throughout Software Development Life Cycle (SDLC)
  • Hands-on experience of Sun One Application Server, Web logic Application Server, Web Sphere Application Server, Web Sphere Portal Server, and J2EE application deployment technology
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process, etc.
  • Monitoring Hadoop cluster using tools like Nagios, Ganglia and Cloudera Manager
  • Automation script to monitor HDFS and HBase through Cron jobs
  • Develop high-performance cache, making the site stable and improving its performance
  • Create a complete processing engine, based on Cloudera's distribution, enhanced to performance
  • Administrative support for parallel computation research on a 24-node Fedora/ Linux cluster

Environment: Hadoop, MapReduce, HDFS, Hive, CouchDB, Flume, Oracle 11g, Java, Struts, Servlets, HTML, XML, SQL, J2EE, JUnit, Tomcat 6. Java, JDBC, JNDI, Struts, Maven, SQL language, Oracle, XML, Eclipse.

Java Developer

Confidential

Responsibilities:

  • Actively involved in Analysis, Detail Design, Development, System Testing and User Acceptance Testing.
  • Developing Intranet Web Application using J2EE architecture, using JSP to design the user interfaces, and JSP tag libraries to define custom tags and JDBC for database connectivity.
  • Implemented struts framework (MVC): developed Action Servlet, Action Form bean, configured the struts-config descriptor, implemented validator framework.
  • Extensively involved in database designing work with Oracle Database and building the application in J2EE Architecture.
  • Integrated messaging with MQSERIES classes for JMS, which provides XML message Based interface. In this application publish-and-subscribe model of JMS is used.
  • Developed the EJB-Session Bean that acts as Facade, will be able to access the business entities through their local home interfaces.
  • Evaluated and worked with EJB's Container Managed Persistent strategy.
  • Used Webservices - WSDL and SOAP for getting Loan information from third party and used SAX and DOM XML parsers for data retrieval
  • Experienced in writing the DTD for document exchange XML. Generating, parsing and displaying the XML in various formats using XSLT and CSS.
  • Used XPath 1.0 for selecting nodes and XQuery to extract and manipulate data from XML documents.
  • Coding, testing and deploying the web application using RAD 7.0 and Websphere Application Server 6.0.
  • Used JavaScript's for validating client side data.
  • Wrote unit tests for the implemented bean code using JUnit.
  • Extensively worked on UNIX Environment.
  • Data is exchanged in XML format, which helps in interoperability with other software applications.

Environment: Struts 2, Rational Rose, JMS, EJB, JSP, RAD 7.0, Websphere Application Server 6.0, XML parsers, XSL, XQuery, XPath 1.0, HTML, CSS, JavaScript, IBM MQSeries, ANT, JUnit, JDBC, Oracle, Unix, SVN.

Java/J2EE Developer

Confidential

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
  • Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
  • CSS and JavaScript were used to build rich internet pages.
  • Agile Scrum Methodology been followed for the development process.
  • Designed different design specifications for application development that includes front-end, back-end using design patterns.
  • Developed proto-type test screens in HTML and JavaScript.
  • Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
  • Developed the application by using the Spring MVC framework.
  • Collection framework used to transfer objects between the different layers of the application.
  • Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
  • Spring IOC being used to inject the parameter values for the Dynamic parameters.
  • Developed JUnit testing framework for Unit level testing.
  • Actively involved in code review and bug fixing for improving the performance.
  • Documented application for its functionality and its enhanced features.
  • Created connection through JDBC and used JDBC statements to call stored procedures.

Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML, JavaScript 1.2, JUnit, Apache Tomcat, My SQL Server 2008.

We'd love your feedback!