Hadoop Developer Resume Deerfield, IL - Hire IT People

SUMMARY

Overall 6 years of IT experience in a variety of industries, which includes hands on experience in Hadoop developer.
Expertise with the tools in Hadoop Ecosystem including Pig, Hive, HDFS, MapReduce, Sqoop, flume, Spark, HBase, Yarn, Oozie,and Zookeeper.
Excellent knowledge on Hadoop Ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm
Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Strong experience in writing applications using python, Scala and MySQL
Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
Strong experience on Hadoop distributions like Cloudera,MapRand Horton Works.
Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase.
Experienced in writing complex MapReduce programs that work with different file formats like Text, Sequence, Xml, parquet and Avro.
Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice - versa.
Extensive Experience on importing and exporting data using stream processing platforms like Flume
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
Excellent Java development skills using J2EE, J2SE web services.
Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
Worked in large and small teams for systems requirement, design & development.
Preparation of Standard Code guidelines, analysis and testing documentations.
Experience in working with Hadoop in Stand-alone, pseudo and distributed modes.
Good Knowledge on Cloud Computing with Amazon Web Services like EC2, S3 which provides fast and efficient processing of Big Data.

TECHNICAL SKILLS

Big Data/ Hadoop: HDFS, MapReduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, Spark, HBase, Spark, and Apache Kafka

Cloud Computing: Amazon Web Services.

Java/J2EE Technologies: J2EE, Python MySQL and Scala

Database: Oracle (SQL & PL/SQL), My SQL, HBase.

IDE: EclipseXML Related and Others XML, DTD, XSD, XSLT, JAXB, JAXP, CSS, AJAX, JavaScript.

PROFESSIONAL EXPERIENCE

Confidential, Deerfield, IL

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive and MapReduce.
Managing fully distributed Hadoop cluster is an additional responsibility assigned to me.
I was trained to overtake the responsibilities of a Hadoop Administrator, which includes managing the cluster, Upgrades and installation of tools that uses Hadoop ecosystem.
Worked on Installation and configuring of Zookeeper to co-ordinate and monitor the cluster resources.
Implemented test scripts to support test driven development and continuous integration.
Worked on POC’s with Apache Spark using Scala to implement spark in project.
Consumed the data from Kafka using Apache spark.
Load and transform large sets of structured, semi structured and unstructured data.
Involved in loading data from LINUX file system to HDFS
Importing and exporting data into HDFS and Hive using Sqoop
Implemented Partitioning, Dynamic Partitions, Buckets in Hive
Experience in Daily production support to monitor and trouble shoots Hadoop/Hive jobs.
Worked in creating HBase tables to load large sets of semi structured data coming from various sources.
Extending HIVE and PIG core functionality by using custom User Defined Function’s (UDF), User Defined Table-Generating Functions (UDTF) and User Defined Aggregating Functions (UDAF) for Hive and Pig using python.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Responsible for loading data files from various external sources like MySQL into staging area in MySQL databases.
Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
Actively involved in code review and bug fixing for improving the performance.
Good experience in handling data manipulation using python Scripts.
Involved in development, building, testing, and deploy to Hadoop cluster in distributed mode.
Created Linux shell Scripts to automate the daily ingestion of IVR data
Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
Created HBase tables to store various data formats of incoming data from different portfolios.

Environment: Hadoop, HDFS, Pig, Apache Hive, Sqoop, Apache Spark, Shell Scripting, HBase, Python, Zookeeper, MySQL.

Confidential, TX

Hadoop Developer

Responsibilities:

Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Configured Sqoop Jobs to import data from RDBMS into HDFS using Oozie workflows.
Involved in creating Hive Internal and External tables, loading data and writing hive queries, which will run internally in map, reduce way.
Created batch analysis job prototypes using Hadoop, Pig, Oozie and Hive.
Assisted with data capacity planning and node forecasting.
Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
Documented the systems processes and procedures for future s.
Monitored workload, job performance and capacity planning using Cloudera Manager.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Performed CRUD operations in HBase.
Developed Hive queries to process the data.
Monitoring, Performance tuning of Hadoop clusters, Screening Hadoop cluster job performances and capacity planning Monitor Hadoop cluster connectivity and security Manage and review Hadoop log files.
Load and transform large sets of structured, semi structured and unstructured data.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Pig, Sqoop, Oozie, Hadoop, HDFS, Map Reduce, Hive, HBase, Linux, Cluster Management

Confidential

Hadoop Engineer

Responsibilities:

Responsible for analyzing large data sets and derive customer usage patterns by developing new MapReduce programs.
Written MapReduce code to parse the data from various sources and storing parsed data into HBase and Hive.
Worked on creating combiners, partitions, and distributed cache to improve the performance of MapReduce jobs.
Developed Shell Script to perform data profiling on the ingested data with the help of HIVE Bucketing.
Responsible for debug, optimization of Hive scripts and implementing DE duplication logic in Hive using a rank key function (UDF).
Experienced in writing Hive validation scripts that are used in validation framework (for daily analysis through graphs and presented to business users).
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed workflow in Oozie to automate the tasks of loading data into HDFS and pre - processing with Pig and Hive.
Imported all the customer specific personal data to Hadoop using Sqoop component from various relational databases like Netezza and Oracle.
Used Impala to read, write and query the Hadoop data in HDFS and HBase.
Worked with BI teams in generating the reports and designing ETL workflows on Tableau.
Develop testing scripts in Python and prepare test procedures, analyze test results data and suggest improvements of the system and software.
Experience in streaming log data using Flume and data analytics using Hive.
Extracted the data from RDBMS (Oracle, MySQL & Teradata) to HDFS using Sqoop

Environment: Hadoop, MapReduce, HDFS, Pig, Hive QL, HBase, Zookeeper, Oozie, Flume, Impala, Cloudera, MySQL, UNIX Shell Scripting, Tableau, Python, Spark.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Deerfield, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship