Hadoop Developer Resume offman Estate, IL - Hire IT People

SUMMARY

Around 3+ years of hands - on experience in working on Apache Hadoop ecosystem components like Map-Reduce, Sqoop, Flume Pig, Hive, HBase, Oozie, Kafka, and Zookeeper.
Excellent understanding/ knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager and Map Reduce.
Experience in analyzing data using HiveQL, Pig Latin.
Knowledge in job work-flow scheduling and monitoring tools like Oozie.
Experience in different Hadoop distributions like Cloudera 5.3(CDH4, CHD 5) and Horton Works Distributions (HDP).
Strong end-to-end experience in Hadoop Development.
Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
Experience in data transformations using Map-Reduce, HIVE and Pig scripts for different file formats.
Expertise in analyzing the data using HIVE and writing custom UDF's in JAVA for extended HIVE and PIG core functionality.
Hands on experience in configuring and administering the Hadoop Cluster.
Good understanding of HDFS Designs, Daemons and HDFS high availability (HA).
Experience with various scripting languages like Linux/Unix shell scripts, Python.
Experience in understanding and managing Hadoop Log Files.
Experience in managing the Hadoop infrastructure with Cloudera Manager.
Experienced with data warehousing and ETL processes.
Expert knowledge of data warehousing concepts, with hands-on in developing ETL applications in a dimensional data mart/data warehouse environment.
Involved in creating MVC architecture using java, validating files, Struts frame Work.
Participated in Review of Test Plan, Test Cases and Test Scripts prepared by system integration testing team.
Monitored the performance and identified performance bottlenecks in ETL code.
Strong experience in client interaction and understanding business application, business data flow and data relations.
Have flair to adapt to new software applications and products, self-starter, have excellent communication skills and good understanding of business work flow.

SKILL SET- TECHNICAL SKILLS:

Script Languages: Shell Scripting, Python, Unix script

Big Data Technologies: HDFS, MapReduce, Hive, Hql, Pig, Sqoop, Flume, Spark, Zookeeper, Oozie, Kafka

RDBMS: MySQL, Oracle, Teradata, MSSQL

Programming language: Python, SQL, Java

IDE’s: NetBeans, Eclipse

Tools: Maven

Virtual Machines: VMWare, Virtual Box

OS: Cent OS 5.5, Unix, Red Hat Linux,Windows7,Debian, Kali

WORK EXPERIENCE

Hadoop Developer

Confidential - Hoffman Estate, IL

Responsibilities:

Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Python.
Developed and executed shell scripts to automate the jobs.
Wrote complex Hive queries and UDFs.
Worked on reading multiple data formats on HDFS using PySpark.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python.
Developed multiple POCs using PySpark and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Analyzed the SQL scripts and designed the solution to implement using PySpark.
Involved in loading data from UNIX file system to HDFS
Extracted the data from Teradata into HDFS using Sqoop.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce.
Spark and loaded data into HDFS.
Manage and review Hadoop log files.
Involved in analysis, design, testing phases and responsible for documenting technical specifications.
Used Kafka to consume data into Hadoop.
Very good understanding of Partitions, bucketing concepts in Hive and designed both managed and External tables in Hive to optimize performance.
Worked on Hive context and SQLContext of spark extensively.
Experienced in running Hadoop streaming jobs to process terabytes of data.
Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports.

Environment: Hadoop, HDFS, Hive, Python, Spark, SQL, Teradata, Yarn, Sqoop, Kafka, UNIX Shell Scripting.

HADOOP DEVELOPER

Confidential, Columbus, OH

Responsibilities:

Worked on Spark SQL, Spark Streaming, Reading/Writing data from JSON file, text file, parquet file, Schema RDD.
Worked extensively with HIVE DDLS and Hive Query language(HQLs).
Developed PIG Latin for handling business transformations.
Responsible writing PIG script and Hive queries for data processing.
Developed Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, UNION.
Involved in implementing the job workflows and scheduling for the end to end application processing.
Written spark programs in Python and ran spark jobs on YARN.
Worked with HBase databases for non-relational data storage and retrieval on enterprise use cases.
Wrote Map Reduce jobs using Java API and Pig Latin.
Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
Issued SQL queries via Impala to process the data stored in HDFS and HBase.
Involved in developing Impala scripts for extraction, transformation, loading of data in to data warehouse.
Used Flume to collect, aggregate and store the web log data onto HDFS.
Wrote Pig scripts to run ETL jobs on the data in HDFS.
Used Hive to do analysis on the data and identify different correlations.
Written lots PIG UDF to process some complex data.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Configured MySQL Database to store Hive metadata.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Written Hive queries for data analysis to meet the business requirements.
Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Involved in creating Hive tables and working on them using Hive QL.
Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers with regular code review sessions.
Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, Spark, Yarn, Sqoop, Flume, Zookeeper, CDH 5.4, Oozie, ETL, MYSQL, agile, Windows, UNIX Shell Scripting, Teradata.

Data Stage Developer

Confidential - Columbus, OH

Responsibilities:

Designed jobs involving various cross reference lookups and joins, shared containers which can be used in multiple jobs.
Sequencers are created at job level to include multiple jobs and a layer level sequence which include all job level sequences.
Extensively employed Data Stage Director to validate, run, schedule, monitor the jobs and followed job log carefully to debug the jobs.
Carefully monitored the performance statistics and involved in fine tuning of jobs for the improved processing time.
Involved in developing UNIX scripts to call Data stage jobs.
Involved in fine tuning, trouble shooting, bug fixing, defect analysis and enhancement of the multiple admin systems Data stage jobs.
Involved in the designing of marts and dimensional and fact tables.

Environment: Data stage 7.5, Teradata, Mainframe system.

JAVA DEVELOPER

Confidential - Kalamazoo, MI

Responsibilities:

The application was developed in J2EE using an MVC based architecture.
Implemented MVC design using Struts1.3 frameworks, JSP custom tag Libraries and various in-house custom tag libraries for the presentation layer.
Created tile definitions, Struts-config files, validation files and resource bundles for all modules using Struts framework.
Wrote prepared statements and called stored Procedures using callable statements in MySQL.
Executing SQL queries to check the customer records are updated appropriately.
Used Apache Tomcat as the application server for deployment.
Used Web services for transmission of large blocks of XML data over HTTP.

Environment: Java/J2EE, JSP, MySQL, Struts 1.3, Apache Tomcat, Eclipse, XML.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Offman Estate, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship