Hadoop Developer Resume Plymouth, MN - Hire IT People

SUMMARY:

6 years of professional experience, that includes development, deployment, maintenance and support of various projects in big organizations.
Strong experience with Big Data and Hadoop technologies with excellent knowledge of Hadoop ecosystem: Hive, Spark, Kafka, Sqoop, Pig, HBase, Oozie, and Talend.
Deep knowledge of Hadoop architecture (HDFS, YARN, MapReduce) along with their insight internal operations.
Worked with Big Data Hadoop distributions like MapR, Cloudera, and Hortonworks.
Experience in AWS cloud environment.
Hands on experience on VPC, EC2, EMR, S3, Redshift, Cloudwatch, SNS .
Experienced on Spark and performed various transformations and actions on large datasets using RDDs.
Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark - SQL, Data Frame, and Spark YARN.
Experience in capturing data and importing it to HDFS using Kafka for semi-structured data and Sqoop for existing relational databases
Experience in importing the real-time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
Analyzed large data sets using Hive queries and Pig Scripts.
Expertise in understanding Partitions, Bucketing concepts in Hive.
Experienced in job workflow scheduling and monitoring tools like Oozie
Worked on Talend Open Studio Data and Big Data integration and Preparation tools. Designed and performed ETL jobs using Talend Open Studio.
Imported and exported data using Sqoop from HDFS to RDBMS.
Exposure to file formats like Sequence, ORC, Parquet and JSON.
Worked on NoSQL databases including Hbase.
Good understanding of HDFS Designs, Daemons, federation and HDFS high availability (HA).
Well versed in designing and implementing MapReduce jobs using JAVA on Eclipse to solve real world scaling problems.
Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
Basic Knowledge of UNIX and shell scripting.
Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Spark, Scala, Kafka, Mapreduce, HBase, Pig, Hive, Sqoop, Oozie, Talend.

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, HTML, CSS.

IDE s: Eclipse, SVN, Apache ANT, Log4J, Maven, JUnit, WinSCP.

NOSQL:: HBase.

DB Languages: SQL.

Application Server: Tomcat

Programming languages: C, Java, shell scripting.

Operating Systems: LINUX, Windows XP, 7, MS DOS.

PROFESSIONAL EXPERIENCE:

Confidential, Plymouth, MN

Hadoop Developer

Responsibilities:

Ingested data into HDFS using Sqoop and scheduled an incremental load to HDFS.
Implemented Kafka for streaming data and filtered, processed the data.
Developed data pipeline using Kafka , Sqoop , Hive to ingest transactional data into HDFS for analysis.
Developed Ingestion framework to read mainframe files and create hive snapshot tables on EDP.
Created Hive tables based on business requirements. Wrote many Hive queries, UDFs and implemented concepts like Partitioning, Bucketing for efficient data access.
Created Hive tables in Parquet and ORC file formats using Snappy and Gzip compression tools.
Developed Spark code by using Scala/Spark-SQL for faster processing. Responsible for ingestion of data into EDP.
Developed workflows using Oozie to automate the tasks.
Involved in QA, test data creation, and unit testing activities.
Involved in design, development and testing phases of Software Development Life Cycle.
Utilized Agile Scrum Methodology to help manage and organize a team with regular code review sessions.

Environment : Hadoop, spark, scala, kafka, Yarn, Hive, Oozie, Sqoop, Hortonworks.

Confidential, Eden Prairie, MN

Hadoop Developer

Responsibilities:

Worked on analyzing, writing Hadoop Mapreduce jobs using Java API, Pig and Hive .
Responsible for building scalable distributed data solutions using Hadoop.
Involved in loading data from edge node to HDFS using shell scripting . .
Implemented Partitioning, Dynamic Partitioning, Buckets in Hive .
Developed PIG scripts using Pig Latin.
Handled importing data from web logs, MySQL and various data sources using sqoop .
Designing & Creating ETL Jobs through Talend to load huge volumes of data into Hbase , Hadoop Ecosystem and relational databases.
Developed testing automation framework using Talend for record count check, duplicate check, field level validation and scd2 validation.
Developed Spark code and Spark - SQL to extract data from Datalake to our Tenant to replicate Talend functionality.
Used Spark and Spark-SQL to read the parquet data and create the tables in hive using the Scala API.
Implemented Spark using Scala and utilizing Dataframes and Spark SQL API for faster processing of data.
Written shell scripts for automation of job.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files. .
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop .

Environment : Apache Hadoop, Apache Spark, Scala, spark-sql, MapReduce, HDFS, Hive, Java, Pig, Hbase, Teradata, Talend, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, PL/SQL, SQL connector, MapR.

Confidential, Basking Ridge, NJ

Hadoop Developer

Responsibilities:

Involved in analyzing data coming from various sources and creating Meta-files and control files to ingest the data in to the Datalake.
Involved in configuring batch job to perform ingestion of the source files in to the Data lake
Created several jobs in Talend ETL tool to perform transformation on source files .
Used Pig to do the transformation of the data that were in the HDFS to fit the requirements.
Created several Pig UDFs for the enrichment engine those were used to perform enrichment on the data.
Developed Hive queries to load data to HBase .
Leveraged Hive queries to create ORC tables.
Created ORC tables to improve the performance for the reporting purposes.
Worked extensively on Hive to create, alter and drop tables and involved in writing hive queries.
Created and altered Hbase tables on top of data residing in Datalake.
Designed and Developed Reference table engine frameworks on Talend using Hadoop tools such as HDFS , Hive , Hbase Mapreduce .
Experience on Talend components like transformation, file processing, java components, Unix, DB related and logging framework.
Worked closely with System Analyst and Architects to design and develop Talend jobs to fit the business requirement.
Experience in scheduling jobs in Talend .
Worked on agile methodology using Rally .

Environment : Hadoop, Map Reduce, Yarn, Hive, Pig, Hbase, Sqoop, MapR, Talend, Core Java, Eclipse, Linux

Confidential, Burlington, MA

Hadoop Developer

Responsibilities:

Involved in migrating data from slough to AWS using ETL.
Responsible for creating Hive tables based on business requirements
Developed Simple to complex MapReduce Jobs using Hive and Pig
Worked on AWS cloud environment.
Hands on experience on VPC , EC2, S3, EMR, Redshift, Data Pipeline , cloudwatch , sns .
Demonstrate analytical and problem solving skills, particularly those that apply to a " Big Data " environment
Developed scripts and improved the performance of the project by automating data management from end to end and embedded monitoring logic using cloudwatch and sns .
Worked on EMR to convert the raw data to derived format and also to transfer data from one server to another.
Worked on Sql workbench to load and aggregate the data from S3 to Redshift .
Importing and exporting data into HDFS and Hive using Flume .
Worked on Tableau dashboard on testing the performance of the dashboard by calculating the response time.
Expert knowledge developing and debugging in Java/J2EE .
Worked hands on with ETL process using Python and Java .
Migrated all the on premise data from Salo , Oracle , MySQL to Amazon redshift using python , Attunity tool on Amazon EC2 instance.
Developed data pipelines to process the data from the source systems directly into Redshift database.
Wrote MapReduce jobs and integrated it with Oozie workflow for batch processing on huge datasets.
Implemented Partitioning, Dynamic Partitioning and Bucketing in HIVE .
Exported the result set from HIVE to MySQL using Sqoop after processing the data.
Utilized Agile Scrum Methodology to help manage and organize a team with regular code review sessions and daily stand ups.

Environment: Hadoop, HDFS, Hue, MapReduce, Hive,Pig,Sqoop,AWS,VPC,EC2,S3,EMR,Redshift,Data pipeline, cloudwatch, sns,Splunk, SQL Server, MySQL, Hbase, MongoDB, UNIX Shell Scripting.

Confidential, Roseville, CA

Hadoop Developer

Responsibilities:

Responsible for building data solutions in Hadoop using Cascading frameworks.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Worked hands on with ETL process.
Upgrading the Hadoop Cluster from CDH3 to CDH4. Integrate the HIVE with existing applications.
Configured Ethernet bonding for all Nodes to double the network bandwidth.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from Teradata into HDFS using Sqoop.
Used Python and Shell scripts to automate the end-to-end ELT process
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Installed Oozie workflow engine to run multiple Hive and Pig jobs.
Developed Hive queries to process the data and generate the data cubes for visualizing.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Teradata, Cloudera Manager, Pig, Sqoop, Oozie, Python.

Confidential, Dallas, TX

Java/J2EE Developer

Responsibilities:

Involved in designing and developing modules at both Client and Server Side.
Developed the UI using JSP, JavaScript and HTML.
Responsible for validating the data at the client side using JavaScript.
Interacted with external services to get the user information using SOAP web service calls
Developed web components using JSP, Servlets and JDBC.
Designed the controller using Servlets.
Accessed backend database Oracle using JDBC.
Developed and wrote UNIX Shell scripts to automate various tasks.
Developed user and technical documentation.

Environment: Java, Servlets, JSP, JavaScript, JDBC, Unix Shell scripting, HTML, Eclipse, Oracle 8i, WebLogic.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Plymouth, MN

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship