Hadoop Developer Resume Charlotte, NC - Hire IT People

PROFESSIONAL SUMMARY:

8+ years of IT experience in software development, big data management, data modeling, data integration, implementation and testing of enterprise class systems spanning big data frameworks, advanced analytics and Java/J2EE technologies.
3+ years of hands on experience in Hadoop components & Map Reduce programming for parsing and populating tables for Terabytes of data.
Extensive usage of Sqoop, Flume, Oozie for data ingestion into HDFS & Hive warehouse.
Hands on performance improvement techniques for data processing in Hive, Impala, Spark, Pig & map - reduce using methods including but not limited to dynamic partitioning, bucketing, file compression.
Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and kafka.
Expertise of ingesting data to Solr from HBase
Experienced in importing data from different sources using StreamSets.
Experience with Cloudera, Hortonworks & MapR Hadoop distributions.
Hands on experience with Spark-SQL for various business use-cases.
Used Spark-SQL, Scala APIs for querying & transformation of data residing in Hive.
Used python for Spark SQL jobs to fast process the data.
Replaced existing MR jobs with Spark streaming & Spark data transformations for efficient data processing.

CORE COMPETENCIES:

TECHINICAL SKILLS:

Hadoop Ecosystems: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Spark and Zookeeper, Solr, StreamSets.

Apache Spark: Spark, Spark SQL, Spark Streaming, Scala.

ETL Tools: Informatica with Hadoop connector, Pentaho, Alteryx

Scripting Languages: Java, C, Scala, SQL, Unix Shell Scripting, Python

Java Technologies: JQuery, JSP, Servlets.

SQL Databases: Oracle, SQL Server 2012, SQL Server 2008 R2, DB2, Teradata

No-SQL: MongoDB, HBase.

Development tools: Maven, Eclipse, IntelliJ, PyCharm

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

Created external hive tables to move the data from different source to cloudera.
Used to keep track of the data once it gets loaded and updated on Weekly and daily basis
Performed SQL Joins among Hive tables to make it into one table.
Ingested data from Hive to HBase and HBase to Solr using Spark.
Worked on Ingesting the data using StreamSets from different sources like JDBC to Hive by Sqoop jobs.
Data import from Hive to Solr using StreamSets.
Near-Real time indexing into Solr for automated process after scheduling the job.
Worked on POC to pull over the third-party data and used Spark SQL to create schema RDD and loaded it into Hive Tables and structured data using Spark SQL.
Developed Flume ETL job for handling data from HTTP source and sink as HDFS.
Worked closely with Admin to setup with Kerberos Authentication.
Interacted closely with Web Developer for usage of application and to pull data from Solr as well as Hbase and populate it in front end.
Used Spark-SQL to Load JSON data and create Schema RDD and loaded it into Hive Tables and handled Structured data using SparkSQL.
Implemented Spark SQL queries using Python for fast processing the data.

Environment: Hive, HDFS, HBase, Solr, StreamSets, Spark, Kafka, Scala, IntelliJ, Python, PyCharm.

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

Developed data pipeline using Sqoop, Flume to store data into HDFS and further processing through spark.
Creating Hive tables with periodic backups, writing complex Hive/Impala queries to run on Impala.
Implemented partitioning, bucketing and worked on Hive, using file formats and compressions techniques with optimizations.
Created Hive Generic UDF's to process business logic that varies based on policy.
Experience in customizing map reduce framework at different levels like input formats, data types, custom serde and partitioners.
Pushed the data to Windows mount location for Tableau to import it for reporting.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
Implemented Spark to migrate map reduce jobs into Spark RDD transformations, streaming data using Spark streaming
Developed Spark scripts using Scala, Spark SQL to access hive tables in spark for faster data processing.
Configured build scripts for multi module projects with Maven.
Automated the process of scheduling workflow using Oozie and Autosys.

Environment: Hadoop, Cloudera, HDFS, Hive, Spark, Sqoop, Flume, Java, Scala, Shell-script, Impala, Eclipse, Tableau, MySql.

Confidential

Java Developer

Responsibilities:

Involved in requirement Analysis, Designing, Coding and Testing.
Developed application on Agile scrum basis.
Developed and implemented the MVC Architectural pattern using Struts Framework including JSP, Servlets, EJB and Action classes.
Object Oriented Analysis and Design using UML include development of class diagrams, Sequence diagrams and State diagrams and implemented these diagrams in Microsoft Visio.
Involved in writing client-side validations using JavaScript, CSS.
Designed and developed the UI using Struts view components HTML, CSS and JavaScript.
Developed JMS API using J2EE package.
Used Oracle as Database and used Toad for queries execution and involved in writing SQL scripts, PL/SQL code for procedures and functions.
Involved in designing test plans, test cases and overall Unit testing of the system.
Prepared documentation and participated in preparing user's manual for the application.

Environment: Java, JQuery, Junit, Servlets, Spring 2.0, Web Logic, Eclipse, JSP, Windows XP, HTML, CSS, JavaScript, and XML.