We provide IT Staff Augmentation Services!

Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Charlotte, NC

PROFESSIONAL SUMMARY:

  • 8+ years of IT experience in software development, big data management, data modeling, data integration, implementation and testing of enterprise class systems spanning big data frameworks, advanced analytics and Java/J2EE technologies.
  • 3+ years of hands on experience in Hadoop components & Map Reduce programming for parsing and populating tables for Terabytes of data.
  • Extensive usage of Sqoop, Flume, Oozie for data ingestion into HDFS & Hive warehouse.
  • Hands on performance improvement techniques for data processing in Hive, Impala, Spark, Pig & map - reduce using methods including but not limited to dynamic partitioning, bucketing, file compression.
  • Experience data processing like collecting, aggregating, moving from various sources using Apache Flume and kafka.
  • Expertise of ingesting data to Solr from HBase
  • Experienced in importing data from different sources using StreamSets.
  • Experience with Cloudera, Hortonworks & MapR Hadoop distributions.
  • Hands on experience with Spark-SQL for various business use-cases.
  • Used Spark-SQL, Scala APIs for querying & transformation of data residing in Hive.
  • Used python for Spark SQL jobs to fast process the data.
  • Replaced existing MR jobs with Spark streaming & Spark data transformations for efficient data processing.

CORE COMPETENCIES:

  • Hadoop Development & Troubleshooting
  • Data Analysis
  • Data Visualization & Reporting in Tableau
  • Real - time Streaming using Spark.
  • Map Reduce Programming
  • Performance Tuning of Hive & Impala
  • Ingesting data from HBase to Solr
  • Data import using StreamSets.

TECHINICAL SKILLS:

Hadoop Ecosystems: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Oozie, Flume, Spark and Zookeeper, Solr, StreamSets.

Apache Spark: Spark, Spark SQL, Spark Streaming, Scala.

ETL Tools: Informatica with Hadoop connector, Pentaho, Alteryx

Scripting Languages: Java, C, Scala, SQL, Unix Shell Scripting, Python

Java Technologies: JQuery, JSP, Servlets.

SQL Databases: Oracle, SQL Server 2012, SQL Server 2008 R2, DB2, Teradata

No-SQL: MongoDB, HBase.

Development tools: Maven, Eclipse, IntelliJ, PyCharm

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

  • Created external hive tables to move the data from different source to cloudera.
  • Used to keep track of the data once it gets loaded and updated on Weekly and daily basis
  • Performed SQL Joins among Hive tables to make it into one table.
  • Ingested data from Hive to HBase and HBase to Solr using Spark.
  • Worked on Ingesting the data using StreamSets from different sources like JDBC to Hive by Sqoop jobs.
  • Data import from Hive to Solr using StreamSets.
  • Near-Real time indexing into Solr for automated process after scheduling the job.
  • Worked on POC to pull over the third-party data and used Spark SQL to create schema RDD and loaded it into Hive Tables and structured data using Spark SQL.
  • Developed Flume ETL job for handling data from HTTP source and sink as HDFS.
  • Worked closely with Admin to setup with Kerberos Authentication.
  • Interacted closely with Web Developer for usage of application and to pull data from Solr as well as Hbase and populate it in front end.
  • Used Spark-SQL to Load JSON data and create Schema RDD and loaded it into Hive Tables and handled Structured data using SparkSQL.
  • Implemented Spark SQL queries using Python for fast processing the data.

Environment: Hive, HDFS, HBase, Solr, StreamSets, Spark, Kafka, Scala, IntelliJ, Python, PyCharm.

Confidential, Charlotte, NC

Hadoop Developer

Responsibilities:

  • Developed data pipeline using Sqoop, Flume to store data into HDFS and further processing through spark.
  • Creating Hive tables with periodic backups, writing complex Hive/Impala queries to run on Impala.
  • Implemented partitioning, bucketing and worked on Hive, using file formats and compressions techniques with optimizations.
  • Created Hive Generic UDF's to process business logic that varies based on policy.
  • Experience in customizing map reduce framework at different levels like input formats, data types, custom serde and partitioners.
  • Pushed the data to Windows mount location for Tableau to import it for reporting.
  • Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
  • Implemented Spark to migrate map reduce jobs into Spark RDD transformations, streaming data using Spark streaming
  • Developed Spark scripts using Scala, Spark SQL to access hive tables in spark for faster data processing.
  • Configured build scripts for multi module projects with Maven.
  • Automated the process of scheduling workflow using Oozie and Autosys.

Environment: Hadoop, Cloudera, HDFS, Hive, Spark, Sqoop, Flume, Java, Scala, Shell-script, Impala, Eclipse, Tableau, MySql.

Confidential

Java Developer

Responsibilities:

  • Involved in requirement Analysis, Designing, Coding and Testing.
  • Developed application on Agile scrum basis.
  • Developed and implemented the MVC Architectural pattern using Struts Framework including JSP, Servlets, EJB and Action classes.
  • Object Oriented Analysis and Design using UML include development of class diagrams, Sequence diagrams and State diagrams and implemented these diagrams in Microsoft Visio.
  • Involved in writing client-side validations using JavaScript, CSS.
  • Designed and developed the UI using Struts view components HTML, CSS and JavaScript.
  • Developed JMS API using J2EE package.
  • Used Oracle as Database and used Toad for queries execution and involved in writing SQL scripts, PL/SQL code for procedures and functions.
  • Involved in designing test plans, test cases and overall Unit testing of the system.
  • Prepared documentation and participated in preparing user's manual for the application.

Environment: Java, JQuery, Junit, Servlets, Spring 2.0, Web Logic, Eclipse, JSP, Windows XP, HTML, CSS, JavaScript, and XML.

We'd love your feedback!