We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Hartford, CT

PROFESSIONAL SUMMARY:

  • Overall 8 years of experience in IT industry which around 4+ years of experience in Big Data in implementing complete Hadoop solutions.
  • Working experience in developing applications involving Big Data technologies like Map Reduce, HDFS, Hive, Sqoop, Pig, Oozie, HBase, NiFi, Spark, Scala, Kafka and Zoo Keeper.
  • Strong experience in data analytics using Hive and Pig, including by writing custom UDFs.
  • Experience in importing and exporting data into HDFS and Hive using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa.
  • Good experience in Talend  (ETL Tool) to developing & leading the end to end implementation of Big Data projects, comprehensive experience as a Hadoop Developer.
  • Good Knowledge in using NiFi   to automate the data movement between different Hadoop systems. 
  • Designed and implemented custom NiFi   processors that reacted, processed for the data pipeline
  • Exploring with Spark  various modules of Spark  and working with Data Frames, RDD and   Spark Context.
  • Experience in writing MAPREDUCE programs in java for data cleansing and preprocessing.
  • Excellent understanding /knowledge on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node.
  • Experience in managing and reviewing Hadoop log files.
  • Implemented Hadoop based data warehouses, integrated Hadoop with Enterprise Data Warehouse systems.
  • Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses
  • In depth knowledge of creating Map Reduce codes in Java as per the business requirements.
  • Expertise in Core Java, J2EE, Multithreading, JDBC, Shell Scripting and proficient in using Java API’s for application development.
  • Well-versed in Agile, other SDLC methodologies and can coordinate with owners and SMEs. 
  • Strong knowledge of Software Development Life Cycle (SDLC).
  • Experienced in preparing and executing Unit Test Plan and Unit Test Cases after software development.

TECHNICAL EXPERTISE:

Big Data Ecosystems:  Hadoop, MapReduce, HDFS, Hive, Pig, Sqoop, HBase, Zookeeper, Oozie, NiFi, Spark, Scala, Kafka.

Programming Languages:  Core Java, Python JSP, JDBC, PL/SQL

Scripting Languages:  Shell Script, JSP & Servlets, JavaScript,

Databases: Oracle 11g/10g/9i, MySQL, DB2

Tools:  Eclipse, JDeveloper, JUnit, Ant, MS Visual Studio

ETL Tools: Informatica, Talend

Methodologies:  Agile, Scrum and Waterfall

Operating Systems: Windows, UNIX, Linux

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential, Hartford, CT

Responsibilities:

  • Worked on analysing Hadoop cluster using different big data analytic tools including Pig, Hive , and MapReduce .
  • Coordinated with business customers to gather business requirements. And also interacted with other technical peers to derive Technical requirements.
  • Developed data pipeline using Pig , SQOOP and Java map reduce to ingest customer behavioral data and financial histories into HDFS for analysis.
  • Implemented File Transfer Protocol operations using Talend  Studio to transfer files in between network folders. 
  • Developed a data pipeline using Kafka  to store data into HDFS. 
  • Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
  • Experience in converting SQL queries into Spark  Transformations using Spark  RDDs and Scala  and Performed map-side joins on RDD's. 
  • Experience in performing Transformations & Actions on RDDs and Spark Streaming data. 
  • Proficient in developing data transformation and other analytical applications in Spark , Spark -SQL using Scala   programming language 
  • Ingested structured data into appropriate schemas and tables to support the rule and analytics. 
  • Developed custom User Defined Function (UDF's ) in Hive to transform the large volumes of data with respect to business requirement.
  • Evaluated Hortonworks NiFi  (HDF 2.0) and recommended solution to inject data from multiple data sources to HDFS & Hive using NiFi.  
  • Loading data from different source(database & files) into Hive using Talend  tool.
  • Worked with Developer teams to move data in to HDFS through HDF NiFi . 
  • Develop NiFi  workflow to pick up the multiple retail files from ftp location and move those to HDFS on daily basis. 
  • Worked with developer teams on NiFi  workflow to pick up the data from rest API server, from data lake as well as from SFTP server and send that to Kafka bro 
  • Involved in loading data from edge node to HDFS using shell scripting. 
  • Load and transform large sets of structured, semi structured and unstructured data. 
  • Developed Pig Scripts, Pig UDFs and Hive Scripts, Hive UDFs to load data files. 
  • Managed Hadoop jobs using Oozie workflow scheduler system for Map Reduce, Hive, Pig and Sqoop   actions .
  • Troubleshooting, debugging & altering Talend  particular issues, while maintaining the health and performance of the ETL environment.
  • Experience in managing and reviewing Hadoop log files. 
  • Used Oozie workflow engine to run multiple Hive and pig jobs. 
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it. 
  • Responsible to manage the test data coming from different sources. 
  • Responsible for developing batch process using Unix Shell Scripting.

Environment: Hadoop, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, NiFi, Spark, Scala, AWS, Talend, Oracle 11g, Core Java, Cloudera, HDFS, Eclipse, UNIX, LINUX, Oracle

Hadoop Developer

Confidential, Las Vegas, NV

Responsibilities:

  • Installed, configured and maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
  • Installed and configured Hadoop, MapReduce, HDFS (Hadoop Distributed File System), developed multiple MapReduce jobs in java for data cleaning.
  • Worked on installing cluster, commissioning & decommissioning of DataNodes, NameNode recovery, capacity planning, and slots configuration.
  • Worked extensively on HIVE, Sqoop , PIG and Python
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive
  • Scheduled CRON JOB to schedule the shell scripts. 
  • Worked with moving the VSAM files in mainframes to the Hadoop.
  • Developed PIG predefined functions to convert the fixed width file to delimited file. 
  • Migrated Complex map reduce programs into in memory spark processing using Transformations and actions. 
  • Used Scala to store streaming data to HDFS and to implement Spark for faster processing of data. 
  • Experience in developing shell scripts to perform the incremental loads. 
  • Used HIVE join queries to join multiple tables of a source system and load them into Elastic Search Tables. 
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS .
  • Involved in data migration from one cluster to another cluster. 
  • Worked on Talend for Loading and extracting data from Oracle and SQL. 
  • Experienced in managing Hadoop Jobs and logs of all the scripts. 
  • Experienced in creating hive tables by getting the source field names and data types.
  • Experienced in Data Validations and gathering the requirements from the business

Environment: Hadoop, MapReduce, Talend, Hive, HDFS, PIG, Sqoop, Oozie, Flume, HBase, ZooKeeper, AWS, Oracle, Python and UNIX. 

Hadoop Developer

Confidential, Miami, FL

Responsibilities:

  • Installed, configured and maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop
  • Responsible for building scalable distributed data solutions using Hadoop 
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files 
  • Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
  • Upgrading the Hadoop Cluster from CDH3 to CDH4, setting up High availability Cluster and integrating HIVE with existing applications 
  • Analyzed the data by performing Hive queries and running Pig scripts to know user behavior 
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs  
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop .
  • Worked extensively with Sqoop   for importing metadata from Oracle 
  • Experience migrating MapReduce programs into Spark transformations using Spark and Scala 
  • Configured Sqoop  and developed scripts to extract data from MySQL into HDFS.
  • Hands-on experience with productionalizing Hadoop applications viz. administration, configuration management, monitoring, debugging and performance tuning 
  • Created HBase tables to store various data formats of PII data coming from different portfolios 
  • Cluster co-ordination services through Zookeeper.
  • Push data as delimited files into HDFS using Talend  Big data studio.
  • Spark Streaming collects this data from Kafka in near-real-time and performs necessary
  • Installed and configured Hive and also written Hive UDFs in java and Python .
  • Helped with the sizing and performance tuning of the Cassandra cluster 
  • Involved in the process of Cassandra data modelling and building efficient data structures.
  • Trained and mentored analyst and test team on Hadoop framework, HDFS, Map Reduce concepts, Hadoop Ecosystem 
  • Worked on installing and configuring EC2 instances on Amazon Web Services (AWS) for establishing clusters on cloud 
  • Responsible for architecting Hadoop clusters .
  • Written shell scripts and Python   scripts for automation of job 
  • Assist with the addition of Hadoop processing to the IT infrastructure 
  • Perform data analysis using Hive and Pig  

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Scala, Cassandra, Pig, Sqoop, Oozie, ZooKeeper, Teradata, PL/SQL, MySQL, Windows, Horton works, Oozie, HBase

Java/ J2EE Developer

Confidential

Responsibilities:

  • Created Use case, Sequence diagrams, functional specifications and User Interface diagrams using Star UML.
  • Understanding of the business functionality and business rules to be followed during the execution of the projects.
  • Involved in creating sequence diagrams using UML and coding.
  • Involved in Design, Development, and Testing and Deployment phases.
  • Used Spring IOC, AOP modules to integrate with application.
  • Implemented MAVEN Build scripts for Build and deploy the application and involved in deploying and shell scripting for the system automation.
  • Designed and developed several Servlets, JSPs, JAVA classes for presentation layer.
  • Used Oracle10g as a backend database created stored procedures, prepared and maintain the scripts for each custom service.
  • Used Hibernate as persistent layer by mapping to the tables.
  • Involved in writing propagation scripts to move content from one environment to another environment.
  • Created the web pages using AJAX with JQuery, Dojo, Ext.Js implementation for Widgets and Events handling.
  • Used MVC architecture and developed code using springs and JSP for the view.
  • Used NetUIi tags for better UI (User Interface) Implementation.
  • Used JSTL to remove scriplets in JSP’s.
  • Created tables, relationships, triggers and indexes for enforcing Business Rule
  • Communicates with other internal applications via JMS messages, EJBs, and web services.
  • Implementing Client Side validations using JavaScript and developed front end code with AJAX, HTML and CSS.
  • Struts Validation is used for server side validation and struts internationalization.
  • Optimized the performance of queries with modifications in T-SQL queries, removed unnecessary columns, and eliminated redundant and inconsistent data.
  • Testing the code using the JUNIT test scripts and supporting System Testing.
  • Used SVN for Version Repository maintenance.

Environment: Java, J2EE, Servlets, JSP, Struts 2.0, Spring 2.0, Hibernate 3.0, Java Script, Linux, AJAX, Beehive NetUI, SQL, MAVEN, Site Minder, SOAP, XML, UML, SVN, Oracle 10g, Eclipse 3.3, Windows.

Java/ J2EE Developer

Confidential

Responsibilities:

  • Interacting with the client on a regular basis to gather requirements.
  • Gathered project requirements from Business users.
  • Used Struts framework to develop action classes and form beans.
  • Used spring framework for Dependency injection, security features and to develop the application.
  • Used SOAP in Web Services for data communications.
  • Used JSP, JSTL to develop web module.
  • Designed and implemented design patterns like Singleton, Factory, Session Façade and DAO.
  • Used PL/SQL for storing, managing and distributing data.
  • AJAX was used to exchange small amounts of data with the server so that the entire web page does not have to be reloaded each time the user requests a change.
  • Configured the database through XML using Hibernate.
  • Create build/deploy scripts using ANT for applications hosted on Web sphere.
  • Followed the client development standards and methodologies.

Environment: Java 5, J2EE, Core JAVA, Eclipse 3, DB2, Web services, Spring 2.0, JSP, Servlets, Struts, SOAP, Design patterns, Hibernate, Java Script, XML, HTML, XSL, XSLT, JDBC, Junit, AJAX, PL/SQL, UML, UNIX.

We'd love your feedback!