We provide IT Staff Augmentation Services!

Hadoop Developer Resume

5.00/5 (Submit Your Rating)

Richardson, TX

SUMMARY

  • IT Professional with over 7 years of experience in implementing various technologies across different phases of software development life cycle delivering cutting edge solutions to clients
  • Worked extensively as a Hadoop Developer for 3 years implementing Big Data applications using different components such as PIG, HIVE, Map Reduce and Spark, Spark SQL
  • More than 4 years of experience in Enterprise Application Development (Back end and Front end) in Java/J2EE technologies
  • Experience in working in diverse domains like Healthcare, Telecom and Banking
  • Experience in working on Cloudera, Hortonworks and MapR Hadoop distributions
  • Expertise in developing custom functionalities to meet business needs by creating UDF s in PIG and HIVE
  • Strong knowledge on Hadoop architecture and its daemons such as Name Node, Data Node, Secondary Name Node, Job Tracker, Task Tracker, Yarn
  • Expertise in using Talend integrated with Cloudera distribution Hadoop for ETL and Data Lake
  • Worked on Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark Context, Data Frame, RDDs, Spark YARN
  • Experience in developing PIG scripts for data flow and transformation activities
  • Expertise in creating Managed tables, External tables and Views in Hive and analyzing data large datasets using HiveQL
  • Stored the data in tabular formats using Hive tables and Hive SerDe
  • Working knowledge on using different NoSQL databases such as HBase, MongoDB, Cassandra
  • Implemented SQOOP for data loading activities from RDBMS (SQL server, Oracle) data sources into HDFS
  • Imported data into HDFS from various streaming systems using Spark Streaming, Flume into Big Data Lake
  • Good knowledge on using Oozie and ZooKeeper for workflows, scheduling and monitoring activities on the cluster
  • Computed indexed views for data exploration using Apache Solr
  • Experience in implementing a distributed messaging queue to integrate with Cassandra using Apache Kafka and zookeeper
  • Expert in creating and designing data ingest pipelines using technologies such as Spring Integration, Apache Storm - Kafka
  • Working knowledge on developing and using Python and Shell Scripting
  • Worked on implementing Combiners concepts in Map Reduce to increase efficiency and tune the performance of the jobs
  • Expertise in Optimization of Hive tables using optimization techniques like Static Partitioning, Dynamic Partitioning and Bucketing
  • Expertise in debugging and resolving issues in batches and scripts on a Hadoop cluster
  • Worked on Sequence files, RC files, ORC for data loading to HDFS and Parquet file format for Spark SQL
  • Hands on development knowledge with RDBMS, including writing SQL queries, PL/SQL, views, stored procedure, Triggers and Cursors
  • Used different frameworks such as Spring, Hibernate, Servlets, JSP, EJB, JMS, JDBC, SOA, Web Services, SOAP, REST
  • Experience in using Eclipse, NetBeans and IDLE, Anaconda IDEs for Java and Python programming
  • Worked extensively and implemented best practices in different software development lifecycles such as Waterfall and Agile Scrum
  • Good knowledge in applications design using Unified Modelling Language (UML), Sequence diagrams, Class diagrams, Data flow diagrams and Entity Relationship diagrams
  • Excellent communication skills and interpersonal skills along with a strong motivation for optimum project delivery
  • Prime problem solving skills with great ability to organize and prioritize tasks and been a noted team player

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop 1.x/2.x(Yarn), HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Kafka, Spark, Storm, Zookeeper, Oozie, Tez, Impala, Mahout, Solr

Programming Languages: Java, Python, PL/SQL, PIG Latin, HiveQL, Spark SQL, C

Hadoop Distributions: Cloudera, Hortonworks

IDE: Eclipse, NetBeans, IDLE, SQL Developer, Intellij, Anaconda

Relational Databases: SQL server, MySQL, Oracle 10g, 11g

NoSQL Databases: HBase, MongoDB, Cassandra

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP, JSON, XML,JSF

Application Frameworks: Hibernate, Spring, Struts, JMS, EJB, Junit, MRUnit

Web Services: SOAP, REST, WSDL, JAXB, and JAXP

Application Servers: Tomcat, WebLogic, WebSphere

Scripting Languages: Python, Shell

Visualization Tools: Tableau, R, MS Excel

Methodologies: Waterfall, Agile

Operating Systems: Microsoft, Linux, Unix, Ubuntu

PROFESSIONAL EXPERIENCE

Confidential - Richardson, TX

Hadoop Developer

Environment: Cloudera distribution Hadoop, Map Reduce, Hive, Pig, HBase, Sqoop, Flume, Cassandra, Storm, Solr Scala, Spark, Oozie, Kafka, Linux, Java, Tableau, Eclipse, HDFS, PIG, Java (JDK), MySQL

Responsibilities:

  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
  • Importing and exporting data into RDBMS and Hive using Sqoop
  • Worked on partitioning Hive table both Static and Dynamic partitions as per the need
  • Optimized HIVE analytics SQL queries to achieve job performance
  • Created and worked Sqoop jobs with incremental load to populate Hive External tables
  • Developed Pig scripts in the areas where extensive coding needs to be reduced
  • Loaded and transformed large datasets of Structured, Un-structured and semi Structured Data into HDFS
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data
  • Used Flume to collect, aggregate, and store the log data from different web servers
  • Created HBase tables to store variable data formats of data coming from different portfolios
  • Experienced in developing Map Reduce programs to load the data from system generated log file to HBase database
  • Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior for targeted advertising
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
  • Worked on migrating MapReduce programs into Spark transformations using Spark and Scala
  • Design technical solution for real-time analytics for streaming data using Kafka and Spark Streaming
  • Helped the team to increase cluster size from 20 nodes to 45 nodes. The configuration for additional data nodes was managed using Puppet
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs
  • Designing conceptual model with Spark for performance optimization
  • Developed Oozie workflow for scheduling and orchestrating the ETL process
  • Developed Map Reduce programs to parse the raw data and store the refined data in tables
  • Analyzing data with Hive, Pig and Hadoop Streaming
  • Worked on creating the Data Model for Cassandra from the current Oracle Data model
  • Worked with CQL to execute queries on the data persisting in the Cassandra cluster
  • To analyze data migrated to HDFS, used Hive data warehouse tool and developed Hive queries
  • Used Tableau for visualizing and to generate reports

Confidential - Dallas, TX

Hadoop Developer

Environment: Cloudera Distribution, Hive, MapReduce, Pig, Impala, Tableau, HDFS, Kafka, SQOOP, Flume, HBase, Oozie, Tableau, Java, AWS

Responsibilities:

  • Worked on a Hadoop environment with MapReduce, KAFKA, Sqoop, Oozie, Flume, HBase, Pig, Hive and IMPALA on a multi node cloud environment
  • To configure Hadoop environment in cloud through Amazon Web Services (AWS) and to provide a scalable distributed data solution
  • Worked on installation of KAFKA on Hadoop cluster and to use it for streaming & cleansing of raw data and have extracted useful information using Hive and stored the results in HBase
  • Developed producers for Kafka which compress, and bind many small files into a larger Avro and Sequence files before writing to HDFS to make best use of a Hadoop block size
  • Worked on implementing MapReduce Jobs to parse raw weblogs into delimited records and also in handling files in various formats such as JSON, XML, Text formats
  • Improved performance on MapReduce Jobs by creating combiners, Partitioning and Distributed Cache
  • Exposure in Spark iterative processing
  • Created partitioned tables in Hive for best performance and faster querying
  • Utilized Sqoop to import data from various database sources into HBase using Sqoop scripts by incremental data loading on transactions of customer's data by date
  • Utilized Flume in moving log files generated from various sources into Amazon S3 for processing of data
  • Performed extensive data analysis using Hive and Pig
  • Created Simple as well as complex results using Hive and have improved performance and reduced query time by creating partitioned tables
  • Created workflow in Oozie for Automating tasks of loading data into Amazon S3 and to preprocess using Pig, utilized Oozie for data scrubbing and processing
  • Developed scripts and deployed them to pre-process the data before moving to HDFS
  • Performed extensive analysis on data with Hive and Pig
  • Worked on proof of concept on IMPALA
  • Used Synergy for Version control and Clear Quest for creating and recording logs on defects and tasks

Confidential - Jacksonville, FL

Hadoop Developer

Environment: Hortonworks Distribution Hadoop, HDFS, Map Reduce, Hive, Flume, HBase, Sqoop, PIG, Java (JDK 1.6), Eclipse, MySQL and Ubuntu, Zookeeper, Oozie, Apache Kafka, Apache Storm

Responsibilities:

  • Developed multiple Map Reduce jobs in java for data cleaning and pre-processing
  • Developed simple to complex Map Reduce jobs using Hive and Pig
  • Involved in creating Hive tables loading and analyzing data using Hive queries
  • Involved in running Hadoop jobs for processing millions of records of text data
  • Responsible for managing data from multiple sources
  • Implemented best income logic using Pig scripts
  • Assisted in exporting analyzed data to relational databases using SQOOP
  • Involved in loading data from UNIX file system to HDFS
  • Created HBase tables to store different data formats
  • Experience in managing and reviewing Hadoop log files
  • Export the analyzed data to the relational databases using SQOOP for visualization and to generate reports for the BI team
  • Analyzed large amounts of datasets to determine optimal way to aggregate and report on it
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and SQOOP
  • Established connections to ingest data in and from HDFS
  • Monitored jobs on Informatica monitoring tool
  • Fetched data from oracle and written into HDFS
  • Used Hive connections to analyze data from Oracle
  • Extensive knowledge on debugging Map Reduce programs, Hive UDF's using Eclipse

Confidential - St. Louis, MO

Java Developer

Environment: Java/J2EE, JSP, CSS, HTML, PHP,JavaScript, AJAX, Hibernate, Spring 2.5, XML, Web Services, Oracle 9i

Responsibilities:

  • Involved in software development / Production support on web-based front-end applications
  • Involved in development of the CSV files using the Data load
  • Implemented Data Access Object (DAO) adapter pattern to communicate with Business Layer with Database by using Hibernate Template class
  • Responsible for Database Designing and Back End Procedures writing using SQL and PL/SQL in ORACLE database
  • Utilized WSDL and SOAP to implement Web Services in order to optimize performance by using remote model applications
  • Development of Service Layer forming the business logic of MVC based spring architecture
  • Responsible for all aspects of CMS development and administration using OpenText
  • Updating and maintaining all CMS content
  • CMS development, code customization, and administration of all Web Servers
  • Involved in configuration and deployment of front-end application on RAD
  • Involved in developing JSP's for graphical user interface
  • Developed the UI using JSP, PHP, HTML and JavaScript
  • Implemented code for validating the input fields and displaying the error messages
  • Performed unit testing using JUnit test cases

Confidential - Minneapolis, MN

SQL/Java Developer

Environment: Java/J2EE, JSP, CSS,JavaScript, AJAX, Hibernate, spring 3.0, XML, Web Services, SOAP, Restful, Maven, Rational Rose, HTML, Log4J, JBoss 4

Responsibilities:

  • Analysis and understanding of business requirements
  • Developed views and controllers for client and manager modules using Spring MVC 3.0 and Spring Core 3.0
  • Business logic is implemented using Spring Core 3.0 and Hibernate
  • Data Operations are performed using Spring ORM wiring with Hibernate and Implemented Hibernate Template and criteria API for Querying database
  • Developed Exception handling framework and used log4J for logging
  • Developed Web Services using XML messages that use SOAP
  • Developed Web Services for Payment Transaction and Payment Release
  • Developed Restful web Services
  • Created WSDL and the SOAP envelope
  • Developed and modified database objects as per the requirements
  • Involved in Unit integration, bug fixing, acceptance testing with test cases, Code reviews

Confidential

Java Developer

Environment: Java6.0, J2EE, Eclipse IDE, J2EE, JSP2.0, JDBC 3.0, Servlets, JavaScript, Springs, Struts, Ajax, HTML, JQuery, Clear Case, Clear Quest, Windows XP

Responsibilities:

  • Worked on requirement analysis and design, interacting with the business teams
  • Implemented Database driven Left Navigation Tree Menu for Admin Module using Ajax4 JSF Framework
  • Developed Validation frame work to show custom validation on JSF Screens
  • Participated in the entire SDLC of the project
  • Developed UI screens by using HTML, JSPs, CSS, jQuery, Ajax
  • Extensively written CORE JAVA with in application
  • Developed business layer using Spring, Hibernate and DAO s
  • JavaScript and jQuery validation of JDBC for all database interactions
  • Used Code Collaborator for code review
  • Creating server-side JAVA architecture using Java Servlets
  • Developed and deployed EJB's, Servlets and JSPs on WebLogic Server
  • Used MySQL as a database product
  • Used Eclipse as the IDE for the development

We'd love your feedback!