We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

4.00/5 (Submit Your Rating)

SUMMARY

  • Around 9 years of Professional Experience in various IT sectors such as health - care, Finance, Insurance, and retail, which includes 4 years of experience with Big Data and Hadoop Eco Systems.
  • Extensive experience of development using Hadoop ecosystem components like Spark, Hive, Kafka, Impala, HBase, MapReduce, Pig, Sqoop, Yarn and Oozie.
  • Strong programming experience using Java, Scala, Python and SQL.
  • Strong fundamental understanding of Distributed Systems Architecture and parallel processing frameworks.
  • Strong experience designing and implementing end-to-end data pipelines running on terabytes of data.
  • Expertise in developing production ready Spark applications utilizing Spark-Core, Dataframes, Spark-SQL, Spark-ML and Spark-Streaming API's.
  • Strong experience troubleshooting failures in spark applications and fine-tuning for better performance.
  • Experience in using D-Streams in spark streaming, accumulators, Broadcast variables, various levels of caching and optimization techniques in spark.
  • Strong experience working with data ingestion tools Sqoop and Kafka.
  • Good knowledge and development experience with using MapReduce framework.
  • Hands on experience in writing AD-hoc Queries for moving data from HDFS to Hive and analyzing data using Hive QL.
  • Proficient in creating Hive DDL's, writing Hive custom UDF’s.
  • Knowledge in job workflow managing and monitoring tools like Oozie and Rundeck.
  • Experience in designing, implementing and managing secure authentication mechanism to Hadoop cluster with Kerberos.
  • Experience in working with NoSQL database like HBase, Cassandra and Mongo DB.
  • Experience in ETL process consisting of data transformation, data sourcing, mapping, conversion and loading.
  • Good knowledge in creating ETL jobs through Talend to load huge volumes of data into Hadoop Ecosystem and relational databases.
  • Experience working with Cloudera, Hortonworks and Amazon AWS EMR distributions.
  • Good experience in developing applications using Java, J2EE, JSP, MVC, EJB, JMS, JSF, Hibernate, AJAX and web based development tools.
  • Strong experience in RDBMS technologies like MySQL, Oracle and Teradata.
  • Strong expertise in creating Shell-Scripts, Regular Expressions and Cron Job Automation.
  • Good knowledge in Web Services, SOAP programming, WSDL, and XML parsers like SAX, DOM, AngularJS, Responsive design/Bootstrap.
  • Experience with various version control systems such as CVS, TFS, SVN.
  • Worked with geographically distributed and culturally diverse team, including roles that involve interaction with clients and team members.

TECHNICAL SKILLS

Big Data Eco System: Hadoop, HDFS, MapReduce, Hive, Pig, Impala, HBase, Sqoop, NoSQL (HBase), Spark, Spark Streaming, Zookeeper, Oozie, Kafka, Flume, Hue, Cloudera Manager, Amazon AWS, Hortonworks

Java/J2EE & Web Technologies: J2EE, JMS, JSF, Servlets, HTML, CSS, XML, XHTML, AJAX, Angular JS, JSP, JSTL

Languages: C, C++, Core Java, Shell Scripting, PL/SQL, Python, Pig Latin, Scala

Scripting Languages: JavaScript and UNIX Shell Scripting, Python

Operating system: Windows, MacOS, Linux and Unix

Design: UML, Rational Rose, Microsoft Visio, E-R Modelling

DBMS / RDBMS: Oracle 11g/10g/9i, Microsoft SQL Server 2012/2008, MySQL, DB2 and NoSQL, Teradata SQL, RDBMS, MongoDB, Cassandra, HBase

IDE and Build Tools: Eclipse, NetBeans, Microsoft Visual Studio, Ant, Maven, JIRA, Confluence

Version Control: SVN, CVS, GITHUB

Security: Kerberos

Web Services: SOAP, RESTful, JAX-RS

Web Servers: Web Logic, Web Sphere, Apache Tomcat, Jetty

PROFESSIONAL EXPERIENCE

Confidential

Sr. Hadoop Developer

Responsibilities:

  • Responsible for ingesting large volumes of user behavioral data and customer profile data to Analytics Data store.
  • Developed custom multi-threaded Java based ingestion jobs as well as Sqoop jobs for ingesting from FTP servers and data warehouses.
  • Developed Scala based Spark applications for performing data cleansing, event enrichment, data aggregation, de-normalization and data preparation needed for machine learning and reporting teams to consume.
  • Worked on troubleshooting spark application to make them more error tolerant.
  • Worked on fine-tuning spark applications to improve the overall processing time for the pipelines.
  • Wrote Kafka producers to stream the data from external rest API’s to Kafka topics.
  • Wrote Spark-Streaming applications to consume the data from KAFKA topics and write the processed streams to HBase.
  • Experienced in handling large datasets using Spark in Memory capabilities, using broadcasts variables in Spark, effective & efficient Joins, transformations and other capabilities.
  • Worked extensively with Sqoop for importing data from Oracle.
  • Experience working for EMR cluster in AWS cloud and working with S3.
  • Involved in creating Hive tables, loading and analyzing data using hive scripts.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Good experience with continuous Integration of application using Bamboo.
  • Used Reporting tools like Tableau to connect with Impala for generating daily reports of data.
  • Collaborated with the infrastructure, network, database, application and BA teams to ensure data quality and availability.
  • Designed, documented operational problems by following standards and procedures using JIRA.

Environment: Hadoop 2.x, Spark, Scala, Hive, Sqoop, Oozie, Kafka, Amazon EMR, ZooKeeper, Impala, YARN, JIRA, Kerberos, Amazon AWS, Shell Scripting, SBT, GITHUB, Maven.

Confidential

Hadoop Developer

Responsibilities:

  • Involved in requirement analysis, design, coding and implementation phases of the project.
  • Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
  • Converted existing MapReduce jobs into Spark transformations and actions using Spark RDDs, Data frames and Spark SQL APIs.
  • Written new spark jobs in Scala to analyze the data of the customers and sales history.
  • Used Kafka to get data from many streaming sources into HDFS.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Good experience in Hive partitioning, Bucketing and Collections perform different types of joins on Hive tables.
  • Created Hive external tables to perform ETL on data that is generated on daily basics.
  • Written HBase bulk load jobs to load processed data to Hbase tables by converting to HFiles.
  • Performed validation on the data ingested to filter and cleanse the data in Hive.
  • Created SQOOP jobs to handle incremental loads from RDBMS into HDFS and applied Spark transformations.
  • Loaded the data into hive tables from spark and used parquet columnar format.
  • Developed oozie workflows to automate and product ionize the data pipelines.
  • Developed Sqoop import Scripts for importing reference data from Netezza.

Environment: HDFS, Hadoop, Pig, Hive, HBase, Sqoop, Kafka, Teradata, Map Reduce, Oozie, Java 6/7, Oracle 10g, YARN, UNIX Shell Scripting, Amazon Web Services, Maven, Agile Methodology, JIRA, Linux.

Confidential

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Developed custom MapReduce programs and custom User Defined Functions (UDF's) in Hive to transform the large volumes of data with respect to business requirement.
  • Wrote MapReduce jobs using Java API and Pig Latin.
  • Extracted the data from the flat files and other RDBMS databases into staging area and ingested to Hadoop.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
  • Involved in migrating tables from RDBMS into Hive tables using SQOOP and later generate visualizations using Tableau.
  • Developed numerable Pig batch programs for both implementation, and optimization needs.
  • Used HBase in accordance with Hive/Pig as per the requirement.
  • Created different Pig scripts & converted them as a shell command to provide aliases for common operation for project business flow.
  • Load the data into HDFS from different Data sources like Oracle, DB2 using Sqoop and load into Hive tables.
  • Integrated the hive warehouse with HBase for information sharing among teams.
  • Developed complex Hive UDFs to work with sequence files.
  • Designed and developed Pig Latin scripts and Pig command line transformations for data joins and custom processing of MapReduce outputs.
  • Created dashboards in Tableau to create meaningful metrics for decision making.
  • Performed rule checks on multiple file formats like XML, JSON, CSV and compressed file formats.
  • Monitored System health and logs and respond accordingly to any warning or failure conditions.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Worked with Avro Data Serialization system to work w0069th JSON data formats.
  • Implemented Counters for diagnosing problem in queries and for quality control and application-level statistics.
  • End-to-end performance tuning of Hadoop clusters and Hadoop Map/Reduce routines against very large data sets.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
  • Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs by Directed Acyclic graph (DAG) of actions with control flows.
  • Involved in Agile methodologies, daily Scrum meetings, Sprint planning.

Environment: HDFS, HBase, MapReduce, Cassandra, Hive, Pig, Sqoop, Tableau, NoSQL, Shell Scripting, Oozie, Avro, HDP Distribution, Eclipse, Log4j, JUnit, Linux.

Confidential

Java Developer

Responsibilities:

  • Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
  • Used Rational Rose for developing Use case diagrams, Activity flow diagrams, Class diagrams and Object diagrams in the design phase.
  • Used spring for cross cutting concerns and IOC for dependency injection.
  • Implemented application level persistence using Hibernate and spring.
  • Consuming and exposing various Web services using JAX-RS to different systems like NPI Validation, Address validation.
  • Implemented the core java programming for the inventory cost.
  • Developed complex Web Services and tailored JAX-RSAPI to suit the requirement.
  • Development of UI models using HTML, JSP, JavaScript, AJAX, Web Link and CSS.
  • Wrote custom JavaScript and CSS to maintain user friendly look and feel.
  • Wrote jQuery function while implementing various UI Screens across the whole web application.
  • Wrote application level code to perform client side validation using jQuery, JavaScript.
  • Primarily focused on the spring components such as Spring MVC, Dispatcher Servlets, Controllers, Model and View Objects, View Resolver.
  • Wrote complex named SQL queries using Hibernate.
  • Generated POJO classes with JPA Annotations using Reverse Engineering.
  • Developed the application using IntelliJ IDE.
  • Used LOG4J, JUnit for debugging, testing and maintaining the system state.
  • Used SOAP-UI for testing the Web-Services.
  • Used SVN to maintain source and version management.
  • Using JIRA to manage the issues/project work flow.
  • Implemented SOLID Design Principles throughout the development of Project.
  • Unit tested all the classes using JUNIT at various class level and methods level.

Environment: Java/Java EE5, JSP2.1, Spring 2.5, Spring MVC, Hibernate3.0, Web services, JAX-RS, Rational Rose, WADL, SoapUI, HTML, CSS, JavaScript, AJAX, JSON, jQuery, Maven, JMS, Maven, log4j, Jenkins, JPA, Oracle, MY SQL, SQL Developer, JIRA, SVN, PL/SQL, Weblogic 10.3, IntelliJ, UNIX.

Confidential

Java Developer

Responsibilities:

  • Involved in Requirement Analysis, Design, Development and Testing of the risk workflow system.
  • Understood open source frameworks along with debugging by Eclipse tool.
  • Utilized Spring Framework including encouraging application architectures based on the MVC (J2EE Design Patterns) design paradigm.
  • Implemented RESTful API Web Services.
  • Performed server side programming using AJAX, JQuery.
  • Configured the hibernate files for the libraries of the project.
  • Implemented bootstrap in designing the responsive design of the web page.
  • Created wireframes in designing the structure of the project.
  • Involved in system design and development in core java using Collections, multithreading and exception handling.
  • Designed user interface using HTML, CSS, Servlet, JSP.
  • Implemented templates for different rules for accessing different applications.
  • Performed Client Side validations using Java script.
  • Developed Web Pages using HTML, DHTML and CSS.
  • Actively involved in the integration of different use cases, code reviews and refactoring.
  • Used Log4J to maintain the user defined logs on system.
  • Created unit test cases using Junit for the end-end testing.
  • Actively worked with the client to collect requirements for the project.
  • Involved in the implementation of the Software development life cycle (SDLC) that includes Development, Testing, Implementation and Maintenance Support.

Environment: Spring, Core Java, HTML, DHTML, Log4J, UNIX OS, CSS, JavaScript, AJAX, JQuery, Eclipse IDE, RESTful Web Service, Maven, UML, Java Mail API, Hibernate, MVC, JSP, Junit, wireframes.

We'd love your feedback!