We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

4.00/5 (Submit Your Rating)

Nyc, NY

SUMMARY:

  • 7+ years of IT experience in development and implementation of business needs using Java, Big Data Hadoop and spark.
  • 3+ years of exclusive experience in big data implementation in Cloudera distribution.
  • Experience in ingestion, storage, querying, processing and analysis of big data.
  • Excellent hands on experience in Apache Hadoop technologies like Hive, Sqoop, Spark with Scala, Impala, HBase.
  • Good understanding and knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, Secondary Namenode, MapReduce and Yarn concepts.
  • Expertise skills of Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice - versa.
  • Experience in developing Hadoop integration for data ingestion, data mapping and data processing capabilities.
  • Expertise in writing shell Scripts, HiveQL queries and Map Reduce programs.
  • Experience in ETL process using Spark(Scala).
  • Hands on experience in Spark SQL.
  • Hands on experience in streaming data tools like Flume and Kafka.
  • Worked in Agile Environment.
  • Worked on semi structured data like JSON and XML files.
  • Extensive experience in development of web applications utilizing Core Java, JDBC, SQL, servlets, hibernate, Eclipse IDE, RDBMS Oracle.
  • Very good experience in complete project life cycle (design, development, testing andimplementation) of Web applications.
  • Hands on experience in VPN, Putty.
  • Exceptional ability to learn new concepts.

TECHNICAL SKILLS:

Hadoop ECO Systems: HDFS, Map Reduce, HDFS, Hive, Sqoop, Flume, and HBase, Impala, Kafka

NO SQL: Hbase

Data Bases: MY SQL, Oracle 10g

Languages: Scala, Java, J2EE, SQL, HiveQL

Operating Systems: Windows XP/10, LINUX

IDE s & Utilities: Eclipse, Hue

PROFESSIONAL EXPERIENCE:

Confidential, NYC, NY

Hadoop/Spark Developer

Responsibilities:

  • Involved in gathering the requirements, designing, development and testing.
  • Involved in running Spark jobs over YARN.
  • Experience with Cloudera distributions (CDH5).
  • Developed the Sqoop scripts for import from MySQL, Oracle to Hadoop.
  • Configured and used Kafka for data ingestion from various data sources containing different file formats like JSON and XML.
  • Involved in writing HDFS CLI commands.
  • Loaded the data into Spark RDD and did in memory data Computation using Spark-SQL Dataframes.
  • Created external Hive tables to store spark output.
  • Knowledge about different layers of data flow.
  • Performed advanced procedures like text analytics and processing, in Spark using Scala.
  • Worked with different file format such as Avro, RC, ORC, Parquet and compression techniques in Hive storage.
  • Created Hive Managed, External Tables and Views to store the processed data and integrate them to reporting tool Tableau
  • Involved in writing the HIVE join queries to extract data as per business requirement.
  • Involved in performance tuning for skewed data in HIVE.
  • Automated jobs using workflows in oozie.
  • Worked on Adhoc requests by providing the data based on user requirement.

Environment: Hadoop cloudera distribution (CDH 5.10), Apache Hadoop 2.6.0, Hive 1.1.0, SQOOP 1.4.6, kafka, Spark 1.6.0, Hue 3.5.0, Linux, oozie, MySQL, Oracle, Windows

Confidential

Hadoop Developer

Responsibilities:

  • Configured and used FLUME for data ingestion
  • Involved in writing UNIX SHELL scripts for automation of sqoop jobs.
  • Experience on loading and transforming of large sets of structured, semi structured.
  • Written the Apache PIG scripts to process the XML data and parse it into structured format.
  • Implemented best offer logic using Pig scripts.
  • Used Piggybank functions to process xml files.
  • Used Pig Diagnostic operators to evaluate step by step execution of Pig statements.
  • Used Eval operators in pig to do arithmetic calculations in data based on business need.
  • Written Hive join queries as per the requirement.
  • Worked with different file format such as Avro, RC, ORC, Parquet file formats and compression techniques in Hive storage.
  • Created Managed tables, Views to store processed data.
  • Experience with Cloudera distributions (CDH5).
  • Involved in Code reviews.

Environment: Hadoop cloudera distribution (CDH 5.10), Apache Hadoop 2.3.0, Hive 1.1.0, SQOOP 1.4.2, Flume 1.3.0, pig 0.2.0, Hue 3.5.0, Windows

Confidential

Software Engineer(J2EE)

Responsibilities:

  • Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
  • Developed a prototype of the application and demonstrated to business users to verify the application functionality.
  • Integrated Spring with Hibernate framework and created Hibernate annotations for mapping an object-oriented domain model to traditional relational database.
  • Built data synchronization application using Java multithreading to aid client to maximize profits
  • Responsible for implementing web services Rest.
  • Developed Object - Relational (O/R) mapping using Hibernate 3.0. Developed Data Access object (DAO) persistence layer using Hibernate 3.0.
  • Conversant with tools like Eclipse and InteliJ
  • Involved in design and implementation of various business scenarios under trading flow using spring.
  • Hibernate used as Persistence framework mapping the ORM objects to tables. Developed HQL, SQL queries
  • Expert in creating various PL/SQL stored procedures, views, functions and temporary tables for data input to the Crystal Reports.

Environment: Core java, Servlets, Jsp, Struts 1.3, Oracle. Java, J2EE, HTML, Hibernate, DHTML, CSS, JavaScript, XML, EJB, SQL Server 2008R2, LINUX, Windows 7/Vista/X.

Confidential

Software Engineer(J2EE)

Responsibilities:

  • Actively participated in requirements gathering, analysis and design and testing phases.
  • Responsible for use case diagrams, class diagrams and sequence diagrams using Rational Rose in the Design phase.
  • Involved in Analysis, design and coding on J2EE Environment.
  • Connectivity with Databases MySQL and Oracle.
  • Involved in writing the database integration code.
  • Web application support and maintenance using Java/J2EE (Struts 1.2), Oracle, MS SQL server, MySQL.
  • Used the JDBC for data retrieval from the database for various inquiries.
  • Involved in writing database connection classes for interacting with Oracle database.
  • Worked with J2EE and core java concept like Oops, GUI, Networking in java
  • Created quality working J2EE code to design, schedule and cost to implement use cases.
  • Involved in complete development of "Agile Development Methodology" and tested the application in each iteration.
  • Prepared the design document based on requirements and sending project status report on weekly basis.

Environment: Java, J2EE, JSP, JDBC 3.0, Servlets 3.0, SQL 2000, MySQL 5.1, Oracle 10g, Apache Tomcat 6.0

We'd love your feedback!