Hadoop/spark Developer Resume
Nyc, NY
SUMMARY:
- 7+ years of IT experience in development and implementation of business needs using Java, Big Data Hadoop and spark.
- 3+ years of exclusive experience in big data implementation in Cloudera distribution.
- Experience in ingestion, storage, querying, processing and analysis of big data.
- Excellent hands on experience in Apache Hadoop technologies like Hive, Sqoop, Spark with Scala, Impala, HBase.
- Good understanding and knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode, Secondary Namenode, MapReduce and Yarn concepts.
- Expertise skills of Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice - versa.
- Experience in developing Hadoop integration for data ingestion, data mapping and data processing capabilities.
- Expertise in writing shell Scripts, HiveQL queries and Map Reduce programs.
- Experience in ETL process using Spark(Scala).
- Hands on experience in Spark SQL.
- Hands on experience in streaming data tools like Flume and Kafka.
- Worked in Agile Environment.
- Worked on semi structured data like JSON and XML files.
- Extensive experience in development of web applications utilizing Core Java, JDBC, SQL, servlets, hibernate, Eclipse IDE, RDBMS Oracle.
- Very good experience in complete project life cycle (design, development, testing andimplementation) of Web applications.
- Hands on experience in VPN, Putty.
- Exceptional ability to learn new concepts.
TECHNICAL SKILLS:
Hadoop ECO Systems: HDFS, Map Reduce, HDFS, Hive, Sqoop, Flume, and HBase, Impala, Kafka
NO SQL: Hbase
Data Bases: MY SQL, Oracle 10g
Languages: Scala, Java, J2EE, SQL, HiveQL
Operating Systems: Windows XP/10, LINUX
IDE s & Utilities: Eclipse, Hue
PROFESSIONAL EXPERIENCE:
Confidential, NYC, NY
Hadoop/Spark Developer
Responsibilities:
- Involved in gathering the requirements, designing, development and testing.
- Involved in running Spark jobs over YARN.
- Experience with Cloudera distributions (CDH5).
- Developed the Sqoop scripts for import from MySQL, Oracle to Hadoop.
- Configured and used Kafka for data ingestion from various data sources containing different file formats like JSON and XML.
- Involved in writing HDFS CLI commands.
- Loaded the data into Spark RDD and did in memory data Computation using Spark-SQL Dataframes.
- Created external Hive tables to store spark output.
- Knowledge about different layers of data flow.
- Performed advanced procedures like text analytics and processing, in Spark using Scala.
- Worked with different file format such as Avro, RC, ORC, Parquet and compression techniques in Hive storage.
- Created Hive Managed, External Tables and Views to store the processed data and integrate them to reporting tool Tableau
- Involved in writing the HIVE join queries to extract data as per business requirement.
- Involved in performance tuning for skewed data in HIVE.
- Automated jobs using workflows in oozie.
- Worked on Adhoc requests by providing the data based on user requirement.
Environment: Hadoop cloudera distribution (CDH 5.10), Apache Hadoop 2.6.0, Hive 1.1.0, SQOOP 1.4.6, kafka, Spark 1.6.0, Hue 3.5.0, Linux, oozie, MySQL, Oracle, Windows
ConfidentialHadoop Developer
Responsibilities:
- Configured and used FLUME for data ingestion
- Involved in writing UNIX SHELL scripts for automation of sqoop jobs.
- Experience on loading and transforming of large sets of structured, semi structured.
- Written the Apache PIG scripts to process the XML data and parse it into structured format.
- Implemented best offer logic using Pig scripts.
- Used Piggybank functions to process xml files.
- Used Pig Diagnostic operators to evaluate step by step execution of Pig statements.
- Used Eval operators in pig to do arithmetic calculations in data based on business need.
- Written Hive join queries as per the requirement.
- Worked with different file format such as Avro, RC, ORC, Parquet file formats and compression techniques in Hive storage.
- Created Managed tables, Views to store processed data.
- Experience with Cloudera distributions (CDH5).
- Involved in Code reviews.
Environment: Hadoop cloudera distribution (CDH 5.10), Apache Hadoop 2.3.0, Hive 1.1.0, SQOOP 1.4.2, Flume 1.3.0, pig 0.2.0, Hue 3.5.0, Windows
ConfidentialSoftware Engineer(J2EE)
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) of the application like Requirement gathering, Design, Analysis and Code development.
- Developed a prototype of the application and demonstrated to business users to verify the application functionality.
- Integrated Spring with Hibernate framework and created Hibernate annotations for mapping an object-oriented domain model to traditional relational database.
- Built data synchronization application using Java multithreading to aid client to maximize profits
- Responsible for implementing web services Rest.
- Developed Object - Relational (O/R) mapping using Hibernate 3.0. Developed Data Access object (DAO) persistence layer using Hibernate 3.0.
- Conversant with tools like Eclipse and InteliJ
- Involved in design and implementation of various business scenarios under trading flow using spring.
- Hibernate used as Persistence framework mapping the ORM objects to tables. Developed HQL, SQL queries
- Expert in creating various PL/SQL stored procedures, views, functions and temporary tables for data input to the Crystal Reports.
Environment: Core java, Servlets, Jsp, Struts 1.3, Oracle. Java, J2EE, HTML, Hibernate, DHTML, CSS, JavaScript, XML, EJB, SQL Server 2008R2, LINUX, Windows 7/Vista/X.
ConfidentialSoftware Engineer(J2EE)
Responsibilities:
- Actively participated in requirements gathering, analysis and design and testing phases.
- Responsible for use case diagrams, class diagrams and sequence diagrams using Rational Rose in the Design phase.
- Involved in Analysis, design and coding on J2EE Environment.
- Connectivity with Databases MySQL and Oracle.
- Involved in writing the database integration code.
- Web application support and maintenance using Java/J2EE (Struts 1.2), Oracle, MS SQL server, MySQL.
- Used the JDBC for data retrieval from the database for various inquiries.
- Involved in writing database connection classes for interacting with Oracle database.
- Worked with J2EE and core java concept like Oops, GUI, Networking in java
- Created quality working J2EE code to design, schedule and cost to implement use cases.
- Involved in complete development of "Agile Development Methodology" and tested the application in each iteration.
- Prepared the design document based on requirements and sending project status report on weekly basis.
Environment: Java, J2EE, JSP, JDBC 3.0, Servlets 3.0, SQL 2000, MySQL 5.1, Oracle 10g, Apache Tomcat 6.0