Hadoop/spark Developer Resume
MA
SUMMARY:
- 6+ years of IT industry experience with strong emphasis on Big Data/Hadoop, Apache Spark, Java/J2EE, Scala and Python.
- Around 4 years of experience in working with different Hadoop Ecosystem development including MapReduce, HDFS, Hive, Pig, Spark SQL, Spark Streaming, YARN, Kafka, HBase, MongoDB, Cassandra, Zoo Keeper, Sqoop, Flume, Impala, Oozie and Storm.
- Strong experience in writing complex Pig Scripts, Hive data modeling and Mapreduce jobs.
- Assisted in Extending Hive and Pig core functionality by writing custom UDFs.
- Experience in different Hadoop distributions like Cloudera (CDH3 & CDH4) and HortonWorks Distributions (HDP) and Elastic Map Reduce (EMR).
- Experience in handling different file formats like Parquet, Apache Avro, Sequence file, JSON, Spreadsheets, Text files, XML and Flat file format.
- Expertise in writing Shell - Scripts, Cron Automation and Regular Expressions.
- Good knowledge on Amazon AWS concepts like EMR & EC2 web services which provides fast and efficient processing of Big Data.
- Experience in developing Shell scripts and Python Scripts for system management.
- Comprehensive knowledge in Debugging, Optimizing and Performance Tuning of DB2, Oracle and MySQL databases.
- Working knowledge of ETL tools like Informatica/Power centre, RDBMS/Oracle 9i, SDLC, QA/UAT & Technical documentation.
- Data visualization using Tableau,Talend, Pentaho, and google visualization API's.
- Well versed in using Software development methodologies like Rapid Application Development (RAD), Agile Methodology and Scrum software development processes.
TECHNICAL SKILLS:
Big Data Technologies: Hadoop, MapReduce, PIG, HBASE, HIVE, Storm, Kafka, Spark, Spark SQL, Spark Streaming, Flume, YARN, Sqoop, Cassandra and Mongo DB.
Databases: Oracle 10/11g, MS SQL Server, DB2, MySQL, Hbase, Cassandra, MongoDB.
Languages: Java/J2EE, Scala, Python, C, PL/SQL.
Java Frameworks: Spring MVC, JDBC, JSP, JSON, JDBC, JSTL, RMI, Servlets, EJB, Hibernate, Struts
Web Technologies: HTML, XML, CSS, JavaScript, JQuery, AJAX.
Build Tools: Maven 2.x, 3.x.a
Operating System: Windows, Unix, Linux and Ubuntu.
App/Web Server: Websphere, Web Logic, Jboss, Apache Jakarta Tomcat 5.1 and Java Web Server 2.0.
Tools: and IDE: Eclipse, Scala IDE, IntelliJ IDEA.
Version Control Systems: SVN, Git, CVS, VSS and CLEAR CASE.
Methodologies: Agile, SDLC.
WORK EXPERIENCE:
Confidential
Hadoop/Spark Developer
Responsibilities:
- Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Spark, Impala, Cassandra with Hortonworks Distribution.
- Real time streaming of data using Spark with Kafka.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
- Experienced with batch processing of data sources using Apache Spark and Elastic search.
- Experienced in implementing Spark RDD transformations, actions to implement business analysis.
- Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Integrating user data from Cassandra to data in HDFS. Integrating Cassandra with Storm for real time user attributes look up.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
- Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
- Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in Oozie.
- Created Hive tables and involved in data loading and writing Hive UDFs.
- Working with Talend OpenStudio for ETL process.
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Spark, Spark Streaming, Spark SQL, Apache Kafka, Sqoop, Java, Scala, CDH4, CDH5, AWS, Eclipse, Oracle, Git, Shell Scripting and Cassandra.
Confidential, MAHadoop Developer
Responsibilities:
- Responsible for Installation and configuration of Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Importing the data from the MySql and Oracle into the HDFS using Sqoop.
- Extensive work with using log files and to copy them into HDFS using flume.
- Developed simple and complex Map Reduce programs in Java for Data Analysis on different data formats.
- Proficient in Data Extraction, Transforming and Loading (ETL) using BI tools. Extracting data from one or many data systems, applying a series of rules or functions on the extracted data to derive it in the end target and finally loading the data in the end target or destination.
- Optimized Map/Reduced Jobs to use HDFS efficiently by using various compression mechanisms.
- Developed workflows using Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Developed Pig scripts for data analysis and extended its functionality by developing custom UDF's.
- Implemented partitioning, dynamic partitions and buckets in HIVE.
- Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
- Worked on designing NoSQL Schemas on HBase.
- Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Have worked on Apache Kafka for real time processing of the data. Have sound knowledge on producer, consumer, and broker concept of Apache Kafka.
- Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs.
- Successfully converted the AVRO data into PARQUET format in IMPALA for faster query processing.
Environment: Hadoop, HDFS, Hive, Pig, Linux, UNIX, Shell Scripting, Kafka, Python, Hbase, Zookeeper, MapReduce, Java, Sqoop, Yarn, Parquet and Avro.
Confidential, Lewisville, TexasJava/Hadoop Developer
Responsibilities:
- Experienced in writing Map reduce jobs using Pig Latin scripts and Pig UDF's in Java to discover trends in data usage by users.
- Importing log files using Flume into HDFS and load into Hive tables to query data
- Coordinated cluster services using Zookeeper.
- Involved in loading data from Linux file system to HDFS.
- Worked with NoSQL databases like Hbase by creating Hbase tables to load large sets of semi structured data coming from various sources.
- Created Partitioned Hive tables and worked on them using HiveQL.
- Designed and Implemented Partitioning (Multi-level), Buckets in HIVE.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Have experience in writing shell scripts.
- Have hands on experience working on Sequence files, AVRO file formats.
- Used Oozie job scheduler to automate the job flows.
- Developed Spring MVC classes for handling requests received from front end logic such as JSP pages.
- Involved in design and development of UI component, using frameworks Angular JS, JavaScript, HTML5, CSS and Bootstrap.
- Involved in JUnit testing of the application using JUnit framework.
Environment: Java, Struts, JSP, JDBC, Spring, XML, Hadoop, HDFS, Pig, Hive, Hbase, MapReduce using Java, Oozie, Zookeeper, Linux, UNIX Shell Scripting, Flume, AngularJS, JavaScript, jQuery, HTML5, CSS, JUnit.
ConfidentialJava Developer
Responsibilities:
- Involved in the complete SDLC software development life cycle of the application which includes requirement analysis, design, development and testing.
- Used Spring Transactions for handling rollbacks and Spring Batch Prepared Statements for doing batch load/ updates to improve the performance.
- Developed and implemented the MVC Architectural Pattern using Struts Framework including JSP, Servlets, EJB, Form Bean and Action classes.
- Implemented application using Spring, Spring IOC, Spring Annotations, Spring MVC, Spring Transactions, Hibernate 3.0, SQL, IBM WebSphere 8 and JBoss.
- Developed DAOs (Data Access Object) using Hibernate as ORM to interact with DBMS - Oracle.
- Used Toad database tool to develop oracle quires.
- Involved in writing SQL queries &PL/SQL - Stored procedures, functions, triggers, cursors, object, types, sequences, indexes etc.
- Developed UI using JavaScript, JSP, HTML, CSS and Angular JS for interactive cross browser functionality and complex user interface.
- Developed the XML Schema and Web services for the data maintenance and structures Wrote test cases in JUnit for unit testing of classes.
- Used REST and SOAP Web Services to exchange information.
- Implemented Agile development process on Software Development Life Cycle.
- Actively participated in the daily SCRUM meetings to produce quality deliverables within time.
- Developed web based presentation-using JSP, AJAX using Servlets technologies and implemented using struts framework.
Environment: Java, Spring MVC, Struts, Hibernate, HTML, Javascript, JSP, AJAX, IBM Websphere, Apache Tomcat, Oracle 10g, SQL, PL/SQL, XML, UML, REST, SOAP, Eclipse.
ConfidentialJr. Java Developer
Responsibilities:
- Involved in the analysis, design, and development and testing phases of Software Development Life Cycle (SDLC).
- Used JSP and JSTL Tag Libraries for developing User Interface components.
- Designed and developed dynamic web pages and extensively worked on user interface for few modules using HTML, JSP, and Javascript.
- Used Frames and Cascading Style Sheets (CSS) to give a better view to the Web Pages.
- Designed and developed framework components, involved in designing MVC pattern using Struts and Spring framework.
- Developed the Action Classes, Action Form Classes, created JSPs using Struts tag libraries and configured in Struts-config.xml, Web.xml files.
- Used SOAP for exchanging XML based messages.
- Wrote various SQL Queries for accessing data from database and used JDBC for Java Database Communication.
Environment: JAVA, J2EE, JSP, HTML, Servlets, Struts, Spring Framework, JavaScript, XML, JDBC, Oracle9i, PL/SQL, MVC, Weblogic, Eclipse.