We provide IT Staff Augmentation Services!

Hadoop / Spark Developer Resume

4.00/5 (Submit Your Rating)

Charlotte, NC

PROFESSIONAL SUMMARY:

  • Around 8 years of professional IT experience, 4+ years Big Data Ecosystem experience in ingestion, querying, processing and analysis of big data.
  • Experience in using Hadoop ecosystem components like Map Reduce, HDFS, HBase, Zoo Keeper, Hive, Experience in meeting expectations wifHadoopclusters using Cloudera and Horton Works.
  • Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.Sqoop, Pig, Flume, Spark, Cloud era.
  • Knowledge and experience in Spark using Python and Scala.
  • Knowledge on big - data database HBase and NoSQL databases Mongo DB and Cassandra.
  • Experience in Spark applications using Scala for easyHadooptransitions.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Solid knowledge of Hadoop architecture and core components Name node, Data nodes, Job trackers, Task Trackers, Oozie, Scribe, Hue, Flume, HBase, etc.
  • Extensively worked on development and optimization of Map reduce programs, PIG scripts and HIVE queries to create structured data for data mining.
  • Ingested data from RDBMS and performed data transformations, and then export teh transformed data to Cassandra as per teh business requirement.
  • Worked in provisioning and managing multi-tenantHadoopclusters on public cloud environment - Amazon Web Services (AWS) and on private cloud infrastructure - Open stack cloud platform.
  • Loaded some of teh data into Cassandra for fast retrieval of data.
  • Worked wif both Scala and Java, Created frameworks for processing data pipelines through Spark.
  • Implemented batch-processing solution to certain unstructured and large volume of data by using Hadoop Map Reduce framework.
  • Very good experience of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data warehouse tools for data analysis.
  • Experience in Database design, Data analysis, Programming SQL.
  • Experience in extending HIVE and PIG core functionality by using Custom User Defined functions.
  • Experience in writing custom classes, functions, procedures, problem management, library controls and reusable components.
  • Working knowledge on Oozie, a workflow scheduler system to manage teh jobs that run on PIG, HIVE and SQOOP.
  • Experienced in integrating Java-based web applications in a UNIX environment.
  • Experience working wif Red Hat Enterprise Linux.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Oozie, and Spark (Python and Scala)

NoSQL Databases: HBase,Cassandra, MongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, R

Java/J2EE Technologies: Applets, Swing, JDBC, JSON, JSTL, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate.

ETL: IBMWebSphere/Oracle

Operating Systems: Sun Solaris, UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle, SQL Server, MySQL

Tools and IDE: Eclipse, NetBeans, JDeveloper, DB Visualizer.

Network Protocols: TCP/IP, UDP, HTTP, DNS

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop / Spark Developer

Responsibilities:

  • Worked wifHadoopEcosystem components like HBase, Sqoop, ZooKeeper, Oozie, Hive and Pig wif ClouderaHadoopdistribution.
  • Developed PIG and Hive UDF's in java for extended use of PIG and Hive and wrote PigScripts for sorting, joining, filtering and grouping thedata.
  • Developed programs in Spark based on teh application for faster data processing TEMPthan standard MapReduce programs.
  • Developed spark programs using Scala, involved in creating Spark SQLQueries and Developed Oozie workflow for sparkjobs.
  • Developed teh Oozie workflows wif Sqoop actions to migrate teh data from relational databases like Oracle, Teradata to HDFS.
  • Used Hadoop FS actions to move teh data from upstream location to local data locations.
  • Written extensive Hive queries to do transformations on teh data to be used by downstream models.
  • Developed map reduce programs as a part of predictive analytical model development.
  • Developed Hive queries to do analysis of teh data and to generate teh end reports to be used by business users.
  • Worked on scalable distributed computing systems, software architecture,datastructures and algorithms using Hadoop, Apache Spark and Apache Storm etc. and ingested streamingdatainto Hadoop using Spark, Storm Framework and Scala.
  • Got pleasant experience wif NOSQL databases like MongoDB.
  • Extensively used SVN as a code repository and Version One for managing day agile project development process and to keep track of teh issues and blockers.
  • Written spark python for model integration layer.
  • Implemented Spark using Scala, Java and utilizing Data frames and Spark SQL API for faster processing of data.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Spark for interactive queries, processing of streaming data and integration wif popular NoSQL database for huge volume of data.
  • Developed a data pipeline using Kafka, HBase, Mesos Spark and Hive to ingest, transform and analyzing customer behavioral data.

Environment: Hadoop, Hive, Impala, Oracle, Spark, Scala, Python, Pig, Sqoop, Oozie, MongoDB, Map Reduce, SVN.

Confidential, McLean, VA

Hadoop Developer

Responsibilities:

  • Designed theHadoopjobs to create teh product recommendation using collaborative filtering.
  • Designed teh COSA pretest utility Framework using MVC, JSF Validation, Tag library and JSF Baking beans.
  • Integrated teh Order Capture system wif Sterling OMS using JSON Web service.
  • Configured and Implemented Jenkins, Maven and Nexus for continuous integration.
  • Mentored and implemented teh test driven development (TDD) strategies.
  • Loaded teh data from Oracle to HDFS (Hadoop) using Sqoop.
  • Developed teh Data transformation script using Hive and MapReduce.
  • Developed PIG scripts using Pig Latin.
  • Exported data using Sqoop from HDFS to Teradata on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet teh business requirements and Designed and developed User Defined Function (UDF) for Hive.
  • Creating Hive tables and working on them using Hive QL.
  • Experienced indefining jobflows.
  • Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
  • Designed and implemented Map reduce-based large-scale parallel relation-learning system
  • Wrote teh Map Reduce code for teh flow from Hadoop Flume to ES Head.

Environment: Hadoop, Map Reduce, Horton Works, HDFS, Hive, Java, Jenkins, Maven, MVC, Cloudera, Pig, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, SQL connector.

Confidential, Washington, DC

Hadoop / Spark Developer

Responsibilities:

  • Monitoring and managing daily jobs, processing around 200k files per day and monitoring those through RabbitMQ and Apache Dashboard application.
  • Monitored workload, job performance and capacity planning using InsightIQ storage performance monitoring and storage analytics, experienced in defining job flows.
  • Worked on analyzingHadoopcluster using different big data analytic tools including Pig, Hive and MapReduce.
  • Strong experience working on design and implemented a Cassandra based database and related web services for storing unstructured data.
  • Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
  • Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream teh web log data from servers/sensors.
  • Developed MapReduce programs to cleanse teh data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Developed programs in Spark based on teh application for faster data processing TEMPthan standard MapReduce programs.
  • Created reports for teh BI team using Sqoop to export data into HDFS and Hive.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Developed PIG Latin scripts to extract teh data from teh web server output files to load into HDFS.
  • Worked on Mapreduce Joins in querying multiple semi-structured data as per analytic needs.
  • Involved in loading data from Unix File System into HDFS wif different format of data (Avro, Parquet)and creating indexes and tuning teh SQL queries in Hive and Involved in database connection by using Sqoop.
  • Worked on setting up High Availability for GPHD 2.2 wif Zookeeper and quorum journal nodes.
  • Automated teh process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
  • Worked in AWS environment for development and deployment of Custom HADOOP Applications.
  • Worked and learned a great deal from AmazonWebServices (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
  • Worked in provisioning and managing multi-tenantHadoopclusters on public cloud environment - Amazon Web Services(AWS)and on private cloud infrastructure - Openstack cloud platform.
  • Strong experience in working wif ELASTIC MAPREDUCE and setting up environments on Amazon AWS EC2 instances.
  • Scheduling and managing cron jobs, wrote shell scripts to generate alerts.
  • Environment: Hadoop, AWS, Map Reduce, HDFS, Hive, Pig, Spark, Python, Java 1.6 & 1.7, Linux, Eclipse, Cassandra, Zookeeper

Confidential, McLean, VA

Hadoop Developer

Responsibilities:

  • Designed theHadoopjobs to create teh product recommendation using collaborative filtering.
  • Designed teh COSA pretest utility Framework using MVC, JSF Validation, Tag library and JSF Baking beans.
  • Integrated teh Order Capture system wif Sterling OMS using JSON Web service.
  • Configured and Implemented Jenkins, Maven and Nexus for continuous integration.
  • Mentored and implemented teh test driven development (TDD) strategies.
  • Loaded teh data from Oracle to HDFS (Hadoop) using Sqoop.
  • Developed teh Data transformation script using Hive and MapReduce.
  • Developed PIG scripts using Pig Latin.
  • Exported data using Sqoop from HDFS to Teradata on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet teh business requirements and Designed and developed User Defined Function (UDF) for Hive.
  • Creating Hive tables and working on them using Hive QL.
  • Experienced indefining jobflows.
  • Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
  • Designed and implemented Map reduce-based large-scale parallel relation-learning system
  • Wrote teh Map Reduce code for teh flow from Hadoop Flume to ES Head.

Environment: Hadoop, Map Reduce, Horton Works, HDFS, Hive, Java, Jenkins, Maven, MVC, Cloudera, Pig, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, SQL connector.

Confidential, New York, NY

Java/J2EE Developer

Responsibilities:

  • Involved in Java, J2EE, struts, web services and Hibernate in a fast-paced development environment.
  • Followed agile methodology, interacted directly wif teh client on teh features, implemented optional solutionsand tailor application to customer needs.
  • Used Apache POI for Excel files reading.
  • Developed teh user interface using JSP and Java Script to view all online trading transactions.
  • Designed and developed Data Access Objects (DAO) to access teh database.
  • Coded Java Server Pages for teh Dynamic front end content that use Servlets and EJBs.
  • Coded HTML pages using CSS for static content generation wif JavaScript for validations.
  • Used JDBC API to connect to teh database and carry out database operations.
  • Used JSP and JSTL Tag Libraries for developing User Interface components.
  • Performing Code Reviews.
  • Performed unit testing, system testing and integration testing.
  • Involved in building and deployment of application in Linux environment.

Environment: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.

Confidential, New York, NY

Java/J2EE Developer

Responsibilities:

  • Responsible for understanding teh scope of teh project and requirement gathering.
  • Developed teh web tier using JSP, Struts MVC to show account details and summary.
  • Created and maintained teh configuration of teh Spring Application Framework.
  • Implemented various design patterns - Singleton, Business Delegate, Value Object and Spring DAO.
  • Used Spring JDBC to write some DAO classes which interact wif teh database to access account information.
  • Mapped business objects to database using Hibernate.
  • Involved in writing Spring Configuration XML files that contains declarations and other dependent objectsdeclaration.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for Unit Testing.
  • Used Oracle as Database and used Toad for queries execution and involved in writing SQL scripts,PL/ SQL code for procedures and functions.
  • Used CVS, Perforce as configuration management tool for code versioning and release.
  • Developed application using Eclipse and used build and deploy tool as Maven.
  • Used Log4J to print teh logging, debugging, warning, info on teh server console.

Environment: Java, J2EE, JSON, LINUX, XML, XSL, CSS, Java Script, Eclipse

We'd love your feedback!