We provide IT Staff Augmentation Services!

Hadoop / Spark Developer Resume

2.00/5 (Submit Your Rating)

Charlotte, NC

PROFESSIONAL SUMMARY:

  • Around 8 years of professional IT experience, 4+ years Big Data Ecosystem experience in ingestion, querying, processing and analysis of big data.
  • Experience in using Hadoop ecosystem components like Map Reduce, HDFS, HBase, Zoo Keeper, Hive, Experience in meeting expectations wifHadoopclusters using Cloudera and Horton Works.
  • Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.Sqoop, Pig, Flume, Spark, Cloud era.
  • Knowledge and experience in Spark using Python and Scala.
  • Knowledge on big - data database HBase and NoSQL databases Mongo DB and Cassandra.
  • Experience in Spark applications using Scala for easyHadooptransitions.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Solid knowledge of Hadoop architecture and core components Name node, Data nodes, Job trackers, Task Trackers, Oozie, Scribe, Hue, Flume, HBase, etc.
  • Extensively worked on development and optimization of Map reduce programs, PIG scripts and HIVE queries to create structured data for data mining.
  • Ingested data from RDBMS and performed data transformations, and tan export the transformed data to Cassandra as per the business requirement.
  • Worked in provisioning and managing multi-tenantHadoopclusters on public cloud environment - Amazon Web Services (AWS) and on private cloud infrastructure - Open stack cloud platform.
  • Loaded some of the data into Cassandra for fast retrieval of data.
  • Worked wif both Scala and Java, Created frameworks for processing data pipelines through Spark.
  • Implemented batch-processing solution to certain unstructured and large volume of data by using Hadoop Map Reduce framework.
  • Very good experience of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
  • Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data warehouse tools for data analysis.
  • Experience in Database design, Data analysis, Programming SQL.
  • Experience in extending HIVE and PIG core functionality by using Custom User Defined functions.
  • Experience in writing custom classes, functions, procedures, problem management, library controls and reusable components.
  • Working knowledge on Oozie, a workflow scheduler system to manage the jobs dat run on PIG, HIVE and SQOOP.
  • Experienced in integrating Java-based web applications in a UNIX environment.
  • Experience working wif Red Hat Enterprise Linux.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Oozie, and Spark (Python and Scala)

NoSQL Databases: HBase,Cassandra, MongoDB

Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, R

Java/J2EE Technologies: Applets, Swing, JDBC, JSON, JSTL, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery

Frameworks: MVC, Struts, Spring, Hibernate.

ETL: IBMWebSphere/Oracle

Operating Systems: Sun Solaris, UNIX, Red Hat Linux, Ubuntu Linux and Windows XP/Vista/7/8

Web Technologies: HTML, DHTML, XML, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, JBoss

Databases: Oracle, SQL Server, MySQL

Tools and IDE: Eclipse, NetBeans, JDeveloper, DB Visualizer.

Network Protocols: TCP/IP, UDP, HTTP, DNS

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop / Spark Developer

Responsibilities:

  • Worked wifHadoopEcosystem components like HBase, Sqoop, ZooKeeper, Oozie, Hive and Pig wif ClouderaHadoopdistribution.
  • Developed PIG and Hive UDF's in java for extended use of PIG and Hive and wrote PigScripts for sorting, joining, filtering and grouping thedata.
  • Developed programs in Spark based on the application for faster data processing than standard MapReduce programs.
  • Developed spark programs using Scala, involved in creating Spark SQLQueries and Developed Oozie workflow for sparkjobs.
  • Developed the Oozie workflows wif Sqoop actions to migrate the data from relational databases like Oracle, Teradata to HDFS.
  • Used Hadoop FS actions to move the data from upstream location to local data locations.
  • Written extensive Hive queries to do transformations on the data to be used by downstream models.
  • Developed map reduce programs as a part of predictive analytical model development.
  • Developed Hive queries to do analysis of the data and to generate the end reports to be used by business users.
  • Worked on scalable distributed computing systems, software architecture,datastructures and algorithms using Hadoop, Apache Spark and Apache Storm etc. and ingested streamingdatainto Hadoop using Spark, Storm Framework and Scala.
  • Got pleasant experience wif NOSQL databases like MongoDB.
  • Extensively used SVN as a code repository and Version One for managing day agile project development process and to keep track of the issues and blockers.
  • Written spark python for model integration layer.
  • Implemented Spark using Scala, Java and utilizing Data frames and Spark SQL API for faster processing of data.
  • Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
  • Used Spark for interactive queries, processing of streaming data and integration wif popular NoSQL database for huge volume of data.
  • Developed a data pipeline using Kafka, HBase, Mesos Spark and Hive to ingest, transform and analyzing customer behavioral data.

Environment: Hadoop, Hive, Impala, Oracle, Spark, Scala, Python, Pig, Sqoop, Oozie, MongoDB, Map Reduce, SVN.

Confidential, McLean, VA

Hadoop Developer

Responsibilities:

  • Designed theHadoopjobs to create the product recommendation using collaborative filtering.
  • Designed the COSA pretest utility Framework using MVC, JSF Validation, Tag library and JSF Baking beans.
  • Integrated the Order Capture system wif Sterling OMS using JSON Web service.
  • Configured and Implemented Jenkins, Maven and Nexus for continuous integration.
  • Mentored and implemented the test driven development (TDD) strategies.
  • Loaded the data from Oracle to HDFS (Hadoop) using Sqoop.
  • Developed the Data transformation script using Hive and MapReduce.
  • Developed PIG scripts using Pig Latin.
  • Exported data using Sqoop from HDFS to Teradata on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet the business requirements and Designed and developed User Defined Function (UDF) for Hive.
  • Creating Hive tables and working on them using Hive QL.
  • Experienced indefining jobflows.
  • Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
  • Designed and implemented Map reduce-based large-scale parallel relation-learning system
  • Wrote the Map Reduce code for the flow from Hadoop Flume to ES Head.

Environment: Hadoop, Map Reduce, Horton Works, HDFS, Hive, Java, Jenkins, Maven, MVC, Cloudera, Pig, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, SQL connector.

Confidential, Washington, DC

Hadoop / Spark Developer

Responsibilities:

  • Monitoring and managing daily jobs, processing around 200k files per day and monitoring those through RabbitMQ and Apache Dashboard application.
  • Monitored workload, job performance and capacity planning using InsightIQ storage performance monitoring and storage analytics, experienced in defining job flows.
  • Worked on analyzingHadoopcluster using different big data analytic tools including Pig, Hive and MapReduce.
  • Strong experience working on design and implemented a Cassandra based database and related web services for storing unstructured data.
  • Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
  • Very good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the web log data from servers/sensors.
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Developed programs in Spark based on the application for faster data processing than standard MapReduce programs.
  • Created reports for the BI team using Sqoop to export data into HDFS and Hive.
  • Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Worked on Mapreduce Joins in querying multiple semi-structured data as per analytic needs.
  • Involved in loading data from Unix File System into HDFS wif different format of data (Avro, Parquet)and creating indexes and tuning the SQL queries in Hive and Involved in database connection by using Sqoop.
  • Worked on setting up High Availability for GPHD 2.2 wif Zookeeper and quorum journal nodes.
  • Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
  • Worked in AWS environment for development and deployment of Custom HADOOP Applications.
  • Worked and learned a great deal from AmazonWebServices (AWS) Cloud services like EC2, S3, EBS, RDS and VPC.
  • Worked in provisioning and managing multi-tenantHadoopclusters on public cloud environment - Amazon Web Services(AWS)and on private cloud infrastructure - Openstack cloud platform.
  • Strong experience in working wif ELASTIC MAPREDUCE and setting up environments on Amazon AWS EC2 instances.
  • Scheduling and managing cron jobs, wrote shell scripts to generate alerts.
  • Environment: Hadoop, AWS, Map Reduce, HDFS, Hive, Pig, Spark, Python, Java 1.6 & 1.7, Linux, Eclipse, Cassandra, Zookeeper

Confidential, McLean, VA

Hadoop Developer

Responsibilities:

  • Designed theHadoopjobs to create the product recommendation using collaborative filtering.
  • Designed the COSA pretest utility Framework using MVC, JSF Validation, Tag library and JSF Baking beans.
  • Integrated the Order Capture system wif Sterling OMS using JSON Web service.
  • Configured and Implemented Jenkins, Maven and Nexus for continuous integration.
  • Mentored and implemented the test driven development (TDD) strategies.
  • Loaded the data from Oracle to HDFS (Hadoop) using Sqoop.
  • Developed the Data transformation script using Hive and MapReduce.
  • Developed PIG scripts using Pig Latin.
  • Exported data using Sqoop from HDFS to Teradata on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet the business requirements and Designed and developed User Defined Function (UDF) for Hive.
  • Creating Hive tables and working on them using Hive QL.
  • Experienced indefining jobflows.
  • Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in map reduce way.
  • Designed and implemented Map reduce-based large-scale parallel relation-learning system
  • Wrote the Map Reduce code for the flow from Hadoop Flume to ES Head.

Environment: Hadoop, Map Reduce, Horton Works, HDFS, Hive, Java, Jenkins, Maven, MVC, Cloudera, Pig, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, SQL connector.

Confidential, New York, NY

Java/J2EE Developer

Responsibilities:

  • Involved in Java, J2EE, struts, web services and Hibernate in a fast-paced development environment.
  • Followed agile methodology, interacted directly wif the client on the features, implemented optional solutionsand tailor application to customer needs.
  • Used Apache POI for Excel files reading.
  • Developed the user interface using JSP and Java Script to view all online trading transactions.
  • Designed and developed Data Access Objects (DAO) to access the database.
  • Coded Java Server Pages for the Dynamic front end content dat use Servlets and EJBs.
  • Coded HTML pages using CSS for static content generation wif JavaScript for validations.
  • Used JDBC API to connect to the database and carry out database operations.
  • Used JSP and JSTL Tag Libraries for developing User Interface components.
  • Performing Code Reviews.
  • Performed unit testing, system testing and integration testing.
  • Involved in building and deployment of application in Linux environment.

Environment: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.

Confidential, New York, NY

Java/J2EE Developer

Responsibilities:

  • Responsible for understanding the scope of the project and requirement gathering.
  • Developed the web tier using JSP, Struts MVC to show account details and summary.
  • Created and maintained the configuration of the Spring Application Framework.
  • Implemented various design patterns - Singleton, Business Delegate, Value Object and Spring DAO.
  • Used Spring JDBC to write some DAO classes which interact wif the database to access account information.
  • Mapped business objects to database using Hibernate.
  • Involved in writing Spring Configuration XML files dat contains declarations and other dependent objectsdeclaration.
  • Used Tomcat web server for development purpose.
  • Involved in creation of Test Cases for Unit Testing.
  • Used Oracle as Database and used Toad for queries execution and involved in writing SQL scripts,PL/ SQL code for procedures and functions.
  • Used CVS, Perforce as configuration management tool for code versioning and release.
  • Developed application using Eclipse and used build and deploy tool as Maven.
  • Used Log4J to print the logging, debugging, warning, info on the server console.

Environment: Java, J2EE, JSON, LINUX, XML, XSL, CSS, Java Script, Eclipse

We'd love your feedback!