We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

Miami, FL

SUMMARY

  • 8+ years of experience in Analysis, Architecture, Design, Development, Testing, Maintenance and User training of software application….
  • Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per requirement.
  • Good working knowledge on Data Transformations and Loading using Export and Import.
  • Hands on experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
  • Used different Hive Serde's like Regex Serde and HBase Serde.
  • Experience in analyzing data using Hive, Pig Latin, and custom MR programs in Java.
  • Hands on experience in writing Spark SQL scripting.
  • Sound knowledge in programming Spark using Scala.
  • Good understanding in processing of real-time data using Spark.
  • Hands on using job scheduling and monitoring tools like Kafka, Oozie and Zookeeper.
  • Developed small distributed applications in our projects using Zookeeper and scheduled the work flows using Oozie.
  • Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregation.
  • Clear understanding onHadoop architecture and various components such as HDFS, Job and Task Tracker, Name and Data Node, Secondary Name Node and Map Reduce programming.
  • Expertise writing custom UDFs for extending Hive and Pig core functionality.
  • Hands on dealing with log files to extract data and to copy into HDFS using flume.
  • Experience in NOSQL database such as HBase.
  • Experience inHadoopadministration activities such as installation and configuration of clusters using Apache and Cloudera.
  • Knowledge on installing, configuring and using Hadoop components like Hadoop Map Reduce (MR1), YARN (MR2), HDFS, Hive, Pig, Flume and Sqoop.
  • Experience in analyzing, designing and developing ETL strategies and processes, writing ETL specifications, Informatica development.
  • Extensively used Informatica Power Center for Extraction, Transformation and Loading process.
  • Experience in Dimensional Data Modeling using Star and Snow Flake Schema.
  • Worked on reusable code known as Tie outs to maintain the data consistency.
  • More than 4 years of experience in JAVA, J2EE, Web Services, SOAP, HTML and XML related technologies demonstrating strong analytical and problem solving skills, computer proficiency and ability to follow through with projects from inception to completion.
  • Extensive experience working in Oracle, DB2, SQL Server and My SQL database and Java Core concepts like OOPS, Multithreading, Collections and IO.
  • Hands on JAXWS, JSP, Servlets, Struts, Web Logic, Web Sphere, Hibernate, Spring, JBoss, JDBC, RMI, Java Script, Ajax, jQuery, Linux, Unix, WSDL, XML, HTML, AWS and Scala and Vertica.
  • Developed applications using Java, RDBMS, and Linux shell scripting.
  • Experience in complete project life cycle of Client Server and Web applications.
  • Good understanding of Data Mining and Machine Learning techniques.
  • Have good interpersonal, communicational skills, strong problem solving skills, explore/adopt to new technologies with ease and a good team member.

TECHNICAL SKILLS

Big Data/ Hadoop Framework: HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Flume and HBaseSpark

Databases: MongoDB, Microsoft SQL Server, MySQL, Oracle, Cassandra, ODI

Languages: Scala, Java, Python, C, C++, SQL, TSQL, Pig Latin, HiveQL

Web Technologies: JSP, JavaBeans, JDBC, XML

Operating Systems: Windows, Unix and Linux

Front-End: HTML/HTML 5, CSS3, JavaScript/JQuery

Development Tools: Microsoft SQL Studio, Toad, Eclipse, NetBeans, MySQL Workbench, Tableau.

Reporting Tool: SSRS, Succeed

Office Tools: Microsoft Office Suite

Development Methodologies: Agile/Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, Miami, FL

Hadoop Developer

Responsibilities:

  • Involved in end to end data processing like ingestion, processing, and quality checks and splitting.
  • Real time streaming the data using Spark Streaming with Kafka
  • Developed Spark scripts by using Scala as per the requirement.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Performed different types of transformations and actions on the RDD to meet the business requirements.
  • Developed a data pipeline using Kafka, Sparkand Hive to ingest, transform and analyzing data.
  • Also worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase and Sqoop.
  • Involved in loading data from UNIX file system to HDFS.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented best offer logic using Pig scripts and Pig UDFs.
  • Responsible to manage data coming from various sources.
  • Installed and configured Hive and also written Hive UDFs.
  • Experience on loading and transforming of large sets of structured, semi structured and unstructured data.
  • Cluster coordination services through Zookeeper.
  • Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Analysed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Responsible for setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
  • Installed and configured Hadoop Mapreduce, HDFS.
  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
  • Installed and configured Pig.
  • Involved in managing and reviewing Hadoop log files.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Responsible for writting Hive queries for data analysis to meet the business requirements.
  • Responsible for creating Hive tables and working on them using Hive QL.
  • Responsible for importing and exporting data into HDFS and Hive using Sqoop.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Designed and implemented Mapreduce based large-scale parallel relation-learning system.
  • Involved in scheduling Oozie workflow engine to run multiple Hive jobs

Environment: Hadoop, MapReduce, Hive, Pig, Sqoop, Java, Oozie, HBase, Kafka, Spark, Scala, Eclipse, Linux, Oracle, Teradata.

Confidential, Jersey city, New Jersey

Hadoop Developer

Responsibilities:

  • Involved in review of functional and non-functional requirements.
  • Facilitated knowledge transfer sessions.
  • Installed and configured Hadoop Mapreduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and pre-processing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced in defining job flows.
  • Experienced in managing and reviewing Hadoop log files.
  • Extracted files from RDBMS through Sqoop and placed in HDFS and processed.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from various sources.
  • Got good experience with NOSQL database such as HBase
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Gained very good business knowledge on health insurance, claim processing, fraud suspect identification, appeals process etc.
  • Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
  • This plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly.
  • Designed and implemented Mapreduce-based large-scale parallel relation-learning system
  • Written the programs in Spark using Scala and used RDD for transformations and performed actions on them.

Environment: Java 6, Eclipse, Oracle 10g, Linux Red Hat. Linux, MapReduce, HDFS, Hive, Java (JDK 1.6), MapReduce, Spark, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.

Confidential, IL

Hadoop Developer

Responsibilities:

  • Installation and Configuration of Hadoop Cluster
  • Working with Cloudera Support Team to Fine tune Cluster
  • Working Closely with SA Team to make sure all hardware and software is properly setup for Optimum usage of resources
  • Developed a custom File System plugin for Hadoop so it can access files on Hitachi Data Platform
  • Plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly
  • The plugin also provided data locality for Hadoop across host nodes and virtual machines
  • Wrote data ingesters and map reduce program
  • Developed map Reduce jobs to analyze data and provide heuristics reports
  • Good experience in writing data ingesters and complex MapReduce jobs in java for data cleaning and preprocessing and fine tuning them as per data set
  • Extensive data validation using HIVE and also written Hive UDF
  • Involved in creating Hive tables loading with data and writing hive queries which will run internally in map reduce wa lots of scripting (python and shell) to provision and spin up virtualized hadoop clusters
  • Adding, Decommissioning and rebalancing node
  • Created POC to store Server Log data into Cassandra to identify System Alert Metrics
  • Rack Aware Configuration
  • Configuring Client Machines
  • Configuring, Monitoring and Management Tools
  • HDFS Support and Maintenance
  • Cluster HA Setup
  • Applying Patches and Perform Version Upgrades
  • Incident Management, Problem Management and Change Managemen
  • Performance Management and Reporting
  • Recover from Name Node failure
  • Schedule Map Reduce Jobs - FIFO and FAIR share
  • Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoo
  • Integration with RDBMS using swoop and JDBC Connector
  • Working with Dev Team to tune Job Knowledge of Writing Hive Jobs

Environment: Windows 2000/ 2003 UNIX Linux Java, Apache HDFS Map Reduce, Pig Hive HBase Flume Sqoop, Cassandra, NOSQL

Confidential

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase and Sqoop.
  • Installed Hadoop, MapReduce, HDFS, and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Coordinated with business customers to gather business requirements, also interacted with other technical peers to derive technical requirements and delivered the BRD and TDD documents.
  • Involved in loading the created HFiles into HBase for faster access of large customer base without taking Performance hit.
  • Created HBase tables to store various data formats of PII data coming from different portfolios.
  • Extensively involved in Design phase and delivered Design documents.
  • Involved in Testing and coordination with business in User testing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables loading data and writing queries that will run internally in MapReduce way.
  • Involved in processing ingested raw data using MapReduce, Apache Pig and HBase.
  • Involved in developing Pig Scripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
  • Populated HDFS with huge amounts of data using Apache Kafka.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Used Hive to analyze the partitioned and bucketed data to compute various metrics for reporting.
  • POC work is going on using Spark and Kafka for real time processing.
  • Design technical solution for real-time analytics using Kafka and HBase.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented test scripts to support test driven development and continuous integration.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data on to HDFS.
  • Developed Hadoop streaming Map/Reduce works using Python.

Environment: Hadoop, Hive, MapReduce, Pig, MongoDB, Oozie, Sqoop, Kafka, Cloudera, Spark, HBase, HDFS, Scala, Solr, Zookeeper,HBase.

Confidential

Java developer

Responsibilities:

  • Develop the complete website for the company from the scratch and deploy the same
  • Involved in requirements gathering.
  • Designed and developed user interface using HTML, CSS and JavaScript.
  • Designed HTML screens with JSP for the front-end.
  • Involved in Database Design by creating Data Flow Diagram (Process Model) and ER Diagram (Data Model).
  • Designed, Created and maintained database using MySQL
  • Made JDBC calls from the Servlets to the Database to store the user details
  • Java Script was used for client side validation.
  • Servlets are used as the controllers and Entity/Session Beans for Business logic purpose.
  • Used Eclipse for project building
  • Participated in User review meetings and used Test Director to periodically log the development issues, production problems and bugs.
  • Used WebLogic to deploy applications on local and development environments of the application.
  • Debugged and fixed the errors
  • Implemented and supported the project through development, Unit testing phase into production environment.
  • Involved in documenting the application.
  • Designed HTML screens with JSP for the front-end.
  • Made JDBC calls from the Servlets to the Database
  • Involved in designing stored procedures to extract and calculate billing information connecting to oracle.
  • Formatting the results from the Database as HTML reports to the client.
  • Java Script was used for client side validation.
  • Servlets are used as the controllers and Entity/Session Beans for Business logic purpose.
  • Used WebLogic to deploy applications on local and development environments of the application.
  • Used Eclipse for building the application.
  • Participated in User review meetings and used Test Director to periodically log the development issues, production problems and bugs.
  • Implemented and supported the project through development, Unit testing phase into production environment.
  • Used PVCS Version manager for source control and PVCS Tracker for change control management
  • Implemented Test First unit testing framework driven using Junit.

Environment: Java, JSP, Servlets, JDBC, Java Script, HTML, CSS, WebLogic, Eclipse and Test Director

We'd love your feedback!