We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

North Quincy, MA

SUMMARY:

  • Having 8+years of overall experience in IT industry includes implementing, developing and maintenance in Big Data Technologies and Web based applications using Java, J2EE technologies.
  • Hands on experience in implementing various solutions and analyzing data using Hadoop Ecosystems like HDFS, MapReduce, Yarn, Spark, Sqoop, Hive, Pig, Flume, Kafka, Impala, Oozie, Oozie coordinator, Zookeeper and Cassandra, HBase.
  • Hands on experience in various Hadoop distributions like Cloudera (CDH3, CDH4 & CDH5), Hortonworks Data Platform (HDP) and MapR .
  • Experience in performing in - memory data processing and real time streaming analytics using Apache Spark with Scala , Java and Python .
  • Experience in Sqoop to import data from various external sources into Hadoop ecosystem components like HDFS, HBase and Hive, as well as exports data from Hadoop to other external sources.
  • Experience on Spark Streaming API with Kafka to feed live streaming data into HDFS and optimized it using spark concepts like Data frames, Partitioning, Bucketing, Parallel execution and Map side Joins using broadcast joins .
  • Experience in advanced procedures like text analytics, processing using in memory computing capabilities like Spark written in Scala .
  • Experience in Spark-SQL to process large amount of data by implementing Spark-RDD transformations, Actions and Data frames to required input data.
  • Experience on data analysis and developing scripts for pig Latin and Hive QL using Java.
  • Experience in using Ambari for provisioning, managing, monitoring and securing apache Hadoop cluster.
  • Experience in analyzing data from Cassandra for quick processing, sorting and grouping through CQL.
  • Experience on Managing and scheduling Jobs in Hadoop Cluster using Oozie and used Zookeeper for cluster coordination services.
  • Experience in configuring different topologies in Storm to import and process information on fly from multiple sources, collect into central repository Hadoop.
  • Good working experience on different file formats like Json, Avro , Parquet, compression techniques like snappy & bzip.
  • Hands on experience in data warehousing with ETL tools like Informatica, Ab initio, IBM DataStage for various loading and transformation processes.
  • Good working experience and knowledge in NOSQL databases like HBase, Cassandra, MongoDB , Couch DB.
  • Intensive working experience with Amazon Web Services(AWS) using S3 for storage, EC2 for computing and RDS, EBS.
  • Experience in NIFI work flow scheduler managing Hadoop jobs by Direct Acyclic graph (DAG) of actions with control flows.
  • Experience in using IDEs and source control repositories like Eclipse, IntelliJ and GitHub, Maven, SBT .
  • Experience in preparing Tableau reports for analyzed data by using excel sheets, flat files, CSV files.
  • Experience in different data sources like Oracle, Netezza, SQL and PostgreSQL.
  • Experience in J2EE technologies like Java servlets, JSP, EJB, and JDBC etc.

TECHNICAL SKILLS

Hadoop Ecosystem : HDFS, MapReduce, Spark, Hive, Beeline, Sqoop, Flume, Oozie, Impala, Pig, Kafka, Zookeeper, NIFI, Spark-Streaming, Spark-SQL, Cloudera, Hortonworks.

Programming Languages : C, C++, Java, Scala, Python.

Data bases : Oracle, SQL server, MySQL.

NoSQL databases : HBase, Cassandra, Mongo DB.

Operating Systems : Linux, Windows, Ubuntu, Unix.

Source Control and Build Tools : Web servers, Maven, Git, Jenkins.

IDE’s : IntelliJ, Eclipse, NetBeans.

Hadoop Distribution : Cloudera enterprise and Horton works.

Web Technologies : HTML, CSS, JavaScript, jQuery, Ajax, Json.

PROFESSIONAL EXPERIENCE:

Hadoop Developer

Confidential - North Quincy, MA

Roles and Responsibilities:

  • Involved in installation, configuration and maintenance of Hadoop clusters for application development with Cloudera distribution.
  • Developed end-to-end scalable distributed data pipelines which receiving data using distributed messaging systems Kafka through persistence of data into HDFS with Apache Spark using Scala .
  • Implemented Spark applications in data processing project to handle data from various sources and creating DStreams, Data frames on input data which we get from streaming service like Kafka.
  • Involved in performance tuning of Spark jobs using Cache and using complete advantage of cluster environment.
  • Performed advanced operations like text analytics and processing, using in-memory computing capabilities of Spark using Scala .
  • Experience in query data using Spark SQL on Spark to implement Spark RDD’S in Scala.
  • Experience in implementing Spark RDD transformations, actions, data frames, case classes to required data by using Spark core .
  • Worked on Partitioning , Bucketing , Parallel execution, Map side Joins for optimization of necessary hive queries.
  • Performed Hive QL to create Hive tables and to write Hive queries to perform the data analysis.
  • Developed Kafka consumer’s API in Scala for consuming data from Kafka topics .
  • Experience in collecting log data from web servers and pushed to HDFS using Flume and NoSQL database Cassandra.
  • Used Oozie workflow to Manage and scheduling Jobs on a Hadoop Cluster and used Zookeeper for cluster coordination services.
  • Implemented apache airflow DAG to find popular items in redshift and ingest in the main PostgreSQL via a web service call.
  • Used NIFI for the transformation of data from different components of Big data ecosystem.
  • Worked on different data sources like Oracle, Netezza, MySQL, Flat files etc. and experience with AWS components like Amazon Ec2 instances, S3 buckets and Cloud Formation templates.
  • Implemented Jira for bug tracking and Bit-bucket to code and code review.
  • Used Qlik sense to build customized interactive reports, worksheets, and dashboards.
  • Involved in managing and organizing developers with regular code review sessions by utilizing Agile and Scrum Methodologies.

Environment: Map Reduce, HDFS, Hive, Spark, Spark-SQL, Sqoop, Apache Kafka, Cassandra, Scala, Oozie, Eclipse and Qlik sense, Cloudera.

Hadoop Developer

Confidential - St. Louis, MO

Roles and Responsibilities:

  • Involved in Collection and aggregation of large amount of weblog data from webservers using Spark Streaming with distributed messaging system Kafka and stored the data into HDFS for analysis.
  • Experience on data migrating Map Reduce programs into Spark transformations using Spark with Scala .
  • Developed Spark-SQL for implementing Spark RDD transformations, actions, and data frames to generate statistical summary and filtering operations for specific input data.
  • Used Spark and Spark-SQL to read the parquet data and create the tables in hive using the Scala API and Spark also used to load JSON data and create Schema-RDD and loaded it into Hive Tables.
  • Performed Hive QL to create Hive tables and to write Hive queries to perform the data analysis.
  • Hands on experience on Hive partitioning , bucketing and perform joins on Hive tables and implementing REGEX, JSON and Avro.
  • Performed Oozie Operational Services for batch processing and scheduling workflows dynamically.
  • Involved in Implementing business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
  • Implemented Impala to analyze data ingested into Hive tables.
  • Worked on Data Serialization formats for converting Complex objects into sequence bits by using Avro , Parquet , Json , Csv formats.
  • Hands on experience on working with Amazon EMR framework transferring data to EC2 server.
  • Experience on Amazon Web Services EC2 and S3 by creating buckets, configuring buckets with permissions, logging, tagging and versioning.
  • Worked on Intellij IDE to developing Scala scripts for Spark jobs.
  • Experience in loading data into HBase using bulk load and non-bulk load.
  • Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data warehouse tools for reporting and data analysis.
  • Involved in implementation of continues Integration and Deployment framework using Jenkins and Maven.
  • Implemented project by using Agile methodologies and attended to Scrum every day.

Environment: Map Reduce, HDFS, Hive, Pig, HBase, SQL, Sqoop, Oozie, Apache Kafka, Eclipse, Spark, Cloudera and Maven, Git.

Hadoop Developer

Confidential - New Kensington, PA

Roles and Responsibilities:

  • Installed and configured Hadoop Clusters using Cloudera Distribution.
  • Experience on Flume to collect, aggregate and store the web log data from web servers, and network devices and stored into HDFS .
  • Worked on exporting analyzed data to relational databases using Sqoop to generate the reports for Business intelligence team.
  • Implemented Map Reduce jobs using Java and Pig scripts
  • Experience in large-scale Hadoop cluster for distributed data processing and analysis by using Hive and HBase .
  • Analyzed the large datasets by using Hive queries which invoke and run internally in map reduce program.
  • Developed MapReduce jobs for data cleaning and data processing.
  • Implemented custom UDF’S, joins and groups for the cleaning and optimization processes on pig scripts.
  • Developed MapReduce pipelines jobs to process the data and creating new Hfiles when it is necessary.
  • Migrated ETL operations using Pig Latin scripts for joins, filtering, and transformations in to the Hadoop system.
  • Involved in loading data from UNIX file system to HDFS using Shell Scripting.
  • Used Oozie for job work flow scheduling, managing and monitoring.
  • Produced various kinds of reports using Tableau and Power BI based on client requirements.

Environment: Hadoop, MapReduce, Yarn, Hive, HBase, Oozie, Sqoop, Flume, Pig, Linux and Maven, Git.

Java Developer

Confidential

Responsibilities:

  • Involved in each phase of Software Development Life Cycle ( SDLC ) models like Requirement gathering and analysis, Design, Implementation, Testing, Deployment and Maintenance.
  • Developed Login, Policy and Claims Screens for customers using HTML 5, CSS3, JavaScript , AJAX, JSP, and jQuery .
  • Used Core Java to develop Business Logic.
  • Involved in the development of business module applications using J2EE technologies like Servlets , JSP .
  • Designed and developed the web-tier using JSP's , Servlets framework.
  • Used various Core Java concepts such as Multi-Threading, Exception Handling , Collection APIs to implement various features and enhancements.
  • Strong experience in design & development of applications using Java/J2EE components such as Java Server Pages (JSP).
  • Developed EJB MDB's and message Queue's using JMS technology.
  • EJB Session Beans were used to process requests from the user interface and CMP entity beans were used to interact with the persistence layer.
  • Developed stored procedures, triggers, and queries using PLSQL in SQL Server.
  • Use Spring MVC as framework and JavaScript for client-side view, used frameworks for client-side data validation, creating dynamic web pages-Ajax, jQuery. Developed model classes based on the forms to be displayed on the UI.
  • Implemented various design patterns in the project such as Business Delegate, Data Transfer Object, Data Access Object, Service Locator and Singleton.
  • Used SQL statements and procedures to fetch the data from the database.
  • Developed test cases and performed unit test using JUnit Framework.
  • Used CVS as version control and ANT scripts to fetch, build, and deploy application to development environment.

Environments: Java, HTML, CSS, JavaScript, MySQL, Struts, EJB, Spring MVC.

Jr Java Developer

Confidential

Responsibilities:

  • Performed Analysis, Design, Development, Integration and Testing of application modules.
  • Implemented application prototype using JSP , Servlets , JDBC, and to give the presentation.
  • Developed new web page designs and development of the project presentation layer by using HTML, JavaScript , JSF , Ajax and implemented CSS for User Interface and better appearance.
  • Implemented data base queries by using SQL and PL/SQL to perform data analysis, extraction and various functions.
  • Implemented an application by using Struts Framework which leverages classical Model View Layer Architecture ( MVC ).
  • Involved in fixing bugs and unit testing with test cases using Junit .
  • Worked with various software development methodologies like Agile , Waterfall to increase the development of the project.

Environment: Java, HTML, CSS, JavaScript, JSON, JSP, JDBC and SQL, PL/SQL.

We'd love your feedback!