We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Kenilworth, NJ

SUMMARY

  • 8 years of overall experience in IT industry including hands on experience in Software development.
  • Over 3+ years of experience in Big Data working environment includes Spark, Map Reduce, HDFS, HBase, Zookeeper, Hive, Sqoop, Pig and Cassandra, Kafka etc.
  • Good Understanding of the software development lifecycle (SDLC) as well as various phases such as Analysis, Design, Development and Testing.
  • Exposed to Agile method of software development (SCRUM).
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
  • Hands - on development and implementation experience in Big Data Management Platform (BMP) using HDFS, Map Reduce, Hive, Pig and other Hadoop related eco-systems as a Data Storage and Retrieval systems.
  • Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Exported the aggregated data onto Oracle using Sqoop for reporting on the Tableau dashboard.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Experience working on NoSQL databases including HBase and MongoDB using python.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Created alter, insert and delete queries involving lists, sets and maps in DataStax Cassandra.
  • Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files, JSON and XML files.
  • Good Experience with version control tools like CVS, SVN, VSS, CLEAR CASE AND GIT.
  • Working knowledge in SQL, Stored Procedures, Functions, Packages, DB Triggers and Indexes.
  • Experience in using various tools and IDEs for development and design like NetBeans, Edit Plus, Notepad++, Eclipse, Sublime Text and Dreamweaver, RADD.
  • Effective leadership quality with good skills in strategy, business development, client management and project management.
  • Good Inter personnel skills and ability to work as part of a team. Exceptional ability to learn and master new technologies and to deliver outputs in short deadlines. Strong analytical and Problem solving skills.
  • Good at prioritizing along critical path, meeting project milestones and deliverable dates.
  • Active participation in Presentation of Project concept, status, plans and progress
  • Excellent interpersonal and communication skills, and is experienced in working with senior level managers and developers across multiple disciplines.

PROFESSIONAL EXPERIENCE

Confidential

Senior Hadoop Developer

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, Kafka, Spark -Streaming/SQL, java, SQL Scripting, Linux Shell Scripting.

Responsibilities:

  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Apache ZooKeeper, SQOOP, Spark, Cassandra with Hortonworks Distribution. Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in HadoopUsing Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • To configure Hadoop environment in cloud through Amazon Web Services (AWS) and to provide a scalable distributed data solution
  • Worked on installation of KAFKA on Hadoop cluster and to use it for streaming & cleansing of raw data and have extracted useful information using Hive and stored the results in HBase and have enabled the clients to review the results using Tableau by connecting it through the IP address provided by AWS.
  • Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Integrating user data from Cassandra to data in HDFS. Integrating Cassandra with Storm for real time user attributes look up.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Created HBase tables to store variable data formats (Avro, JSON) of data coming from different portfolios using NOSQL
  • Import the data from different sources like HDFS/HBase into Spark RDD.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Experienced in Microsoft Azure using SSH and Putty.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Expertise in different data Modeling and Data Warehouse design and development.
  • Worked on AWS clusters.

Confidential, Kenilworth, NJ

Senior Hadoop Developer

Environment: Apache Hadoop, HDFS, Java MapReduce, Eclipse, Hive, PIG, Sqoop and SQL, Oracle 11g.

Responsibilities:

  • All the data was loaded from our relational DBs to HIVE using Sqoop. We were getting four flat files from different vendors. These were all in different formats e.g. text, EDI and XML formats
  • We create table in HIVE and after that load data from HDFS to HIVE
  • Writing hive join query to fetch info from multiple tables
  • Writing multiple map-reduce jobs to collect output from hive
  • We are also getting duplicate data in files which was corrected by writing map-reduce job
  • Involved in developing Map-reduce framework, writing queries scheduling map-reduce
  • Large data sets in parallel across the Hadoop cluster for pre-processing
  • Developed the code for Importing and exporting data into HDFS and Hive using Sqoop
  • Installed and configured Hadoop and responsible for maintaining cluster and managing and reviewing Hadoop log files.
  • Worked on configuring and managing disaster recovery and backup on Cassandra Data.
  • Performed Filesystem management and monitoring on Hadoop log files.
  • Implemented partitioning, dynamic partitions and buckets in HIVE.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Supported Map Reduce Programs those are running on the cluster.
  • Wrote shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
  • Involved in loading data from UNIX file system to HDFS, configuring Hive and writing Hive UDFs.
  • Utilized Java and MySQL from day to day to debug and fix issues with client processes.
  • Managed and reviewed log files.
  • Developed customized classes for serialization and Deserialization in Hadoop
  • Performed optimization of MapReduce for effective usage of HDFS by compression techniques.
  • Developed Shell, Perl and Python scripts to automate and provide Control flow to Pig scripts.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data.
  • Performed analysis of data using Hive queries and Pig scripts..
  • Extracted the data onto HDFS using KAFKA
  • Load and transform large sets of structured, semi structured and unstructured
  • Worked on large sets of structured, semi-structured and unstructured data.
  • Use of Sqoop to import and export data from HDFS to Oracle RDBMS and vice-versa.
  • Developed PIG Latin scripts to play with the data.

Confidential, Plano TX

Senior Hadoop Developer

Environment: s: Apache Hadoop, HDFS, Java MapReduce, Eclipse, Hive, PIG, Sqoop and SQL, Oracle 11g

Responsibilities:

  • Developed and executed shell scripts to automate the jobs
  • Wrote complex Hive queries.
  • Worked on reading multiple data formats on HDFS using Scala
  • Involved in converting Cassandra/Hive/SQL queries intoSparktransformations usingSparkRDDs, and Scala.
  • Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance ofSpark, with Cassandra and SQL
  • Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala
  • Involved in loading data from UNIX file system to HDFS.
  • Extracted the data from Databases into HDFS using Sqoop
  • Handled importing of data from various data sources, performed transformations using Hive,Spark and loaded data into HDFS.
  • Manage and reviewHadooplog files. Implemented lambda architecture as s solution to a problem.
  • Involved in analysis, design, testing phases and responsible for documenting technical specifications.
  • Very good understanding of Partitions, Bucketing concepts Managed and External tables in Hive to optimize performance.
  • Worked on the core andSparkSQL modules ofSparkextensively.
  • Experienced in runningHadoopstreaming jobs to process terabytes data.

Confidential, Sacramento CA

Senior Hadoop Developer.

Environment: Apache Hadoop, HDFS, Java MapReduce, Eclipse, Hive, PIG, Sqoop and SQL.

Responsibilities:

  • Responsible for business logic using java and JavaScript, JDBC for querying database.
  • Involved with the application teams to install Hadoop updates, patches and version upgrades as required.
  • Development experience in UNIX, LINUX and Windows (Vista, XP, NT, 2000, 95) and Cloud based virtual platforms.
  • Expertise with web based GUI architecture and development using HTML, CSS, AJAX, JQuery, Angular Js, and JavaScript.
  • Experience in client side Technologies such as HTML, CSS, JavaScript, jQuery.
  • Experience in developing the user interfaces using HTML, CSS, AJAX and JAVASCRIPT.
  • Worked on analyzing, writing Hadoop MapReduce jobs using JavaAPI, Pig and Hive.
  • Very Good understanding of SQL, ETL and Data Warehousing Technologies.
  • Involved in Configuring core-site.xml and mapred-site.xml according to the multi node cluster environment
  • Having experience on Hadoop eco system components HDFS, MapReduce, Hive, Pig, Sqoop and HBase.
  • Involved in peer &lead level design & code reviews.
  • Load and transform large sets of structured, semi structured and unstructured data onto Oracle from Hadoop environment using Sqoop for reporting on the dashboard.
  • Strong Knowledge on HDFS, MapReduce and NoSQL Database like HBase.
  • Responsible for writing Hive Queries for analyzing terabytes of customer data from HBase and put the results in output file.
  • Incrementally loaded data from HBase NoSQL database.
  • Designed Business classes and used Design Patterns like Data Access Object, MVC etc.
  • Developed bootstrap UI JS using with Angular JS.
  • Responsible for the overall layout design, color scheme of the web site using HTML, bootstrap and CSS3.
  • Created Server Side of application for project management using Mongo DB.

Confidential - Dallas Tx

MySql DBA

Environment: MySql, Linux, shell scripting

Responsibilities:

  • Worked on migrating on premise servers to AWS cloud successfully. Resolved concurrency and performance issues by configuring thread pooling at MySQL level. Also benchmarked various types of AWS instances with differing configurations. Setup Xtrabackup scripts to regularly send backups to S3
  • Worked in a heavily sharded environment (120 shards). Client data is split into multiple shards effectively. Wrote a shell script to regularly audit schema configurations across shards. Also wrote scripts for pruning reporting tables across schemas
  • Setup different Percona clusters for various applications. Dealt with migration of native mysql nodes to percona cluster. Setup gcache size accurately to avoid SST initiation. Used RSU and TSI methods to perform schema changes on clusters
  • Dealt with swappiness problem in production. Had to change the swappiness to 10 to prevent swapping
  • Extensively used SYS schema to troubleshoot various performance issues. Performance Schema statements digest table and event summary tables were used to troubleshoot various performance issues
  • Worked on setting up Grafana and Prometheus for monitoring of 100+ MySQL instances
  • Worked with application teams to release database changes to production systems
  • Increased database performance by utilizing MySQL config changes, multiple instances and by upgrading hardware.
  • Assisted with sizing, query optimization, buffer tuning, backup and recovery, installations, upgrades and security including other administration functions as part of profiling plan.
  • Ensured production data being replicated into data warehouse without any data anomalies from the processing databases.
  • Worked with the engineering team to implement new design systems of databases used by the company.
  • Effectively configured MySQL Replication as part of HA solution.
  • Designed databases for referential integrity and involved in logical design plan.
  • Performance Tuning on a daily basis for preventing issues and providing capacity planning using MySQL Enterprise Monitor.
  • Developed stored procedures, triggers in MySQL for lowering traffic between servers & clients.
  • Ability to carry out security tasks at network level such as block/unblock TCP/IP ports through firewall on both Linux and windows and block/unblock remote access to MySQL server.
  • Proficiency in Unix/Linux shell commands.
  • Created and deleted users, groups and set up restrictive permissions, configuration of the sudo files etc.
  • Created data extracts as part of data analysis and exchanged with internal staff.
  • Performed MySQL Replication setup and administration on Master-Slave and Master-Master.
  • Documented all servers and databases.
  • Database engineering is based on SDLC pattern. Involved all the steps like requirement analysis, Design, Development and Testing and then deployment.
  • Shell script is being used for some data migration and backend work management.
  • Supporting management with different database related decisions.
  • Handling Release management and user acceptance.

Confidential - San Ramon, CA

Java/UI Developer

Environment: s: JSP, HTML, Servlets, Hibernate, Spring Framework, JavaScript, XML, JDBC Oracle9i, PL/SQL, WebSphere, Eclipse, Junit

Responsibilities:

  • Involved in design of JSP's and Servlets for navigation among the modules.
  • Developed various EJBs for handling business logic and data manipulations from database.
  • Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
  • Developed SQL queries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
  • Developed the XML Schema and Web services for the data maintenance and structures Wrote test cases in JUnit for unit testing of classes.
  • Used DOM and DOM Functions using Firefox and IE Developer Tool bar for IE.

We'd love your feedback!