We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

0/5 (Submit Your Rating)

Jacksonville, FL

SUMMARY

  • Result - driven IT Professional with 8+ years of experience in Software Development & Requirement Analysis in Agile work environment dat includes recent 4+ years of BigData Ecosystems experience in ingestion, storage, querying, processing and analysis of Big Data.
  • Excellent understanding of Hadoop architecture and core components such as Name Node, Data Node, Resource Manager, Node Manager and other distributed components in teh Hadoop platform.
  • Hands on experience in writing Ad-hoc Queries for moving data from HDFS to HIVE and analyzing teh data using HIVE QL
  • Experience in dealing with Apache Hadoop components like HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Oozier, Mahout, Spark, Cassandra, H-base
  • Experience inSparkFramework on both batch and real-time data processing
  • Experience managing No-SQL DB on large Hadoop distribution Systems such as: Cloudera, HortonWorks
  • Skilled in developing Hadoop integration for data ingestion, data mapping and data process capabilities
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper, Storm, Spark, Kafka and Flume
  • Strong understanding of Data Modeling and experience with Data Cleansing, Data Profiling and Data analysis
  • Experience in ETL (Data stage) analysis, designing, developing, testing and implementing ETL processes including performance tuning and query optimizing of databases
  • Experience in extracting source data from Sequential files, XML files, Excel files, transforming and loading it into teh target data warehouse
  • Strong base in writing teh Test plans, perform Unit Testing, User Acceptance testing, Integration Testing and System Testing
  • Good noledge in other SQL and NoSQL Databases like MySQL, MS SQL, MongoDB, HBase and Cassandra
  • Good noledge inDataVisualization by creating multiple dashboards using Tableau
  • Skilled in using version control software such as GIT
  • Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa
  • Excellent analytical, problem solving skills and a motivated team player with excellent inter-personal skills.

TECHNICAL SKILLS

Analytical Tools: SQL, Jupyter Notebook, Tableau, Zeppelin, Intellij, TalenD, Erwin

Programming: Scala, Java

Big Data: Spark, Pig, Hive, Sqoop, HBase, Hadoop, HDFS, MapReduce

NOSQL: Cassandra, HBase

Methodologies: Agile and Waterfall Model

PROFESSIONAL EXPERIENCE

Confidential - Jacksonville, FL

Sr. Hadoop Developer

Responsibilities:

  • Performed data Ingestion from various sources into Hadoop Data Lake using Kafka
  • Used Sqoop to transfer data between RDBMS and HDFS
  • Involved in collecting and aggregating large amounts of streaming data into HDFS using Flume and defined channel selectors to multiplex data into different sinks
  • Designed and implemented custom writable, custom input formats, custom partitions and custom comparators in MapReduce
  • Responsible for troubleshooting issues in teh execution of MapReduce jobs by inspecting and reviewing log files
  • Converted existing SQL queries into Hive QL queries
  • Implemented UDFs, UDAFs, UDTFs in Python for hive to process teh data dat can’t be performed using Hive inbuilt functions
  • TEMPEffectively used Oozier to develop automatic workflows of Sqoop, MapReduce and Hive jobs
  • Exported teh analyzed data into relational databases using Sqoop for visualization and to generate reports for teh BI team
  • Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON
  • Extensively used ETL methodology for performing Data Profiling, Data Migration, Extraction, Transformation and Loading using TalenD and designed data conversions from wide variety of source systems including Netezza, Oracle, DB2, SQL server, Teradata, Hive
  • Worked on teh core and Spark SQL modules of Spark extensively using programming languages likeScala, Python
  • Developed PIG UDFs for manipulating teh data according to Business Requirements and also worked on developing custom PIG Loaders
  • Worked on developing ETL processes (Data Stage Open Studio) to load data from multiple data sources to HDFS using FLUME and SQOOP, and performed structural modifications using Map Reduce, HIVE
  • Knowledge on handling Hive queries using Spark SQL dat integrate Spark environment
  • Involved in creating POCs to ingest and process streaming data using Spark streaming and Kafka
  • Worked in aggressive AGILE environment and participated in daily Stand-ups/Scrum Meetings

Environment: Hadoop, CDH4, Map Reduce, HDFS, Pig, Hive, Impala, oozy, Python, spark, Kafka, Flume, Storm, Knox, Linux, Scala

Confidential - Richards, TX

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop
  • Responsible for building scalable distributed data solutions using Hadoop
  • Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration
  • Implemented a script to transmit suspiring information from Oracle to HBase using Sqoop
  • Extensively worked on different file formats like PARQUET, AVRO & ORC
  • Created Hive schemas using performance techniques like partitioning and bucketing
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
  • Load and transform large sets of structured, semi structured and unstructured data
  • Developed Spark code using Scala for faster processing of data
  • Migrated complex Map reduce programs, Hive scripts into Spark RDD transformations and actions
  • Installed Oozie workflow engine to run multiple Hive and pig jobs
  • Performed data analysis with Cassandra using Hive External tables
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop
  • Implemented custom interceptors for flume to filter data as per requirement
  • Created internal and external Hive tables and defined static and dynamic partitions for optimized performance
  • Used Hive and Pig to analyze data in HDFS to identify issues and behavioral patterns
  • Involved in deploying code into version control GIT

Environment: Hadoop, HDFS, Hive, Impala, Pig, Sqoop, HBase, Shell Scripting, Python

Confidential - Boston, MA

Hadoop Developer

Responsibilities:

  • Loaded high volume of data into Hadoop cluster and analyzed teh data through Hadoop ecosystem
  • Responsible for building scalable distributed data solutions using Hadoop
  • Installed and configured Hive, Pig, Oozie, and Sqoop on Hadoop cluster
  • Developed simple to complex Map-Reduce jobs using Java programming language dat was implemented using Hive and Pig.
  • Supported Map Reduce Programs dat are running on teh cluster.
  • Handled teh importing of data from various data sources, performed transformations using hive, Map-Reduce, loaded data into HDFS and extracted data from MySQL into HDFS using Sqoop
  • Analyzed teh data by performing Hive queries (HiveQL) and running Pig Scripts (Pig Latin)
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs
  • Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team
  • Generated teh reports and dashboards using teh tool Tableau
  • Generated Tableau reports with trend lines and used filters, sets and calculated fields on teh reports
  • Worked on loading and transforming of large sets of structured, semi structured and unstructured data
  • Worked on NoSQL database including MongoDB, Cassandra and HBase
  • Developed NoSQL database by using CRUD, Indexing, Replication and Sharing in MongoDB. Sorted teh data by using indexing
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it
  • Wrote multiple MapReduce programs in Java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and another compressed file format

Environment: Hadoop, HDFS, Hive, Impala, Pig, Sqoop, HBase, Shell Scripting, Python

Confidential - Grand Rapids, MI

Oracle PL/SQL Developer

Responsibilities:

  • Created new database objects like procedure, functions, packages, triggers and views using T-SQL in development production environment for SQL Server.
  • Worked on Business Intelligence Process for Data Integration and Migration services
  • Performed performance tuning and optimization of SQL queries to improve query performance and to reduce runtime
  • Used database triggers and PL/SQL for development of system
  • Extensively used PL/SQL to write Cursors for acquiring information to be used in calculations written in stored procedures
  • Developed interfaces using PL/SQL packages, stored procedures, functions, Object Types, Cursors, Pipelined functions, Oracle queues and used Collections and Bulk Collects
  • Worked extensively on tuning of SQL and PL/SQL code using various tools like explain plan, TKPROF, SQL tuning advisor and SQL trace to enhance teh performance
  • Worked on various projects involving gap and map analysis, design and customization as per teh business rules and worked on custom applications development involving
  • Used Data Transformation Services (DTS) an Extract Transform Loading (ETL) tool of SQL Server to populate data from various data sources, creating packages for different data loading operations for application
  • Involved in Tuning Database & Application performance using Explain Plan, TK prof, resolving lock contentions, identifying & killing sessions
  • Worked with java developers to repair and enhance current base of PL/SQL packages to fix production issues and build new functionality and improve processing time through code optimizations and indexes
  • Involved in testing, bug fixing and documenting teh work for teh project Confidential each phase

Environment: Oracle 10g, SQL, PL/SQL, UNIX, SQL* Plus, Reports 6i, Forms 6i/4.5, SQL* LOADER, TOAD, Discoverer 6i, Apex 4.2

Confidential - Charlotte, NC

MS SQL Server Developer

Responsibilities:

  • Gatheird requirements from business analyst.
  • Developed physical data models and created DDL scripts to create database schema and database objects
  • Designed data models using Erwin
  • Wrote user requirement documents based on functional specification
  • Created new tables, written stored procedures, triggers for Application Developers and some user defined functions. Created SQL scripts for tuning and scheduling
  • Involved in performing data conversions from flat files into a normalized database structure
  • Developed source to target specifications for Data Transformation Services
  • Developed functions, views and triggers for automation
  • Extensively used Joins and sub-Queries to simplify complex queries involving multiple tables and optimized teh procedures and triggers to be used in production
  • Provideddisaster recoveryprocedures and policies forbackup and recoveryof Databases
  • Performance Tuningin SQL Server 2000 usingSQL ProfilerandData Loading
  • Installing SQL Server Client-side utilities and tools for all teh front-end developers/programmers
  • Created DTS package to schedule teh jobs for batch processing
  • Involved in performance tuning to optimize SQL queries using query analyzer
  • Created indexes, Constraints and rules on database objects for optimization
  • Creation/ Maintenance of Indexes for various fast and efficient reporting processes

Environment: MS SQL 2005, SSRS, SSAS, SSIS, SQL, MS Access, MS Excel, MS Word

Confidential

Software Engineer

Responsibilities:

  • Supported teh European market application of General Motor - by developing & maintaining applications, enhancements, designing and implementing new codes as per teh business requirement.
  • Prepared user requirements document and functional requirements document for different modules Analyzing teh Business Requirements
  • Architecture with JSP as View, Action Class as Controller and combination of EJBs and Java classes as Model
  • Involved in coding Session-beans and Entity-beans to implement teh business logic
  • Prepared SQL script for database creation and migrating existing data to teh higher version of application
  • Developed different Java Beans and helper classes to support Server-side programs
  • Involved in development of backend code for email notifications to admin users with multi excel sheet using teh xml
  • Modified teh existing Backend code for different level of enhancements
  • Designing error handling flow and error logging flow

Environment: COBOL, JCL, PAC BASE, CA7, Endevor, VSAM, IDMS, DB2, ZOs, CICS, Easytrieve, SPUFI, QMF

We'd love your feedback!