Senior Hadoop Developer Resume Charlotte, NC - Hire IT People

SUMMARY

Over 8+ years of experience in analysis, design, development, implementation of web - based distributed applications
4+ years of experience in Hadoop, HDFS, Map Reduce, Sqoop, Pig, Hive and HBase, LINUX, UNIX.
Worked on a Hadoop Cluster with current size of 56 Nodes and 896 Terabytes capacity.
Hands on using job scheduling and monitoring tools like Kafka, Oozie and Zookeeper.
Having strong techno-functional skill to efficiently conceptualize business scenarios into Big Data Use Cases.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Nodeand MapReduce concepts.
Capable of Designing and Architecting Hadoop Applications and recommending the right solutions and technologies for the application.
Developed many MapReduce programs in Java for data cleansing, data filtering, and data aggregation.
Have re-engineered many Legacy Mainframe Applications into Hadoop usingMapReduce API to reduce mainframe MIPS and Storage Cost.
Developed UDFs in Java as and when necessary to use in PIG and HIVE queries.
Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way.
Good knowledge on Horton works Data platform 2.2
Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop MapReduce and Pig jobs.
Experience in working with the NoSQL Mongo DB,Apache Cassandra.
Experience in managing and reviewing Hadoop Logfiles.
Developed tools using Python, Shell scripting, XML to automate some of the menial tasks
Extending Hive and Pig core functionality by writing custom UDFs.
Good experience working with Distributions such as MAPR, Horton works and Cloudera.
Good Knowledge in Amazon AWS concepts like EMR and EC2web services which provides fast and efficient processing of Big Data.
Experienced in the integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets, and Text files.
Experience in working with Oracle and DB2.
Experience in Web Services using XML, HTML and SOAP.
Excellent Java development skills using J2EE, J2SE, Servlets, JUnit, JSP, JDBC.
Familiarity working with popular frameworks likes Struts, Hibernate, Spring, MVC and AJAX.
Implemented Proofs of Concept on Hadoop stack and different big data analytic tools.
Experience in migration from different databases (i.e. VSAM, DB2, PLSQL and MYSQL) to Hadoop.
Experience in NoSQL databases like HBase, Cassandra and MongoDB.
Committed to timely and quality work, Quick learner, able to adapt effortlessly to new technologies, ability to work within a team as well as cross-team
Defined and Developed ETL process to automate the data conversions, catalog uploading, error handling and auditing using Talend.
Highly motivated and a self-starter with effective communication and organizational skills, combined with attention to detail and business process improvements.

TECHNICAL SKILLS:

Technology: Hadoop Ecosystem/ J2SE/ J2EE/ Oracle.

Operating Systems: WindowsVista/XP/NT/2000Series,UNIX/LINUX (Ubuntu, CentOS, Redhat)/ AIX/ Solaris.

DBMS/Databases: DB2, My SQL, SQL, PL/SQL.

Programming Languages: C, C++, JSE, XML, JSP/Servlets, Struts, Spring, HTML, JavaScript, JQuery, Web services.

Big Data Ecosystem: HDFS, Map Reduce, Oozie, Hive, Pig, Sqoop, Flume, Zookeeper and HBase, Storm, Kafka, Spark, Scala.

Methodologies: Agile, Waterfall.

NOSQL Databases: Cassandra, MongoDB, HBase.

Version Control Tools: SVN, CVS, VSS, PVCS.

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Senior Hadoop Developer

Responsibilities:

Knowledge on handling Hive queries using Spark SQL that integrate with Spark environment implemented in Scala.
Used Spark Streaming API with Kafka to build live dashboards; Worked on Transformations & actions in RDD, Spark Streaming, Pair RDD Operations, Check-pointing, and SBT.
Implemented POC to migrate map reduce jobs into Spark RDD transformation using Scala IDE for Eclipse
Creating Hive tables to import large data sets from various relational databases using Sqoop and export the analyzed data back for visualization and report generation by the BI team.
Installing and configuring Hive, Sqoop, Flume, Oozie on the Hadoop clusters.
Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
Developed a process for the Batch ingestion of CSV Files, Sqoop from different sources and also generating views on the data source using Shell Scripting and Python.
Integrated a shell script to create Collections/morphline, SolrIndexes on top of table directories using MapReduce Indexer Tool within Batch Ingestion Framework.
Implemented partitioning, dynamic partitions and buckets in HIVE.
Configured the Message Driven Beans (MDB) for messaging to different clients and agents who are registered with the system.
Developed Hive Scripts to create the views and apply transformation logic in the Target Database.
Involved in the design of Data Mart and Data Lake to provide faster insight into the Data.
Involved in using Stream Sets Data Collector tool and created Data Flows for one of the streaming application.
Experienced in using Kafka as a data pipeline between JMS (Producer) and Spark Streaming Application (Consumer)
Involved in the development of Spark Streaming application for one of the data source using Scala, Spark by applying the transformations.
Developed Web services and web services clients using both SOAP and REST implantations.
Designed and Developed web based applications using Hibernate, XML, EJB, and SQL to setup new web services.

Environment: Hadoop, HDFS, Hive, HBase, Zookeeper, Oozie, Impala, Java(jdk1.6), Cloudera, Oracle, Teradata SQL Server, UNIX Shell Scripting, Flume, Scala, Spark, Sqoop, Python.

Confidential, Charlotte, NC

Sr. Hadoop Developer.

Responsibilities:

Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Involved in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
Data ingestion into HDFS from various Mainframe Db2 table using Sqoop
Populated HDFS and Cassandra with huge amounts of data using Apache Kafka.
Migrated Existing Map Reduce programs to Spark Models using Python.
Automated Spark streaming process using Kafka
Used RDD's to perform transformation on datasets as well as to perform actions like count, reduce, first.
Good knowledge on Sparkplatform parameters like memory, cores, and executors
Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
Test data for conformance with standard patterns or customized patterns using Talend
Extracted files from Couch DB through Sqoop and placed in HDFS and processed.
Experienced in runningHadoopstreaming jobs to process terabytes of xml format data.
Administration, installing, upgrading, and managing distributions of Hadoop, Hive, Hbase.
Loading data into HBase tables using Java MapReduce.
Used AWS cloud infrastructure to manage product development and implementation.
Involved in performance of troubleshooting and tuning Hadoop clusters.
Created Hive tables, loaded data and wrote Hive queries that run within the map.
Implemented business logic by writing Hive UDFs in Java.
Got good experience with NoSQL database.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in map reduce way.
Wrote XML scripts to build OOZIE functionality.
Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
Extensively worked on creating End-End data pipeline orchestration using Oozie.
Evaluated suitability of Hadoop and its ecosystem in my current project and implementing / validating with various Proof of Concept (POC) applications to eventually adopt them to benefit from the Big Data Hadoop initiative.
Used Sqoop and mongoDump to move the data between MongoDB and HDFS.
Developed workflows using custom MapReduce.
Developed suit of Unit Test Cases for Mapper, Reducer and Driver classes using MR Testing library.

Environment: Java 6,python, Linux, Hadoop, HBase, Sqoop, Kafka, Pig, Hive, Cloudera Hadoop Distribution, HDFS, MapReduce, MongoDB, Shell scripting, LINUX, Flume,spark.

Confidential, Frederick, MD

Sr. Hadoop Developer.

Responsibilities:

Involved in Low level design for MR, Hive, Shell scripts to process data.
Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Designed a data warehouse using Hive.
Created partitioned tables and hive queries for ad hoc access in Hive.
Worked extensively with Sqoop for importing metadata from Oracle.
Extensively used Pig for data cleansing.
Importing the real-time data to Hadoop using Kafka and implemented the Oozie job.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
Worked on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
Administration, installing, upgrading, and managing distributions of Hadoop, Hive, HBase.
Advanced knowledge in performance troubleshooting and tuning Hadoop clusters.
Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
Used OOZIE Operational Services for batch processing and scheduling workflows dynamically.
Extensively worked on HIVE data stores for text, Avro and RC storage formats.
Worked on populating analytical data stores for data science team.
Created tools using Java for performing balance tests.
Worked with architects to build efficient OOZIE workflows with coordinators. evaluated and reconfigured companies Unix/Linux/Oracle setup including reallocatingSanDisk space to engineer a robust, scalable solutions
Integrated the hive warehouse with HBase.
Wrote Python scripts to parse XML documents and load the data in database
Written customized HiveUDFs in Java where the functionality is too complex.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive).

Environment: Hadoop, MapReduce, HDFS, Hive, Java (jdk1.6), Hadoop distribution of Horton works, Oozie, Core Java, Pig, Sqoop, Shell scripting, Kafka, LINUX, HBase, Oracle.

Confidential, Hillsboro, OR

Hadoop Developer

Responsibilities:

Wrote MapReduce jobs using Java API.
Importing and exporting data into HDFS and Hive using Sqoop.
Defining a logical architecture of the layers and components of Apache Spark solution. Selecting the right products to implement a big data solution.
Reliability and Ease of Scalability over traditional MSMQ.
Expertise into monitoring and administration of Spark applications. Involved in writing PySpark scripts.
Involved heavily in writing complex SQL queries based on the given requirements on Teradata platform.
Extracted Data through different source systems like Oracle, MySQL and SQL Server Databases for Applications Development and report Maintenances.
Worked on several BTEQ scripts to transform the data and load into the Teradata database.
Performed Data analysis and prepared the physical database based on the requirements.
Involved in Creating the Unix Shell Scripts/Wrapper Scripts that are used to run the BTEQ and other Teradata jobs from the Control Panel tool.
Developed entire frontend and backend modules using Python on Django Web Framework.
Involved in active communication and interaction with offshore support team during the development, Testing and production implementation phases of the project.
Worked on Tuning, and troubleshooting Teradata system at various levels. Performed unit testing, regression testing and Integration testing.
Assisted the Testing team in developing SQL/PLSQL scripts for Automated Testing.
Maintain System integrity of all sub-components related to Hadoop.
Involved for Cassandra Database Schema Design Using BULK LOAD Utility data pushed to Cassandra databases.

Environment: Hadoop, Hive, Spark, Spark-SQL, Sqoop, Kafka, Teradata Viewpoint, UNIX, UNIX Shell Scripting, TPT, Fast Export, and BTEQ, GitHub, Framework, Pig, SQOOP, ORACLE, MySQL.

Confidential, Irving, TX

Java/Hadoop Developer.

Responsibilities:

Creating class diagrams, sequence diagrams, Data Model and Object Model using Rational Rose and MS-Visio.
Used JSF Framework to develop the application.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented Spring quartz Jobs for generating feed to the various downstream applications.
Used Rational Rose to draw UML diagrams and to develop the Use cases, Domain model and Design Model
Implemented the functionalities using Java, J2EE, JSP, and AJAX, Servlets.
Dynamic chart generation using JFreeChart API in java
Developed java batch, for performance updates, implemented Multi Thread concepts.
Involved in Database programming in DB2.
Created the Stored Procedures, functions and triggers using PL/SQL.
Implemented struts MVC framework with tiles and validators.
Application UI development using AJAX, HTML, JSP, XML and CSS.
Implemented the functionalities using Java, J2EE, JSP, and AJAX, Servlets
Developed automation, mail notification system using Java Mail API in java FTP programming.
Involved Database programming in oracle10g.
Worked as a module/tech lead for various modules like GCSP, ORNIS of the application.
Created the Stored Procedures, functions and triggers using PL/SQL.

Environment: Java, J2EE, JSP, MVC, Eclipse, web services, SOAP, WSDL, UDDI, Java Script, MTG, AJAX, JDBC, WAS5.1, Eclipse, Oracle 10g, PL/SQL, HTML, DHTML,XML

Confidential

Software Engineer

Responsibilities:

Conducted requirements gathering sessions with the business user to collect business requirements (BRDs), data requirement, and user interface requirements.
Responsible for the initiation, planning, execution, control and completion of the project
Worked alongside the Development team in solving critical issues during the development.
Responsible for developing management reporting using Cogon’s reporting tool.
Conducted User Interview and documented reconciliation work flows.
Worked with Business and System Analyst to complete the development in time.
Implemented the presentation layer with HTML, CSS and JavaScript.
Developed web components using JSP, Servlets and JDBC.
Implemented secured cookies using Servlets.
Wrote complex SQL queries and stored procedures.
Implemented Persistent layer using HibernateAPI.
Implemented Search queries using Hibernate Criteria interface.
Conducted detailed analysis of current processes and developed new process flow, data flow, and work flow models, Use Cases using Rational Rose & MS Visio
Maintained responsibility for database design, implementation, and administration.
Testing the functionality and behavioral aspect of the software.

Environment: UNIX, Windows, Core Java, SQL, JDBC, JavaScript, HTML, JSP, Servlets, Oracle, J2EE, JCL, DB2, CICS.

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship