Sr Hadoop Developer Resume Kansas, MO - Hire IT People

SUMMARY:

4 Years of strong experience as Application developer responsible for building Rest Services, multi - threaded applications, IO programming using Java.
Strong hands on experience using major components in Hadoop Ecosystem like Spark, Map Reduce, HIVE, PIG, HBase, Sqoop, Splunk, Oozie, Flume and Kafka.
Hands on developing and debugging YARN (MR2) Jobs to process large Datasets.
Excellent knowledge and understanding of Distributed Computing and Parallel processing frameworks.
Strong experience with developing end-to-end Spark applications in Scala.
Worked extensively on troubleshooting issues related to memory management, resource management, with in spark applications.
Strong knowledge on fine-tuning spark applications and hive scripts.
Written complex MapReduce jobs to perform various data transformations on large scale datasets.
Experience in installation, configuration, and monitoring Hadoop clusters both in house and on the cloud (AWS).
Extending Hive and Pig core functionality by writing custom UDF’s for Data Analysis.
Handling importing of data from various data source, performed transformation, and hands on developing and debugging MR2 jobs to process large data sets.
Experienced in writing MapReduce programs and UDFs for both Hive and Pig in Java.
Experience in using Splunk, Apache Flume for collecting, aggregation, moving large amount of data from application server.
Used Sqoop extensively for ingesting data from relational databases.
Good knowledge on Kafka for streaming real time feeds from external rest applications to Kafka topics.
Strong knowledge of entire SDLCE - Requirement Gathering & Analysis, Planning, Design, Development, Testing and Implementation.
Involved in Design and Development of technical specifications using Hadoop Echo System tools
Used NOSSQL technologies like HBase, Mongo dB for data extraction and storing huge volume of data.
Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce, Hive, Sqoop and Spark jobs.
Expertise in writing Map Reduce jobs using Java native code, Pig, Hive for data Processing
Used SVN repository for version control of the developed code
Experience working with NoSQL databases including Cassandra and HBase.
Major strengths are familiarity with multiple software systems, ability to learn quickly new technologies, adapt to new environments, self-motivated, team player, focused adaptive and quick learner with excellent interpersonal, technical and communication skills.
Strong oral and written communication, initiation, interpersonal learning and organizing skills matched with the ability to manage time and people effectively.

TECHNICAL SKILLS:

Big Data Eco System: HDFS, Map Reduce, Hive, Pig, HBase, Spark, Spark Streaming, Spark SQL, Kafka, Cloudera CDH4, CDH5, Hortonworks, Hadoop Streaming, Splunk, Zookeeper, Oozie, Sqoop, Flume, Impala, Solar, and Ranger.

Database: Oracle 10g/11g, Sql Server 2005/2008 R2, My SQL, DB2, HBase, MongoDB, Cassandra.

Framework: Struts, Spring, Hibernate

Operating Systems: Windows 2008, 2003, 2000 Server, Windows 95/98/XP/Vista/7, DOS, Red Hat Linux, Macintosh OSX.

Database Tools: SQL Enterprise Manager, SQL Profiler, Query Analyser, SQL Server Setup, Security Manager, Service manager, DTS, Import Export Data, Bulk Insert, SQL Server Reporting Services(SSRS)

Programming Languages: Java, Scala, Python, SQL

Script Languages: JavaScript, jQuery, Shell Script(BASH)

Methodologies: Waterfall, Iterative, Agile/Scrum

PROFESSIONAL EXPERIENCE:

Confidential, Kansas, MO

Sr Hadoop Developer

Responsibilities:

Developed Spark applications using Scala utilizing Data frames and Spark SQL API for faster processing of data.
Developed highly optimized Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement
Data pipeline consists Spark, Hive and Sqoop and Custom build Input Adapters to ingest, transform and analyze operational data.
Developed Spark jobs and Hive Jobs to summarize and transform data.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Involved in converting Hive/SQL queries into Spark transformations using Spark Data Frames and Scala.
Analyzed the SQL scripts and designed the solution to implement using Scala.
Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
Real time streaming the data using Spark with Kafka
Created applications using Kafka, which monitors consumer lag within Apache Kafka clusters. Used in productionby multiple report suites.
Ingested syslog messages, parses them and streams the data to Apache Kafka.
Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
Analyzed the data by performing Hive queries (Hive QL) to study customer behavior.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
Created HBase tables and column families to store the user event data.
Scheduled and executed workflows in Oozie to run various jobs.

Environment: Hadoop, HDFS, HBase, Spark, Scala, Hive, MapReduce, Sqoop, ETL, Java, PL/SQL, Oracle 11g, Unix/Linux, Ford DirectDearborn

Confidential

Hadoop Developer

Responsibilities:

Creating end to end Spark applications using Scala to perform various data cleansing, validation, transformation and summarization activities on user behavioral data.
Developed custom Input Adaptor utilizing the HDFS File system API to ingest click stream log files from FTP server to HDFS.
Developed end-to-end data pipeline using FTP Adaptor, Spark, Hive and Impala.
Implemented Spark and utilized SparkSQL heavily for faster development, and processing of data.
Exploring with Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's.
Involved in converting Hive/SQL queries into Spark transformations using Spark with Scala.
Used Scala collection framework to store and process the complex consumer information.
Implemented a prototype to perform Real time streaming the data using Spark Streaming with Kafka
Handled importing other enterprise data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HBase tables.
Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
Analyzed the data by performing Hive queries (Hive QL) and running Pig scripts (Pig Latin) to study customer behavior.
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
Created components like Hive UDFs for missing functionality in HIVE for analytics.
Worked on various performance optimizations like using distributed cache for small datasets, Partition,Bucketing in Hive and Map Side joins.
Created validate and maintain scripts to load data using Sqoop manually.
Created Oozie workflows and coordinators to automate Sqoop jobs weekly and monthly.
Uploaded and processed more than 30 terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
Used Oozie and Oozie coordinators to deploy end-to-end data processing pipelines and scheduling the workflows.
Continuous monitoring and managing the Hadoop cluster
Developed interactive shell scripts for scheduling various data cleansing and data loading process.
Experience with data wrangling and creating workable datasets.

Environment: HDFS, Pig, Hive, Sqoop, Flume, Spark, Scala, MapReduce, Scala, Oozie, Oracle 11g, YARN, UNIX Shell Scripting, Agile Methodology

Confidential, Warren, NJ

Big Data/Hadoop Developer

Responsibilities:

Lead a team of three developers that built a scalable distributed data solution-using Hadoop on a 30-node cluster using AWS cloud to run analysis on 25+ Terabytes of customer usage data.
Developed several complex MapReduce programs to analyze and transform the data to uncover insights into the customer usage patterns.
Used MapReduce to Index the large amount of data to easily access specific records.
Performed ETL using Pig, Hive and MapReduce to transform transactional data to de-normalized form.
Configured periodic incremental imports of data from DB2 into HDFS using Sqoop.
Exported data using Sqoop from HDFSto Teradata on regular basis.
Developed ETL scripts for data acquisition and transformation using Informatica and Talend.
Installed and configuredFlume, Hive, Pig and Sqoop HBaseon the Hadoop cluster.
Exported and analyzed data to the relational databases usingSqoopfor visualization and to generate reports for the BI team.
Supported in setting up QA environment and updating configurations for implementing scripts withPigandSqoop.
Worked extensively with importing metadata into Hive and migrated existing tables and applications to work on Hive and AWS cloud.
Wrote Pig and HiveUDFs to analyze the complex data to find specific user behavior.
Used Solr workflow engine to schedule multiple recurring and ad-hoc Hive and Pig jobs.
Created HBase tables to store various data formats coming from different portfolios.
Created Python scripts in automating the work flows.
Extracted feeds form social media sites such as Facebook Twitter using Python scripts.
Designed and implemented Hive and Pig UDF's using Python for evaluation, filtering, loading and storing of data
Developed Simple to complex Map/reduce streaming jobs using Python language that are implemented using Hive and Pig.
TibcoJasperSoft was used for the embedding BI reports
Experience in writing scripts in Python for the automated jobs
Assisted the team responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files.
Conversion of Teradata, RDBMS are formulated in Hadoop backlog files.
Worked actively with various teams to understand and accumulate data from different sources up on the business requirements
Worked with the testing teams to fix bugs and ensure smooth and error-free code.

Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Pig, Sqoop, Oozie, HBase, ZooKeeper, PL/SQL, MySQL, DB2, Teradata.

Confidential, Salt Lake City, UT

Hadoop Developer

Responsibilities:

Responsible for developing efficient MapReduce on AWS cloud programs for more than 20 years’ worth of claim data to detect and separate fraudulent claims.
Developed Map-Reduce programs from scratch of medium to complex.
Uploaded and processed more than 30 terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
Played a key-role is setting up a 40 node Hadoop cluster utilizing Apache MapReduce by working closely with the Hadoop Administration team.
Worked with the advanced analytics team to design fraud detection algorithms and then developed MapReduce programs to run efficiently the algorithm on the huge datasets.
Developed Java programs to perform data scrubbing for unstructured data.
Responsible for designing and managing the Sqoop jobs that uploaded the data from Oracle to HDFS and Hive.
Creating Hive tables to import large data sets from various relational databases using Sqoop and export the analyzed data back for visualization and report generation by the BI team
Used Flume to collect the logs data with error messages across the cluster.
Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
Played a key role in installation and configuration of the various Hadoop ecosystem tools such as, Hive, Pig, andHBase.
Successfully loaded files to HDFS from Teradata, and loaded from HDFS to HIVE
Experience in using Zookeeper and Oozie for coordinating the cluster and scheduling workflows
Installed Oozie workflow engine and scheduled it to run data/time dependent Hive and Pig jobs
Designed and developed Dashboards for Analytical purposes using Tableau.
Analyzed the Hadoop log files using Pig scripts to oversee the errors.
Actively updated the upper management with daily updates on the progress of project that include the classification levels in the data.

Environment: Java, Hadoop, Mapreudce Hive, Pig, Sqoop, Flume, HBase, TeradataCapital One

Confidential, VA

Java/J2EE Developer

Responsibilities:

Effective role in the team by interacting with welfare business analyst/program specialists and transformed business requirements into System Requirements.
Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using Struts.
Responsible to enhance the Portal UI using HTML, Java Script, XML, JSP,Java, CSS as per the requirements and providing the client-side Java script validations and Server side Bean Validation Framework (JSR 303).
Developed Web services component using XML, WSDL, and SOAP with DOM parser to transfer and transform data between applications.
Developed analysis level documentation such as Use Case, Business Domain Model, Activity, Sequence and Class Diagrams.
Handling of design reviews and technical reviews with other project stakeholders.
Implemented services using Core Java.
Developed and deployed UI layer logics of sites using JSP.
Spring MVC for the implementation of business model logic.
Used SOAP UI for testing the Restful Webservices by sending an SOAP request.
Used AJAX framework for server communication and seamless user experience.
Created test framework on Selenium and executed Web testing in Chrome, IE and Mozilla through Web driver.
Worked with StrutsMVC objects like action Servlet, controllers, and validators, web application context, Handler Mapping, message resource bundles, and JNDI for look-up for J2EE components.
Developed dynamic JSP pages with Struts.
Employed built-in/custom interceptors, and validators of Struts.
Developed the XML data object to generate the PDF documents, and reports.
Employed Hibernate, DAO, and JDBC for data retrieval and medications from database.
Messaging and interaction of web services is done using SOAP.
Developed Junittest cases for Unit Test cases and as well as system, and user test scenarios

Environment: Struts, Hibernate, Spring MVC, SOAP, WSDL, Web Logic, Java, JDBC, Java Script, Servlets, JSP, JUnit, XML, UML, Eclipse, Windows.

Confidential

Java Developer

Responsibilities:

Involved in designing the Project Structure, System Design and every phase in the project.
Responsible for developing platform related logic and resource classes, controller classes to access the domain and service classes.
Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
Designed user-interface and checking validations using JavaScript.
Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
Involved in Technical Discussions, Design, and Workflow.
Participate in the Requirement Gathering and Analysis.
Developed Unit Testing cases using JUnit Framework.
Implemented the data access using Hibernate and wrote the domain classes to generate the Database Tables.
Involved in design of JSP’s and Servlets for navigation among the modules.
Designed cascading style sheets and XML part of Order entry Module & Product Search Module and did client side validations with java script.
Involved in implementation of view pages based on XML attributes using normal Java classes.
Involved in integration of App Builder and UI modules with the platform.

Environment: Hibernate, Java, JAXB, JUnit, XML, UML, Oracle11g, Eclipse, Windows XP.

Confidential

Java Developer

Responsibilities:

Involved in designing the Project Structure, System Design and every phase in the project.
Responsible for developing platform related logic and resource classes, controller classes to access the domain and service classes.
Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
Designed user-interface and checking validations using JavaScript.
Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
Involved in Technical Discussions, Design, and Workflow.
Participate in the Requirement Gathering and Analysis.
Developed Unit Testing cases using JUnit Framework.
Implemented the data access using Hibernate and wrote the domain classes to generate the Database Tables.
Involved in design of JSP’s and Servlets for navigation among the modules.
Designed cascading style sheets and XML part of Order entry Module & Product Search Module and did client side validations with java script.
Involved in implementation of view pages based on XML attributes using normal Java classes.
Involved in integration of App Builder and UI modules with the platform.

Environment: Hibernate, Java, JAXB, JUnit, XML, UML, Oracle11g, Eclipse, Windows XP.

We provide IT Staff Augmentation Services!

Sr Hadoop Developer Resume

Kansas, MO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship