Senior Hadoop Developer Resume Medford, MA - Hire IT People

SUMMARY

8 years of overall experience in a variety of industries including 3+ years of experience in Big Data Technologies (Apache Hadoop stack and Apache Spark), 4+ years of experience in Java Technologies and 1+ years of experience in Dot Net technologies
Hands on experience in Hadoop Ecosystem components such as Hadoop, Spark, HDFS,MapReduce, YARN, TEZ, Hive, Sqoop, Flume, Pig, Oozie, Zookeeper, HBase, Kafka.
Hands on experience on working in multiple domains such as Manufacturing, Healthcare,Finance & Banking etc.
Experience in working with Cloudera, Horton works, Amazon Web Services and Microsoft Azure HDINSIGHT Hadoop Distributions.
In - depth knowledge of Apache Hadoop Architecture (1.x and 2.x) and Apache Spark 1.x Architecture.
Experience in implementing OLAP multi-dimensional cube functionality using ATSCALE.
Responsible for ingesting structured data residing on our traditional back-end databases on toHadoop and HIVE using SQOOP.
Hands on experience in writing Map Reduce jobs to perform data cleaning and preprocessing using Java and Python.
Experience in dealing with SQL in Hadoop with Apache Hive.
Hands on experience in writing Apache Spark SQL and Spark Streaming programming with Scala and Python.
Experienced in transporting, and processing real time event streaming using Spark streaming andKafka.
Experience in writing Hive UDF to in corporate complex business logic into Hive Queries.
Responsible for modification and performance tuning of HIVE scripts, resolving automation jobfailure issues and reloading the data into HIVE Data Warehouse if needed.
Experience in writing Sparktransformations and actions using SparkSQL(RDDs and
Dataframes) in Scala by converting Hive/SQL queries.
In-depth knowledge of Apache Hadoop Architecture (1.x and 2.x) and Apache Spark 1.x Architecture
Hands on experience with Cloudera and Hortonworks
Hands on experience in Hadoop Ecosystem components such as Hive, Pig, Sqoop, Flume, Impala, Oozie, Zookeeper, HBase.
Strong knowledge of Hadoop Architecture and Daemons such asHDFS, Job Tracker, Task Tracker, Name None, Data Node and Map Reduce concepts.
Hands on experience in writing Map Reduce programs using Java to handle different data sets using Map and Reduce tasks.
Hands on experience with various Apache Hadoop Ecosystems such as Hadoop, Spark, HDFS, MapReduce, YARN, TEZ,HBase, Pig, Hive, Sqoop, Flume, Oozie, and Kafka
Hands on experience in writing MapReduce jobs in Java, Pig, and Python
Experience in dealing with SQL in Hadoop with Apache Hive
Hands on experience in writing Apache Spark SQL and Spark Streaming programming with Scala and Python.
Developed multiple Map Reduce jobs to perform data cleaning and preprocessing.
Involved in designing the data model in Hive for migrating the ETL process into Hadoop and wrote Pig Scripts to load data into Hadoop environment
Designed HIVE queries & Pig scripts to perform data analysis, data transfer and table design.
Expertise in writing Hive UDF, Generic UDF's to in corporate complex business logic into Hive Queries.
Experienced in optimizing Hive queries by tuning configuration parameters.
Implemented SQOOP for large dataset transfer between Hadoop and RDBMS.
Extensively used Apache Flumeto collect the logs and error messages across the cluster.
Experience in implementing Real-Time streaming and analytics using SparkStreaming and Kafka
Experience in data ingestion using Sqoop from RDBMS to HDFS and Hive and vice-versa
Proficient in Java/J2EE technologies - Core Java, JSP, Java Beans, Java Servlets, Ajax, JDBC, ODBC, Web Services, Swing, Hibernate, Spring, Struts, XML and XSLT
Proficient in Dot Net Technologies - C# .Net, ASP .Net, Entity Framework, WCF, Ajax, and MVC
Good Experience in MVC architecture using Spring, Struts and ASP .Net frameworks
Performed data analysis using MySQL, SQL Server Management Studio and Oracle
Experience with ETL Tool using Informatica, Talend and SSIS
Experience in working with Cloudera (CDH3 & CDH4&CDH5) and Hortonworks Hadoop Distributions.
Hands on experience onAWS infrastructure services Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2).
Worked with Oozie and Zookeeper to manage the flow of jobs and coordination in the cluster
Experience in performance tuning, monitoring the Hadoop cluster by gathering and analyzing the existing infrastructure using Cloudera manager.
Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
Experience with Restful Services and Amazon Web Services
Hands on Experience on Amazon’s EC2, EMR and S3
Conversant with Web/Application Servers - Tomcat, Websphere, Weblogic and IIS
Experience in writing Maven and SBT scripts to build and deploy Java and Scala Applications
Implemented unit testing with Junit and MRUnit
Expertise in Web Application Development with JSP, HTML, CSS, JavaScript, ASP .Net, C# .Net and JQuery

TECHNICAL SKILLS

Bigdata Technologies: Hadoop, Map Reduce, HDFS, Hive, Pig, Zookeeper, Sqoop,Oozie, Flume, IMPALA, HBASE, Kafka, Storm

Big Data Frameworks: HDFS, YARN, Spark

Hadoop Distributions: Cloudera(CDH3,CDH4,CDH5),Horton works, Amazon EMR

Programming Languages: Java, C, C++, shell scripting, Scala

Databases: RDBMS, MySQL, Oracle, Microsoft SQL Server, Teradata, DB2, PL/SQL,CASSANDRA, MongoDB

IDE and Tools: Eclipse, NetBeans, Tableau

Operating System: Windows XP/vista/7, Linux/Unix

Frameworks: Spring, Hibernate, JSF, EJB, JMS

Scripting Languages: JSP & Servlets, JavaScript, XML, HTML, Python

Application Servers: Apache Tomcat, Web Sphere, Web logic, JBoss

Methodologies: Agile, SDLC,Waterfall

Web Services: Restful, SOAP

ETL Tools: Talend, Informatica

Others: Solr, elasticsearch

PROFESSIONAL EXPERIENCE

Confidential, Medford, MA

Senior Hadoop Developer

Responsibilities:

Experience in working with Horton works Distribution of Hadoop.
Experience in implementing OLAP multi-dimensional cube functionality using ATSCALE.
Responsible for building scalable distributed data solutions using Hadoop.
Responsible for ingesting structured ERP systems data residing on our traditional back-end Microsoft SQL server database on to Hadoop data platform using SQOOP.
Experience in writing AZURE POWERSHELL scripts to copy or move data from local filesystemto HDFS Blob storage.
Involved in creating Hive tables, and loading and analyzing data using hive queries
Analyzed large data sets by running Hive queries and Pig scripts
Developed Simple to complex Map Reduce Jobs using Hive and Pig
Involved in running Hadoop jobs for processing millions of records of text data
Developed multiple Map Reduce jobs in java for data cleaning and preprocessing
Involved in loading data from LINUX file system to HDFS
Responsible for managing data from multiple sources
Extracted files from Relational Database through Sqoop and placed in HDFS and processed
Experienced in runningHadoopstreaming jobs to process terabytes of xml format data
Load and transform large sets of structured, semi structured and unstructured data
Responsible to manage data coming from different sources
Assisted in exporting analyzed data to relational databases using Sqoop
Managed and reviewed Hadoop Log files
Load log data into HDFS using Flume
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
Used JDBC for database connectivity with MySQL Server
Extensive work in ETL process consisting of data transformation, data sourcing, mapping,conversion and loading using Talend
Experience in working with Sparkstreaming.
Written Spark SQL queries using Dataframes.
Experience in Importing and exporting data into HDFS and Hive using Sqoop.
Hands on experience in defining, partitioning, bucketing, compressing Hive tables to meet business requirement.
Experience in performance tuning of Hive queries.
Implemented Ad-hoc query using Hive to perform analytics on structured data.
Worked extensively with HIVE DDLs and Hive Query language (HQLs) and implemented business logic using Hive UDF's to perform ad-hoc queries on structured data.
Implemented Optimized joins to perform analysis on different data sets using Map Reduce programs.
Written Hive queries for data analysis to meet the business requirements.
Hands on experience in working with IMPALA.
Hands on experience in writingMap Reduceprograms to meet business needs.
Hands on experience in writing Linux/Unix Shell scripting.
Experienced in transporting, and processing real time event streaming using Kafka.
Experienced in defining CRON job flows.
Able to understand and migrate the ETL & BI codes cross multiple ETL and BI tools like Talend
Experienced in analyzing data using HiveQL and Pig latin and custom MapReduce programs in Java
Diverse experience in utilizing Java and python tools in business, web and client server environments including Java platform, JSP, Servlet, Java beans, JSTL, JSP custom tags, EL, JSF and JDBC
Deep JVM knowledge of heavy experience with Functional Programming language like Scala
Involved in converting Hive/SQL queries intoSparktransformations and actions using Spark SQL(RDDs and Dataframes) in Python and Scala
ImplementedSparkSQL queries with Scala for faster testing and processing of data
Implemented Spark Streaming to read real-time data from Kafka in parallel and processed in parallel and save the result as parquet format in Hive
Did analytics POC to analyze outpatient details with R and SparkR (with Logistic Regression algorithm)
Installed Zeppelin in Cloudera Dev environment and executed Spark programs
Developed applications using Eclipse
Used Hadoop Streaming to write jobs in a Python scripting language
Expertise in writing Shell scripts to monitor Hadoop job

Environment: Hadoop, MapReduce, HDFS, Pig, Hive,HBase,Sqoop, Flume, Java, Python, Oracle 10g, MySQL, Ubuntu, Agile, XML, SQL Server, YARN, Cloudera, Teradata, Talend, UNIX Shell Scripting, Oozie, Scala, Spark, R, Maven, SBT, Zeppelin, Eclipse, IntelliJ

Confidential, Santa Barbara, CA

Sr. Hadoop Developer.

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and MapReduce
Experience with Cloudera distribution of Hadoop.
Experience in implementing applications on Spark frameworks using Scala.
Written Spark SQL queries using Data frames.
Developed spark programming code in SCALA on INTELLIJIDEAIDE using SBT tools.
Experience in writing Spark transformations and actions using SparkSQL (RDDs and Dataframes) in Scala by converting Hive/SQL queries.
Experience in Importing and exporting data into HDFS and Hive using Sqoop.
Hands on experience in defining, partitioning, bucketing, compressing Hive tables to meet business requirement.
Experience in performance tuning of Hive queries.
Implemented Ad-hoc query using Hive to perform analytics on structured data.
Worked extensively with HIVE DDLs and Hive Query language (HQLs) and implementedbusiness logic using Hive UDF's to perform ad-hoc queries on structured data.
Written Hive queries for data analysis to meet the business requirements.
Hands on experience in working with IMPALA.
Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
Used Pig as ETL tool to do transformations, event joins, filtering and some pre-aggregations before storing the data onto HDFS.
Hands on experience in writing, executing pig scripts.
Hands on experience in writing Pig UDFs.
Configured Oozie work flows to automate data flow, preprocess and cleaning tasks using Hadoop Actions.
Daily Monitoring of Cluster status and health includedDataNode, Job Tracker, Talk Tracker, and Name Node.
Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
Experience with CDH distribution and ClouderaManager to manage and monitor Hadoop clusters.
Knowledge on rendering and delivering reports in desired formats by using reporting tools such as Tableau.
Worked on debugging, performance tuning of Hive & Pig Jobs
Worked on tuning the performance Pig queries
Involved in loading data from LINUX file system to HDFS
Importing and exporting data from different relational databases into HDFS and Hive using Sqoop and performed transformations using MapReduce and Hive
Analyzed data by performing Hive Queries and running the Pig Scripts to study the behavior in a particular aspect
Experience working on processing unstructured data using Pig and Hive
Used UDFs to implement business logic in Hadoop
Supported MapReduce Programs those are running on the cluster
Gained experience in managing and reviewing Hadoop log files
Created HBase tables to store variousdataformats coming from different applications
Developed ETL Scripts for Data acquisition and Transformation using Talend
Extensive experience with Talend source & connections configuration, credentials management, context management
Implemented and assisted with Talend installations and Talend Servers setup which including,MDM server
Implemented proof of concept to analyze the streaming data using Apache Spark with Scala and Python; Used Maven and SBT for build and deploy the Spark programs
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
Developed simple to complex MapReduce jobs using Java, Pig and Hive
Developed application using Eclipse and used build and deploy tool as Maven
Exported the analyzed data to the relational databases using Sqoop for visualization

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, Flume, Java, Oracle 10g, MySQL, SQL Server, Ubuntu, Agile, SQL Server, YARN, Spark,Hortonworks, Teradata, Talend, UNIX Shell Scripting, Oozie, Maven, Eclipse

Confidential, Austin, Texas

Hadoop Developer/ Java

Responsibilities:

Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required
Manipulated, transformed, and analyzed data from various types of databases
Worked extensively in creating Map Reduce jobs to power data for search and aggregation
Extensively used Pig for data cleansing with Tez
Created HBase tables to store variousdataformats coming from different applications
Designed a data warehouse using Hive
Have strong understanding of Dynamic Partitioning in Hive
Created partitioned and bucketed tables in Hive to provide nice sample during predictive modeling
Worked with business teams and created Hive queries for ad hoc access
Created several UDFs in Pig and Hive to give additional support for the project
Did Analytics with Hive Queries.
Implemented counters on HBasedata to count total records on different tables.
Experienced in handling Avro data files by passing schema into HDFS using Avro tools and Map Reduce.
Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
Implemented secondary sorting to sort reducer output globally in map reduce.
Implemented data pipeline by chaining multiple mappers by using Chained Mapper
Experience with Hortonworks distribution of Hadoop.
Worked onHadoopcluster which ranged from 5-8 nodes during pre-production stage and it was sometimes extended up to 26 nodes during production
Experience in Importing and exporting data into HDFS and Hive using Sqoop.
Developed Pig program for loading and filtering the streaming data into HDFS using Flume.
Experienced in handling data from different data sets, join them and pre-process using Pig join operations.
Moving Bulk amount data into HBase using Map Reduce Integration.
Worked extensively with Sqoop for importing data from Oracle and Netezza
Created ETL Scripts for Data acquisition and Transformation using Talend
Can understand and migrate the ETL & BI codes cross multiple ETL and BI tools like Talend
Developed application using Eclipse and used build and deploy tool as Maven
Evaluated usage of command line Oozie/Hue for Workflow Orchestration
Created and maintained Technical documentation for launchingHADOOPClusters and for executing Hive queries and Pig Scripts.
Created tables, partitions, buckets and perform analytics using Hive ad-hoc queries.
Provided batch processing solution to certain unstructured and large volume of data by using Hadoop MapReduce framework
Experience with configuration of Hadoop Ecosystem components: Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper, Flume, Storm, Spark, Yarn, Tez.
Mentored analyst and test team for writing Hive Queries
Used R for analytics, predictive modeling and regression analysis
Implemented test scripts to support test driven development and continuous integration

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, Sqoop, HBase, Flume, Java, Oracle 10g, Netezza, MySQL, Ubuntu, Agile, Cloudera, UNIX Shell Scripting, Oozie, Maven, Eclipse

Confidential, Carson City, NV

Java/ J2EE Developer.

Responsibilities:

Worked as a senior developer for the project
Created UML class diagrams that depict the code’s design and its compliance with the functional requirements
Analysis, Design, Development and Unit Testing of the modules
Used Java Mail notification mechanism to send confirmation email to applied companies
Also involved in writing JSP’s/JavaScript and Servlets to generate dynamic web pages and web content
Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from backend Oracle database using JDBC
Used Enterprise Java Beans as a middleware in developing a three-tier distributed application
Developed Session Beans and Entity beans to business and data process
Implemented Web Services with REST
Developed user interface using HTML, CSS, JSPs and AJAX
Client side validation using JavaScript and JQuery
Performed client side validation with JavaScript and applied server side validation as well to the web pages.
Developed the application leveraging the Model View Layer (MVC) architecture, Build tools Maven, ANT.
Used JIRA for BUG Tracking of Web application.
Written Spring Core and Spring MVC files to associate DAO with Business Layer.
Worked with HTML, DHTML, CSS, and JAVASCRIPT in UI pages.
Wrote Web Services using SOAP for sending and getting data from the external interface.
Extensively worked with JUnit framework to write JUnit test cases to perform unit testing of the application
Implemented JDBC modules in java beans to access the database.
Designed the architecture, tables for the back-end Oracle database.
Application hosted under Web Logic and developed utilizing Eclipse IDE.
Used XSL/XSLT for transforming and displaying reports. Developed Schemas for XML.
Involved in writing the ANT scripts to build and deploy the application.
Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework.
Implemented field level validations with AngularJS, JavaScript and JQuery
Preparation of unit test scenarios and unit test cases
Branding the site with CSS
Code review and unit testing the code
Involved in unit testing using Junit
ImplementedLog4Jto trace logs and to track information
Involved in project discussions with clients and analyzed complex project requirements as well as prepared design documents

Environment: Java, JSP, EJB, JMS, JavaScript, JSF, XML, JBOSS, WebSphere, WebLogic, Hibernate, spring, SQL, PL/SQL, CSS, Log4j, JUnit, Eclipse, Oracle 11g, Load Runner, TFS

Confidential

Java Developer

Responsibilities:

Interacted with clients to gather functional requirements such as SEO requirements, Captcha implementation, consultation form implementation and etc.,
Involved in Analysis, design, development and testing of the modules
Developed master pages and static pages
Developed consultation form with Captcha functionality and mailing functionality
Developed Services with AngularJS
Working with Shibboleth Identity Provider andService Provider
Using IIS and Apache for web Server.
Developed analysis level documentation such as Use Case Model, Activity, Sequence and Class Diagrams.
Developed the application using Struts MVC for the web layer.
Developed UI layer logics of the application using JSP, JavaScript, HTML/DHTML, and CSS.
Implemented URL Rewrite and Redirection using URLRewriteFilter
Implemented English to French Toggling functionality
Extensively used Core Java, Servlets, JSP and XML.
Used Struts 1.2 in presentation tier.
Involved in writing JSP and JSF components. Used JSTL Tag library (Core, Logic, Nested, and Bean and Html taglib’s) to create standard dynamic web pages.
Application was based on MVC architecture with JSP serving as presentation layer, Servlets as controller and Hibernate in business layer to access to Oracle Database.
Developed the DAO layer for the application using Spring Hibernate Template support
Implemented Google Analytics for all pages by using Google Scripts
Implemented Log4j Logging in the application
Hosted the web application in Testing Environment and Supported network team to host the same in Live

Environment: ASP .Net, C# .Net, SQL Server, SSIS, SharePoint, Entity Framework, Outlook, SMTP Mailing, HTML, JavaScript, JSON, JQuery, CSS, Visual Studio, SQL Server Management Studio, Team Foundation Server, XML, and IIS

We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Medford, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship