Hadoop Developer Resume Framingham, MA - Hire IT People

SUMMARY:

8 years of experience in Developing,Implementing, Testing and maintenance of various web applications using JAVA along with 4 years of experience in phases of Hadoop and Bigdata development
Experience in Software Development Life Cycle (SDLC) methodologies like Agile, Scrum, and Waterfall Methodologies
Experience in using Hive to analysethe partitioned and bucketed data and compute various metrics for reporting
Experience in Using Pig as ETL tool to perform testing on transformations, event joins and some pre - aggregations before storing the data onto HDFS
Good experience with MapReduce (MR), Hive, Pig, HBase, Sqoop, Spark, Scala for data extraction, processing, storage and analysis
Experience writing Hive QL queries and Pig Latin scripts for ETL
Expertise in processing and analyzing archived and real-time data using Core Spark,Spark-SQL and Spark Streaming
Good Knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics
Expertise in Data Development in Hortonworks HDP platform &Hadoop ecosystem tools like Hadoop, HDFS, Spark, Zeppelin, Hive, HBase, SQOOP, flume, Atlas, SOLR, Pig, Falcon, Oozie, Hue, Tez, ApacheNiFi, Kafka
Expertise in Java Script, JavaScript MVC patterns, Object Oriented JavaScript Design Patterns and AJAX and developed core modules in large cross-platform applications using JAVA, JSP, Servlets, JDBC, JavaScript, XML, and HTM
Experience with multiple Hadoop distributions such as Cloudera, Hortonworks and AWS
Experience with VMWare, VirtualBox, Docker and Vagrant
Experience with Java SE 8 and Java EE frameworks such as Spring MVC 4.0, Spring
In-depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames, Spark Streaming, Spark MLlib
Hands on experience in application development using Java, RDBMS, and UNIX shell scripting
Experience in working with BI team and transform big data requirements into Hadoop centric technologies
Hands-on experience on development tools like Eclipse, IntelliJ, RAD, MyEclipse
Good knowledge of Hadoop Architecture and various components such as YARN, HDFS, Node Manager, ResourceManager, JobTracker, TaskTracker, NameNode, DataNode and MapReduce concepts
Experienced with different scripting language like Python and shell scripts
Good Data Warehouse experience in MSSQL
Solid SQL skills, can write complex SQL queries; functions, triggers and stored procedures for Backend testing, Database Testing and End-to-End testing
Good experience in Linux, UNIX, Windows and MacOS environment
Experience in working with small and large groups and successful in meeting new technical challenges and finding solutions to meet the needs of the customer
Strong understanding in Agile and Waterfall SDLC methodologies

PROFESSIONAL EXPERIENCE:

Confidential, Framingham, MA

Hadoop Developer

Responsibilities:

Worked on Hadoop eco-systems including Hive, Sqoop and Kafka with Hortonworks Distributed Platform
Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop,PIG, Hive and Shells Scripts (for scheduling of jobs)
Handled importing of data from various data sources, performed transformations using Hive,Pig, and loaded data into HDFS
Extracted files from Couch DB and placed into HDFS using Sqoop and pre-process the data foranalysis
Handled importing of data from various data sources like MySQL, Oracle, DB2
Exported the result set from Hive to MySQL using Sqoop after processing the data
Participated in the managing and reviewing of the Hadoop log files
Developed a Python code for Testing and QA On ETL Job
Unit tested and tuned SQLs and ETL Code for better performance
Developed Python scripts to find vulnerabilities with SQL Queries by doing SQL injection
Used Python for pattern matching in build logs to format warnings and errors
Performed multiple MapReduce jobs in Pig and Hive for Data Cleaning and pre-processing
Involved in HDFS maintenance and loading of structured and unstructured data
Created custom python/shell scripts to import data via SQOOP from Oracle databases
Used Python scripts to update content in the database and manipulate files
Utilized standard Python modules such as csv, ConfigParser, ibm db, cx Oracle etc.
Developed test plan, test scripts and test procedures from the specification documentin Python and automating them to run in the real time environment
Developed and designed automation framework using Python and Shell scripting
Developed the project in Linux environment
Knowledge on JSON and SimpleJSON based web services
Utilized Agile Scrum Methodology to help manage and organize with developers and regularcode review sessions and setup High Availability Cluster to Integrate the HIVE with existingapplications
Imported and exported the analyzed data to the relational databases using Sqoop

Confidential, Minneapolis, MN

Hadoop Developer

Responsibilities:

Followed Agile Methodology especially SCRUM software development process throughout Project
Involved in Bi-Weekly sprint meetings with the Business Analysts and Business Managers to drive out testing efforts and implement an elegant solution to the tasks
Worked with Hadoop Ecosystem components like HDFS, HBase, Sqoop, Hive, Spark-Scala and Pig with HORTONWORKS Hadoop distribution
Prepared Positive and Negative testcases by understanding user stories/requirements for different interfaces
Preparing the system test plan covering testing scope, requirements, environment, approach, risks and issues
Involved in designing test plans, test cases and overall Unit and Integration testing of system
Performed comparisons between DDL's and tables structures
Handled importing of data from various data sources like SQL, Mainframes and Oracle DB2 (AAH)
Validated the data between the Data lake tables and target tables in Teradata, Oracle DB2 (AAH) and SQL
Performed data validation between the source and target tables by using HIVE and PIG
Involved in creating Hive tables and loading them into dynamic partition tables
Had hands-on experience on Hive CDC and SCD logics
Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting
Used Pig as ETL tool to perform testing on transformations, event joins and some pre-aggregations before storing the data onto HDFS
Inserted Overwriting the HIVE data with HBase data daily to get fresh data every day
Experienced in using Zena component to trigger the files from Source and load the data into the target tables
Monitoring and Verifying Hadoop Zena jobs and analyzing the data, generated reports to meet business requirements
As a part of audit testing, performed the testing on Duplicate file check, Zero Byte file check through Zena
Performed the Data Quality checks and Data Quality threshold checks on the HDFS files which we get from the source
Experienced in writing test scripts in HiveQL and PIG Latin for validating the data in Tables
Involved in defect management process - Created all the bugs and logs the bugs in JIRA and ALM for development and business review
Involved in defect triage meetings with business, developers and explained them the severity and risk of the issues found
Worked on SQOOP to import data from various relational data sources
Worked on strategizing SQOOP jobs to parallelize data loads from source systems
Participated in Daily Stand up calls and update the progress to all the stake holders
Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit

Confidential, Birmingham, AL

Hadoop Developer

Responsibilities:

Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing Hadoop log files
Involved Low level design for MR, Hive, Impala, Shell scripts to process data
Involved in complete Big Data flow of the application starting from data ingestion upstream to HDFS, processing the data in HDFS and analyzing the data
Knowledge on handling Hive queries using Spark SQL that integrate with Spark environment implemented in Scala
Used Spark Streaming API with Kafka to build live dashboards; Worked on Transformations & actions in RDD, Spark Streaming, Pair RDD Operations, Check-pointing, and SBT
Implemented POC to migrate map reduce jobs into Spark RDD transformation using Scala IDE for Eclipse
Creating Hive tables to import large data sets from various relational databases using Sqoop and export the analyzed data back for visualization and report generation by the BI team
Installing and configuring Hive, Sqoop, Flume, Oozie on the Hadoop clusters
Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs
Developed a process for the Batch ingestion of CSV Files, Sqoop from different sources and also generating views on the data source using Shell Scripting and Python
Integrated a shell script to create Collections/morphline, SolrIndexes on top of table directories using MapReduce Indexer Tool within Batch Ingestion Framework
Implemented partitioning, dynamic partitions and buckets in HIVE
Developed Hive Scripts to create the views and apply transformation logic in the Target Database
Involved in the design of Data Mart and Data Lake to provide faster insight into the Data
Involved in using Stream Sets Data Collector tool and created Data Flows for one of the streaming applications
Experienced in using Kafka as a data pipeline between JMS (Producer) and Spark Streaming Application (Consumer)
Involved in the development of Spark Streaming application for one of the data sources using Scala, Spark by applying the transformations
Developed a script in Scala to read all the Parquet Tables in a Database and parse them as Json files, another script to parse them as structured tables in Hive
Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster
Configured Zookeeper for Cluster co-ordination services
Developed a unit test script to read a Parquet file for testing PySpark on the cluster
Involved in exploration of new technologies like AWS, Apache Flink, and Apache NIFIetc which can increase the business value

Confidential

Java Developer

Responsibilities:

Attended various meetings with users. Gone through and understand the client requirements
Developed application on Spring 4.x framework by utilizing its features like Spring
Dependency injection, Spring Beans, Spring JDBC, Spring Web flow using Spring MVC
Worked on Spring MVC application with XML configurations and annotations
Used Dispatcher servlet to route incoming requests, controllers to handle requests and Model to send values to user interface
Used Agile principles to implement the projects using two-week sprints, planning meeting,daily standups, grooming, estimation and retrospectives
Developed a portal application from scratch to interact with third party application token exchange model for authentication, get the data needed and Spring MVC to handle incoming requests and RESTful web services (Implementing JAX-RS API) with Jackson parser to send data on Web Service Calls in JSON format
Participated in Scrum meetings and project planning and coordinated the status sessions
Developed the presentation layer by using Servlet, HTML 5, CSS 3, JavaScript, JSP's, JSON and XML
Developed Data Access Layer using Hibernate ORM framework
Used Hibernate named queries concept to retrieve data from the database and integrate with Spring MVC to interact with backend persistence system (Oracle11g)
Extensively involved in creating complex SQL queries and calling Stored Procedures
Maintain high-quality of RESTful services and implemented REST Services using Spring MVC and JAX-RS
Used Maven to build and deploy application onto JBOSS Application Server to deploy code onto server
Used JIRA tracking tool to manage and track the issues reported by QA and prioritize and act based on the severity
Used GitHub extensively as versioning tool and used Maven for automated building of projects
Involved in the analysis of finding out the performance issues of DAO classes
Extensively used the LOG4j to log regular Debug and Exception statements and involved in design, analysis and architectural meetings
Implemented Unit Testing using JUnit and involved in Integration Testing with Database Layer.

Confidential

Java Developer

Responsibilities:

Involved in SDLC Requirements gathering, Analysis, Design, Development, and Testing of application developed using AGILE methodology
Developed the application using Spring Framework that leverages classical Model View Layer (MVC) architecture
Worked in the different parts of the MVC pattern like Dispatcher Servlet, Handler Mapping, Controllers, Model, and Views
Used spring core for Business Layer
Used Hibernate in conjunction with Spring functionality to implement Object-relation mapping in the persistence layer
Created and consumed Web Services using REST and SOAP
Created webpages using HTML5, CSS3, JavaScript
Asynchronous calls and preloading the data are made using AJAX
Worked on Complex SQL queries and created stored procedures for different business functionalities
Used SONAR tool to maintain code quality compliance
Performed Unit testing for various modules using JUnits
Used SPLUNK to get the Debug logs

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Framingham, MA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship