Lead Hadoop Analyst Resume Richmond/VA - Hire IT People

PROFESSIONAL SUMMARY:

Over 7 years of progressive experience in analysis, design, development and testing on Big Data and Core Java /J2EE and Scala applications.
3+ years of proficient experience in Big Data applications using Hadoop.
In - depth understanding of Hadoop architecture and various components such as HDFS and MapReduce programming paradigm
Extensive knowledge in Hadoop components like Map Reduce, Spark, Hive, Impala, Sqoop, Hbase, Oozie and HUE
Excellent understanding and knowledge of NOSQL database- HBase
Good experience in scheduling the jobs using Oozie
Experience in using different file formats (text, parquet, json, xml)
Worked in different version controlling tools like Git, Bitbucket, SVN, Clear Case
Responsible to manage data coming from various sources.
Involved in loading data from UNIX file system to HDFS
Developed scripts to schedule various Hadoop jobs
Experience in Cloudera Distribution Hadoop cluster
Have sound exposure to Health Care / Retail domains and ETL / licensing framework with strong analytical, design skills and problem solving abilities as an added advantage.
Worked in Agile/Scrum and waterfall based delivery environment
Experience in managing and reviewing Hadoop log files.
Created MRUnit and JUnit test cases to test map reduce and java applications
Good hands on experience in Apache Tomcat, JBOSS, Web Sphere, Data Junction, FileZilla-FTP Client, WinSCP, Putty, SQuirreL SQL Client, Microsoft SQL Server and IBM DB2
Excellent communication and presentation skills.
Basic knowledge in Python, Scala, StreamSet and AWS
Worked as Onsite and Offshore team lead

TECHNICAL SKILLS:

Big Data tool: Hadoop, Map reduce, Spark, Hive, Impala, Hbase, Oozie, Sqoop, Hue

Languages: Core Java/J2EE, UNIX scripting, SQL, PL/SQL

Databases: Microsoft SQL Server, IBM DB2

Frameworks: MVC, Spring MVC, ETL, Pega PRPC Certification and Licensing Framework (CLF)

BPM Tools: PEGA PRPC 7.1, PEGA PRPC 6.3

Development Tools: IBM Rational Application developer, Rational Software Architect, Eclipse

Servers: Apache Tomcat, JBOSS, Web sphere

Build Tool: Maven, Ant

Version management tools: Git, IBM Rational Clear case, SVN, Bitbucket

Tools: Data Junction, FileZilla-FTP Client, WinSCP, Putty, SQuirreL SQL Client, Control-M

Software Development:

Methodology: Agile/Scrum, Waterfall

Unit Testing Frameworks: MRUnit, JUnit

PROFESSIONAL EXPERIENCE:

Confidential, Richmond/VA

Lead Hadoop Analyst

Responsibilities:

Responsible for data loading from data processing
Leading onsite team in an Agile development model
Developed Spark program to process and transform data
Created data frames and joined multiple data frames for processing data and written the results to HDFS
Developed bash scripts to create DDLs,and import data from Teradata
Involved in automation using Control-M
Scheduling and monitoring cyclic and adhoc jobs through Control-M
Involved in loading data from Unix file system to HDFS
Experience in creating external and internal Hive tables.
Involved in analyzing and validating data using Hive, Impala and HUE
Handled Parquet file formats
Managed and reviewed Hadoop/Spark log files
Developed bash scripts to call Spark jobs
Developed Impala/Hive queries to fetch data from Data Lake.
Built and deployed application using Maven
Involved in Agile/Scrum methodology
Used Git/Bitbucket for version control
Used Bamboo for CICD

Environment: Java 1.8, Eclipse, Scala, HDFS, Spark, Hive, Impala, Maven, HUE, Control-M, UNIX Script, Putty, FileZilla, Tera data SQL assistant, Bamboo, Jira

Confidential

Lead Hadoop Developer

Responsibilities:

Responsible for ETL (Extract Transform and Load) data to Data Lake
Lead offshore team an Agile development cycle
Developed Map reduce program to clean and transform data
Developed Spark jobs for cleaning and loading data to HDFS for better performance
Developed bash scripts to load data
Involved in automation using automation console and JSON
Involved in loading data from Unix file system to HDFS
Experience in creating external and internal Hive tables with dynamic partitions on Data Lake.
Created external Hive stage tables on top of loaded data in HDFS as part of ETL
Finally loaded data from stage tables to final tables
Developed map reduce programs with map side and reducer side joins
Used Distributed cache for better performance
Involved in analyzing and validating data using Hive, Impala and HUE
Involved in optimization of Hive/Impala queries
Handled Text and Parquet file formats
Managed and reviewed Hadoop log files
Developed bash scripts to call impala/hive queries and to load results to middleware TIBCO using SCP
I was the only one responsible for fetching data from a common Data Lake based on the various user request through UI
Developed Impala/Hive queries to fetch data from Data Lake.
Built and deployed application using Maven
Involved in Agile/Scrum methodology
Used Git/Bitbucket for version control
Supported AWS migration

Environment: Java 1.7, Eclipse, Map reduce, HDFS, Spark, Hive, Impala, Maven, HUE, UNIX Script, Putty, FileZilla

Confidential, New York

Lead Hadoop Developer

Responsibilities:

Developed various map only jobs and map reduce jobs for various rules.
Created sequence of map reduce jobs. Subsequent jobs fetch the output of the previous job as the input.
Imported and exported data from SQL Database to HDFS using Sqoop through regular intervals
Lead offshore team an Agile development process
Developed Hive and Impala queries for the application
Code optimization and performance enhancements
Involved in analysis, design, enhancement and maintenance of the application.
Involved in dynamic Oozie workflow generation and executed multiple jobs parallel by fork and join
Prepared MRUnit test cases and involved in unit testing of the application
Created Custom partitions for better performance
Used Distributed cache for hive/impala query results
Involved in creating REST web service using JSON
Managed and reviewed Hadoop log files
Built and deployed application using Maven
Involved in Agile/Scrum methodology
Used Git/Bitbucket for version control

Environment: Hive, Impala, Sqoop, Map reduce, Oozie, Java 1.7, Maven, FileZilla, Putty

Confidential

Java/Hadoop/Pega Developer

Responsibilities:

Responsible for ETL data
Developed various map reduce jobs for loading data.
Created HBase tables.
Inserted data to HBase table
Fetched data from HBase tables
Prepared MRUnit test cases and involved in unit testing of the application
Managed and reviewed Hadoop log files
Created and used static Oozie workflow
Built and deployed application using Maven
Used svn for version control

Environment: Map reduce, HBase, Oozie, Java 1.7, Maven, FileZilla, Putty

Confidential, Sacramento/CA

Pega System Architect

Responsibilities:

Participated in all phases for PERL - Pega Enterprise Licensing
Responsible to enable and configure all modules and business scenarios and use cases for application of online licenses and certification across Public Health.
Build out business rules, configured routing, designed and implemented UIs and harnesses to support the business requirements.
Participated in unit and all phases of testing to deploy within agreed timelines
Involved in PERL Detailed Resource Task plan creation
Preparation of Perl application profile document.
Prepared test strategy for Perl.
Actively involved in the creation of the below documents for Perl
Kick off Presentation
High level development solution document

Environment: PRPC 7.1, PostgreSQL

Confidential, Indianapolis/IN

Java Developer

Responsibilities:

Involved in planning, estimation, task allocation, technical support and team management.
Prepared necessary documents like Estimation, schedule and design.
Prepared test plans and involved in testing of the application.
Constructed Java code for the enhancements of the application
Developing PL/SQL queries and stored procedures for the application
Analyzed requirements directly from the client.
Writing/Modifying DB2 Procedures for Database manipulations.
Involved in Unit testing, System Testing, and Integration Testing.
Used Log4j framework to log application
Data Junction is used to translate data using XML mapping
Prepared JUnit test cases and involved in unit testing of the application
Designed application using MVC architecture
Involved in analysis, design, enhancement and maintenance of the application.
Preparing WPSR, Metrics and weekly task trackers.
Code optimization and performance enhancements
Knowledge Transfer
Built and deployed application using Ant
Involved in Waterfall methodology
Used SVN for version control

Environment: Java 1.5, Ant, FileZilla, Putty, DB2, Data junction, JBoss, PL/SQL, JSP, JS, HTML, SQL,WAS

We provide IT Staff Augmentation Services!

Lead Hadoop Analyst Resume

Richmond, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship