We provide IT Staff Augmentation Services!

Lead Hadoop Analyst Resume

5.00/5 (Submit Your Rating)

Richmond, VA

PROFESSIONAL SUMMARY:

  • Over 7 years of progressive experience in analysis, design, development and testing on Big Data and Core Java /J2EE and Scala applications.
  • 3+ years of proficient experience in Big Data applications using Hadoop.
  • In - depth understanding of Hadoop architecture and various components such as HDFS and MapReduce programming paradigm
  • Extensive knowledge in Hadoop components like Map Reduce, Spark, Hive, Impala, Sqoop, Hbase, Oozie and HUE
  • Excellent understanding and knowledge of NOSQL database- HBase
  • Good experience in scheduling the jobs using Oozie
  • Experience in using different file formats (text, parquet, json, xml)
  • Worked in different version controlling tools like Git, Bitbucket, SVN, Clear Case
  • Responsible to manage data coming from various sources.
  • Involved in loading data from UNIX file system to HDFS
  • Developed scripts to schedule various Hadoop jobs
  • Experience in Cloudera Distribution Hadoop cluster
  • Have sound exposure to Health Care / Retail domains and ETL / licensing framework with strong analytical, design skills and problem solving abilities as an added advantage.
  • Worked in Agile/Scrum and waterfall based delivery environment
  • Experience in managing and reviewing Hadoop log files.
  • Created MRUnit and JUnit test cases to test map reduce and java applications
  • Good hands on experience in Apache Tomcat, JBOSS, Web Sphere, Data Junction, FileZilla-FTP Client, WinSCP, Putty, SQuirreL SQL Client, Microsoft SQL Server and IBM DB2
  • Excellent communication and presentation skills.
  • Basic knowledge in Python, Scala, StreamSet and AWS
  • Worked as Onsite and Offshore team lead

TECHNICAL SKILLS:

Big Data tool: Hadoop, Map reduce, Spark, Hive, Impala, Hbase, Oozie, Sqoop, Hue

Languages: Core Java/J2EE, UNIX scripting, SQL, PL/SQL

Databases: Microsoft SQL Server, IBM DB2

Frameworks: MVC, Spring MVC, ETL, Pega PRPC Certification and Licensing Framework (CLF)

BPM Tools: PEGA PRPC 7.1, PEGA PRPC 6.3

Development Tools: IBM Rational Application developer, Rational Software Architect, Eclipse

Servers: Apache Tomcat, JBOSS, Web sphere

Build Tool: Maven, Ant

Version management tools: Git, IBM Rational Clear case, SVN, Bitbucket

Tools: Data Junction, FileZilla-FTP Client, WinSCP, Putty, SQuirreL SQL Client, Control-M

Software Development:

Methodology: Agile/Scrum, Waterfall

Unit Testing Frameworks: MRUnit, JUnit

PROFESSIONAL EXPERIENCE:

Confidential, Richmond/VA

Lead Hadoop Analyst

Responsibilities:

  • Responsible for data loading from data processing
  • Leading onsite team in an Agile development model
  • Developed Spark program to process and transform data
  • Created data frames and joined multiple data frames for processing data and written the results to HDFS
  • Developed bash scripts to create DDLs,and import data from Teradata
  • Involved in automation using Control-M
  • Scheduling and monitoring cyclic and adhoc jobs through Control-M
  • Involved in loading data from Unix file system to HDFS
  • Experience in creating external and internal Hive tables.
  • Involved in analyzing and validating data using Hive, Impala and HUE
  • Handled Parquet file formats
  • Managed and reviewed Hadoop/Spark log files
  • Developed bash scripts to call Spark jobs
  • Developed Impala/Hive queries to fetch data from Data Lake.
  • Built and deployed application using Maven
  • Involved in Agile/Scrum methodology
  • Used Git/Bitbucket for version control
  • Used Bamboo for CICD

Environment: Java 1.8, Eclipse, Scala, HDFS, Spark, Hive, Impala, Maven, HUE, Control-M, UNIX Script, Putty, FileZilla, Tera data SQL assistant, Bamboo, Jira

Confidential

Lead Hadoop Developer

Responsibilities:

  • Responsible for ETL (Extract Transform and Load) data to Data Lake
  • Lead offshore team an Agile development cycle
  • Developed Map reduce program to clean and transform data
  • Developed Spark jobs for cleaning and loading data to HDFS for better performance
  • Developed bash scripts to load data
  • Involved in automation using automation console and JSON
  • Involved in loading data from Unix file system to HDFS
  • Experience in creating external and internal Hive tables with dynamic partitions on Data Lake.
  • Created external Hive stage tables on top of loaded data in HDFS as part of ETL
  • Finally loaded data from stage tables to final tables
  • Developed map reduce programs with map side and reducer side joins
  • Used Distributed cache for better performance
  • Involved in analyzing and validating data using Hive, Impala and HUE
  • Involved in optimization of Hive/Impala queries
  • Handled Text and Parquet file formats
  • Managed and reviewed Hadoop log files
  • Developed bash scripts to call impala/hive queries and to load results to middleware TIBCO using SCP
  • I was the only one responsible for fetching data from a common Data Lake based on the various user request through UI
  • Developed Impala/Hive queries to fetch data from Data Lake.
  • Built and deployed application using Maven
  • Involved in Agile/Scrum methodology
  • Used Git/Bitbucket for version control
  • Supported AWS migration

Environment: Java 1.7, Eclipse, Map reduce, HDFS, Spark, Hive, Impala, Maven, HUE, UNIX Script, Putty, FileZilla

Confidential, New York

Lead Hadoop Developer

Responsibilities:

  • Developed various map only jobs and map reduce jobs for various rules.
  • Created sequence of map reduce jobs. Subsequent jobs fetch the output of the previous job as the input.
  • Imported and exported data from SQL Database to HDFS using Sqoop through regular intervals
  • Lead offshore team an Agile development process
  • Developed Hive and Impala queries for the application
  • Code optimization and performance enhancements
  • Involved in analysis, design, enhancement and maintenance of the application.
  • Involved in dynamic Oozie workflow generation and executed multiple jobs parallel by fork and join
  • Prepared MRUnit test cases and involved in unit testing of the application
  • Created Custom partitions for better performance
  • Used Distributed cache for hive/impala query results
  • Involved in creating REST web service using JSON
  • Managed and reviewed Hadoop log files
  • Built and deployed application using Maven
  • Involved in Agile/Scrum methodology
  • Used Git/Bitbucket for version control

Environment: Hive, Impala, Sqoop, Map reduce, Oozie, Java 1.7, Maven, FileZilla, Putty

Confidential

Java/Hadoop/Pega Developer

Responsibilities:

  • Responsible for ETL data
  • Developed various map reduce jobs for loading data.
  • Created HBase tables.
  • Inserted data to HBase table
  • Fetched data from HBase tables
  • Prepared MRUnit test cases and involved in unit testing of the application
  • Managed and reviewed Hadoop log files
  • Created and used static Oozie workflow
  • Built and deployed application using Maven
  • Used svn for version control

Environment: Map reduce, HBase, Oozie, Java 1.7, Maven, FileZilla, Putty

Confidential, Sacramento/CA

Pega System Architect

Responsibilities:

  • Participated in all phases for PERL - Pega Enterprise Licensing
  • Responsible to enable and configure all modules and business scenarios and use cases for application of online licenses and certification across Public Health.
  • Build out business rules, configured routing, designed and implemented UIs and harnesses to support the business requirements.
  • Participated in unit and all phases of testing to deploy within agreed timelines
  • Involved in PERL Detailed Resource Task plan creation
  • Preparation of Perl application profile document.
  • Prepared test strategy for Perl.
  • Actively involved in the creation of the below documents for Perl
  • Kick off Presentation
  • High level development solution document

Environment: PRPC 7.1, PostgreSQL

Confidential, Indianapolis/IN

Java Developer

Responsibilities:

  • Involved in planning, estimation, task allocation, technical support and team management.
  • Prepared necessary documents like Estimation, schedule and design.
  • Prepared test plans and involved in testing of the application.
  • Constructed Java code for the enhancements of the application
  • Developing PL/SQL queries and stored procedures for the application
  • Analyzed requirements directly from the client.
  • Writing/Modifying DB2 Procedures for Database manipulations.
  • Involved in Unit testing, System Testing, and Integration Testing.
  • Used Log4j framework to log application
  • Data Junction is used to translate data using XML mapping
  • Prepared JUnit test cases and involved in unit testing of the application
  • Designed application using MVC architecture
  • Involved in analysis, design, enhancement and maintenance of the application.
  • Preparing WPSR, Metrics and weekly task trackers.
  • Code optimization and performance enhancements
  • Knowledge Transfer
  • Built and deployed application using Ant
  • Involved in Waterfall methodology
  • Used SVN for version control

Environment: Java 1.5, Ant, FileZilla, Putty, DB2, Data junction, JBoss, PL/SQL, JSP, JS, HTML, SQL,WAS

We'd love your feedback!