Hadoop Senior Consultant Resume Ashburn, VA - Hire IT People

PROFESSIONAL SUMMARY:

Over 9 years of IT experience in various domains wif Hadoop Eco Systems, Core java and SQL&PL/SQL Technologies wif hands - on project experience in various Verticals which includes financial services and trade compliance.
Extensive hands-onexperience in Spark Core, Spark-Sql, Spark Streaming and Spark machine learning using Scala and Python programming language.
Solid understanding of RDD operations in Apache Spark i.e., Transformations &Actions, Persistence (Caching), Accumulators, Broadcast Variables, Optimising Broadcasts.
In depth understanding of Apache spark job execution Components like DAG, lineage graph, Dag Scheduler, Task scheduler, Stages and task.
Experience in exposing Apache Spark as web services.
Good understanding of Driver, Executor Spark web UI.
Experience in submitting Apache Spark job and map reduce jobs to YARN.
Experience in real time processing using Apache Spark and Flume, Kafka.
Migrated Python Machine learning modules to scalable, high performance and fault-tolerant distributed systems like Apache Spark.
Strong experience in Spark SQL UDFs, Hive UDFs, Spark SQL Performance, Performance Tuning. Hands on experience in working wif input file formats like orc, parquet, json, avro.
Good expertise in coding in Python, Scala and Java.
Good understanding of teh map reduces framework architectures (MRV1 & YARN Architecture).
Good Knowledge and understanding of Hadoop Architecture and various components in Hadoop ecosystems - HDFS, Map Reduce, Pig, Sqoop and Hive.
Developed various Map Reduce applications to perform ETL workloads on meta data and terabytes of data.
Hands on experience in cleansing semi-structured and unstructured data using Pig Latin scripts
Good working knowledge in creating Hive tables and worked using HQL for data analysis to meet teh business requirements.
Experience in managing and reviewing Hadoop log files.
Having good working experience of No SQL database like Hbase,Cassandra and MangoDB
Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
Experience in importing and exporting teh data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
Experience in working wif flume to load teh log data from multiple sources directly into HDFS
Experience in scheduling time driven and data driven Oozieworkflows.
Used Zookeeper on a distributed Hbase for cluster configuration and management.
Worked wif Avro Data Serialization system.
Experience in fine-tuning Map reduces jobs for better scalability and performance.
Experience in writing shell scripts do dump teh shared data from landing zones to HDFS.
Experience in performance tuning teh Hadoop cluster by gathering and analyzing teh existing infrastructure.
Expertise in Client Side designing and validations using HTML and Java Script.
Excellent communication and inter-personal skills detail oriented, analytical, time bound, responsible team player and ability to coordinate in a team environment and possesses high degree of self-motivation and a quick learner.

TECHNICAL SKILLS:

Confidential Frameworks: Hadoop, Hive, Kafka, AWS, Cassandra, HBase, Flume, Pig, Sqoop, Map Reduce, Cloudera, Mongo DB, Spark, Scala.

Confidential distribution: Cloudera, Amazon EMR

Programming languages: Oracle PL/SQL,Core Java, Scala, Python, Shell Scripting

Operating Systems: Windows, Linux (Ubuntu)

Databases: Oracle10g,Mysql, Netezza, Sql Server, Tera Data, Postgres

Designing Tools: Eclipse,PL/SQL Developer, Toad,Putty

Development methodologies: Agile, Waterfall

Messaging Services: ActiveMQ, Kafka,JMS

Version Tools: PVCS, SVN and CVS, Git

Analytics: Tableau, SPSS, SAS EM and SAS JMP

PROFESSIONAL EXPERIENCE:

Confidential, Ashburn, VA

Hadoop Senior Consultant

Responsibilities:

Write spark jobs to read data into a data frame and apply various transformations and actions to filter and transform data into teh required format
Build scalable framework using spark’s advanced framework.
Write spark jobs to write final data into HDFS and RDBMS.
Manage and monitor Hadoop cluster.
Manage ETL team and motivate/educate them to learn and work efficiently
Involved Hive queries into Spark SQL to improve performance.
Executed Spark RDD transformations and actions as per business analysis needs
Imported data from MySQL to HDFS using Sqoop and manage Hadoop log files
Fully automated job scheduling, monitoring, and cluster management wifout human.
Used Sqoop to import and export data among HDFS, MySQL database and Hive.

Environment: Linux, Hadoop, Spark core, Spark SQL, Scala, HiveCITI Bank, USA Jul ’2015 to Jun’2017

Confidential

Hadoop Consultant.

Responsibilities:

Load and transform large sets of structured, semi structured and unstructured data coming from different source systems and a variety of portfolios
Used Sparkdata frame to read text data, CSV data, and image data from HDFS, S3 and Hive.
Worked closely data scientist for building predictive model using Spark.
Cleaned input text data using Spark Machine learning feature exactions API.
Migrated Hive queries into Spark SQL to improve performance.
Involving in Migrating teh coding from Hive to Apache Spark and Scala using Spark SQL, RDD.
Trained model using historical data stored in HDFS and Amazon S3.
Used Spark Streaming to load teh trained model to predict on real time data from Kafka.
Executed Spark RDD transformations and actions as per business analysis needs
Imported data from MySQL to HDFS using Sqoop and manage Hadoop log files
Fully automated job scheduling, monitoring, and cluster management wifout human.
Created Hive tables and involved in meta data loading and writing Hive UDFs
Used Sqoop to import and export data among HDFS, MySQL database and Hive.
Migrated python scikit learn machine learning to data frame based spark machine learning algorithms.

Environment: Spark core, SparkSQL, Spark streaming, Spark machine learning, Scala, Data frames, Datasets,AWS, Kafka Hive, Sqoop, Hbase,Github, Webflow, Amazon s3, Amazon EMR.

Confidential

Hadoop Associate Consultant

Responsibilities:

Created various Map reduce jobs for performing ETL transformations on teh transactional and application specific data sources.
Imported data from our relational data stores to Hadoop using Sqoop
Wrote PIG scripts and executed by using Grunt shell.
Worked on teh conversion of existing Map Reduce batch applications for better performance.
Confidential analysis using Pig and User defined functions (UDF).
Worked on loading tables to Impala for faster retrieval using different file formats .
Teh system was initially developed using Java. Teh Java filtering program was restructured to has business rule engine in a jar that can be called from both java and Hadoop.
Created Reports and Dashboards using structured and unstructured data.
Upgrade operating system and/or Hadoop distribution as and when new versions released by using Puppet.
Performed joins, group by and other operations in Map Reduce by using Java and PIG.
Processed teh output from PIG, Hive and formatted it before sending to teh Hadoop output file.
Used HIVE definition to map teh output file to tables.
Setup and benchmarked Hadoop/HBase clusters for internal use
Wrote data ingesters and map reduce programs
Reviewed teh HDFS usage and system design for future scalability and fault-tolerance
Wrote MapReduce/HBase jobs
Worked wif HBase, NOSQL database.

Environment: ApacheHadoop 2.x, MapReduce, HDFS, Hive, Pig, Hbase, Sqoop, Flume, Linux, Java 7, Eclipse, NOSQL.

Confidential

Hadoop Associate Consultant

Responsibilities:

Developed Confidential Solutions that enabled teh business and technology teams to make data-driven decisions on teh best ways to acquire customers and provide them business solutions.
Installed and configured Apache Hadoop, Hive, and HBase.
Worked on Hortonworks cluster, which was used to process teh Confidential .
Developed multiple map reduce jobs in java for data cleaning and pre-processing.
Sqoop was used to pull data into Hadoop distributed file system from RDBMS and vice versa
Defined workflows using Oozie.
Used Hive to create partitions on hive tables and analyzes dis data to compute various metrics for reporting.
Created Data model for Hive tables
Good Experience in managing and reviewing Hadoop log files
Used Pig as ETL tool to do transformations, joins and pre-aggregations before loading data onto HDFS.
Worked on large sets of structured, semi structured and unstructured data
Responsible to manage data coming from different sources
Installed and configured Hive and also developed Hive UDFs to extend core functionality of hive
Responsible for loading data from UNIX file systems to HDFS.

Environment: Apache Hadoop 2.x, MapReduce, HDFS, Hive, HBase, Pig, Oozie, Linux, Java 7, Eclipse.

Confidential

Associate Consultant

Responsibilities:

Full life cycle experience including requirements analysis, high level design, detailed design, data model design, coding, testing and creation of functional and technical design documentation.
Extensively involved in writing stored procedures, functions, packages as per teh business requirements.
Redesigned existing procedures and packages to enhance teh performance.
Debugging Pro*C and PL/SQL code block of stored procedures.
Generation of ad-hoc reports using SQL and stored procedures.
Involved in teh continuous enhancements and fixing of production problems.
Analysis of CRs those are raised from UAT & Production.
Coordinating wif teh UAT & Production team as well as wif teh users.
Used Bulk Collectionsforbetter performanceand easy retrieval of data, by reducing context switching between SQL and PL/SQL engines.
Wrote SQL, PL/SQL, SQL*Plus programs required to retrieve data using cursors and exception handling
Involved in SIT and UAT Support for solving critical issues.
Involved in requirements, Design phases, Coding, Testing of teh functionality.
Creating and Maintaining Database objects.
End to end functional testing for entire application

Environments: Oracle11g, SQL, PL/SQL, Pro*c, Putty, Sun Solaris.

Confidential

Software Engineer

Responsibilities:

Involved in writing stored procedures, functions, packages as per teh business requirements.
Developed Pro*c programs for flat file generation.
Worked on Request for Changes (RFC) and Production Problem Resolutions (PPR).
Provided support across teh various phases of teh project.
Prepared and executed unit test cases.

Environment: Oracle11g, SQL, PL/SQL, Pro*c, Putty, Sun Solaris

We provide IT Staff Augmentation Services!

Hadoop Senior Consultant Resume

Ashburn, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship