We provide IT Staff Augmentation Services!

Big Data Lead Consultant Resume Profile

San Jose, CA

Professional Summary

  • Experience in designing, implementing large scale data processing, data storage and data distributed systems.
  • Expertise in working with large data sets of Hadoop File Systems HDFS ,Map Reduce, Hive, Pig, Sqoop, Flume, Oozie to build robust Big Data solutions.
  • Efficient in building hive, pig scripts by capturing data from relational databases that provide SQL interfaces using Sqoop.
  • Strong understanding of Big Data Analytics platforms and ETL in the context of Big Data.
  • Designing Hadoop clusters that can be scalable of various configuration parameters and helps arrive at values for optimal cluster performance.
  • Expertise in data cleansing data mining and efficient in building hive, pig scripts of loaded the dataset into Hive for ETL Extract, Transfer and Load operation.
  • Developed free text search solution with Hadoop and SOLR.
  • Strong technical background skills with the ability to perform business analysis. Understand the business requirement and extensively involved in the client deliverables
  • Involved in technical and functional document irrespective of the project requirements
  • Extensively used ETL methodology of development deployment experience in Banking, Insurance and Infrastructure domain with a strong understanding of data analyticsbyloading of data using Informatica and SSIS.
  • Being a flexible, highly motivated and effective leader/ leads with excellent analytical, troubleshooting and problem solving skills to develop creative solution for challenging client requirements.

Technical Skills

  • Hadoop: HDFS, MapReduce, Hive, Pig, Flume, Mahout, Avro, Oozie, ZooKeeperand SOLR.
  • Dev. Tools : CentOS, Ubuntu, Linux, Eclipse, Xcode,
  • Languages : Java, Python, C / C
  • ETL tools : SSIS 2012/2008/2005, Informatica 7.1
  • Databases: Oracle 11g/8i, SQL Server 2012 /2008
  • Knowledge in NoSQL:Hbase, MongoDband Cassandra.
  • Knowledge in HortonWorks,Cloudera, Tez, Ambari, Spark and Storm.

Work Experience


Role : Big Data Lead Consultant

Environment :Hadoop, MR, Flume, Hive,Pig, SQOOP, Oozie, Ubuntuand CentOS


  • Co-ordinate with business and understand analytics requirements
  • Extraction of log files from different sources and loaded into HDFS for analysation using Flume
  • Created Pig scripts with the logic and computation.
  • Automated the jobs by pulling data from different sources to load data into HDFStables using Oozie workflows.
  • Interface with SME's, Analytics team Account managers and Domain Architects to review to-be developed solution
  • Convert high level solution into deliverables by generating a controllable and manageable activity list.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability

Use Case - 1

  • Integrated the home loan credit data by loading into staging using ETL.
  • Fetch the ETL stage data from RDBMS and moved into HDFS using SQOOP. Initiate the automated process on daily basis using Oozie workflows.
  • Created Pig tasks and implemented the logic to load the data and store the refined data into EDW
  • Automated the jobs by pulling data from HDFS into PIG using Oozie workflows.
  • Involved in administration of Hadoop, hive and pig.

Use Case - 2

  • Using Sqoop to export the data into HDFS.
  • Setup theHbase tables to load large data sets of structured and semi-structured data.
  • Implemented the business logic and created the reports for BI team.
  • Claim data has been moved from source system into hadoop using Sqoop on daily basis with scheduled Oozie workflows.
  • Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
  • Business logic implemented for categorization of claims using Pig.
  • The refined data from HDFS has been moved to relational database using Sqoop on daily basis with scheduled Oozie workflows.


Role : ETL Database Architect

Environment :SSIS, SQL Server 2008 / 2005, Remedy 7.6.04, ESM Lotus Notes , Cognos 8.3 and SSRS.


  • Data mapping with an analysis of service levels of various ITIL areas with respect to business requirements.
  • Responsible for the DBA support to multiple regions globally
  • Integration of data with various sources such as Remedy, Avaya, CMDB, Survey Central, Lotus Notes and flat files
  • Implement and maintain database security create and maintain users, roles, assign privileges
  • Creation of SSIS packages, loading of data, package execution scheduler, data error logging and trigger of automated emails.
  • Deploy SSIS Packages include, a master package which executes a child packages. The package created includes a variety of transformations like Execute SQL Task, Script Task, Execute Package Task, File Connection, Derived Column and For Each Loop.
  • Co-ordinates with customers / other teams / vendors in resolving issues
  • Participated in the detailed requirement analysis for the design of data marts and schemas.
  • Rebuild / Monitor the indexes at regular intervals for better performance.
  • Implemented numerous components as part of an end-to-end implementation of BI solutions includes data mart, ETL design and SQL queries.
  • Responsible for monitoring and making recommendations for performance improvement in hosted databases. This involved index creation, removal, modification, file group modifications and adding scheduled jobs to re-index and update statistics in databases.
  • Designed a metric summary page which provides a high level view of account health. Graphic Visualizations includes Incident Management, Problem Management, Change Management, Service Desk and Configuration Management.

Hire Now