Hadoop Consultant Resume

SUMMARY

6 years of professional experience in Upgrading and provided analytical solutions to clients in Hadoop space.
Worked on various versions of hadoop distributions like Cloudera, Hortonworks, Apache and EMR
Experienced working and designing pipelines using Lambda Architecture
Introduced and Proposed new Open Source tools like Hadoop File system, Zeppelin, Apache Spark, Maven, Apache Storm, Hue, Ambari and Yarn to the clients
Strong analytical skills with ability to understand clients business needs.
Experiences working on Agile methodologies
Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
Experience working with Apache Hadoop components like HDFS, Spark, Storm, Ambari, MapReduce, HiveQL, HBase, Pig, Sqoop, Ozzie, Cassendra, MongoDB, Big Data and Big Data Analytics.
Experienced in processing Big Data applications on Apache Hadoop and MapReduce Frameworks
Experience in analyzing data using Pig Latin and HiveQL
Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
Worked on databases like Oracle, MySQL, MongoDB and Hbase databases.
Hands on experience in application development using Java, scala, sql and UNIX shell scripting.
Expertise in creating Conceptual Data Models, Process/Data Flow Diagram, Use Case Diagrams and State Diagrams.
Experience with web - based development using HTML

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Consultant

Responsibilities:

Worked on setting up Hadoop cluster in Cloudera platform
Experienced working with planning, design and cluster setup
Installed/configured latest version of Cloudera 5.14
Responsible to maintain good health and heart beat for the nodes
Experienced in setting up Kerberos security to the hadoop cluster
Installed/deployed and configured Cloudera parcels on to the cluster
Experienced working on Sqoop, Nifi, Spark, Storm, Kafka and RabbitMQ
Worked with running Benchmark and stress tests to analyze the performance of the cluster
Suggested client with new tools on Hadoop platform that suites there needs
Worked on setting up and building custom pipelines for the data migration on to Hadoop
Worked on building live streamed analytics for the client
Migrated terabytes of data from various tools on to HDFS
Responsible to schedule the jobs and allot the resources
Worked with Sqoop to migrate data from RDBMS

Confidential, VA

Hadoop Consultant

Responsibilities:

Worked with POC’s using NIFI on to MongoDB, Redshift, EMR using Hortonworks environments
Solved the complex Analytics and Business problems of Data loss and maintenence
Designed the data flow and architecture for the client team according to the need
Installed, Configured and maintained HDF and HDP on production clusters
Check and maintain health of the cluster
Looked over the security of the cluster by restricting access to the users and 2 way authentication for the login
Worked with developing scripts to automate the processors in NIFI
Migrated the data and updated the cluster from HDP 2.2 to 2.6
Worked with emerging tools like Ambari, NIFI, RANGER, KNOX, EMR, AWS, Zeppelin, HUE, etc
Developed, tested and deployed the code in java and scala for the spark streaming, Storm and HDFS applications
Used SQL queries to get data from RDBMS on to Hadoop
Written Hive scripts using Partition and bucketing
Written and executed unix shell scripts to install, configure, maintain, give permissions to users
Experience securing the PII and sensitive data
Scheduled background jobs to migrate the data on a timely basis

Confidential, DC

Hadoop Consultant

Responsibilities:

Can easily adopt to custom specific tools/software and build environments
Solved the complex Analytics and Business problems of Data loss
Designed the data flow and architecture for the Analytics team according to the business need
Presented POCs using combinations of Open source Messaging brokers and Streaming applications
Presented POCs and tested the flow of Data from various Data sources to the reporting tools like Zeppelin.
Installed and experienced in writing Notebooks in Zeppelin
Successfully passed streaming data onto the MongoDB from Spark streaming
Created Databases and injected data on to MongoDB documents
Worked with Oracle database to export data on to HDFS
Worked with SQL queries to query the data
Proposed Lambda architecture for the Streaming Data
Proposed the finalized Architecture and data flow using Big data/Hadoop tools
Built the architecture in AWS and successfully tested the data Flow
Experience working with Chef recipes on the AWS machines
Designed java and Scala application connectors for the data flow from the data sources to the Messaging brokers and from Messaging brokers to the Streaming applications.
Experience working with RabbitMQ and Kafka Messaging Brokers
Created custom applications for creating the queues in RabbitMQ broker
Experience working with Storm and Spark streaming applications
Experience working with Maven, Github and AWS
Experienced working with EMR
Created branches for deploying the code in Github
Experience working with MongDB database
Designed, Tested and made Poc’s comparing Storm/Spark streaming and RabbitMQ/Kafka applications.
Successfully passed data from Data sources to the Messaging broker and from broker to spark and stored them MongoDB data base.
Maintained and administrated the Big data clusters on AWSusing Ambari.
Designed the cluster architecture.
Installed and configured the clusters using Hortonworks distribution.
Responsible for the system failure and data loss of the cluster.
Responsible for Logs storing on s3 storage.
Responsible for restricting access to the team members.
Designed Spark applications for consuming the streaming data on the fly and do the complex computations
Worked with the data transformations like Json to Rdd and Json to Bson etc
Worked closely and conducted meetings with Analytic, Dev-ops, Business, Database, Developer and testing Teams to gather the information regarding the data transformations etc
Worked with data transfers from the Custom related tools to the Hadoop file system.
Designed application to store the Raw data on HDFS file system.
Worked with Rest Api calls to access the data from different open source tools
Suggested the best practices and security settings for the data.
Identified the Roles for Analytic users and Business users to analyze the data.
Documented the Steps performed in designing the applications specific to the client.

Confidential, IL

Hadoop consultant

Responsibilities:

Worked on analyzing Hadoop cluster using different big data analytic tools.
Collecting and aggregating large amounts of log data using Apache Flume
Worked on debugging and performance tuning.
Created Hbase tables to store various data formats of PII data
Implemented test scripts to support test driven development and continuous integration.
Worked on performance tuning Pig/Hive queries.
Importing and exporting data into HDFS and Hive using Sqoop.
Experience working on processing unstructured data using Pig and Hive.
Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
Experienced in running Hadoop streaming jobs to process terabytes of data.
Gained experience in managing and reviewing Hadoop log files.
Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
Extensively used Pig for data cleansing.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
Strong experience on Apache server configuration.
Extensively worked with Kafka and storm.
Implemented SQL, PL/SQL Stored Procedures.
Actively involved in code review and bug fixing for improving the performance.
Developed screens using JSP, DHTML, CSS, AJAX, JavaScript, Struts, spring, Java and XML

Confidential, TX

Hadoop Consultant

Responsibilities:

Worked with business partners to gather business requirements.
Responsible for building scalable distributed data solutions using Hadoop.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Developed the Pig UDF’S to pre-process the data for analysis.
Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
Developed job workflows in Oozie to automate the tasks
Effectively involved in creating Hive partioning and bucketing.
Performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop.
Configured Sqoop and developed scripts to extract data from MySQL Server into HDFS.
Expertise in exporting analyzed data to relational databases using Sqoop.
Implemented Fair scheduling on the Job tracker to share the resources of the Cluster
Maintained Cluster co-ordination services through ZooKeeper.
Responsible for running Hadoop streaming jobs to process terabytes of Data.
Managed and reviewed Hadoop log files

Confidential, TN

Hadoop Consultant

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS.
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Experienced in defining job flows.
Experienced in running Hadoop streaming jobs to process terabytes of data.
Load and transform large sets of structured, semi structured and unstructured data.
Responsible to manage data coming from different sources.
Supported Map Reduce Programs those are running on the cluster.
Involved in loading data from different file systems on to HDFS.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship