We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

2.00/5 (Submit Your Rating)

SUMMARY

  • 6 years of professional experience in Upgrading and provided analytical solutions to clients in Hadoop space.
  • Worked on various versions of hadoop distributions like Cloudera, Hortonworks, Apache and EMR
  • Experienced working and designing pipelines using Lambda Architecture
  • Introduced and Proposed new Open Source tools like Hadoop File system, Zeppelin, Apache Spark, Maven, Apache Storm, Hue, Ambari and Yarn to the clients
  • Strong analytical skills with ability to understand clients business needs.
  • Experiences working on Agile methodologies
  • Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
  • Experience working with Apache Hadoop components like HDFS, Spark, Storm, Ambari, MapReduce, HiveQL, HBase, Pig, Sqoop, Ozzie, Cassendra, MongoDB, Big Data and Big Data Analytics.
  • Experienced in processing Big Data applications on Apache Hadoop and MapReduce Frameworks
  • Experience in analyzing data using Pig Latin and HiveQL
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Worked on databases like Oracle, MySQL, MongoDB and Hbase databases.
  • Hands on experience in application development using Java, scala, sql and UNIX shell scripting.
  • Expertise in creating Conceptual Data Models, Process/Data Flow Diagram, Use Case Diagrams and State Diagrams.
  • Experience with web - based development using HTML

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Consultant

Responsibilities:

  • Worked on setting up Hadoop cluster in Cloudera platform
  • Experienced working with planning, design and cluster setup
  • Installed/configured latest version of Cloudera 5.14
  • Responsible to maintain good health and heart beat for the nodes
  • Experienced in setting up Kerberos security to the hadoop cluster
  • Installed/deployed and configured Cloudera parcels on to the cluster
  • Experienced working on Sqoop, Nifi, Spark, Storm, Kafka and RabbitMQ
  • Worked with running Benchmark and stress tests to analyze the performance of the cluster
  • Suggested client with new tools on Hadoop platform that suites there needs
  • Worked on setting up and building custom pipelines for the data migration on to Hadoop
  • Worked on building live streamed analytics for the client
  • Migrated terabytes of data from various tools on to HDFS
  • Responsible to schedule the jobs and allot the resources
  • Worked with Sqoop to migrate data from RDBMS

Confidential, VA

Hadoop Consultant

Responsibilities:

  • Worked with POC’s using NIFI on to MongoDB, Redshift, EMR using Hortonworks environments
  • Solved the complex Analytics and Business problems of Data loss and maintenence
  • Designed the data flow and architecture for the client team according to the need
  • Installed, Configured and maintained HDF and HDP on production clusters
  • Check and maintain health of the cluster
  • Looked over the security of the cluster by restricting access to the users and 2 way authentication for the login
  • Worked with developing scripts to automate the processors in NIFI
  • Migrated the data and updated the cluster from HDP 2.2 to 2.6
  • Worked with emerging tools like Ambari, NIFI, RANGER, KNOX, EMR, AWS, Zeppelin, HUE, etc
  • Developed, tested and deployed the code in java and scala for the spark streaming, Storm and HDFS applications
  • Used SQL queries to get data from RDBMS on to Hadoop
  • Written Hive scripts using Partition and bucketing
  • Written and executed unix shell scripts to install, configure, maintain, give permissions to users
  • Experience securing the PII and sensitive data
  • Scheduled background jobs to migrate the data on a timely basis

Confidential, DC

Hadoop Consultant

Responsibilities:

  • Can easily adopt to custom specific tools/software and build environments
  • Solved the complex Analytics and Business problems of Data loss
  • Designed the data flow and architecture for the Analytics team according to the business need
  • Presented POCs using combinations of Open source Messaging brokers and Streaming applications
  • Presented POCs and tested the flow of Data from various Data sources to the reporting tools like Zeppelin.
  • Installed and experienced in writing Notebooks in Zeppelin
  • Successfully passed streaming data onto the MongoDB from Spark streaming
  • Created Databases and injected data on to MongoDB documents
  • Worked with Oracle database to export data on to HDFS
  • Worked with SQL queries to query the data
  • Proposed Lambda architecture for the Streaming Data
  • Proposed the finalized Architecture and data flow using Big data/Hadoop tools
  • Built the architecture in AWS and successfully tested the data Flow
  • Experience working with Chef recipes on the AWS machines
  • Designed java and Scala application connectors for the data flow from the data sources to the Messaging brokers and from Messaging brokers to the Streaming applications.
  • Experience working with RabbitMQ and Kafka Messaging Brokers
  • Created custom applications for creating the queues in RabbitMQ broker
  • Experience working with Storm and Spark streaming applications
  • Experience working with Maven, Github and AWS
  • Experienced working with EMR
  • Created branches for deploying the code in Github
  • Experience working with MongDB database
  • Designed, Tested and made Poc’s comparing Storm/Spark streaming and RabbitMQ/Kafka applications.
  • Successfully passed data from Data sources to the Messaging broker and from broker to spark and stored them MongoDB data base.
  • Maintained and administrated the Big data clusters on AWSusing Ambari.
  • Designed the cluster architecture.
  • Installed and configured the clusters using Hortonworks distribution.
  • Responsible for the system failure and data loss of the cluster.
  • Responsible for Logs storing on s3 storage.
  • Responsible for restricting access to the team members.
  • Designed Spark applications for consuming the streaming data on the fly and do the complex computations
  • Worked with the data transformations like Json to Rdd and Json to Bson etc
  • Worked closely and conducted meetings with Analytic, Dev-ops, Business, Database, Developer and testing Teams to gather the information regarding the data transformations etc
  • Worked with data transfers from the Custom related tools to the Hadoop file system.
  • Designed application to store the Raw data on HDFS file system.
  • Worked with Rest Api calls to access the data from different open source tools
  • Suggested the best practices and security settings for the data.
  • Identified the Roles for Analytic users and Business users to analyze the data.
  • Documented the Steps performed in designing the applications specific to the client.

Confidential, IL

Hadoop consultant

Responsibilities:

  • Worked on analyzing Hadoop cluster using different big data analytic tools.
  • Collecting and aggregating large amounts of log data using Apache Flume
  • Worked on debugging and performance tuning.
  • Created Hbase tables to store various data formats of PII data
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on performance tuning Pig/Hive queries.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience working on processing unstructured data using Pig and Hive.
  • Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
  • Experienced in running Hadoop streaming jobs to process terabytes of data.
  • Gained experience in managing and reviewing Hadoop log files.
  • Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
  • Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
  • Extensively used Pig for data cleansing.
  • Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
  • Strong experience on Apache server configuration.
  • Extensively worked with Kafka and storm.
  • Implemented SQL, PL/SQL Stored Procedures.
  • Actively involved in code review and bug fixing for improving the performance.
  • Developed screens using JSP, DHTML, CSS, AJAX, JavaScript, Struts, spring, Java and XML

Confidential, TX

Hadoop Consultant

Responsibilities:

  • Worked with business partners to gather business requirements.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed the Pig UDF’S to pre-process the data for analysis.
  • Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
  • Developed job workflows in Oozie to automate the tasks
  • Effectively involved in creating Hive partioning and bucketing.
  • Performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop.
  • Configured Sqoop and developed scripts to extract data from MySQL Server into HDFS.
  • Expertise in exporting analyzed data to relational databases using Sqoop.
  • Implemented Fair scheduling on the Job tracker to share the resources of the Cluster
  • Maintained Cluster co-ordination services through ZooKeeper.
  • Responsible for running Hadoop streaming jobs to process terabytes of Data.
  • Managed and reviewed Hadoop log files

Confidential, TN

Hadoop Consultant

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS.
  • Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Experienced in defining job flows.
  • Experienced in running Hadoop streaming jobs to process terabytes of data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Responsible to manage data coming from different sources.
  • Supported Map Reduce Programs those are running on the cluster.
  • Involved in loading data from different file systems on to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.

We'd love your feedback!