Hadoop Consultant Resume
2.00/5 (Submit Your Rating)
SUMMARY
- 6 years of professional experience in Upgrading and provided analytical solutions to clients in Hadoop space.
- Worked on various versions of hadoop distributions like Cloudera, Hortonworks, Apache and EMR
- Experienced working and designing pipelines using Lambda Architecture
- Introduced and Proposed new Open Source tools like Hadoop File system, Zeppelin, Apache Spark, Maven, Apache Storm, Hue, Ambari and Yarn to the clients
- Strong analytical skills with ability to understand clients business needs.
- Experiences working on Agile methodologies
- Good Knowledge on Hadoop Cluster architecture and monitoring the cluster.
- Experience working with Apache Hadoop components like HDFS, Spark, Storm, Ambari, MapReduce, HiveQL, HBase, Pig, Sqoop, Ozzie, Cassendra, MongoDB, Big Data and Big Data Analytics.
- Experienced in processing Big Data applications on Apache Hadoop and MapReduce Frameworks
- Experience in analyzing data using Pig Latin and HiveQL
- Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
- Worked on databases like Oracle, MySQL, MongoDB and Hbase databases.
- Hands on experience in application development using Java, scala, sql and UNIX shell scripting.
- Expertise in creating Conceptual Data Models, Process/Data Flow Diagram, Use Case Diagrams and State Diagrams.
- Experience with web - based development using HTML
PROFESSIONAL EXPERIENCE
Confidential
Hadoop Consultant
Responsibilities:
- Worked on setting up Hadoop cluster in Cloudera platform
- Experienced working with planning, design and cluster setup
- Installed/configured latest version of Cloudera 5.14
- Responsible to maintain good health and heart beat for the nodes
- Experienced in setting up Kerberos security to the hadoop cluster
- Installed/deployed and configured Cloudera parcels on to the cluster
- Experienced working on Sqoop, Nifi, Spark, Storm, Kafka and RabbitMQ
- Worked with running Benchmark and stress tests to analyze the performance of the cluster
- Suggested client with new tools on Hadoop platform that suites there needs
- Worked on setting up and building custom pipelines for the data migration on to Hadoop
- Worked on building live streamed analytics for the client
- Migrated terabytes of data from various tools on to HDFS
- Responsible to schedule the jobs and allot the resources
- Worked with Sqoop to migrate data from RDBMS
Confidential, VA
Hadoop Consultant
Responsibilities:
- Worked with POC’s using NIFI on to MongoDB, Redshift, EMR using Hortonworks environments
- Solved the complex Analytics and Business problems of Data loss and maintenence
- Designed the data flow and architecture for the client team according to the need
- Installed, Configured and maintained HDF and HDP on production clusters
- Check and maintain health of the cluster
- Looked over the security of the cluster by restricting access to the users and 2 way authentication for the login
- Worked with developing scripts to automate the processors in NIFI
- Migrated the data and updated the cluster from HDP 2.2 to 2.6
- Worked with emerging tools like Ambari, NIFI, RANGER, KNOX, EMR, AWS, Zeppelin, HUE, etc
- Developed, tested and deployed the code in java and scala for the spark streaming, Storm and HDFS applications
- Used SQL queries to get data from RDBMS on to Hadoop
- Written Hive scripts using Partition and bucketing
- Written and executed unix shell scripts to install, configure, maintain, give permissions to users
- Experience securing the PII and sensitive data
- Scheduled background jobs to migrate the data on a timely basis
Confidential, DC
Hadoop Consultant
Responsibilities:
- Can easily adopt to custom specific tools/software and build environments
- Solved the complex Analytics and Business problems of Data loss
- Designed the data flow and architecture for the Analytics team according to the business need
- Presented POCs using combinations of Open source Messaging brokers and Streaming applications
- Presented POCs and tested the flow of Data from various Data sources to the reporting tools like Zeppelin.
- Installed and experienced in writing Notebooks in Zeppelin
- Successfully passed streaming data onto the MongoDB from Spark streaming
- Created Databases and injected data on to MongoDB documents
- Worked with Oracle database to export data on to HDFS
- Worked with SQL queries to query the data
- Proposed Lambda architecture for the Streaming Data
- Proposed the finalized Architecture and data flow using Big data/Hadoop tools
- Built the architecture in AWS and successfully tested the data Flow
- Experience working with Chef recipes on the AWS machines
- Designed java and Scala application connectors for the data flow from the data sources to the Messaging brokers and from Messaging brokers to the Streaming applications.
- Experience working with RabbitMQ and Kafka Messaging Brokers
- Created custom applications for creating the queues in RabbitMQ broker
- Experience working with Storm and Spark streaming applications
- Experience working with Maven, Github and AWS
- Experienced working with EMR
- Created branches for deploying the code in Github
- Experience working with MongDB database
- Designed, Tested and made Poc’s comparing Storm/Spark streaming and RabbitMQ/Kafka applications.
- Successfully passed data from Data sources to the Messaging broker and from broker to spark and stored them MongoDB data base.
- Maintained and administrated the Big data clusters on AWSusing Ambari.
- Designed the cluster architecture.
- Installed and configured the clusters using Hortonworks distribution.
- Responsible for the system failure and data loss of the cluster.
- Responsible for Logs storing on s3 storage.
- Responsible for restricting access to the team members.
- Designed Spark applications for consuming the streaming data on the fly and do the complex computations
- Worked with the data transformations like Json to Rdd and Json to Bson etc
- Worked closely and conducted meetings with Analytic, Dev-ops, Business, Database, Developer and testing Teams to gather the information regarding the data transformations etc
- Worked with data transfers from the Custom related tools to the Hadoop file system.
- Designed application to store the Raw data on HDFS file system.
- Worked with Rest Api calls to access the data from different open source tools
- Suggested the best practices and security settings for the data.
- Identified the Roles for Analytic users and Business users to analyze the data.
- Documented the Steps performed in designing the applications specific to the client.
Confidential, IL
Hadoop consultant
Responsibilities:
- Worked on analyzing Hadoop cluster using different big data analytic tools.
- Collecting and aggregating large amounts of log data using Apache Flume
- Worked on debugging and performance tuning.
- Created Hbase tables to store various data formats of PII data
- Implemented test scripts to support test driven development and continuous integration.
- Worked on performance tuning Pig/Hive queries.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience working on processing unstructured data using Pig and Hive.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Experienced in running Hadoop streaming jobs to process terabytes of data.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs.
- Developed Pig Latin scripts to extract data from the web server output files to load into HDFS.
- Extensively used Pig for data cleansing.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts.
- Strong experience on Apache server configuration.
- Extensively worked with Kafka and storm.
- Implemented SQL, PL/SQL Stored Procedures.
- Actively involved in code review and bug fixing for improving the performance.
- Developed screens using JSP, DHTML, CSS, AJAX, JavaScript, Struts, spring, Java and XML
Confidential, TX
Hadoop Consultant
Responsibilities:
- Worked with business partners to gather business requirements.
- Responsible for building scalable distributed data solutions using Hadoop.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF’S to pre-process the data for analysis.
- Implemented multiple Map Reduce Jobs in java for data cleansing and pre-processing.
- Developed job workflows in Oozie to automate the tasks
- Effectively involved in creating Hive partioning and bucketing.
- Performed transformations using Hive, MapReduce, loaded data into HDFS and extracted data from Teradata into HDFS using Sqoop.
- Configured Sqoop and developed scripts to extract data from MySQL Server into HDFS.
- Expertise in exporting analyzed data to relational databases using Sqoop.
- Implemented Fair scheduling on the Job tracker to share the resources of the Cluster
- Maintained Cluster co-ordination services through ZooKeeper.
- Responsible for running Hadoop streaming jobs to process terabytes of Data.
- Managed and reviewed Hadoop log files
Confidential, TN
Hadoop Consultant
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experienced in defining job flows.
- Experienced in running Hadoop streaming jobs to process terabytes of data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from different file systems on to HDFS.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
