We provide IT Staff Augmentation Services!

Hadoop Consultant Resume

San Francisco, CA


  • Being in and around the IT industry my entire life has provided me with a learning experience that one cannot gain in a classroom.
  • I am constantly gaining exposure and expertise from the clients I work with.
  • Currently working alongside with IT experts I gained an experience of 7+ years in few IT technologies 4+ years in Hadoop Ecosystem and its modules such as MapReduce, Hive, Pig, Sqoop, Cassandra and Hortonworks as well as 3+ years as a JAVA developer and its tools such as: Eclipse IDE, NetBeans and Oracle JDeveloper.
  • Creating solutions and providing efficient work system to my clients has always been my priority
  • Ability to think strategically about business, product, and technical challenges in an enterprise environment
  • Understanding of database and analytical technologies in the industry including MPP databases, NoSQL storage, Data Warehouse design, BI reporting and Dashboard development
  • Experienced in Python based MVC frameworks - Django, Flask
  • Architected, designed and maintained high performing ETL/ ELT Processes
  • Tuning and Monitoring Hadoop jobs and clusters in a production environment
  • Extensive experience in developing Pig Latin Scripts for transformations and using Hive Query Language for data analytics
  • Good level of experience in Core Java, hands on experience of using JEE technologies such as JDBC, Servlets, JSP, JNDI, JMX
  • Architected, designed and maintained high performing ETL/ ELT Processes
  • Built high performing, scalable & robust JAVA APPLICATIONS


Languages & DevOps: JAVA/JSP, C, Python, Linux/UNIX Shell Scripts, XML/HTML, Pig, Puppet, Ansible, Vagrant, VMware vSphere, OpenStack VMs

Databases & SQLs: Hive, Hbase, Cassandra, MongoDB, Presto, MySQL, PostgreSQL, Oracle, TeraData, SQL Server

Languages: JAVA/JSP, C, Python, Linux/UNIX Shell Scripts, XML/HTML, Puppet, Ansible, Vagrant, VMware vSphere (ESXi), OpenStack VMs

Operating Systems: Linux, MacOS, Windows, Solaris 2.x, HP-UX 11.x

Web Technologies: HTML, XHTML, XML, XSL, CSS, JavaScript

SDLC Methodology: Agile (SCRUM), Waterfall

Web Tools: Apache Tomcat, JDBC/ODBC, Weblogic, Netscape, iPlanet


Hadoop Consultant

Confidential, San Francisco, CA


  • Experience in installing, configuring, monitoring HDP stacks 2.1, 2.2, and 2.3
  • Experience in cluster planning, performance tuning, Monitoring, and troubleshooting the Hadoop cluster
  • Experience in planning the cluster, preparing the nodes, pre-installation and testing
  • Responsible for cluster HDFS maintenance tasks: commissioning and decommissioning nodes, balancing the cluster, and rectifying failed disks
  • Experience in using flume to get log files into the Hadoop cluster
  • Experience in configuring MySQL to store the hive metadata
  • Used Flume to collect, aggregate and store the web log data onto HDFS
  • Experience in administration of NoSQL databases including Hbase and MongoDB
  • Addressing and Troubleshooting issues on a daily basis
  • Manage and review Hadoop log files
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers/sensors
  • Worked on Text mining project with Kafka
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers
  • Written Python applications to interact with the MySQL database using Spark SQL Context and also accessed Hive tables using Hive Context
  • Good experience in Amazon Web Services (AWS) environment and good knowledge of like Elastic Compute Cloud(EC2), Elastic Load-balancers, Elastic Container Service (Docker Containers), S3, Elastic Beanstalk, Cloud Front, Elastic File system, RDS, DMS, VPC, Route53, CloudWatch, CloudTrail, CloudFormation, IAM
  • Involved in developing Hive DDLs to create, alter and drop Hive tables
  • Handled importing of data from various data sources, performed transformations using Hive MapReduce, loaded data into Hadoop Distributed File System (HDFS) and extracted the data from MySQL into HDFS vice-versa using Sqoop
  • Created and maintained Technical documentation for launching Hadoop Clusters and for executing Pig Script
  • Worked with NoSQL databases like Hbase and Mongo DB for POC purpose

Environment: HDFS, Hbase, MongoDB, AWS, MySQL, NoSQL, JAVA, Sqoop, Oozie, Flume, Pig, Oracle, UNIX, Eclipse, Python, Kafka, storm

Hadoop Consultant

Confidential, Cherry Hill, NJ


  • Writing Hadoop jobs using Pig, Hive, Sqoop and shell scripting. Responsible for building scalable distributed data solutions using Hadoop
  • Developed and Creating Hive, Pig, Sqoop, Flume, Zookeeper, Yarn, Oozie, Kafka, Elastic Search, Spark, Storm on the Hadoop
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop
  • Importing data from Oracle DB and third party vendors using Sqoop and FTP
  • Creating Hive tables using partitioning and bucketing techniques. Experience in injects data from HDFS & Oracle, MS SQL Server to Netezza & Vice versa
  • Installation and configuration of HA environment using Sun or VERITAS Cluster
  • Image machines using Jumpstart / Kickstart to install Solaris 10 and Red Hat Enterprise Linux leading large-scale global data warehousing and analytics projects
  • Implementing and tuning experience specifically using Amazon Elastic Map Reduce + tools such as Pig and Hive
  • Tracking record of implementing AWS services in a variety of distributed computing, enterprise environments
  • Implementation and tuning experience for Apache Hadoop + tools such as Pig and Hive database and analytical technologies in the industry including MPP databases, NoSQL storage, Data Warehouse design, BI reporting and Dashboard development
  • Provide support, maintenance, monitoring, troubleshooting, and resolution for data warehouse and database processes
  • Build and unit test production deployment packages in a shared environment
  • Working knowledge on the TCP/IP protocols RSH, SSH, RCP, SCP
  • Maintains DNS, NFS, and DHCP, printing, mail, web, and FTP services for the enterprise
  • Automating system tasks using Puppet
  • Created several complex job flows with a number of jobs in each job flow

Environment: Hortonworks Data Platform, HDFS, Hive, HBase, Data leak, Java, Pig, Linux, Oozie, YARN, Hue, Solr, Storm, Kafka, Elastic Search, Redis, Flume, Scoop, XML, SQL Navigator, IBM Tivoli, Eclipse, PL/SQL, SQL connector

Hadoop Consultant

Confidential, Houston, TX


  • Installing & configuring multiple nodes Hortonworks platform
  • Performance comparison between Cassandra and HBase for benchmarking
  • Monitor cluster using Ambari and periodically review Hadoop log files
  • Setup and optimize Standalone-System/Pseudo-Distributed/Distributed Clusters
  • Tune/Modify SQL for batch and online processes
  • Manage cluster through performance tuning and security enhancements using Knox/Ranger
  • Develop CRON jobs in Linux (RHEL) to monitor the system and system resources
  • Install, configure, and operate Hive, Cassandra, HBase, Pig, Sqoop, Zookeeper, & Mahout
  • Maintain Hadoop, Hadoop ecosystems, third party software, and database(s) with updates/upgrades, performance tuning and monitoring
  • Manage/implement LFS and HDFS security using tools i.e. PBIS and Kerberos authentication protocol
  • Support/Troubleshoot/Schedule jobs or MapReduce programs running in the Production cluster using Oozie workflow scheduler
  • Develop scripts to automate routine DBA tasks (i.e. refresh, backups, vacuuming)
  • Configure Name node High Availability for HAWQ database and Name node failovers using QJM

Environment: HDFS, NoSQL, Hortonworks, Cassandra, Hbase, Hive, Ambari, SQL, HDFS, YARN, MapReduce, Hive, HBase, Pig, Oozie, Shell Scripting, SOLR, Java, Puppet, Navigator, Kerberos, Mahout

JAVA Developer



  • Used JDBC, SQL and PL/SQL programming for storing, retrieving, manipulating the data
  • Responsible for creation of the project structure, development of the application with Java, J2EE and management of the code
  • Responsible for the Design and management of database in DB2
  • Integrated third party plug-in tool for data tables with dynamic data using JQuery
  • Responsible for the deployment of the application on the server using IBM WebSphere and putty
  • Developed the application in an Agile environment with the constant changes in the application scope and deadlines
  • Involved in designing and development of the ecommerce site using JSP, Servlets, JavaScript and JDBC
  • Involved in client interaction and support for the application testing at the client location
  • Used AJAX for interactive user operations and client side validations Used XSL transforms on certain XML data
  • Performed an active role in the Integration of various systems present in the application
  • Responsible to provide services for the mobile requests based on the user request
  • Performed logging of all the debug, error and warning at the code level using log4j
  • Involved in the UAT phase and production phase to provide continuous support to the onsite team
  • Used HP Quality Canter tool to actively resolve any bugs logged in any of the testing phases
  • Used XML for ORM mapping relations with the java classes and the database
  • Developed ANT script for compiling and deployment. Performed unit testing using JUnit
  • Used Subversion as the version control system. Extensively used Log4j for logging the log files

Environment: Java, J2EE, PL/SQL, JSP, HTML, AJAX, Java Script, JDBC, XSL, XML, JMS, UML, JUnit

JAVA Developer



  • Developed the applications using Java, J2EE, Struts, JDBC
  • Built applications for scale using JavaScript
  • Used SOAP UI Pro version for testing the Web Services
  • Involved in preparing the High Level and Detail level design of the system using J2EE
  • Created struts form beans, action classes, JSPs following Struts framework standards
  • Implemented the database connectivity using JDBC with Oracle database as backend
  • Involved in the development of underwriting process, which involves communications without side systems using IBM MQ and JMS
  • Created a deployment procedure utilizing Jenkins CI to run the unit tests
  • Worked with JMS Queues for sending messages in point-to-point mode
  • Used PL/SQL stored procedures for applications that needed to execute as part of a scheduling mechanisms
  • Developed SOAP based XML web services
  • Used JAXB to manipulate XML documents
  • Created XML document using STAX XML API to pass the XML structure to Web Services
  • Used Rational Clear Case for version control and JUnit for unit testing

Environment: JSP1.2, Jasper reports, JMS, XML, SOAP, JDBC, JavaScript, XML, UML, HTML, JNDI, Apache Tomcat, ANT and JUnit

Hire Now