We provide IT Staff Augmentation Services!

Hadoop Administrator/ Hadoop Infrastructure Architect Resume

5.00/5 (Submit Your Rating)

Hoffman Estates, IL


  • 13years of Professional experience in IT background which includes 6 years in Hadoop Technologies and extensive experience in Linux flavors
  • Experience in implementing Data Warehousing/ETL solutions for different domains like financial, telecom, loyalty, Retail and insurance verticals
  • Experience in operating and managing large clusters with 650+ nodes and 4+ Peta Bytes of storage.
  • Hands on experience in setting up, configuring Hadoop ecosystem components like Hadoop, MapReduce, HDFS, HBase, IMPALA, OOZIE, HIVE, SQOOP, PIG, SPARK, FLUME, KAFKA, SENTRY, RANGER services.
  • Experience in planning, implementing, testing and documenting the performance benchmarking to Hadoop platform.
  • Helped in planning, development and architecture of Hadoop ecosystem.
  • Experience in both On - Premises and Cloud space: AWS, GCP, and AZURE.
  • Experience with securing Hadoop clusters by implementing Kerberos KDC installation, LDAP Integration, data transport encryption with TLS, and data-at-rest encryption with Cloudera Navigator Encrypt.
  • Experience on Design, configure and manage the backup and disaster recovery using Data Replication, Snapshots, Cloudera BDR utilities.
  • Experience on Multi Clustered environment and setting up Hortonworks Hadoop echo -System.
  • Implemented Role based authorization for HDFS, HIVE, IMPALA using Apache Sentry.
  • Good Knowledge on Implementing and using Cluster monitoring tools like Cloudera Manager, Ganglia and Nagios.
  • Experienced in implementing and supporting auditing tools like Cloudera Navigator.
  • Experience in implementing High Availability features for services like Namenode, HUE, IMPALA.
  • Experience Installed and configured multi-nodes fully distributed Hortonworks Hadoop cluster of large number of nodes
  • Hands-on experience in Deploying and using automation tools like PUPPET for cluster configuration management.
  • Experience in creating cookbooks/playbooks and documentations for Installation, upgrades and support projects.
  • Participated in the application on-boarding meetings along with Application owners, Architects and helps them to identify/review the technology stack, use case and estimation of resource requirements.
  • Experience in documenting standard practices and compliance policies.
  • Fix the issues by interacting with dependent and support teams and log the cases based on the priorities.
  • Experience in creating and upgrading the ETL and Relational Database frameworks. Experienced on setting up Hortonworks cluster and installing all the ecosystem components through Ambari and manually from command line
  • Assisted in tuning the performance of the Hadoop ecosystem as well as monitoring.
  • Hands on experience in performing functional testing and helps application teams/users to incorporate third party tools with Hadoop environment.
  • Experience in analyzing Log files and finding the root cause and then involved in analyzing failures, identifying root causes and taking/recommending course of actions.
  • Experience in Data Warehousing and ETL processes.
  • Experience Implemented and configured High Availability Hadoop Cluster using Hortonworks Distribution.
  • Knowledge of integration with Reporting tools like Tableau, Micro-Strategy and Datameer.
  • Performed performance tuning and troubleshooting of MapReduce jobs by analyzing and reviewing job counters and application logs files.
  • Experience on setting up Hortonworks cluster and installing all the ecosystem components through Ambari.
  • Experience in tuning performance for various services
  • Experience in job scheduling and monitoring tools like Control M, Nagios and Ganglia.
  • Additional responsibilities include interacting with offshore team daily, communicating the requirement and delegating the tasks to offshore/on-site team members and reviewing their delivery.
  • Good Experience in managing Linux platform servers
  • Effective problem-solving skills and ability to learn and use new technologies/tools quickly
  • Good scripting knowledge in Bash shell scripting.
  • Experience in working on ITIL tools JIRA, SUCCEED and SERVICE-NOW tools for change management and support processes.
  • Has good experience, excellent communication and interpersonal skills which contribute to timely completion of project deliverables well ahead of schedules
  • Experience in providing 24x7X365, on-call and weekend production support.


Operating Systems/Platforms: UNIX& Linux (CentOS 6 & RHEL6), CentOS, Ubuntu 14. x, AIX, Windows

Programming Languages: C, C++, Java, Pig Latin, SQL, HQL

CloudComputing Services: VMware, AWS, GoogleCloud, Microsoft Azure

CRM Package: Siebel 7. x, Siebel 8. x

SQL & NOSQL Data Storage: PostgreSQL, MYSQL, Cassandra, MongoDB, Teradata, Oracle

Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, SQOOP, Oozie

Management Tool: Cloudera Manager,Ambari

Application Servers: WebLogic 11g, 12c, Tomcat 5. x and 6.x

ETL Tool: Informatics 8. x and 9. x, BODS 4.0/4.1, Talend

Reporting tools: BI Publisher,, Web Intelligence, Tableau, Micro Strategy, Datameer

SCM Tools: Perforce, Team track, VSS, Harvest, SVN and HP Quality Centre, Jira

Methodology: Agile SDLC, UML

Scripting language: Bash, Perl, Pig, Python, Puppet

Security: Kerberos, Sentry, LDAP, AD, SSL/TLS, REST Encryption

Protocols: TCP/IP, UDP, SNMP, Socket Programming, Routing Protocol


Hadoop Administrator/ Hadoop Infrastructure Architect

Confidential, Hoffman Estates, IL


  • Created a new project solution based on the company's technology direction ensured that infrastructure services are projected based on current standard
  • Implemented HA for namenode and HUE using Cloudera manager
  • Created and configured cluster monitoringservice activity monitor, service monitor, report manager, event server and alert publisher.
  • Created cookbooks/playbooks and documentations for special tasksConfigured HA proxy for IMPALA service
  • Writing desktop scripts for synchronizing the data within and across clusters.
  • Created snapshot’s for in cluster backup of the data instance.
  • Created SQOOP scripts for ingesting data from Transactional systems to Hadoop.
  • Regularly accessing JIRA and Service now tools and other internal issue trackers for the Project development.
  • Worked independently with Cloudera support and Hortonworks support for any issue/concerns with Hadoop cluster.
  • Conducted Technology Evaluation sessions for Big Data, Data Governance, Hadoop and Amazon Web Services, Tableau and R, Data Analysis, Statistical Analysis, Data Driven Business Decision
  • Integrated Tableau, Teradata, DB2, ORACLE via ODBC/JDBC drivers with Hadoop
  • Worked with application teams to install the operating system, Hadoop updates, patches, version upgrades as required.
  • Hortonworks configuration parameters to new different environments such as Integration, Production
  • Created scripts for automating balancing data across the cluster using the HDFS load balancer utility.
  • Created POC for implementing streaming use case with Kafka and HBase services.
  • Working experience of maintaining MySQL database creation and setting up the users and maintain the backup of databases.
  • Implemented Kerberos Security Authentication protocol for existing cluster.
  • Integrated is existing LLE and Production clusters with LDAP.
  • Implemented TLS for CDH Services and for Cloudera Manager.
  • Working with data delivery teams to set up new Hadoop users. This job includes setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
  • Managed the backup and disaster recovery for Hadoop data. Coordinated root cause analysis efforts to minimize future system issues
  • Served as lead technical infrastructure Architect and Big Data subject matter expert.
  • Deployed Big Data solutions in the cloud. Built, configured, monitored and managed end to end Big Data applications on Amazon Web Services (AWS)
  • Screen Hadoop cluster job performances and capacity planning
  • Spinning clusters in Azure using Cloudera director. Implemented this for POC for the cloud migration project.
  • Leveraged AWS cloud services such as EC2, auto-scaling and VPC to build secure, highly scalable and flexible systems that handled expected and unexpected load bursts
  • Defined Migration strategy to move the application to the cloud. Developed architecture blueprints and detailed documentation. Created bill of materials, including required Cloud Services (such as EMR, EC2, S3 etc.) and tools, experience in scheduling Cron jobs on EMR
  • Created bashscripts frequently, depending on the project requirements
  • Work on GCP Cloud architecture design patterns
  • Guided application teams on choosing the right file formats in Hadoop file systems Text, Avro, Parquet and compression techniques such as Snappy, bz2, LZO
  • Improved communication between teams in the matrix environment which led to increase in number of simultaneous projects and average billable hours
  • Substantially improved all areas of the software development life cycle for the company products, introducing frameworks, methodologies, reusable components and best practices to the team
  • Implemented VPC, Auto scaling, S3, EBS, ELB, Cloud Formation templates and Cloud Watch services from AWS

Environment: Over 1500 nodes, approximately 5 PB of data, Cloudera's distribution Hadoop (CDH) 5.5, HA name node, map reduce, Yarn, Hive, Impala, Pig, Sqoop,Flume, Cloudera Navigator, Control-M,Oozie, Hue, White elephant, Ganglia, Nagios, HBase, Cassandra, Kafka, Storm, Cobbler, Puppet

We'd love your feedback!