Hadoop Administrator Resume
Houston, TX
PROFESSIONAL SUMMARY:
- 8.5 years of overall IT experience in a variety of industries, including hands on experience in Big Data technologies.
- Experience in installing, configuring, supporting and managing Hadoop clusters using CDH & HDP distributions.
- Installation, configuration, supporting and managing Hadoop clusters using Apache, Cloudera distributions (CDH 5.X, 6.X).
- Proficient in working with Hadoop 2.x (HDFS, MapReduce, YARN) architectures.
- Experience in deploying and managing the multi - node development, testing and production Hadoop cluster with different Hadoop components (Hive, HBase, Zookeeper, Pig, Sqoop, Oozie, Flume ) using Cloudera Manager.
- Experienced in Performance Monitoring, Trouble shooting, Maintenance and Support of production systems.
- Experience in configuring Zookeeper to provide High Availability and cluster service co-ordination.
- Troubleshooting cluster issues, Root Cause Analysis preparing run books.
- Experience in Importing and Exporting Data between different databases including MySQL, Oracle Teradata and HDFS using SQOOP.
- Experience in performing minor and major upgrades, commissioning and decommissioning of data nodes on Hadoop cluster.
- Responsible for cluster maintenance, Monitoring, Troubleshooting, Manage and review log files.
- Worked on importing and exporting data from Oracle into HDFS and Hive using Sqoop.
- Experience in supporting data analysis projects using Elastic Map Reduce (EMR) on the Amazon Web Services (AWS) cloud. Exporting and importing data into S3.
- Experience in NoSQL DB’s (Cassandra and HBase) for support and enterprise production
- Implemented Real-Time streaming and analytics using various technologies including Spark Streaming and Kafka.
- Hands on experience in writing Hive scripts using HiveQL and analyzing data using HiveQL.
- Implemented enterprise level security using LDAP, Knox, Kerberos, Ranger and Sentry.
- Configured encryption using TLS SSL.
- Experience in loading data into Hive partitions and creating buckets in Hive.
- Import/Export data from HDFS to RDBMS and vice-versa using SQOOP.
- Hands-on experience on application design, development and Test using Java.
- Knowledge on Oracle database architecture, configuration and administration. Backup & recovery.
- Excellent written, verbal and personal communication skills.
TECHNICAL SKILLS:
Operating Systems: WINDOWS 98/NT/2000/XP, WINDOWS VISTA, WINDOWS 7, WINDOWS 8, LINUX, UNIX, Mac OS, Windows Server 2000/2003/2008.
Hadoop ECO Systems: Hadoop, MapReduce, HDFS, HBase, Hive, Pig, Sqoop, Kafka, ZooKeeper.
NO SQL: Mongo DB, Cassandra
Languages: C, C++, JDK 1.4, 1.5, 1.6, Core Java, J2EE (SERVLETS, JSP, JDBC,JAVA BEANS,EJB) Python, C#, SQL, PL/SQL, HiveQL, Pig Latin, Scala
Web Technologies: HTML, CSS, XML, JavaScript.
Hadoop Distributions: Cloudera, Hortonworks, MapR
Scripting: JAVASCRIPT, UNIX, Bash, Korn
Databases: ORACLE 8i/9i/10g/11g/12c, Teradata, MS-Access, MySQL, SQL-Server 2000/2005/2008/2012
SQL Server Tools: SQL Server Management Studio, Enterprise Manager, QueryAnalyser, Profiler, Export & Import (DTS).
IDE: Eclipse, IntelliJ, Net Beans
Web Services: Restful, SOAP
Tools: Bugzilla, QuickTestPro (QTP) 9.2, Selenium, Quality Center, Test Link, TWS
EMPLOYMENT HISTORY:
Confidential, Houston, TX
Hadoop Administrator
Responsibilities:
- Responsible for installing, configuring, supporting and managing of Hadoop clusters.
- Managed and reviewed Log files as a part of administration for troubleshooting purposes.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MYSQL into HDFS using SQOOP.
- Accountable for creating users and groups through LDAP and give required permissions to respective users. .
- Performed HDFS cluster support and maintenance tasks like adding and removing nodes without any effect to running nodes and data.
- Monitoring and controlling local file system disk space usage, log files, cleaning log files.
- As a Hadoop admin, monitoring cluster health status on daily basis, tuning system performance related configuration parameters, backing up configuration xml files.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW tables and historical metrics.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Responsible for building scalable distributed data solutions using Hadoop.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Leveraged and configured the use of Fair and Capacity schedulers.
- Optimized Map Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Worked on Unix shell scripts for business process and loading data from different interfaces to HDFS
- Developed a data pipeline using Kafka t o store data into HDFS.
- Created partitioned tables in Hive.
- Migration of ETL processes from Oracle to Hive to test the easy data manipulation.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in MapReduce way.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Configured Yarn (2.x) highly available Hadoop clusters .
- Installed and configured Pig and also written Pig Latin scripts.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Load and transform large sets of structured, semi structured and unstructured data.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Environment: Hadoop, MapReduce, Cloudera, HDFS, Hive, ETL, Pig, Java, Shell Script, SQL, Sqoop, Java, Eclipse.
Confidential, Alpharetta, GA
Hadoop Administrator
Responsibilities:
- Involved in start to end process of Hadoop cluster setup where in installation, configuration and monitoring the Hadoop cluster.
- Monitoring systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Experienced in define being job flows with Oozie.
- Loading log data directly into HDFS using Flume.
- Experienced in managing and reviewing Hadoop log files.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions. Documented the systems processes and procedures for future s.
- Monitored multiple Hadoop clusters environments using Nagios. Monitored workload, job performance and capacity planning using Cloudera Manager.
- Worked on analyzing Hadoop cluster using different big data analytic tools including Pig, Hive, and Map Reduce.
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Hands on experience with ETL process.
- Implemented test scripts to support test driven development and continuous integration.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
- Worked on Pig Scripts to create file dumps for downstream analytical processing.
- Automated all the jobs from pulling data from databases to loading data into SQL server using shell scripts.
- Worked on Performance tuning related to Pig queries.
- Used UDF's to implement business logic in Hadoop.
- Involved in loading data from LINUX file system to HDFS.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required. .
- Supported MapReduce Programs those are running on the cluster .
- Created schema for the Hive external tables and also built few managed tables for the intermediate stage transformations.
- Created partitions as per the requirements and loaded the data into those tables
- Data validations are performed by writing queries on the loaded tables as per the business needs.
- I nvolved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
- Worked on Customer tickets on bas i s of priority and complexity of the issue
Environment: Hadoop, HDFS, Hortonworks, Pig, Hive, MapReduce, ETL, Sqoop, LINUX Shell Script, and Big Data
Confidential, Warren, NJ
Big Data Engineer
Responsibilities:
- Solid understanding of Hadoop HDFS, Map - Reduce and other ecosystem components
- Installation and Configuration of Hadoop cluster
- Worked with Cloudera Support Team to Fine tune cluster
- Plugin allows Hadoop MapReduce programs, HBase, Pig and Hive to work unmodified and access files directly. The plugin also provided data locality for Hadoop across host nodes and virtual machines.
- Developed Map Reduce jobs to analyze data and provide heuristics reports
- Adding, Decommissioning and re-balancing nodes
- Rack Aware Configuration
- Configuring Client Machines
- Configuring, Monitoring and Management Tools using Cloudera
- cluster High Available Setup
- Reviewed and managed the Hadoop log files for troubleshooting purposes and escalated issues in a timely manner.
- Incident Management, Problem Management and Change Management
- Performance Management and Reporting
- Recover from Name Node failures
- Schedule Map Reduce Jobs FIFO and FAIR share
- Installation and Configuration of other Open Source Software like Pig, Hive, HBASE, Flume and Sqoop
- Configured Spark streaming to receive real time data from Kafka and store the stream data to HDFS using Scala.
- Integration with RDBMS using sqoop and JDBC Connectors
- Installed Kerberos and set up permissions
- Managed the Amazon Web Services (AWS) infrastructure with automation and configuration management tools such as Ansible, Puppet and Chef.
- Assisted the application teams to install operating system, Hadoop updates, version upgrades and patches when required.
- Highly involved in analyzing system failures, root cause analsis, and recommended courses of action.
Environment: RHEL, P uppet, CDH 3 distribution, Tableau, Datameer, HBase, CDH Manager, YARN, Hive, Flume, Kafka
Confidential, Rocky Hill, CT
Oracle DBA
Responsibilities:
- Installation, Administration, Maintenance and Configuration of Oracle Databases
- Created and maintained different database entities included Tablespaces, data files, redo log files, rollback segments and renaming/relocating of data files on servers.
- Provided on-call support regarding production-related database problems.
- Used RMAN for backup and recovery strategy.
- Tuning of databases which includes Server side tuning and application tuning.
- Capacity planning of the databases / applications.
- Upgraded database from Oracle 10G 11G/12C
- Refreshing test and development environments (Oracle) from production databases as and when required.
- Optimized different SQL queries to insure faster response time
- Worked with Oracle Enterprise Manager.
- Work with Oracle Support to resolve different database related issues.
- Data replication using Golden Gate (Bi-directional and multi replication) across all prod databases for consistence across all sites.
- DDL replication through Golden Gate, analyze the SQLs, tuning if required before release the Code.
- Troubleshooting the Data Replication issues with Golden Gate in Prod and non-prod environments.
- Used Transportable table space, Data pump, RMAN for database migration between different platforms like Solaris to AIX.
Environment: Oracle 10g, 11g RH Linux, Solaris, Installation, Migration, Upgradation, OEM, RAC, RMAN, ADDM, AWR, Statspack.
Confidential, Columbus, OH
Java/J2EE Developer
Responsibilities:
- Understanding and analyzing the project requirements.
- Analysis and Design with UML and Rational Rose.
- Created Class Diagrams, Sequence diagrams and Collaboration Diagrams
- Used the MVC architecture.
- Worked on Jakarta Struts open framework.
- Developed Servlets in order to deal with requests for account activity,
- Developed Controller Servlets and Action Servlets to handle the requests and responses.
- Developed Servlets and created JSP pages for viewing on a HTML page.
- Developed the front end using JSP.
- Developed various EJB's to handle business logic.
- Designed and developed numerous Session Beans deployed on Web logic Application Server.
- Implemented Database interactions using JDBC with back-end Oracle.
- Worked on Database designing, Stored Procedures, and PL/SQL.
- Created triggers and stored procedures using PL/SQL.
- Written queries to get the data from the Oracle database using SQL.
Environment: J2EE, Servlets, JSP, Struts, Spring, Hibernate, Oracle, TOAD, WebLogic Server