Sr Hadoop Administrator Resume
Greenville, SC
SUMMARY
- 10+ years of professional IT experience in designing, development, implementation and Management of Oracle and SQL server Databases for Client / Web Applications and in Production support.
- 5+ years of Experience in Installing, Updating Hadoop and its related components in Single node cluster as well as Multi node cluster environment using Apache, Cloudera.
- Hands on experience on major components in Hadoop Ecosystem including HDFS and MR framework, YARN, Hbase, Hive, Pig, Scoop, Zookeeper.
- Experience in managing and handling Linux platform servers (especially Ubuntu).
- Set up Standards and Processes for Hadoop based application design and implementation.
- Brief exposure in Implementing and Maintaining Hadoop Security and Hive Security.
- Experience in Database Administration, performing tuning and backup & recovery and troubleshooting in large scale customer facing environment.
- Expertise in Commissioning and Decommissioning of nodes in the clusters, Backup configuration and Recovery from a Namenode failure.
- Good working knowledge on importing and exporting data from different databases namely MySQL into HDFS and Hive using Scoop.
- Very Good Knowledge in YARN (Hadoop 2.x.x) terminology and High availability Hadoop Clusters.
- Experience in analyzing the log files for Hadoop and eco system services and finding out the root cause.
- As an administrator, involved in balancing the load on server and tuning the server for optimal performance of the cluster.
- Expertise in MS - Office (MS Word, MS PowerPoint, MS Excel, MS Outlook).
- Strong interpersonal and communication skills and ability to work effectively with a wide range of constituencies in a diverse community.
- Outstanding analytical and technical problem solving skills
- Experience in analysis, design and development of MVC pattern and Struts1.2, Hibernate3, spring2 and FLEX framework based applications.
- Better Understanding of SOA concepts and implementation using Web services
- Experience in writing SQL queries, Stored Procedures for accessing and managing databases such as Oracle8i/9i/10g, SQL Server 2008/2005
- Good experience on high-volume transactional systems running on Unix/Linux and Windows
- Implemented Unit Testing and Integration testing during the projects
- Collaborated with technical team members to integrate back/front end issues
- Excellent Analytical & problem solving skills with attention to detail and Persistence, Teamwork and Communications.
TECHNICAL SKILLS
Languages: SQL, PL/SQL, HTML,XML, WSDL
Hadoop Ecosystem: Hadoop Map Reduce, HDFS, Hbase, Hive, Pig, Scoop, YARN
Hadoop Distribution: Apache Hadoop, Cloudera Series, Horton works Hadoop
Open Source: Hibernate 3.2/3.0 /2.1,Spring IOC, Spring MVC, Spring Web Flow, Spring AOP, Ant 1.2/1.7, Maven 1.0
Database: Oracle 8i/9i/10g/11g, Microsoft SQL Server 2008/2005
Operating Systems: Ubuntu (Linux), Windows NT 2003/2007/12 , Windows 8, Windows 10
Methodologies: Agile, Scrum, Waterfall, Iterative, Spiral
PROFESSIONAL EXPERIENCE
Confidential, Greenville, SC
Sr Hadoop Administrator
Responsibilities:
- Worked on a Hadoop Cluster with current size of 56 Nodes and 896 Terabytes capacity.
- Written Map Reduce Jobs, HIVEQL, Pig.
- Imported data using Sqoop into Hive and Hbase from existing SQL Server.
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Involved in Requirement Analysis, Design, and Development.
- Export and Import data into HDFS, HBase and Hive using Sqoop.
- Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in Map Reduce way.
- Work closely with the business and analytics team in gathering the system requirements
- Load and transform large sets of structured and semi structured data.
- Loading data into HBase tables using Java MapReduce
- Loading data into Hive partitioned tables
Environment: CDH, HDFS, Core Java, MapReduce, Hive, Pig, Flume, Storm, Elastic search, Scala, Spark, Kibana, Shell scripting, UNIX.
Confidential, CA
Hadoop Administrator
Responsibilities:
- Worked with the Linux administration team to prepare and configure the systems to support Hadoop deployment.
- Worked on configurations which Includes, networking and iptables, resolving hostnames, user accounts and file permissions, http, ftp, SSH key less login.
- Performed authentication and authorization service using Kerberos authentication protocol.
- Benchmarked Hadoop cluster using different bench marking mechanisms.
- Optimized Hadoop cluster performed by changing Hdfssit.xml, Coresite.xml, Mapredsite.xml.
- Deployed Network file system for Name Node Meta data backup.
- Performed a POC on cluster back using distcp, Cloudera manager BDR and parallel ingestion.
- Configured and deployed hive metastore using MySQL and thrift server.
- Monitored clusters using Nagios and Ganglia.
- Installed and configuredSparkecosystem components SparkSQL,SparkStreamingand alsoworked on building BI reports in Tableau with Spark using SparkSQL.
- Implemented Spark using scala and SparkSQL for faster testing and processing of data.
- Performed Commissioning and decommissioning the DataNodes.
- Used Fair scheduler on the job tracker to allocate fair amount of resources to small jobs.
- Upgraded the Hadoop cluster from cdh3 - cdh4 & cdh4 - cdh5.1.
- Implemented automatic failover zookeeper and zookeeper failover controller.
- Development of Pig scripts for handling the raw data for analysis.
- Audited & Maintained and built new clusters for testing purposes using the Cloudera manager.
- Deployed and configured flume agents to stream log events into HDFS for analysis.
- Configured Oozie for workflow automation and coordination.
- Custom monitoring scripts for Nagios to monitor the daemons and the cluster status.
- Custom shell scripts for automating redundant tasks on the cluster.
- Installation & monitoring of MongoDB cluster.
- Upgrading MongoDB from 2.4 to 2.6 by implementing new security features.
- Worked with BI teams in generating the reports and designing ETL workflows on Pentaho.
- Involved in loading data from UNIX file system to HDFS.
- Defined Oozie workflow based on time to copy the data upon availability from different Sources to Hive. Configured Ganglia which include installing gmond and gmetad daemons which collects all the metrics running on the distributed cluster and presents them in real-time dynamic web pages which would further help in debugging and maintenance.
Environment: MapReduce, HDFS, HIVE, PIG, FLUME, SQOOP, UNIX Shell Scripting, NAGIOS, KERBEROS.
Confidential
Hadoop Administrator
Responsibilities:
- Involved in all phases of development activities from requirements collection to production support.
- Migrated from different RDBMS system and focused on migrating from Cloudera distribution to Amazon to reduce project cost.
- Worked with different feeds data like JSON, CSV, XML and implemented data lake concept.
- Understanding the current system and find out the different sources of data.
- Assist in managing, acquiring, and analyzing customer data using SQL and R.
- Predictive analytics (which can monitor inventory levels and ensure product availability).
- Performed Batch processing of logs from various data sources using MapReduce.
- Had an exposure to Amazon Web Services - AWS cloud computing (EMR, EC2 and S3 services).
- Created the Load Balancer on AWS EC2 for unstable cluster.
- Maintenance of data importing scripts using Hive and MapReduce jobs.
- Worked on hive data warehouse modeling to interface with BI tools including Jaspersoft, Qlikview and Tableau.
- Administered hive permissions & user access with Kerberos authentication.
- Develop and maintain several batch jobs to run automatically depending on business requirements.
- Import and export data between the environments like MySQL, HDFS and deploying into productions.
- Connect tableau from client end with AWS IP addresses and view the end results.
- Developed dashboards,reports,adhoc views, domains in Jasper soft server & Tableau for business/stakeholders.
- Lead project teams on Big Data Analytics using Hadoop framework to analyze massive volumes of unstructured biological data to build a predictive tools which enabled various data scientist to make real time decisions.
Environment: EMR, Hive, PIG, Data Meer, HDFS, Solr, Quartz, Java Map-Reduce, Maven, Core Java, GIT, Jenkins, UNIX, R, MYSQL, Eclipse, Oozie, Sqoop, Flume, jaspersoft, tableau, qlikview, Cloudera, EMR,EC2,S3,Amazon data pipeline.
Confidential
Oracle PL/SQL Developer
Responsibilities:
- Application support which involves installing, customizing, writing reports and Maintenance of PowerApp-ERP in Confidential Limited for their entire unit in a Multi Org set up. Implementation is carried out by four phases in their four Units, IPL SEMBIUM,IPL MM Nagar, IP Pins and Liners and in IPL Ring Blank Unit. Identifying bugs and fixing them if it is at our level, or escalating it to appropriate levels and co-ordinate them to fix the bug.
- Gathering the business requirements of the application from the client.
- Documenting and analyzing the business rules to eliminate redundant and inefficient processes and practices. Review the Functional & Technical design documents prepared by the offshore team.
- High level design document which gives a brief description of how the requirements are to be implemented.
- Low level design document which contains a detailed description of the logic how the requirement is to be implemented.
- Database administration throughout the whole testing phase of the project.
- Involved in coding enhancement, Review of code changes, test plans and other deliverables.
- Involved in Setup/Install/maintain the database used by the offshore development team.
- Involved in server monitoring and to take Import / Export of database backup (Weekly/Monthly Backup).
- Handled errors using system defined exceptions and user defined exceptions like INVALID NUMBER, NO DATA FOUND and PRAGMA EXCEPTION INIT.
Environment: Oracle 9i, Toad 9.2, PL/SQL, SQL Developer, SQL*Plus, Putty, Share Point, Windows 7, VB 6.0, Classic ASP
Confidential
Oracle Forms, PL/SQL Developer
Responsibilities:
- Actively participated in Requirement gathering for the project.
- Preparation of LLD and unit test cases.
- Creation of database Procedures, Functions and Triggers.
- Customized forms / Reports as per the business requirements.
- Troubleshoot and debug complex issues.
- Developed Loader Control scripts for different interfaces using SQL Loader.
- Fixed issues occurred in month end and year end accounts run.
- Preparation of technical documents using the functional specifications
- Optimized the queries to improve the performance of the application
Environment: ORACLE 9i, PL/SQL, Developer 2000, SQL*loader, Windows, PVCS