Big Data/hadoop Administrator Resume
Lincoln, RI
PROFESSIONAL SUMMARY:
- Over 7+ years of professional IT experience in requirement gathering, design, development, testing, implementation and maintenance. Progressive experience in all phases of the iterative Software Development Life Cycle (SDLC)
- Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, NameNode, Data Node and MapReduce programming paradigm.
- Experience in building, maintaining multiple Hadoop clusters (production, developement etc.,) of different sizes and
- Configuration and setting up the rack topology for large clusters
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Apache Hadoop MapReduce, HDFS, HBase, Hive, Sqoop, Pig, Zookeeper and Flume, Cloudera (CDH3, CDH4), Nagios, Ganglia, Yarn distributions.
- Experience in optimization of Map reduce algorithm using combiners and partitioners to deliver the best results
- Experience in Hadoop Cluster capacity planning, performance tuning, cluster Monitoring, Troubleshooting.
- In - depth understanding of Data Structure and Algorithms.
- Experience in bench marking Hadoop/HBase cluster file systems various batch jobs and workloads
- Extensively used ETL methodology for supporting Data Extraction, transformations and loading processing, using Informatica.
- Excellent understanding and knowledge of NOSQL databases like MongoDB, HBase, Cassandra.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Good experience on Design, configure and manage the backup and disaster recovery for Hadoop data.
- Hands on experience in analyzing Log files for Hadoop and eco system services and finding root cause.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Experience in making Hadoop cluster ready for development team working on POCs.
- Experience in minor and major upgrades of hadoop and hadoop eco system.
- Expert knowledge of key data structures and algorithms (indexing, hash tables, joins, aggregations)
- Experience monitoring and troubleshooting issues with Linux memory, CPU, OS, storage and network.
- Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.ager tool.
- Experience in Administering, Installation, configuration, troubleshooting, Security, Backup, Performance Monitoring and Fine-tuning of Linux Ubuntu.
- Scripting to deploy monitors, checks and critical system admin functions automation.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Ability to adapt to evolving technology, strong sense of responsibility and accomplishment.
TECHNICAL SKILLS:
Big Data Ecosystem: HDFS, HBase, Impala, Hadoop MapReduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, Cassandra, Datameter, Pentaho
Programming Languages and Scripting: SQL, PL/SQL, C, C++, PHP, Python, Core Java, JavaScript, Shell Script, Perl script
Data Bases: Oracle 9i/10g/11g (SQL & PL/SQL), Sybase ASE 12.5, DB2, MS SQL Server, MySQL
Web Technologies: HTML, XML, AJAX, SOAP, ODBC, JDBC, Java Beans, EJB, MVC, JSP, Servlets, Java Mail, Struts, Junit
Frameworks: MVC, Spring, Struts, Hibernate, .NET
Data warehousing and NoSQL Databases: Netezza, Hbase
Methodologies: Agile, V-model.
Configuration Management Tools: TFS, CVS
IDE / Testing Tools: Eclipse.
Operating System: Windows, UNIX, Linux
Software Products: Putty, Eclipse, Toad 9.1, DB Visualizer, Comptel's AMD 6.0.3 & 4.0.3, InterConnecT v7.1 & 6.0.7, MS Project 2003, HP Quality Center, MS Management studio, MS SharePoint
PROFESSIONAL EXPERIENCE:
Big Data/Hadoop Administrator
Confidential, Lincoln, RI
Roles & Responsibilities:
- Installation & configuration of Apache Hadoop cluster using Pseudo-Distributed Operation, Fully-Distributed Operation
- Configure & Modify hadoop XML parameters according to hardware/storage requirements
- Working with development team on various updates and implement changes to the systems and Map Reduce framework
- Manage databases from 40TB to 100 TB using Hadoop Cluster
- Importing data from RDBMS to HDFS using Sqoop as and when required.
- Tuning of XML parameters & monitor the performance using CDH
- Extensively working on Oracle 11g, Derby, Mysql & Postgresql databases
- Developed free text search solution with Hadoop and Solr.
- Scheduling the jobs using Cloudera Manager
- Provide recommendations for Big Data design.
- Experience in running Ozzie workflow, Hadoop jobs, migrating applications from development/test environments to production including customization, Network protocols
- Setup and maintain documentation, policies and standards.
- Mentor junior members of the Big Data team.
- Constantly learning various Bigdata tools and provide strategic direction as per development requirement
- Skills Used Wide range of services around Big Data, NoSql, Hadoop including following:- Hadoop installation, performance tuning & scalability
- Data services using Flume, Hbase, Hcatalog, Hive, Mahout, Sqoop, Avro(Data Serialization)
- Operational services using Cloudera Manager, Oozie, Ambari MapReduce v2.0 YARN development
- NOSQL solutions MongoDB, Cassandra, HBase
- RDBMS solutions like Oracle 12c, 11g, OEM, MySQL
- Storage solutions EMC, SAN, NAS, EC2
- Clustering solutions, Oracle RAC, JBOD
Environment: Sun Solaris 9/10, RedHat Linux AS 4, Oracle Linux/Unix, Hive, Pig, Hbase, Zookeeper, Sqoop, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse.
Systems Admin
Confidential, San Jose, CA
Roles & Responsibilities:
- Involved in review of functional and non-functional requirements.
- Provided technical expertise on multiple environment (Development, Integration, QA, UAT, Production and DR)
- Installed and configured Hadoop Map reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in defining job flows.
- Worked with IT managers and application developers to resolving Critical database issues
- Technical mentoring of team(SME) and cross platform training
- Experience in managing and reviewing Hadoop log files.
- Extracted files from CouchDB through Sqoop and placed in HDFS and processed.
- Experience in running Hadoop streaming jobs to process terabytes of xml format data.
- Got good experience with NOSQL database.
- Supported Map Reduce Programs those are running on the cluster.
- Involved in loading data from UNIX file system to HDFS.
- Involved setting up MySQL Nagios alarms, creating usage reports written in Python.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Replaced default Derby metadata storage system for Hive with MySQL system.
- Executed queries using Hive and developed Map-Reduce jobs to analyze data.
- Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
- Developed the Pig UDF's to preprocess the data for analysis.
- Developed Hive queries for the analysts.
- Involved in loading data from LINUX and UNIX file system to HDFS.
- Developed a custom File System plug in for Hadoop so it can access files on Data Platform.
- Designed and implemented Mapreduce-based large-scale parallel relation-learning system
- Setup and benchmarked Hadoop/HBase clusters for internal use
- Setup Hadoop cluster on Amazon EC2 using whirr for POC.
- Projects include using a RedHat Satellite server.
- Wrote recommendation engine using mahout.
Environment: Java, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, HBase, MapReduce, HDFS, Pig Hive, Cassandra, Java (JDK 1.6), Hadoop Distribution of HortonWorks, Cloudera, MapReduce, DataStax, IBM DataStage 8.1, Oracle 11g / 10g, PL/SQL, SQL*PLUS, Toad 9.6, Windows NT, LINUX, UNIX Shell Scripting
Hadoop Administrator
Confidential, Dallas, TX
Roles & Responsibilities:
- Installed and configured MapReduce, HIVE and the HDFS; implemented CDH3 Hadoop cluster on CentOS. Assisted with performance tuning and monitoring.
- Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Configure & Modify hadoop XML parameters according to hardware/storage requirements
- Manage databases from 40TB to 100 TB using Hadoop Cluster
- Tuning of XML parameters & monitor the performance using CDH
- Developed workflows using custom MapReduce, Pig, Hive, Sqoop
- Supported code/design analysis, strategy development and project planning.
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
- Assisted with data capacity planning and node forecasting.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
- Administrator for Pig, Hive and Hbase installing updates, patches and upgrades.
Environment: Hive, Pig, Hbase, Zookeeper, Sqoop, Java, JDBC, JNDI, Struts, Maven, Trac, Subversion, JUnit, SQL language, spring, Hibernate, Junit, Oracle, XML, Putty and Eclipse.
Systems Admin
Confidential, St. Louis, MO
Roles & Responsibilities:
- Installation and configuration of Sun Solaris 8, 9, and10 and Linux AS 3.0 and 4.0
- Installation and configuration of Zones and Containers in Solaris 10 environment
- Performing day-to-day system administrator activities, troubleshooting, and storage assignments
- Installing and monitoring of VERITAS Volume Manager(VxVM) and Solaris Volume Manager on Sun Solaris environment.
- File system tuning, growing, and shrinking with Veritas File System (VFS) 3.5/4.x and VxVM
- Upgrading Solaris 8/9 to Solaris 10 using Live Upgrade and Manual Upgrade, Jumpstart Severs
- Experience in using VERITAS Cluster Server 3.x & 4.x and Sun Cluster 2.5 & 3.1 in SAN environment.
- Creating meta db, soft partitions, and RAID levels using sun Solaris Volume Manager
- Creating and monitoring pools, RAID levels, snapshots and clones using ZFS
- Regular disk management like adding/replacing hot swappable drives on existing servers, partitioning according to the requirements, creating new file systems or growing existing one over the hard drives and managing file systems and adding virtual swap space
- User management: handling day to day user management tasks and issues related to user access, givingthe root permission to the local users (SUDOERS) as on demand
- Maintaining Patches and Packages to keep the servers up to date with latest OS versions
- Developing script for Backups from tape to validate using Shell scripting.
- Strong knowledge of shell scripts and Perl scripts., ESP Job Scheduler.
- Automation of jobs through crontab and autosys.
- Installing and configuring SQL server for windows and linux connectivity.
- Worked with database team for improving performance of databases.
Environment: Sun Solaris 10/9/ Linux 5/4/, AIX, SUN Servers, SUN Fires, Linux, Jump, Volume Manager (VVM), VxVM, LDAP, Shell Scripting, EMC Storage SAN, Veritas Cluster Server (VCS), and Apache.
Jr. Systems Admin
Confidential, Newtown, CT
Roles & Responsibilities:
- Responsible for providing the desktop system administration and support to the network.
- Interact with the clients to resolve the queries, issues and problems.
- Responsible for providing help, support and assistance in initial installation of the system, setup and maintenance of the user account, data recovery, etc.
- Responsibilities include analysis, installation, maintenance and modification of storage area networks and computing system.
- Ensure the compatibility of the hardware and software of the system by determining it.
- Responsible for the evaluation and recommendation of the new hardware and software.
- Assist team in resolution of hardware, software and system issues.
- Analyze performance of the system and ensure the performance objective and availability of the requirements
Environment: s: Sun Solaris, Linux, AIX, SUN Servers, SUN Fires, Linux.