Sr. Big Data(hadoop) Administrator Resume
Austin, TX
EXPERIENCE SUMMARY:
- Comprehensive experience of 8+ years, with over 4+ years in Hadoop and Scala Development and Administration experience along with 4+ years of experience in Java/J2EE enterprise application design, development and maintenance.
- Extensive experience implementing Big Data solutions using various distributions of Hadoop and its ecosystem tools.
- Hands - on experience in installing, configuring and monitoring HDFS clusters (on premise & cloud).
- In depth understanding of MapReduce, concepts and its critical role in data analysis of huge and complex datasets.
- Expertise developing MapReduce programs to scrub, sort, filter, join and query data.
- Implemented innovative solutions using various Hadoop ecosystem tools like Pig, Hive, Impala, Sqoop, Flume, Kafka, Oozie, HBase and Zookeeper, Cassandra.
- Experience developing PigLatin and HiveQL scripts for Data Analysis and ETL purposes and also extended the default functionality by writing User Defined Functions (UDFs) for data specific processing.
- Experience with migrating data to and from RDBMS and unstructured sources into HDFS using Sqoop & Flume.
- Hands-on experience developing workflows that execute MapReduce, Sqoop, Flume, Hive and Pig scripts using Oozie.
- Manage Hadoop cluster, monitoring alerts and notification
- Deployment of upgrades, updates and patches
- Provide 24x7 tier-3 troubleshooting and break-fix support for production services
- Well-versed database development knowledge using SQL data types, Indexing, Joins, Views, Transactions, Large Objects and Performance tuning.
- Good knowledge of Data warehousing concepts and ETL processes.
- Experience writing Shell scripts in Linux OS and integrating them with other solutions.
- Intensive work experience in developing enterprise solutions using Java, J2EE, Servlets, JSP, JDBC, Struts, Spring, Hibernate, JavaBeans, JSF, MVC.
- Fluent with the core Java concepts like I/O, Multi-threading, Exceptions, RegEx, Collections, Data-structures and Serialization.
- Excellent problem-solving, analytical, communication, presentation and interpersonal skills that help me be a core member of any team.
- Experience mentoring and working with offshore and distributed teams.
- Worked as the coordination point for all engineering support activities.
TECHNICAL SKILLS:
Hadoop Ecosystem: HDFS, MapReduce, Sqoop, Flume, Hive, Pig, HBase, Impala, HUE, Zookeeper, Oozie, Cloudera Manager, Ambari, Cassandra
Hadoop Distributions: Apache Hadoop, CDH3, CDH4, Hortonworks.
Languages/Technologies: Java, C,C++, HTML,CSS3,Python,Php, J2EE, JSP, Servlets, HTML, XHTML, CSS, JavaScript, JQuery, AJAX, Scala
Scripting Languages: Java Script
Development Tools: Eclipse IDE, MS Visual Studio 2010, Amazon Web services, Open Stack
Version Control Tools: Git
Operating Systems: LINUX and UNIX, Windows 2008
RDBMS: Oracle 10g,11g, MySQL, PostgreSQL
Application/Web Servers: Apache, Tomcat, MSIIS, Splunk
Oracle utilities: EXP,IMP, EXPDP,,AWR,ADDM
WORK EXPERIENCE:
Confidential, Austin, TX
Sr. Big Data(Hadoop) Administrator
Responsibilities:
- Lead a team of 3 developers that built a scalable distributed data solution using Hadoop on a 30-node cluster to run analysis on 25+ Terabytes of customer usage data.
- Developed several new MapReduce programs to analyze and transform the data to uncover insights into the customer usage patterns.
- Altered existing Scala programs to enhance performance and obtain partitioned results.
- Used MapReduce to Index the large amount of data to easily access specific records.
- Performed ETL using Pig, Hive and MapReduce to transform transactional data to de-normalized form.
- Configured periodic incremental imports of data from DB2 into HDFS using Sqoop.
- Worked extensively with importing metadata into Hive using Scala and migrated existing tables and applications to work on Hive.
- Wrote Pig and Hive UDFs to analyze the complex data to find specific user behavior.
- Used Oozie workflow engine to schedule multiple recurring and ad-hoc Hive and Pig jobs.
- Responsible for maintaining and implementing code versioning techniques using GIT for the entire project.
- Created HBase tables to store various data formats coming from different portfolios.
- Utilized cluster co-ordination services through ZooKeeper.
- Assisted the team responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files.
- Worked with teams in various locations nationwide and internationally to understand and accumulate data from different sources.
- Worked with the testing teams to fix bugs and ensure smooth and error-free code.
Environment: Hadoop, MapReduce, HDFS, Hive, Java, SQL, Cloudera Manager, Pig, Sqoop, Oozie, HBase, ZooKeeper, PL/SQL, MySQL, DB2.
Confidential, South Brunswick, NJ
Big Data(Hadoop) Engineer
Responsibilities:
- Responsible for developing efficient MapReduce programs for more than 20 years’ worth of claim data to detect and separate fraudulent claims.
- Uploaded and processed more than 30 terabytes of data from various structured and unstructured sources into HDFS using Sqoop and Flume.
- Played a key-role is setting up a 40 node Hadoop cluster utilizing Apache Hadoop by working closely with the Hadoop Administration team.
- Played a key role in providing L3 engineering support for Hadoop and Datameer to the L1/L2 operational support organizations.
- Worked with the advanced analytics team to design fraud detection algorithms and then developed MapReduce programs to efficiently run the algorithm on the huge datasets.
- Developed Scala programs to perform data scrubbing for unstructured data.
- Responsible for designing and managing the Sqoop jobs that uploaded the data from Oracle to HDFS and Hive.
- Helped in troubleshooting Scala problems while working with MicroStrategy to produce illustrative reports and dashboards along with ad-hoc analysis.
- Diagnosis of installation & configuration issues
- Diagnosis of cluster management issues
- Diagnosis of performance issues
- Job scheduling, monitoring, debugging and troubleshooting
- Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Solr, Kafka, Pig, HBase and Cassandra
- Used Flume to collect the logs data with error messages across the cluster.
- Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
- Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Sqoop, Flume, Pig and HBase.
- Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data.
Environment: Java, Hadoop, Hive, Pig, Sqoop, Flume, HBase, Oracle 10g
Confidential, Denver, CO
Hadoop Engineer/Hadoop Admin
Responsibilities:
- Responsible for architecting Hadoop clusters with Cloudera3
- Involved in the installation of Cloudera3 and up-gradation from Cloudera3 to Cloudera4
- Developer in Big Data team, worked with Hadoop, and its ecosystem.
- Installed and configured Hadoop, Map Reduce, HDFS.
- Used Hive QL to do analysis on the data and identify different correlations.
- Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing.
- Installed and configured Pig and also written Pig Latin scripts.
- Wrote Map Reduce job using Scala.
- Great understanding of REST architecture style and its application to well performing web sites for global usage.
- Analyzed health of the Hadoop clusters and coordinates with operations for necessary tuning or stabilization changes and Kerberos knowledge for Hadoop security.
- Developed and maintained Hive QL, Pig Latin Scripts, Scala and Map Reduce.
- Worked on the RDBMS system using PL/SQL to create packages, procedures, functions, triggers as per the business requirements.
- Involved in ETL, Data Integration and Migration.
- Worked on Talend to run ETL jobs on the data in HDFS.
- Imported data using Kafka to load data from Oracle to HDFS on a regular basis.
- Developing scripts and batch jobs to schedule various Hadoop Programs.
- Have written Hive Queries for data analysis to meet the business requirements.
- Creating Hive Tables and working on them using Hive QL.
- Importing and exporting data into HDFS from Oracle Database, and vice versa using Sqoop.
- Experienced indefining jobflows.
- Experience with NoSQL database HBase.
- Hybrid implementation using Oracle.
- Wrote and modified stored procedures to load and modifying of data according to business rule changes.
- Involved in creating Hive Tables, loading the data and writing Hive Queries that will run internally in a map reduce way.
- Diagnose and troubleshoot L3 engineering support requests and provide resolutions.
- Developed a custom file system plugin for Hadoop to access files on data platform.
- The custom file system plugin allows Hadoop Map Reduce programs, HBase, Pig, and Hive to access files directly.
- Extracted feeds from social media sites such as Facebook, Twitter using Python scripts.
- Organized and benchmarked Hadoop/HBase Clusters for internal use.
Environment: Hadoop, HDFS, HBase, Pig, Hive, MapReduce, Sqoop, Flume, ETL, REST, Java, Python, PL/SQL, Oracle 11g, Unix/Linux, CDH3, CDH4.
Confidential, Atlanta, GA
Java Developer
Responsibilities:
- Developed an end to end vertical slice for a JEE based application using popular frameworks Spring, Hibernate, JSF, Facelets, XHTML, Maven2, and AJAX by applying OO Design Concepts, JEE, and GoF Design Patterns.
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
- Tuned SQL statements, Hibernate Mapping, and Web Sphere application server to improve performance, and consequently met the SLAs.
- Collected business requirements and wrote functional specifications and detailed design documents.
- Detected and fixed transactional issues due to wrong exception handling and concurrency issues because of unsynchronized block of code.
- Employed MVC Struts framework for application design.
- Assisted in designing, building, and maintaining database to analyze life cycle of checking and debit transactions.
- Used Web Sphere to develop JAX-RPC web services.
- Developed Unit Test Cases, and used JUNIT for Unit Testing of the application.
- Involved in the design team for designing the Java Process Flow architecture.
- Used Web Sphere to develop JAX-RPC web services.
- Worked with QA, Business and Architect to solve various Defects in to meet deadlines
Environment: Spring, Hibernate, Struts MVC, AJAX, Web Sphere, Maven2, Java, Java Script, JUnit, XHTML, HTML, DB2, SQL, UML, Oracle, Eclipse, Windows.
Confidential, Chattanooga, TN
Java/J2EE Developer
Responsibilities:
- Effective role in the team by interacting with welfare business analyst/program specialists and transformed business requirements into System Requirements.
- Developed analysis level documentation such as Use Case, Business Domain Model, Activity, Sequence and Class Diagrams.
- Handling of design reviews and technical reviews with other project stake holders.
- Implemented services using Core Java.
- Developed and deployed UI layer logics of sites using JSP.
- Spring MVC is used for implementation of business model logic.
- Worked with Struts MVC objects like action Servlet, controllers, and validators, web application context, Handler Mapping, message resource bundles, and JNDI for look-up for J2EE components.
- Developed dynamic JSP pages with Struts.
- Employed built-in/custom interceptors, and validators of Struts.
- Developed the XML data object to generate the PDF documents, and reports.
- Employed Hibernate, DAO, and JDBC for data retrieval and medications from database.
- Messaging and interaction of web services is done using SOAP.
- Developed JUnit test cases for Unit Test cases and as well as system, and user test scenarios
Environment: Struts, Hibernate, Spring MVC, SOAP, WSDL, Web Logic, Java, JDBC, Java Script, Servlets, JSP, JUnit, XML, UML, Eclipse, Windows.
Confidential
Junior Java Developer
Responsibilities:
- Involved in designing the Project Structure, System Design and every phase in the project.
- Responsible for developing platform related logic and resource classes, controller classes to access the domain and service classes.
- Involved in Technical Discussions, Design, and Workflow.
- Participate in the Requirement Gathering and Analysis.
- Employed JAXB to unmarshall XML into Java Objects.
- Developed Unit Testing cases using JUnit Framework.
- Implemented the data access using Hibernate and wrote the domain classes to generate the Database Tables.
- Involved in implementation of view pages based on XML attributes using normal Java classes.
- Involved in integration of App Builder and UI modules with the platform.
Environment: Hibernate, Java, JAXB, JUnit, XML, UML, Oracle11g, Eclipse, Windows XP.