Hadoop Developer / Hadoop Admin Resume
SF
SUMMARY
- 8+ years of professional experience in IT, including 3+ years of work experience in Big Data, Hadoop Development and Ecosystem Analytics in Banking, Food & Beverage, Healthcare, Insurance and State Project sectors.
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Hands on experience on VPC, EC2, S3, Redshift and Cloud watch.
- Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, Impala, PIG, HBase, Sqoop, Oozie, Flume and Avro.
- Experience in managing and reviewing Hadoop Log files.
- Experience with advanced analytics techniques like K - Means clustering and high dimensional data visualization.
- Experience with Oozie Workflow Engine in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice-versa.
- Experience in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Cloudera Manager
- Experience in installation, configuration, supporting and managing- CloudEra's Hadoop platformalong with CDH3&4 clusters.
- Experience in developing and extending serialization frameworks like Avro.
- Very Good knowledge on Hadoop Architecture, Administration, HDFS File system and Streaming API along with Data warehousing Concepts.
- Working experience with Hadoop Clusters using Cloudera (CDH) distribution.
- Work experience with cloud infrastructure like Amazon Web Services (AWS).
- Experienced the integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets, and Text files.
- Experience in Web Services using XML, HTML and SOAP.
- Excellent Java development skills using J2EE, J2SE, Servlets, JUnit, JSP, JDBC.
- Experience using middleware architecture using Sun Java technologies like J2EE, JSP, Servlets, and application servers like Web Sphere and Web logic.
- Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files and XML files.
- Basic Knowledge of UNIX and shell scripting.
- Working experience with testing tools like Junit.
- Familiarity working with popular frameworks like Struts, Hibernate and Spring MVC.
- Worked with the software development models, Waterfall Model and the Agile Software Development Methodology.
- Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Map Reduce Hive, Impala, Pig, HBase, Zookeeper, Sqoop, Oozie, and Flume.
Languages: Java, J2EE, Python, XML, Unix Shell scripting, HTML, C/C++SQL.
Web Technologies: Servlets, JSP, JDBC, XML, JPA, JavaScript, JSF, Ajax, Jquery, Java Swings, Java Beans, Hibernate, spring, JSON, EJB, JMS, HTML, XML, CSS
Methodologies: Agile, UML, Design Patterns (Core Java and J2EE)
IDE: Eclipse, RSA, VMware, Apache
GUI: Visual Basic 5.0, Oracle, MS Office (Word, Excel, Outlook, PowerPoint, Access).
Data Bases: Oracle 11g/10g, DB2, MS-SQL Server, MySQL, MS-Access, Hadoop
Testing Tools: Junit
NoSQL Databases: Hbase
Web Services: Web Logic, Web Sphere, Apache Tomcat
Monitoring & Reportingtools: Cloudera Manager, Ganglia, Custom Shell scripts
Data Analytics, Business Intelligence & Reporting: Zoomdata, X15, NGData Lily
PROFESSIONAL EXPERIENCE
Confidential, SF
Hadoop Developer / Hadoop Admin
Environment: Hadoop 1x, Hadoop 2x, HDFS, Map Reduce, Hive, Impala, Pig, Sqoop, HBase, AWS(Amazon Web Services), Cloudera (CDH) distribution, Jira, Junit, Lucene, Unix, SqlShell Scripting, Linux
Responsibilities:
- Developed PIG Scripts to parse and analyze logs (Application logs and Backend message logs) generated in IRD Mall Project to capture Errors, Warnings and other eventsthat may cause suspension or halt the automated Cron jobs.
- Created tables in Hive to store the captured data and analyzed it using Hive Query Language (HQL).
- Worked on a Proof of Concept (POC) on Cloudera Impala. Our use case was to compare Impala and Hive. We also wanted to look at how Impala’s Response time is better TEMPthan Hive when it comes to large batch processing.
- Proof of Concept with Zoomdata. Installation of Zoomdata and configured it to connect with various data sources. Developed core analytics dashboards using Zoomdata for IRD Mall Project. We are amongst the first few to work with Zoomdata, which is the fastest, Big Data Exploration, Visualization and Analytics Platform. me was able to install and connect Zoomdata to various data sources which include Oracle DB, SQL Server and Impala. Also, me was successful in designing separate dashboards for each Mall (Americana and Grove), which can analyze various Facets, Matrices and Fields present in the databases schemas.
- Worked with a team of System Engineers to install NGdata’s Lily in our SF-Devs Hadoop Cluster. Carried out an Upgrade process to upgrade our SF-Devs cluster from CDH 4.4.0 to CDH 5.1.2. These were the Pre-requisites prior to installing NGData’s Lily in our lab environment.
- Managing fully distributed Hadoop cluster is an additional responsibility assigned to me. me was trained to overtake the responsibilities of a Hadoop Administrator, which includes managing the cluster, Hadoop Ecosystem Upgrades,Cloudera Manager Upgrades and installation of tools that that uses Hadoop ecosystem.
- Responsibilities on a daily basis involved developing Oracle and SQL scripts to generate daily reports on various on-going projects and Mall project production support.
- Data Migration from Production Database (Oracle) into Hadoop cluster in the lab environment using Sqoop.
Confidential, Medford, MA
Hadoop Developer
Environment: Hadoop 1x, HDFS, Map Reduce, Hive 10.0, Pig, Sqoop, Impala, HBase, Shell Scripting
Responsibilities:
- Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.
- Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
- Documented the systems processes and procedures for future references.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters.
- Monitored workload, job performance and capacity planning using Cloudera Manager.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Installed and configured Flume, Hive, Pig, Sqoop and Oozie on the Hadoop cluster.
- Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
- Performed Map Reduce programs on log data to transform into structured way to find user location, age group, spending time.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Responsible to manage data coming from different sources.
- Integrated Oozie with the rest of the Hadoop stack supporting several types of Hadoop jobs out of the box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts).
Confidential, Reno, NV
Hadoop Developer
Environment: Hadoop Cluster, HDFS, Hive, Pig, Sqoop, Linux, Hadoop Map Reduce, HBase, Shell Scripting.
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Analyzed data using Hadoop components Hive and Pig.
- Worked hands on with ETL process.
- Responsible for running Hadoop streaming jobs to process terabytes of xml's data.
- Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Involved in loading data from UNIX file system to HDFS.
- Responsible for creating Hive tables, loading data and writing hive queries.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Extracted the data from Teradata into HDFS using the Sqoop.
- Exported the patterns analyzed back to Teradata using Sqoop.
- Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability.
Confidential - Carson City, NV
J2EE Developer
Environment: Java/J2EE, Oracle 11g, SQL, JSP, Struts 1.2, Hibernate 3, Web Logic 10.0, HTML, AJAX, Java Script, JDBC, XML, JMS, UML, JUnit, log4j, Web Sphere, My Eclipse
Responsibilities:
- Utilized Agile Methodologies to manage full life-cycle development of the project.
- Implemented MVC design pattern using Struts Framework.
- Form classes of Struts Framework to write the routing logic and to call different services.
- Created tile definitions, Struts-config files, validation files and resource bundles for all modules using Struts framework.
- Developed web application using JSP custom tag libraries, Struts Action classes and Action.
- Designed Java Servlets and Objects using J2EE standards.
- Used JSP for presentation layer, developed high performance object/relational persistence and query service forentire applicationutilizingHibernate.
- Developed the XML Schema and Web services for the data maintenance and structures.
- Used Web Sphere Application Server to develop and deploy the application.
- Worked with various Style Sheets like Cascading Style Sheets (CSS). In coding for JUnit Test cases.
Java/J2EE Developer
Environment: Spring MVC, Oracle 11g J2EE, Java, JDBC, Servlets, JSP, XML, Design Patterns, CSS, HTML,JavaScript 1.2, JUnit, Apache Tomcat, My SQL Server 2008.
Responsibilities:
- Involved in various phases of Software Development Life Cycle (SDLC) as design development and unit testing.
- Developed and deployed UI layer logics of sites using JSP, XML, JavaScript, HTML/DHTML, and Ajax.
- CSS and JavaScript were used to build rich internet pages.
- Agile Scrum Methodology been followed for the development process.
- Designed different design specifications for application development that includes front-end, back-end using design patterns.
- Developed proto-type test screens in HTML and JavaScript.
- Involved in developing JSP for client data presentation and, data validation on the client side with in the forms.
- Developed the application by using the Spring MVC framework.
- Collection framework used to transfer objects between the different layers of the application.
- Developed data mapping to create a communication bridge between various application interfaces using XML, and XSL.
- Spring IOC being used to inject the parameter values for the Dynamic parameters.
- Developed JUnit testing framework for Unit level testing.
- Actively involved in code review and bug fixing for improving the performance.
- Documented application for its functionality and its enhanced features.
- Created connection through JDBC and used JDBC statements to call stored procedures.