Hadoop Developer Resume
Scottsdale, AZ
SUMMARY
- 6 years of professional IT experience which includes experience in Big data ecosystem related technologies
- Excellent understanding on Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper and Flume Hive, Flume, Oozie, Storm, Kafka..
- Good Exposure on Apache Hadoop Map Reduce programming, Pig Scripting and Distributed Application and HDFS.
- Experience in algorithm analysis using R - language and implementing regression model algorithm on bigdata using Spark MLlib by converting scripts to scala.
- Developed Oozie coordinator for scheduling and orchestrating teh ETL process.
- Implemented ETL process with Pentaho and Talend.
- Experience in developing customized UDF's in Java to extend Hive and Pig Latin functionality.
- Experience in NOSQL column oriented databases like HBase and its integration with Hive & Pig.
- Implemented Pig scripts, integrated them into Oozie workflows and performed integrated testing.
- Experience in managing and reviewing Hadoop log files.
- Experience in importing streaming logs and aggregating teh data to HDFS through Flume.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
- Excellent understanding of Virtualization, with experience of setting up a POC multi-node virtual cluster by leveraging underlying Bridge Networking and NAT technologies.
- Strong Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
- Experience in Extraction, Transformation and Loading (ETL) of data from multiple sources.
- Highly utilized Teradata, Oracle, SQL Server, Teradata SQL Assistant, BTEQ
- Excellent experience in creating join indexes, partitioned indexes, and adding collect statistics for better query performance.
- Proficient in using Teradata Utilities (BTEQ, Fast load, Multiload, FastExport, and TPump) for development.
- Extensive experience in developing applications using Java and multi-threading.
- Experience in designing and building web applications using Core Java and Java Enterprise Technologies- JSP, Servlets and JDBC.
- Good Knowledge on Apache Spark and Scala.
- Experience in converting PL-SQL packages to Scala as a part of client requirement.
- Experienced in building projects using Ant and Maven.
- Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
TECHNICAL SKILLS
Big Data Technologies: Hadoop, Spark, HDFS, Hive, MapReduce, Pig, Sqoop, Flume, Zookeeper, Crunch, Oozie, Hue, Spark MLlib, Kafka, Spark SQL, Spark streaming, Hadoop distribution of Cloudera CDH3, Hadoop distribution of Hortonworks HDP2, Crunch API, HCatalog, Tez and HBase
Scripting/ Web Languages: JavaScript, Perl, Python, HTML, XML, SQL, Shell Scripting
Programming Languages: C, C++, R, Java and scala
Java/J2EE Technologies: Java, Java Beans, J2EE (JSP, Servlets, EJB), JDBC, SOLR, Pentaho.
Frameworks: Hibernate 2.x/3.x, spring 2.x/3.x/4.x
Databases/ RDBMS: MySQL, PL-SQL, PostgreSQL, MS-SQL Server 2005/2008,Oracle 9i/10g/11g, Hbase, Cassandra.
Statistical Programming: Programming in R, SAS, H2O
Operating Systems: Unix, Windows XP/7/NT/8/2003, MS DOS, Mac, Linux( SUSE, RHEL, UBUNTU)
Software Life Cycles: SDLC, Waterfall and Agile models
Office Tools: MS-OFFICE - Excel, Word, and PowerPoint
Utilities/ Tools: Eclipse, Tomcat, NetBeans, TOAD, JUnit, SQL, SVN, Log4j, Tiles, Developer, SQL*PLUS, Advanced REST client, ANT, Maven, Visio, Mule ESB and MRUnit, RStudio, Talend
Cloud Platforms: AWS EC2, VPC, Redshift, EMR, S3
PROFESSIONAL EXPERIENCE
Confidential, Scottsdale, AZ
Hadoop Developer
Environment: Cloudera, Hadoop, HDFS, Map Reduce, Hive, Pig, HBase, Linux Shell Scripting, Oracle MySQL, Java.
Responsibilities:
- Installed and configured Hadoop, developed multiple Map Reduce jobs in Java for data cleaning and processing.
- Installed and configured Pig for ETL jobs.
- Troubleshooting teh cluster by reviewing Hadoop LOG files.
- Imported data using Sqoop from Teradata using Teradata connector.
- Used Oozie to orchestrate teh work flow.
- Creating Hive tables and working on them for data analysis in order to meet teh business requirements.
- Good experience on NoSQL database.
- Designed and implemented Map Reduce-based large-scale parallel relation-learning system.
- Installed and benchmarked Hadoop / HBase clusters for internal use.
- Written HBASE Client program in Java and web services.
- Model, serialize, and manipulate data in multiple forms (xml).
- Supported post production enhancements.
- Experience on data model concepts-star schema dimensional modeling relational design (ER).
Confidential, Cary, NC
Hadoop Developer
Environment: Cloudera, Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Oozie, Flume, HBase, ZooKeeper, Oracle, NoSQL, MySQL and Unix/Linux.
Responsibilities:
- Installed, configured and maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop for POC.
- Created HDFS (Hadoop Distributed File System), and MapReduce jobs in java.
- Implemented NameNode backup using NFS for High availability.
- Used Pig as ETL tool to do transformations, event joins and some aggregations before storing teh data onto HDFS.
- Developed data pipeline using flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
- Used Oozie workflow engine to run multiple Hive and Pig Jobs.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Created Hive tables and involved in data loading and writing Hive UDFs.
- Exported teh analyzed data to teh relational database MySQL using Sqoop for visualization and to generate reports.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
- Automated workflows using shell scripts to pull data from various databases into Hadoop.
Confidential, NJ
Hadoop Developer
Environment: Cloudera, Eclipse, Hadoop, HDFS, MapReduce, Yarn, Hive, Pig, HBase, Oozie, Sqoop, Flume, Core java.
Responsibilities:
- Involved in Installing, Configuring Hadoop ecosystem, and Cloudera Manager using CDH Distribution.
- Involved in creating Hive tables, loading teh data and writing hive queries that will run internally in Map Reduce.
- Involved in writing Map Reduce jobs.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract teh data from Weblogs and store in HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using Sqoop, HDFS GET or CopyToLocal.
- Developed data pipeline using Flume, Sqoop, Pig andJavaMapReduce to ingest customer behavioral data and financial histories into HDFS for analysis.
- Experienced in managing and reviewing Hadoop log files.
- Used Pig to do transformations, event joins, filter boot traffic and aggregations before storing teh data onto HDFS.
- Written Hive queries for data to meet teh business requirements.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Worked on tuning teh performance of Pig queries.
- Involved in developing Pig Scripts for data change capture and delta record processing between newly arrived data and already existing data in HDFS.
Confidential
Java Developer
Environment: JDK, J2EE, Eclipse IDE, ANT, JDBC, Servlets, JSP, EJB, Struts, XML and Oracle.
Responsibilities:
- Developed Stateless Session Beans in teh model layer to implement business logic for teh application.
- Developed Action Classes for workflow control and Data Access Object for getting database connections from connection pool.
- Extensively used teh Jakarta Struts Framework.
- Implemented user session management using Http Sessions.
- Used JDBC to access Oracle Database and used Stored Procedures.
- Developed JSP Pages made them accessible to teh Client using Web Logic Application Server.
- Extensively used complex SQL statements including joins and nested queries
- Developed Stored Procedures
- Extensively used XPath for finding information in an XML document. XPath is used to navigate through elements and attributes in an XML document.
- Coded JSP pages and used JavaScript for client side validations and to achieve other client-side functionality.
- Extensively worked on AJAX
- Used ANT scripts for building teh application.
- Developed Java Helper classes for updating Customer Accounts and Customer information.
- Adopted Sun's coding and documentation standards.
Confidential
Java Developer
Environment: Java, Eclipse, Oracle, HTML, JSP, Tomcat
Responsibilities:
- Involved in Design, Development, Testing and Integration of teh application.
- Involved in development of user interface modules using HTML, CSS and JSP.
- Involved in writing SQL queries
- Involved in coding, maintaining, and administering Servlets, and JSP components to be deployed on Apache Tomcat application servers
- Database access was done using JDBC. Accessed stored procedures using JDBC.
- Worked on bug fixing and enhancements on change requests.
- Coordinated tasks with clients, support groups and development team.
- Worked with QA team for test automation using QTP
- Participated in weekly design reviews and walkthroughs with project manager and development teams.
