We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Lead Resume

0/5 (Submit Your Rating)

St Louis, MO

SUMMARY

  • Over 10 years of experience in software development, building scalable and high performance Big Data applications with specialization in Apache Hadoop Stack, NoSQL databases, distributed computing and Java/ J2EE technologies
  • Expertise working across all phases of SDLC viz requirements gathering, system design, development, enhancement, maintenance, testing, deployment, production support, and documentation
  • Expertise with Hadoop architecture and its components such as HDFS, YARN, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm
  • Good experience in using data processing tools viz MapReduce, Pig and Hive for performing business transformations, data validations, and metadata driven data parsing routines
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Pig, HBase, ZooKeeper, Oozie, Hive, Sqoop, and Flume
  • Working experience in creating complex data ingestion pipelines, data transformations, data management and data governance in a centralized enterprise data hub
  • Experience with cloud computing technologies viz Windows Azure HDInsight, AWS/EMR, Direct - Hadoop-EC2 (non EMR), and Cloudera Manager
  • Working experience with pushing data as delimited files into HDFS using Talend Big Data tool
  • Strong expertise in writing complex MapReducejobs, Pig Scripts, and Hive queries for data modeling
  • Good experience in HDFS design, daemons, federation and HDFS high availability (HA)
  • Familiarity with Hadoop Security projects like Apache Knox Gateway, Sentry, Ranger, and Project Rhino
  • Very good working knowledge of Apache Cassandra, MongoDB, and Flume
  • Experienced in Integrated Data Warehousing and MDM projects
  • Experience in working with BI team in transforming big data requirements into Hadoop centric design and solutions
  • Experience in performance tuning the Hadoop cluster by gathering information and analyzing the existing infrastructure
  • Good working experience using Apache Sqoop to import data into HDFS from RDBMS and vice-versa
  • Adept in creating real time data streaming solutions using Apache Spark/ Spark Streaming, Kafka and Flume
  • Adept in extending Hive and Pig core functionality by writing custom UDFs
  • Good understanding of Data Mining and Machine Learning techniques
  • Developed various MapReduce applications to perform ETL workloads on terabytes of data
  • Strong work ethic with desire to succeed and make significant contributions to the organization
  • Experienced in Java Application Development, Client/Server Applications, Internet/Intranet based applications using Core Java, J2EE patterns, Web Services, Oracle, SQL Server, and DB2
  • Experience in building, deploying and integrating with Ant, Maven and Jenkins
  • Extensive work experience with different SDLC approaches such as Waterfall and Agile methodologies
  • Strong inter-personnel and communication skills with an ability to grasp new things quickly
  • Ability to successfully work under tight deadlines
  • Experience in leading small teams with Onshore and Offshore model
  • Ability to identify and resolve problems both independently and quickly
  • A great team player and ability to effectively communicate with people at all levels of the organization such as technical, management, and customers

TECHNICAL SKILLS

Big Data Framework and EcoSystems: Hadoop, MapReduce, HBase, Hive, Pig, HDFS, Zookeeper, Sqoop, Cassandra, MongoDB, Kafka, Apache Kafka, Oozie, Flume, ElasticSearch 2.x, MRUnit, Spark on Scala

J2EE Technologies: Servlets, JSP, JDBC, JUnit

Languages: Java, Ruby, C, SQL, PL/SQL

ETL Tools: Talend Open Studio, Pentaho Data Integration (PDI /Kettle)

Middleware: Hibernate 3.x

Web Technologies: CSS, HTML, XHTML, AJAX, XML, XSLT

Databases: Oracle 8i/9i/10g, MySQL, MS Access

IDE: Eclipse 3.x, 4.x, Eclipse RCP, NetBeans 6, STS 2.0, EditPlus, Notepad++

Design Methodologies: UML, Rational Rose

Version Control Tools: CVS, SVN

Operating Systems: Windows XP/Vista/7, Linux, UNIX, Cent OS

Tools: Ant, Maven, Putty

PROFESSIONAL EXPERIENCE

Confidential, St. Louis, MO

Sr. Hadoop Developer Lead

Responsibilities:

  • Gathered business requirements from the business analysts and subject matter experts
  • Involved in installing Hadoop Ecosystem components on 50-nodes production environment
  • Installed/ configured/ maintained Hortonworks Hadoop clusters for application development and Hadoop tools like YARN, Hive, Pig, HBase, Zookeeper and Sqoop
  • Installed and configured Hadoop security and access controls using Kerberos, and Active Directory
  • Responsible for managing data coming from different sources into HDFS viz Sqoop, and Flume
  • Responsible for troubleshooting and monitoring Hadoop services using Cloudera Manager
  • Monitored and fine-tuned MapReduce programs running on the cluster
  • Involved in HDFS maintenance and loading of structured and unstructured data
  • Developed several MapReduce programs for data pre-processing
  • Loaded data from MySQL to HDFS and vice-versa on regular basis using Sqoop Import and Export commands
  • Wrote Hive queries for data analysis to meet the business requirements
  • Designed and implemented jobs and transformations. Loaded the data sequentially and in parallel for initial and incremental loads
  • Implemented various Pentaho Data Integration steps in cleansing and loading the data as per the business needs
  • Configured Pentaho Data integration server to run the jobs in local, remote server and cluster mode
  • Prepared System Design document with all functional implementations
  • Involved in Data modelling sessions to develop models for Hive tables
  • Interpreted the existing enterprise data warehouse set up to understand the design and provided design and architecture suggestions on converting to Hadoop using MapReduce, Hive, Sqoop, Flume and Pig Latin
  • Converted existing ETL logic to Hadoop mappings
  • Extensive hands on experience in Hadoop file system commands for file handling operations
  • Worked on Sequence files, Map-side joins, bucketing, partitioning for hive performance enhancement and storage improvement
  • Worked with parsing XML files using MapReduce to extract related attributes and store it in HDFS
  • Performed unit testing using MRUnit testing framework
  • Involved in building Tbuild scripts to import data from Teradata using Teradata Parallel transport APIs

Environment: CDH 5, Hadoop, HDFS, MapReduce, Hive, Sqoop, Pig, XML, Cloudera Manager, Teradata and Pentaho (PDI / Kettle)

Confidential, Denver, CO

Hadoop Developer

Responsibilities:

  • Built a scalable distributed data solution using Hadoop to perform analysis on 25+ terabytes of customer usage data using Cloudera Distribution
  • Created Pig and Hive UDFs to analyze the complex data to find specific user behavior
  • Configured periodic incremental imports of data from DB2 into HDFS using Sqoop
  • Worked extensively with importing metadata into Hive using Sqoop and migrated existing tables and applications to work on Hive
  • Used Oozie workflow engine to schedule multiple recurring Hive and Pig jobs
  • Created HBase tables to store various formats of data coming from different portfolios
  • Created Hive tables to store the processed results in a tabular format.
  • Utilized cluster co-ordination services through Zookeeper
  • Extensively used HiveQL, Pig Latin and Spark on Scala
  • Assisted in cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files
  • Generated various marketing reports using Tableau with Hadoop as the source for data
  • Created relationships, actions, data blending, filters, parameters, hierarchies, calculated fields, sorting, groupings, live connections, and in-memory in Tableau
  • Created customized reports using various chart types like text tables, bar, pie, donut chart, funnel charts, heat maps, line charts, area charts, scatter plot, and dual combinations charts in Tableau
  • Blended data from multiple databases into one report by selecting primary keys from each database for data validation
  • Created high-level dashboards and stories in Tableau for Business and Product owners
  • Thorough understanding of ETL tools and how they can be applied in a Big Data environment
  • Performed unit testing using MRUnit testing framework
  • Involved in troubleshooting, performance tuning of reports and resolving issues within Tableau Server and Reports

Environment: CDH 4, Hadoop, HDFS, Hive, Java, SQL, Cloudera Manager, Pig, Sqoop, Oozie, Zookeeper, PL/SQL, Tableau 8.0.x, Scala, Spark, DB2, Cloudera Manager, UNIX Shell, YARN

Confidential, St. Louis, MO

Java Developer

Responsibilities:

  • Worked with Business Analyst and helped representing the business domain details with technical specifications
  • Actively involved in setting coding standards and writing related documentation
  • Developed the Java Code using Eclipse as IDE
  • Developed JSPs and Servlets to dynamically generate HTML and display the data on client side
  • Developed applications on Struts MVC architecture utilizing Action Classes, Action Forms and validations
  • Tiles were used as an implementation of Composite View pattern
  • Was responsible for implementing various J2EE Design Patterns like Service Locator, Business Delegate, Session Façade, and Factory Pattern
  • Generated few dynamic Tag Libs and implemented MVC design patterns using Java Struts
  • Performed code review and debugging using Eclipse Debugger
  • Was responsible for developing and deploying the EJB (Session & MDB)
  • Configured Queues in WebLogic server where the messages, using JMS API, were published
  • Consumed Web Services (WSDL, SOAP, UDDI) from third party for authorizing payments to/ from customers
  • Performed unit testing using JUnit testing framework and used Log4j to monitor the error log
  • Wrote complex database queries
  • Built web applications using Maven as build tool
  • Used CVS for version control

Environment: Java/J2EE, Eclipse, Web Logic Application Server, Oracle, JSP, HTML, JavaScript, JMS, Servlets, UML, XML, Eclipse, Struts, Web Services, WSDL, SOAP, UDDI

Confidential, St. Louis, MO

Java/ J2EE Application Developer

Responsibilities:

  • Responsible for gathering and analyzing requirements and converting them into technical specifications
  • Used Rational Rose for creating sequence and class diagrams
  • Developed presentation layer using JSP, Java, HTML and JavaScript
  • Used Spring Core Annotations for Dependency Injection
  • Designed and developed ‘Convention Based Coding’ utilizing Hibernate’s persistence framework and O-R mapping capability to enable dynamic fetching and displaying of various table data with JSF tag libraries
  • Designed and developed Hibernate configuration and session-per-request design pattern for making database connectivity and accessing the session for database transactions respectively. Used SQL for fetching and storing data in databases
  • Participated in the design and development of database schema and entity-relationship diagrams of the backend Oracle database tables for the application
  • Implemented web services using Apache Axis
  • Designed and developed Stored Procedures and Triggers in Oracle to cater the needs for the entire application
  • Developed complex SQL queries for extracting data from the database
  • Designed and implemented SOAP web service interfaces in Java
  • Used Apache Ant for the build process
  • Used ClearCase for version control and ClearQuest for bug tracking

Environment: Java, JDK 1.5, Servlets, Hibernate, Ajax, Oracle 10g, Eclipse, Apache Ant, Web Services (SOAP), Apache Axis, Apache Ant, Web Logic Server, JavaScript, HTML, ClearCase, ClearQuest

Confidential

Junior Java Developer

Responsibilities:

  • Developed the user interface screens using Swing for accepting various system inputs such as contractual terms, monthly data pertaining to production, inventory and transportation
  • Involved in designing database connections using JDBC
  • Involved in design and development of UI using HTML, JavaScript and CSS
  • Involved in creating tables, stored procedures in SQL for data manipulation and retrieval using SQL Server 2005, database modification using SQL, PL/SQL, stored procedures, triggers, and views in oracle
  • Developed the business components (in core Java) for the calculation module (calculating various entitlement attributes)
  • Involved in the logical and physical database design and implemented it by creating suitable tables, views and triggers
  • Created the related procedures and functions used by JDBC calls in the above components
  • Involved in fixing bugs and minor enhancements for the front-end modules
  • Successfully migrated model database from Oracle to DB2
  • Created UNIX build script for Enterprise Data Translator
  • Effectively used Log4j for logging, Bugzilla for bug tracking and JUnit for unit testing

Environment: Java, HTML, Java Script, CSS, Oracle, JDBC, Swing and Eclipse

We'd love your feedback!