We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop - Consultant Resume

4.00/5 (Submit Your Rating)

Louisville, KY

SUMMARY:

  • 9+ years of work experience in IT Industry in Analysis, Design, Development and Maintenance of various software applications mainly in Hadoop - Cloudera, Horton Works, Oracle Business Intelligence, Business Objects, Oracle (SQL, PL/SQL), Crystal Xcelsius, Sales Force (SFDC), Data Base Administrator in UNIX and Windows environments in industry verticals like Banking, Financial, Pharmacies, Financial Assets, Fixed Income, Equities, Telecom& Health Insurance.
  • 5 + years of experience in Hadoop 1.x/2.x, CDH3U6, HDFS, HBase, Spark, Sqoop 2.x,, Scala, Hive 0.7.1, Kafka ; Flume; Java 1.6, Linux, Eclipse Juno, security - Kerberos, Impala, XML, JSON, Maven, SVN, NiFi, SaaS, Amazon Redshift & AZURE
  • Worked with data in multiple file formats including Avro, ORC and Text/ CSV. Develop Spark code using Scala and Spark-SQL for faster testing and data processing.
  • Experienced in Extraction, Transformation, and Loading (ETL) processes based on business need using Falcon and Oozie workflows to execute multiple Java, Hive, Shell and SSH actions.
  • Good understanding of NoSQL databases like HBase and Cassandra.
  • Hands on experience in Stream processing frameworks such as Storm, Spark Streaming.
  • Solid understanding and extensive experience in working with different databases such as Oracle, SQL Server, MySQL and writing Stored Procedures, Functions, Joins and Triggers for different Data Models.
  • Excellent Java development skills using J2EE, Servlets, Junit and familiar with popular frameworks such as Spring, MVC and AJAX.
  • Extensive experience in PL/SQL, developing stored procedures with optimization techniques.
  • Adept at Web Development and experience in developing front end applications using JavaScript, CSS and HTML.
  • Experienced with distributed message brokers (such as Kafka).
  • Excellent team player, with pleasant disposition and ability to lead a team and a proven track record.
  • Solid experience in Agile Methodology - stories, sprints, Kanban, Scrum & tasks

TECHNICAL SKILLS:

Big Data Technology: Apache Hadoop; Hadoop Clusters, Hadoop Common, Hadoop Distributed File System; Replication; Cloudera Cluster; Hadoop Pig; Map Reduce; Cassandra, no sql, MongoDB, Scala, Kafka, Strom, Strom Streaming, Flume, NiFi, SaaS, Mongoose, Tableau, Predix ion Insight, Informatica, Relational, hierarchical and graph databases, distributed data file systems, data federation and query optimization

RDBMS: Oracle 11g/10g, DB2 8.0/7.0 & MS-Server 2005

Data Modeling: Dimensional Data Modeling, Star Join Schema Modeling, Snow Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling, Erwin 3.5.2/3.x & Toad

Programming: UNIX Shell Scripting, SQL, PL/SQL, VB & C.

Operating Systems: Windows 2000, UNIX AIX.

PROFESSIONAL EXPERIENCE:

Confidential, Louisville, KY

Sr. Big Data/Hadoop - Consultant

Responsibilities:

  • Involving in loading and transforming large sets of structured, semi structured and unstructured data from relational databases into HDFS using Sqoop imports.
  • Developing Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
  • Migrating existing java application into microservices using spring boot and spring cloud.
  • Working knowledge in different IDEs like Eclipse, Spring Tool Suite.
  • Working knowledge of using GIT, ANT/Maven for project dependency / build / deployment.
  • Developing simple and complex MapReduce programs in Java for Data Analysis on different data formats.
  • Delivering high availability and performance
  • Contributing in all phases of the development lifecycle
  • Writing well-designed, efficient, and testable code
  • Conducting software analysis, programming, testing, and debugging
  • Managing Java and Java EE application development
  • Developing Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Working as a part of AWS build team.
  • Creating, configure and managing S3 bucket(storage).
  • Experience on AWS EC2, EMR, LAMBDA and Cloud Watch.
  • Importing the data from different sources like HDFS/HBase into Spark RDD.
  • Experiencing with batch processing of data sources using Apache Spark and Elastic search.
  • Experiencing in implementing Spark RDD transformations, actions to implement business analysis
  • Migrating Hive QL queries on structured into Spark QL to improve performance
  • Optimizing MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Working on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
  • Working on Data Serialization formats for converting Complex objects into sequence bits by using AVRO, PARQUET, JSON, CSV formats.
  • Responsible for analyzing and cleansing raw data by performing Hive/Impala queries and running Pig scripts on data.
  • Administration, installing, upgrading and managing distributions of Hadoop, Hive, HBase.
  • Involved in performance of troubleshooting and tuning Hadoop clusters.
  • Creating Hive tables, loaded data and wrote Hive queries that run within the map.
  • Used Spark-Streaming APIs to perform necessary transformations and actions on the fly for building the common learner data model which gets the data from Kafka in near real-time configuration
  • Deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Developing Spark scripts by using Python shell commands as per the requirement to read/write JSON files
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's to read/write JSON files.
  • Developing Spark ingestion process to extract 1 TB data on daily basis

Environment: Hadoop 2.x, HDFS, HBase 2.x/0.90.x, Amazon AWS, Spark Flume 0.9.3, Impala, NiFi, security - Kerberos, Sqoop 2.x, Hive 0.7.1 & Tableau 9.3 (Online, Desktop, Public Vizable).

Confidential, Mount Laurel, NJ

Sr. Hadoop Developer

Responsibilities:

  • Worked on a live 52 node Hadoop Cluster running Hortonworks Data Platform (HDP 2.2).
  • Developed Spark SQL to load tables into HDFS to run select queries on top.
  • Used Spark Streaming to divide streaming data into batches as an input to spark engine for batch processing.
  • Engineered ETL standards for Hadoop Data Pipelines and Automated end to end data ingestion using Falcon, Sqoop and Oozie.
  • Led design and implementation of Store Traffic Data Analysis - an ETL solution to consolidate customer traffic data, sales data and employee workforce data to compute store close rate for about 1800 stores on a daily basis.
  • Migrated Existing MapReduce programs to Spark Models using Python. Develop predictive analytic using Apache Spark Scala APIs.
  • Implemented various Data Quality rules to ensure traffic data meets quality standards as outlined by analytics stakeholders.
  • Implemented Apache Storm Spouts, bolts to process data by creating topologies.
  • Developed Imputation Models in Java using Apache Crunch to substitute values for missing or improper traffic data.
  • Realized various initiatives from Apache Software Foundation, vetted new frameworks and built Proof of Concepts.
  • Developed workflows to cleanse and transform raw data into useful information to load it to a Kafka Queue to be loaded into HDFS and noSQL database.
  • Developed Sqoop Jobs to both import data into HDFS from Relational Database Management System like Teradata & DB2 and export data from HDFS to Teradata.
  • Developed workflows for complete end to end ETL process starting with getting data into HDFS, validating and applying business logic, storing clean data in hive external tables, exporting data from hive to RDBMS sources for reporting and escalating and data quality issues.
  • Built scalable distributed data solutions using Hadoop. Developed MapReduce jobs written in Java to apply the business logic.
  • Developed Pig functions to preprocess the data for analysis. Developed Spark scripts by using Scala shell commands as per the requirement.
  • Created Oozie workflows to sqoop the data from source to HDFS and then to target tables.
  • Created HBase tables to store different formats of data as a backend for user portals.
  • Analyzed system failures, identified its root cause and recommended course of actions.
  • Functioned as the point of contact for tracking issues and communicating it to the vendors and all other stakeholders. Experienced with batch processing of data sources using Apache Spark and Elastic search.
  • Developed utilities in Python to be used by ingestion workflows as part of Data Ingestion Process.

Environment: Hadoop, Hive, Crunch, Falcon, Kafka, Oozie, Sqoop, Pig, Hbase, Spark, Oracle, Teradata, Scala, Java, Python, SQL Navigator, Spark streaming, Eclipse IDE.

Confidential

Systems Analyst

Responsibilities:

  • Understanding and analyzing business requirements to develop a credit check module using Servlets and JSP & Core Java components in Web logic application server.
  • Developed new screens/menu depending on the business requirements. Translate application storyboards and use cases into functional applications
  • Design, build, and maintain efficient, reusable, and reliable Java code
  • Ensure the best possible performance, quality, and responsiveness of the applications
  • Identify bottlenecks and bugs, and devise solutions to these problems
  • Help maintain code quality, organization, and automatization
  • Responsible for designing Rich user Interface Applications using JavaScript, CSS, HTML.
  • Extensively worked with XSD, XSL/XSLT, and XML to navigate in XML documents, and SAX to process and parse xml files.
  • Used JUnit Framework for the unit testing of all the java classes and performed system and Integration testing.
  • Worked on AJAX implementation for retrieving the content and display it without reloading the existing page.
  • Enhancing the existing functionality to improve performance and bug fixing.
  • Gathered requirements, created user stories for the Business Requirement Document and prepared a Functional Specification document.
  • Provided round the clock on call support.

Environment: Java 1.6, JDBC, XML, AJAX, Oracle, Microsoft Office 2007, MS Outlook 2007, SharePoint.

Confidential

Java/J2EE Developer

  • Designed use cases, sequence and class diagrams, activities, states, objects and components. Used UML (MS Visio) for software design.
  • Developing presentation layer with JSPs, HTML5, JavaScript, CSS3, JQuery, JSON, AJAX, Spring Form tags, JSTL Tags etc.
  • Design and develop XML processing components for dynamic menus on the application
  • Developing the application using Spring MVC Architecture with Hibernate as ORM framework.
  • Developed SQL queries for retrieving data used to generate the reports.
  • Developed Stored Procedures and Triggers on Oracle Database.
  • Used AJAX and JQuery for developing asynchronous web applications on client side.
  • Used Hibernate, object/relational-mapping (ORM) solution technique, to map data representation from MVC model and Oracle Relational Data Model with a SQL-based schema.
  • Created SOAP Web Services using WSDL, XML and SOAP for transferring data.
  • Writing complex SQL queries for demanding complex business logic.
  • Developing Web services using RESTful and SOAP frameworks.
  • Worked with Quality Assurance team in tracking and fixing bugs.
  • Developed JUnit test cases for all use cases and executed them.
  • Took various initiatives to optimize existing applications for better performance and efficiency
  • Used Log4j for application logging and debugging.

Environment: Java 1.6, J2EE, Servlets, JSP 2.5, JUNIT, Spring 2.5.6/3.0, Spring ORM Spring Form tags JSTL, Hibernate 3.0, Oracle11g, Apache, SOA, Eclipse IDE 3.7, Log4J, Ajax, SOAP, PL/SQL, HTML, CSS, jQuery, JSON.

Confidential

Java Developer

  • Involved in analysis, design and development of e-bill payment system as well as account transfer system and developed specs that include Use Cases.
  • Class Diagrams, Sequence Diagrams and Activity Diagrams.
  • Involved in designing the user interfaces using JSPs.
  • Developed custom tags, JSTL to support custom User Interfaces.
  • Developed the application using Struts Framework using Model View Layer (MVC) architecture.
  • Implemented persistence layer using Hibernate that use the POJOs to represent the persistence database tables. These POJOs are serialized Java Classes that would not have the business processes.
  • Implemented Hibernate using the Spring Framework (Created the session Factory).
  • Implemented the application using the concrete principles laid down by several design patterns such as MVC, Business Delegate, Data Access Object, Singleton and Factory.
  • Deployed the applications on BEA WebLogic Application Server.
  • Developed JUnit test cases for all the developed modules.
  • Used CVS for version control across common source code used by developers.
  • Used Log4J to capture the log that includes runtime exceptions.
  • Used JDBC to invoke Stored Procedures and database connectivity to ORACLE.
  • Refactored the code to migrate from Hibernate2.x version to Hibernate3.x. (I.e. moved from xml mapping to annotations) and Implemented the Hibernate Filters and Hibernate validators.
  • DAO and the hibernate transactions was implemented using spring framework.
  • Used AJAX and JavaScript for validations and integrating business server side components on the client side with in the browser.

Environment: Java, J2EE, JSP, JNDI, Oracle 10g, DHTML, ANT, Rationale Rose, Eclipse 3.1, Unix, Web logic Application Server, Hibernate 3.0, Struts, LOG4J, CVS.

We'd love your feedback!