We provide IT Staff Augmentation Services!

Hadoop Technical Lead (contract) Resume

SUMMARY:

  • Around 8+ years of experience in software development, predominantly worked on Data analytics, ETL & Data warehousing diving into BFSI, Pharma, Payments and Telecom domain.
  • Over 4+ years of experience in Big Data Eco - System.
  • Crunching & Munching, Slicing & Dicing Big Data to bring out plethora of insights to drive business with exponential growth.
  • Expertise in Java aligned with Agile methodologies and best engineering practices like - Test Driven Development (TDD) and clean code principles.
  • Working experience in setting, configuring and monitoring of Hadoop cluster of Apache, Cloudera and Hortonworks distribution.
  • Experience& expertise with Big Data ecosystem - HDFS, Hive, Impala, Map Reduce, Kafka, NoSQL DB, Spark, Sqoop & Ozzie.
  • Knowledge on Scala, Apache Flume, Apache Storm & Apache Drill.
  • Experience with shell scripting on Linux/Unix.
  • Played as Technical Lead/Scrum master, handled 3 SP releases of the CEM product.
  • Knowledge of cloud computing infrastructure and considerations for scalable, distributed systems.
  • Involved in all phases of software life cycle including Requirements Gathering, Design, Development, Testing and Debugging of Big Data Applications.
  • Onsite anchor and a bridge between business and technology development done by an offshore team.
  • Extensively worked with Product team/Business (Requirement Gathering and Analysis) and IT teams. Spanned across organizations and geographies during the project development life cycle and resolution of issues.

TECHNICAL SKILLS:

Languages: Java (Core Java), Scala

Big Data Distributions: Apache, Cloudera

Big Data Eco System: Hadoop, Map Reduce, Hive, HDFS, Spark, Sqoop & Kafka

Frame Works: JDBC, JSQL Parser & Spring IOC

Databases: Oracle SQL, PostgreSQL

Scheduler: Autosys, D-Series & Ozzie

Methodologies: Waterfall, Agile, Test Driven Development

Reporting Tools: Tableau, SAP BO

Build Tools: Maven, Ant

DevOps Tools: Jenkins, Bamboo & Jira

Version Control Tools: SVN, Git

IDE: Eclipse and Intellij

Operating System: Windows 7,8,10 and Unix/LINUX

Design Tools: Rational Rose, UML

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Technical Lead (Contract)

Responsibilities:

  • Performing a panoramic role wearing Analyst, Developer & Lead hat for Confidential team.
  • Predominantly worked on creating Big Data Insights on Fasenra product pre-launch, launch & post-launch.
  • Developed Data Lake on Cloudera platform which gives complete insights on oncology data to all the key businesses responsible for the new product launch.
  • Worked on ingesting data from multiple vendors with variety of data like LAAD, APLD OCS, NPA MD, AMS Affiliation data into Cloudera data lake (EDH) hosted on AWS.
  • Worked on ingesting data from multiple vendors with variety of data into Cloudera data lake (EDH) hosted on AWS.
  • Worked on maintaining patient's entire clinical journey, derived valuable insights that drives the business with exponential growth.
  • Onsite anchor and a bridge between business and technology development done by the offshore team.
  • Worked on transformation, de-normalization and mashing of huge amount of Oncology Data on Terabyte scale optimized for deriving insights and visualizations.
  • Worked on creating ETL work flow using core java, Shell script and HQL.
  • Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Worked on optimizing impala using shuffle & broadcast techniques and hive with serde format techniques.
  • Created business critical Analytics service layer data model and implemented the same using Hive and Impala.
  • Developed business critical java UDFs in Hive as needed for complex Querying.
  • Working on integrating spark streaming with apache Kafka to bring in real time analytics.
  • Applying partitioning and bucketing techniques in Hive for performance improvement.
  • Worked on optimizations with dynamic ingestion and schema evolution using Spark & Avro.
  • Used Oozie and Autosys for workflow Management of batch jobs.
  • Worked with the Production support team to resolve production Issues.

Environment: Cloudera 5.9.1, Autosys, Hadoop 2.6.0, Hive 1.1.0, Hue 3.9.0, Oozie 4.1.0, Parquet-format 2.1.0, Spark 1.6.0, shell script, Impala 2.7.0.

Confidential

Big Data Lead Developer

Responsibilities:

  • Developed Data Lake on Cloudera platform which gives complete insights into all the services provided by Confidential like Confidential check out and Confidential Token service.
  • Used YARN for Distributed Computation using Resource Manager and Task Manager.
  • Worked on Distcp in migrating data between highly critical and low latency Hadoop clusters.
  • Worked on compliance critical Encryption and Decryption module and on boarded new applications on to the same.
  • Worked on creating ETL work flow using core java, Shell script and HQL.
  • Worked on bringing the entire DMPD reporting model on Spark for a better performance and latency.
  • Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
  • Loaded semi-structure and un-structured Data such as Customer contact center data to HBase using Flume.
  • Created a POC using Kafka to stream the data into HDFS from Oracle DB.
  • Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms.
  • Created tables and Loaded data into Hive Tables using Hive Hql.
  • Developed business critical UDF in Hive as needed for complex Querying.
  • Involved in various phases of Software Development Life Cycle (SDLC) of the application development like Requirement gathering, Design, Analysis, Coding and Deployment.
  • Applying partitioning and bucketing techniques in Hive for performance improvement.
  • Used Oozie and D-series for workflow Management of batch jobs.
  • Developed JUnit test cases for DAO and Service Layer methods to accomplish TDD approach.
  • Developed data validation module using spark SQL and Scala.
  • Thoroughly involved in business side Requirements collection & Analysis.
  • Involved in writing the script files for processing data and loading to HDFS, and writing CLI commands using HDFS.
  • Worked on different file formats like Text files, Sequence Files, RC, ORC, Avro & Parquet.
  • Wrote SQL scripts for modifying the existing schemas and introducing the new features in the existing schemas of oracle.
  • Agile Methodology and everyday Scrum-Standup practiced for development of the application.
  • Manage and create the Low-level document after studying the High-Level documents.
  • Worked with the Production support team to resolve production Issues.

Environment: Java, Spring, Cloudera 5.8.2, Hadoop 2.6, MapReduce, Sqoop 1.4.6, Hive 1.1.0, Flume 1.6.0, D-series, CA Automation, JSP, JavaScript, putty, SOAP, DB2, Eclipse IDE, XML & TOAD.

Confidential

Technical Lead for Big Data

Responsibilities:

  • Developed Data Lake on Cloudera &Apache platform which gives complete insights into all the services provided by CSP.
  • Worked on software development kit for CEM, implemented automated installation using Core-Java, predominantly used collections & Executor (Multi-threading) frame work.
  • Lead a team of 6 for development and enhancement of south bound modules in customer quality insight project.
  • Developed a new feature, service experience which gives deep insights into customer experience with Data services.
  • Architected and developed complex hive UDFs to suffice the business use cases for CEM.
  • Lead a team of 6 for development and enhancement of Lean CI Jenkins setup for Customer quality insight project.
  • Worked on bringing the entire DMPD reporting model on Spark for a better performance and latency.
  • Part of Confidential 's clean code community which chants clean code principles and imbibes the same into the team.
  • Architected ETL work flows for various telecom interfaces like Gb, IuPs and S11.
  • Prepared data model for dimensions which caters to plethora of use cases giving deep dive insights into customer services.
  • Involved in various phases of Software Development Life Cycle (SDLC) of the application development like Requirement gathering, Design, Analysis, Coding and Deployment.
  • Developed data access layer, an interface between SAP BO/Tableau and Hadoop.
  • Worked with the customer support team in resolving many of the onsite production Issues.
  • Architected and implemented common data-model which gives a holistic view on all services using Lambda architecture.
  • Worked on performance tuning post performance test and was part of fixing issues that arose in system and integration testing.
  • Worked on Data access layer and developed the same using JSQL parser.
  • Worked as scrum master for CQI project line, handled service pack releases for Orange and Safaricom customer.
  • Implemented migration of customer experience management project from Shark to Spark SQL.
  • Was part of tools team, worked on automation of data generation for system testing according to the data model.
  • Worked on staging layer which was built using Apache Storm.
  • Part of CEM predictive Analytics team where in various projections are produced based complex algorithms and theorems using R.

Environment: Java, J2EE, Spring, Hibernate, Apache Hadoop Platform, Hadoop 2.2, MapReduce, Sqoop 1.4.6, Hive 1.1.0, Flume 1.6.0, Oozie 3.3.0, Hbase 0.94.11, JSP, JavaScript, Angular JS 1.0.x, CSS, jQuery 2.x, AJAX, EJB, WAS, Tomcat, putty, SOAP, DB2, Eclipse IDE.

Confidential

Software Developer

Responsibilities:

  • Developed Execution cost analytic tool using core java, spring and PLSQL for APAC space to calculate the execution cost of the trades and derived business projections on the same.
  • To work closely with the RCG (Regulatory Compliance group) and Business analyst group in helping them by providing various reports (example: Large Trader (daily, Monthly, Yearly), Market Access Percentage Away (Aggressive Trading), Wash Trades, Advertise Autex Volume report and many more) by analysis of Trade and Market data.
  • Developed cost models calculating the cost incurred in executing trades in US & EU markets.
  • Developed worm module for real-time data load various trading platforms into the Data warehouse using core Java, predominantly used executor framework.
  • Developed and maintained highly critical compliance reports - OATS to FINRA.
  • Developed a POC on Hadoop platform using Hadoop, hive and sqoop for US space.
  • Involved in unit testing, Integration testing of the application and supported during system testing.
  • Involved in writing Stored Procedures in Oracle and PL/SQL for back end which were used to update business logic over a set of scheduled timers. Used Views and Functions at the Oracle Database end.
  • Extensively used JDBC drivers for retrieving the data from Oracle database.
  • Implementation of new module development, new change requirement, fixes the code. Defect fixing for defects identified in pre-production environments and production environment.
  • Used SVN for version control, ANT for dependency management and structure of project.
  • Performed unit and integration testing, checked for code coverage.
  • Generated explain plan for the SQLs to check the costs and optimized the same.
  • Involved in coding Spring Configuration XML file that contains bean declarations and other dependent objects declaration.
  • Was working for Execution cost analytics project which catered to multiple business stakeholders across globe - US, UK & Singapore.
  • Worked with reporting layer for Execution cost and trade surveillance developed using Adobe Flex and Spring core.
  • Developed various business reports using shark sql for faster report generation.

Environment: Java, J2EE, spring 3.0, Web services, Flex 3.5, Drools, Junit, Struts 2.0, Oracle 11g, PL/SQL, Shell Script, Tomcat 6.0, Autosys, CAST, TOAD, Eclipse and Adobe flash builder 4.0.

Confidential

Developer Intern

Responsibilities:

  • Developed teach protocol using java, spring for selection of best suited node as cluster head.
  • Developed energy distribution model using servlets and JSP.
  • Prepared graphs to show comparison between Teach and Leach protocol on saving energy.
  • Developed JSP and Servlets to implement the business functionality.
  • Developed regression Test cases and performed unit testing and system testing.

Environment: Java, Servlets, JSP, MySQL, PL/SQL, Shell Script, Tomcat 6.0 and Eclipse.

Hire Now