We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

3.00/5 (Submit Your Rating)

Austin, TX

SUMMARY:

  • Overall 8+ years of experience in a variety of industries including 3 years of experience in Big Data Technologies (Apache Hadoop stack and Apache Spark) and 5 years of experience in Java and Web Technologies.
  • Hands on experience on working in multiple domains such as Retail, and Healthcare etc.
  • Experience in working with Cloudera, Hortonworks, and Microsoft Azure HDINSIGHT Distributions.
  • In - depth knowledge of Apache Hadoop Architecture (1.x and 2.x) and Apache Spark 1.x Architecture.
  • Hands on experience in Hadoop Ecosystem components such as Hadoop, Spark, HDFS, YARN, TEZ, Hive, Sqoop, Flume, MapReduce, SCALA, Pig, OOZIE, Kafka, NIFI, Storm, HBASE.
  • Experience in importing and exporting data from RDBMS to Hadoop and HIVE using SQOOP.
  • Experience in transporting, and processing real-time stream data using NIFI, Kafka and storm.
  • Experience in working with NO-SQL databases such as HBASE.
  • Experience in implementing OLAP multi-dimensional cube functionality using Azure SQL Data Warehouse.
  • Experience in writing Spark transformations and actions using Spark SQL in Scala.
  • Experience in working with Restful APIs.
  • Hands on experience in writing Map Reduce jobs in Java.
  • Experience in writing HQL queries in Hive Data warehouse.
  • Modification and performance tuning of HIVE scripts, resolving automation job failure issues and reloading the data into HIVE Data Warehouse if needed.
  • Experience in processing streaming data using Flume and PIG.
  • Developed OOZIE workflows to schedule SQOOP, HIVE, and spark jobs.
  • Experience in performance tuning, monitoring the Hadoop cluster by gathering and analyzing the existing infrastructure using Cloudera manager and AMBARI UI.
  • Expertise in Web Application Development with JSP, HTML, CSS, JavaScript.
  • Experience in writing Maven and SBT scripts to build and deploy Java and Scala Applications
  • Able to work on own initiative, highly proactive, self-motivated commitment towards work and resourceful.
  • Adequate knowledge and working experience with Agile and SDLC methodologies.
  • Working with onsite and offshore team members, mentoring junior team members and ability to work in team.

TECHNICAL SKILLS:

Big data Technologies: Hadoop, MapReduce, HDFS, Hive, Pig, Zookeeper, Sqoop, Oozie, Flume, IMPALA, HBASE, Kafka, ORC, Parquet, Zeppelin, R, HUE, TEZ, NIFI, STORM, Solr

Big Data Frameworks: HDFS, YARN, Spark

Hadoop Distributions: Cloudera, Hortonworks, Azure HDINSIGHT

Programming and scripting Languages: Java, Scala, R, LINUX Shell Scripting, AZURE PowerShell

Databases: RDBMS, MySQL, Oracle, Microsoft SQL Server, NETEZZA, SAP HANA

IDE and Tools: Eclipse, Tableau, IntelliJ, R Studio, SSMS, Maven, SBT, MS-Project, GitHub

Methodologies: Agile/Scrum, SDLC

Web services: REST, SOAP

PROFESSIONAL EXPERIENCE:

Confidential, Austin, TX

Sr. Big Data Engineer

Responsibilities:

  • Working with the Hortonworks Distribution of Hadoop.
  • Played a lead role in the development of Confidential Data Lake and in building Confidential Data Cube on Microsoft Azure HDINSIGHT cluster.
  • Responsible for managing data coming from disparate data sources.
  • Experience in ingesting incremental updates from structured ERP systems residing on Microsoft SQL server database on to Hadoop data platform using SQOOP.
  • Implemented OLAP multi-dimensional cube functionality using Azure SQL Data Warehouse.
  • Responsible for transporting, and processing real-time stream data sourced from Magento and Form site APIs for inventory management using NIFI, Kafka and Storm.
  • Experience in working with Restful APIs.
  • Created HBase tables to store various data formats coming from different applications.
  • Developed scripts for extracting and processing EDI POS sales data sourced from SFTP server in Hive data warehouse using Linux shell scripting.
  • Implemented proof of concept to analyze the streaming data using Apache Spark with Scala; Used Maven/SBT for build and deploy the Spark programs.
  • Responsible for building Confidential data cube using SPARK framework by writing Spark SQL queries in Scala so as to improve efficiency of data processing and reporting query response time.
  • Developed spark programming code in SCALA on INTELLIJ IDE using SBT tools.
  • Performance tuning of SQOOP, Hive and Spark jobs.
  • Responsible for modification of ETL data load scripts, scheduling automated jobs and resolving production issues (if any) on time.
  • Wrote AZURE POWERSHELL scripts to copy or move data from local file system to HDFS Blob storage.
  • Developed OOZIE workflows to automate ETL process by scheduling multiple SQOOP and HIVE and Spark jobs.
  • Daily Monitoring of Cluster status and health using AMBARI UI.
  • Experience in rendering and delivering reports in desired formats by using reporting tools such as Tableau.
  • Maintained technical documentation for launching and executing jobs on Hadoop clusters.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.
  • Responsible for programming code independently for intermediate to complex modules following development standards.
  • Planned and conducted code reviews for changes and enhancements that ensure standards compliance and systems interoperability.
  • Responsible for modifying the code, debugging, and testing the code before deploying on the production cluster.

Environment: Hadoop Stack, Java, Sqoop, Hive, ATSCALE, Oozie, Microsoft SQL server, NIFI, Kafka, Storm, Ubuntu, HBASE, YARN, Hortonworks, UNIX Shell Scripting, AZURE PowerShell, CRON, Scala, Spark, R, Maven, SBT, IntelliJ, Tableau, Microsoft Azure HDINSIGHT,SSMS, Azure Data Factory, Azure Data Warehouse, SAP HANA.

Confidential, Collegeville, PA

Sr. Hadoop/Spark Developer

Responsibilities:

  • Experience with Cloudera distribution of Hadoop.
  • Responsible for managing data coming from disparate data sources.
  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in managing and reviewing Hadoop log files by developing Pig program for loading and filtering the streaming data into HDFS using Flume.
  • Experience in performance tuning of Hive queries.
  • Worked extensively with HIVE DDLs and Hive Query language (HQLs) for analytical processing.
  • Written Hive queries for data analysis to meet the business requirements.
  • Hands on experience in defining, partitioning, bucketing, compressing Hive tables to meet business requirement.
  • Hands on experience in working with IMPALA.
  • Experience in implementing applications on Spark frameworks using Scala.
  • Developed spark programming code in SCALA on INTELLIJ IDE using SBT tools.
  • Hands on experience in writing Map Reduce jobs to perform data cleansing and preprocessing using Java.
  • Hands on experience in writing Linux/Unix Shell scripting.
  • Have solid understanding of REST architecture and its applications to well performing web sites for global usage.
  • Implemented test scripts to support test driven development and continuous integration.
  • Configured OOZIE work flows to automate data flow, preprocess and cleaning tasks using Hadoop Actions.
  • Experience with CDH distribution and Cloudera Manager to manage and monitor Hadoop clusters.

Environment: Hadoop, Hive, Map Reduce, HDFS, Pig, Sqoop, Java, Eclipse, Oracle 10g, PL/SQL, Linux, flat files, Microsoft SQL Server, Spark, Scala, CRON Tab, IMPALA, Cloudera, Cloudera Manager, Netezza, Tableau, Agile.

Confidential

Java Developer

Responsibilities:

  • Worked as a senior developer for the project
  • Created UML class diagrams that depict the code’s design and its compliance with the functional requirements
  • Analysis, Design, Development and Unit Testing of the modules
  • Used Java Mail notification mechanism to send confirmation email to applied companies
  • Also involved in writing JSP’s/JavaScript to generate dynamic web pages and web content
  • Developed various Java classes, SQL queries and procedures to retrieve and manipulate the data from backend Oracle database using JDBC
  • Used Enterprise Java Beans as a middleware in developing a three-tier distributed application
  • Developed Session Beans and Entity beans to business and data process
  • Implemented Web Services with REST
  • Implemented field level validations with AngularJS, JavaScript and JQuery
  • Preparation of unit test scenarios and unit test cases
  • Branding the site with CSS
  • Code review and unit testing the code
  • Involved in unit testing using Junit
  • Implemented Log4J to trace logs and to track information
  • Involved in project discussions with clients and analyzed complex project requirements as well as prepared design documents
  • Organized and presented technical sessions for a group of 30 project members

Environment: Java, JSP, EJB, JMS, JavaScript, JSF, XML, JBOSS, WebSphere, WebLogic, SQL, PL/SQL, CSS, Log4j, JUnit, Eclipse, Oracle 11g, Load Runner, TFS

Confidential

Java Developer

Responsibilities:

  • Involved in various SDLC Life cycle phases like development, deployment, testing, documentation, implementation & maintenance of application software.
  • Extensively used Core Java, Servlets/JSPs and XML.
  • Used XML and JSON for transferring/retrieving data between different Applications.
  • Used JQuery for creating JavaScript behaviors.
  • Collaborated with technical architects to ensure that the design meets the requirements.
  • Implemented the JBoss server logging configuration which is represented by the logging subsystem.
  • Responsible for developing JUnit test cases using Easy Mock and DB units for unit and integration units.
  • Used Maven script for building and deploying the application.
  • Assisted in development and improvement of application maintenance plans, processes, procedures, standards and priorities.
  • Used Microsoft SQL Server Database to store the system data
  • Used Apache log 4j Logging framework for logging of trace and Auditing.
  • Responsible for modifying the code, debugging, and testing the code before deploying on the production cluster

Environment: Java, JSP, Java Script, CSS, JDBC, JBOSS, REST, Web Services, Microsoft SQL Server, HTML.

Confidential

Associate Software Professional

Responsibilities:

  • Involved in various SDLC Life cycle phases like development, deployment, testing, documentation, implementation & maintenance of application software.
  • Using IIS and Apache for web Server.
  • Developed analysis level documentation such as Use Case Model, Activity, Sequence and Class Diagrams.
  • Also involved in writing JSP’s/JavaScript to generate dynamic web pages and web content
  • Developed business components of the applications using EJB.
  • Developed Test plans, cases and executed them TEST and Stage environments.
  • Involved in requirements gathering and converting them into specifications.
  • Developed unit test cases using JUnit for testing functionalities and performed integration testing of the application.
  • Support to UAT, production environments and resolving issues with other deployment and testing groups.
  • Implemented REST web services for Deployment and Publishing purposes.
  • Wrote SQL queries and procedures that store and retrieve information from Microsoft SQL server database.
  • Developing UI using different Frontend Technologies like HTML, CSS, JavaScript
  • Planned and conducted code reviews for changes and enhancements that ensure standards compliance and systems interoperability.

Environment: Java, JSP, Java Script, CSS, JDBC, JBOSS, REST, Web Services, Microsoft SQL Server, HTML.

We'd love your feedback!