We provide IT Staff Augmentation Services!

Big Data Engineer Resume

VA

SUMMARY

  • A competent professional with 10 years of experience in the design, development, and implementation phases of Software Development Life Cycle (SDLC) .
  • 2 years of experience in dealing with Apache Hadoop components like HDFS, MapReduce, Hive, Pig, Sqoop, Oozie, HBase, SPARK, Kafka, SCALA and Big Data Analytics.
  • 8 years of experience in Database Architecture, Administration, System Analysis, Design, Development and Support of Oracle (SQL/PLSQL), MySQL, Teradata, SAP Crystal reports, Informatica and Shell scripting .
  • Strong scripting skills using Shell.
  • Hands on experience in installing, configuring, and using Hadoop components like Hadoop Map Reduce, HDFS, HBase, Hive, Sqoop, Pig and Flume .
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java.
  • Worked on backend using Scala and Spark to perform several aggregation logics
  • Exposed in working with SPARK data frames and optimized the SLA’s.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Good understanding of Hadoop Architecture and underlying Hadoop framework including Storage Management.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa.
  • Expert working experience in Oracle PL/SQL Development using various Oracle key components such as Stored Procedures, Functions, Packages, DB Triggers, Views, Materialized Views, DBlinks, Exception handling, Oracle Supplied Packages, Collections, PL/SQL Types, External Tables, Merge Statements, Autonomous transactions, Global Temporary Tables (GTT), Bulk Load, Cursor, Ref Cursors, Partitioned Tables, Dynamic SQL, SQL*Loader, Data Pump, UTL FILE, Database Links etc.
  • Expertise in Creating and Maintaining Database objects like Tables, Views, Indexes, Constraints, Materialized Views, Synonyms, and Sequences.
  • Proficient in writing and tuning Complex SQL statements, complex joins co-related sub-queries and SQL statements with Analytical Functions (ROW NUMBER, RANK, DENSE RANK, Lag, Lead, Connect by level etc).
  • Experience in writing D ynamic SQL Statements using EXECUTE IMMEDIATE and DBMS SQL.
  • Worked in extensively on Query Level Performance tuning using Explain Plan utilitie to pinpoint time consuming SQL’s and tuned them by creating indexes and forcing specific plans.
  • Working knowledge on using new partitioning techniques like Interval, Reference and Extended Composite (List-List, List-Range) Partitioning.
  • Expertise in loading data from FLAT files into Oracle Database Tables using SQL*LOADER and External Tables.
  • Proficient in writing SQL statements with Window Aggregate Functions using ROWS or RANGE clause.
  • Strong experience with Oracle Data Warehouse, ETL Process, Data analysis for ODS, Online Transactional Processing (OLTP) and Data Warehouse logical/physical, relational and multi-dimensional modeling (Star Schema, Snowflake Schema), O ptimization, Partitioning, Archiving and capacity planning.
  • Good understanding of RDBMS, Oracle Database architecture, designs and having performed DBA duties, such as Table Partitioning, Export /Import
  • Experience in using Oracle concepts Like Table Partitioning, Optimizer hints and Materialized Views (Snapshots).
  • Good understanding in the areas of Users, Roles, Privileges, Schema and Object Management & Session monitoring.
  • Expertise in Transaction Management like Commit, Rollback in Oracle Database.
  • Expert working Knowledge of UNIX Shell Scripting and Scheduling Cron Jobs for Automation and tools like WINSCP, PUTTY and FTP/SFTP.
  • Responsible for Query Optimization, troubleshooting, debugging, problem solving and Tuning for improving performance of the applications.
  • Have experience in Relational and Dimensional Data modeling, Normalization, Demoralization, Data Architecture, Planning, Testing, Data Migration and Data Conversion
  • Expertise on various Data Modeling tools viz., Erwin Data Modular 7X, ER Studio Data Architect. 8X.
  • Excellent communication skills, problem solving and logical methodology. Work well in the team environment, Self-motivated, quick learner, able to work well under tight deadlines and rapidly changing priorities.
  • Mentoring new members and training them with domain knowledge and technologies.
  • Flexible and versatile to adapt to new environment and technologies.
  • Strong communication, interpersonal, learning and organizing skills matched with the ability to manage the stress and time effectively.

TECHNICAL SKILLS

Big Data technology’s: Hadoop (1.X), Yarn, Hive, PIG, Sqoop, Spark, Kafka

RDBMS: Oracle10g,11g,12c (SQL, PL/SQL),MySQL(5.X),Teradata(13)

Operating Systems: Windows, Red Hat linux and Hdfs

Languages : Java

GUI: Toad for MySQL, Toad for Oracle, SQL Developer

ETL Tools: Informatica Power Center 9.X

Reporting Tools: SAP Crystal Reports

No SQL: HBase

Version control tools: TFS, SVN, Sharepoint

PROFESSIONAL EXPERIENCE

Big Data Engineer

Confidential, VA

Responsibilities:

  • Collaborate with the Internal/Client BA’s in understanding the requirement and architect a data flow system.
  • Developed complete end to end Bigdata processing in Hadoop echo system.
  • Optimized hive scripts to use HDFS efficient by using various compression mechanisms.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for Data Aggregation, queries and writing data back into RDBMS through Sqoop.
  • Load the data into Spark RDD and performed in-memory data computation to generate the output response.
  • Migrated complex Map reduce programs, Hive scripts into Spark RDD transformations and actions.
  • Writing UDF/Mapreduce jobs depending on the specific requirement.
  • Testing all the month end changes in DEV, SIT and UAT environments and getting the business approvals to perform the same in Production.
  • Worked in writing shell scripts to schedule the hadoop jobs
  • Worked in writing SPARK sql scripts for optimizing the query performance.
  • Extensively worked in code reviews and code remediations to meet the coding standards.
  • Written sqoop scripts to import and export data in various RDBMS systems.
  • Written PIG scripts to process unstructured data and available to process in Hive.
  • Created hive schemas using performance techniques like partitioning and bucketing.
  • Used SFTP to transfer and receive the files from various upstream and downstream systems.
  • Developed Oozie workflow jobs to execute hive, pig, Sqoop and MapReduce actions.
  • Involved in complete end to end code deployment process in Production.
  • Prepared automated script to deploy every month end code changes in all the environments.
  • Worked in exporting data from Hive tables into Teradata database.
  • Worked with Hadoop administration team for configuring servers at the time of cluster migration.
  • Responsible to business and clients on every month job schedules and change requirements to validate the data.
  • Responsible for all the SLA meet times to make sure the Hadoop job’s run in time.
  • Co-ordinate with offshore team to explain the business requirements and prepare the code changes for every month end releases.

Environment: HDFS, SPARK, Pig, Hive, Beeline, Sqoop, Map Reduce, Oozie, Putty, HaaS(Hadoop as a Service), Java 6/7, SQL Server 2012, Sub Version, Toad, Teradata, Oracle 10g, YARN, UNIX Shell Scripting, Autosys, Agile Methodology, JIRA, Version One

Technical Lead

Confidential

Responsibilities:

  • Involved in requirements walk through to identify the scope and feasibility
  • Worked with ETL team and architect to build data model for staging area.
  • Created various PL/SQL objects like Stored Procedures, Functions, packages and triggers As per business requirements
  • Involved in identifying and fixing of bugs and data issues.
  • Worked on data loading into custom tables and interface tables and validated custom data through PL/SQL custom packages.
  • Developed custom interface programs that occurred daily, weekly and monthly that updates data using PL/SQL, SQL loader, and export and import utilities.
  • Coordinate with the onshore team to get the requests done.
  • Extensively used Bulk collect, Bulk Binds temp tables and external tables for DML operations.
  • Participated in Data Quality meetings with DW Team, where the data integrity issues were discussed and resolved.
  • Worked on fixing defects and improving performance for existing data base objects by using PLSQL, where performance has increased.
  • Developed stored procedures and triggers to facilitate consistent data entry into the database.
  • Used Oracle’s Explain Plan method to analyze the execution and improved the performance of SQL Statements by tuning them and reducing the cost.

Environment: Oracle 11, SQL, PL/SQL, SQL*Plus, SQL Server, SVN, Notepad++, ANSI SQL, XML, Windows 10, WINSCP.

Sr. Database Developer

Confidential

Responsibilities:

  • Involved in requirement gathering for multiple data base object developments.
  • Created various PL/SQL objects like Stored Procedures, Functions, packages and triggers As per business requirements
  • Involved in identifying and fixing of bugs and data issues.
  • Worked on data loading in to custom tables and interface tables and validated custom data through PL/SQL custom packages.
  • Developed custom interface programs that occurred daily, weekly and monthly that updates data using PL/SQL, SQL loader, and export and import utilities.
  • Extensively used Bulk collect, Bulk Binds temp tables and external tables for DML operations.
  • Participated in Data Quality meetings with DW Team, where the data integrity issues were discussed and resolved.
  • Worked on fixing defects and improving performance for existing data base objects by using PLSQL, where performance has increased.
  • Developed stored procedures and triggers to facilitate consistent data entry into the database.
  • Participated in system analysis and data modeling, which included creating tables, views, indexes, synonyms, triggers, functions, procedures, cursors and packages.
  • Developed and new reports and customized existing Reports as per client requirements.
  • Used Oracle’s Explain Plan method to analyze the execution and improved the performance of SQL Statements by tuning them and reducing the cost.
  • Used advanced Bulk techniques (FOR ALL, BULK COLLECT) to improve performance.
  • Tuned database SQL statements and procedures by monitoring run times and system statistics. Inserted hints and rewrote code as required.
  • Performed functional testing for the procedures and packages.
  • Developed and customized Discoverer reports for custom applications based on user requirements
  • Code review on peer developed code and prepared document for the same

Environment: Oracle 10, SQL, PL/SQL, SQL*Plus, TOAD 12.10, SVN, Notepad++, ANSI SQL, XML, Windows 10, WINSCP, Erwin Data Modular 7X,PUTTY.

Sr. Etl Developer

Confidential

Responsibilities:

  • Analysed the report requirement and understand functional specification and business logic.
  • Participated in team meetings and contributed inputs right from the initial stages of Reports Generation.
  • Design and Develop reports using drill down, drill through and drop-down menu option and parameterised and linked options.
  • Involved in Defect fixing (Design related issues) and identifying the remedies.
  • Review the unit test scripts/ unit test cases and BI Reports
  • Attended onsite/offshore team meetings.
  • Mentored the new team members.

Environment: Oracle 11 g, SQL, PL/SQL, SQL*Plus, TOAD 12.10, SVN, Notepad++, Sap crystal reports and Informatica.

Hire Now