We provide IT Staff Augmentation Services!

Sr. Data/ Hadoop Engineer Resume

Seattle, WA

PROFESSIONAL SUMMARY:

  • Over 9+ years of extensive experience in the field of information technology industry, providing strong business solutions with excellent technical, communication and customer service expertise.
  • Over 4 years of extensive experience in Hadoop, Spark and Big data Technologies
  • Strong Experience in using Hadoop eco - system components like HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Impala.
  • Strong Experience in AWS Cloud services like EC2 and S3.
  • Strong working experience in monitoring data using Splunk.
  • Strong experience in guiding and building CI/CD pipelines.
  • Strong coding experience in Python,Unix Shell and Power Shell
  • Expertise in Creating, Debugging, Scheduling and Monitoring jobs using Airflow and Oozie.
  • Experience in Creating, Scheduling and Debugging Spark jobs using Python.
  • Excellent Understanding of Spark and its benefits in Big Data Analytics
  • Strong Experience in creating and debugging jobs in EMR cluster and running it successfully.
  • Strong and Extensive experience in dealing with data files in AWS S3.
  • Experience working with Amazon RedShift database.
  • In-depth knowledge working with both Parquet and Avro data files
  • In-depth experience Sqooping wide range of data sizes from Oracle to Hadoop environment.
  • Strong Experience in Parsing both structured and unstructured data files using Data Frames in PySpark
  • Strong Debugging skill using Cloudera Resource Manager
  • Strong Performance Improvement techniques that helps Hadoop jobs runs faster.
  • Strong Knowledge on NOSQL databases like Document DB, Cassandra and Hbase
  • Unique Experience in building ETL scripts in different languages like PLSQL, Informatica, Hive, Pig and PySpark.
  • Expert-level knowledge of Oracle PL/SQL programming in Oracle 10g/oracle 11g
  • Expert-level knowledge in design and development of PL/SQL Packages, Procedures, Functions, Triggers, Views, Sequences, Indexes and other DB objects, SQL Performance Tuning.
  • Design PL/SQL implementations, Optimize and troubleshooting existing PL/SQL packages
  • Demonstrated experience using Oracle Collections, bulking techniques, partition utilization to increase performance.
  • Very strong in data modeling techniques in normalized (OLTP) modeling
  • Expertise in Performance tuning Informatica mappings and workflows
  • Provide metrics and project planning updates for the development effort in AgileProjects.
  • Strong knowledge and use of development, methodologies, standards and procedures.
  • Strong leadership qualities with excellent written and verbal communications skills.
  • Ability to multi-task and provide expertise for multiple development teams across concurrent project tasks.
  • Good time management skills & Strong problem solving skills
  • Exposure to all phases of software development life cycle (SDLC)
  • Excellent interpersonal skills and an innate ability to provide motivation, and open to new and Innovative ideas for best possible solution.

TECHNICAL SKILLS:

OPERATING SYSTEMS LANGUAGES: Sun Solaris 5.6, UNIX, Red hat LINUX 3, WINDOWS-NT, 95, 98, 2000, XP C, C++, PL/SQL, Shell Scripting, HTML, XM, Java, Python, HQL, PIG,USQL, PowerShell

DATABASES: Oracle 7.3, 8, 8i, 9i, 10g, 11g, SQL Server CE, HBase, Cassandra Document DB

TOOLS: TOAD, SQL developer, SQL Navigator, Erwin, SQL* Plus, PL/SQL Editor, SQL* Loader, Informatica, Autosys, Airflow, Subversion, Git-Bucket, Jenkins, Visual Studio Enterprise 2015

Hadoop Distributions: Cloudera, Amazon Web Services, Horton Works, Azure

PROFESSIONAL EXPERIENCE:

Confidential, Seattle, WA

Sr. Data/ Hadoop Engineer

Roles & Responsibilities:

  • Developed SQOOP scripts to migrate data from Oracle to Big data Environment
  • Migrated the functionality of Informatica jobs to HQL scripts using HIVE
  • Developed ETL jobs using PIG, HIVE and SPARK
  • Extensively worked with Avro and Parquet files and converted the data from either format
  • Parsed Semi Structured JSON data and converted to Parquet using Data Frames in PySpark.
  • Created Python UDF that are used in Spark
  • Created Hive DDL on Parquet and Avro data files residing in both HDFS and S3 bucket
  • Created Airflow Scheduling scripts in Python
  • Worked extensively Sqooping wide range of data sets
  • Extensively worked in Sentry Enabled system which enforces data security
  • Involved in file movements between HDFS and AWS S3
  • Extensively worked with S3 bucket in AWS
  • Created Oozie workflows for scheduling
  • Created tables and views in RedShift Database
  • Imported data from S3 buckets to Redshift
  • Created data partitions on large data sets in S3 and DDL on partitioned data.
  • Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size.
  • Self driven Multiple small projects with quality output
  • Extensively used Stash Git-Bucket for Code Control
  • Monitor and Troubleshoot Hadoop jobs using Yarn Resource Manager
  • Monitor and Troubleshoot EMR job logs using Genie
  • Provided mentorship to fellow Hadoop developers
  • Provided Solutions to technical issues in Big data
  • Explained the issues in laymen terms to help BSAs understand
  • Worked simultaneously on multiple tasks.

Environment: SQOOP, ETL, PIG, HIVE,SPARK, Python,HDFS, AWS-S3, Airflow, RedShift, EMR, Bit-Bucket, YARN, Genie, Unix.

Confidential, Los Angeles, CA

Sr. Data Engineer

Roles & Responsibilities:

  • Gathering requirements and system specifications from the business users.
  • Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
  • Developed complex Informaticaworkflows and mappings.
  • Worked on Tuning Informatica mappings using Partition Techniques
  • Extensively involved in tuning slow performing queries, procedures and functions.
  • Extensively worked in OLAP environment.
  • Involves co-ordination between OLTP and OLAP systems and teams.
  • Extensively used collections and collection types to improve the data upload performance
  • Co-ordinate with QA Team regularly for test scenarios and functionality.
  • Organized knowledge sharing sessions with PS team.
  • Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
  • Involved in both logical and physical model design.
  • Extensively worked with DBA Team for refreshing the pre-production databases.
  • Created index organized tables
  • Simultaneously worked on multiple applications.
  • Involved in estimating the effort required for the database tasks
  • Involved in fixing Production bugs which involves in and out of assigned projects
  • Explained the issues in laymen terms to help understand the BSAs
  • Executed Jobs in Unix Environment
  • Involved in Hadoop technology learning and coding couple of Hadoop scripts.
  • Hands on experience in development, installation, configuring, and using Hadoop & ecosystem components like Hadoop Map Reduce, HDFS, Hive, Sqoop, Pig, Flume, Kafka, Spark.
  • Involved in many dry run activities to make sure we have smooth production release
  • Involved extensively in creating a release plan during the project Go-Live

Environment: PL/SQL,SQL, Informatica, Oracle 11g, OLTP,OLAP, Unix.

Confidential, Dallas, TX

ETL Developer

Roles & Responsibilities:

  • Involved in all phases of software development life cycle.
  • Interacted with data architects and business users to develop data profiles and analyzed source data.
  • Involved in requirements gathering and analysis to define functional specifications.
  • Created various logical and physical data models interacting with the business team, leads and developers using ERWIN.
  • Interacting with Report Users and Business Analysts to understand and document the Requirements and translate these to Technical Specifications for Designing the Informatica Mappings.
  • Involved in extracting source data from Oracle, SQL Server 2000, Flat Files on different systems
  • Extensively used Informatica Client tools - Source Analyzer, Warehouse Designer, Mapping Designer, Mapplet Designer, Informatica Repository Manager and Informatica Workflow Manager.
  • Developed number of Informatica mappings based on business requirements using various transformations like Dynamic Lookup, Connected and Unconnected lookups, Filter, Stored procedure, Update Strategy, Joiner, Aggregator, Expression, Router, Sequence generator and Normalizer.
  • Created Connected, Unconnected and Dynamic lookup transformation for better performance and increased the cache file size based on the size of the lookup data.
  • Extensively tested the mappings by running the queries against Source and Target tables and by using Break points in the Debugger.
  • Used Informatica’s features to implement Type 1,2 changes in slowly changing dimension tables.
  • Used version control to check in and checkout versions of objects.
  • Used Informatica Designer to create reusable transformations to be used in Informatica mappings and mapplets.
  • Moved the Informatica objects to global repository after finalizing the design in the local repository.
  • Created and scheduled Sessions, Jobs based on demand, run on time and run only once using Workflow Manager.
  • Worked in fixing poorly designed mappings, workflows, worklets, sessions, and target data loads for better performance.
  • Extensively worked in the performance tuning sources, targets, mappings and SQL queries in the transformations.
  • Written PL/SQL Stored Procedures, Functions and Packages and Triggers to implement business rules into the application.
  • Responsible for SQL tuning and optimization using Explain Plan, TKPROF utility and optimizer hints.
  • Wrote UNIX shell scripts to work with flat files, to define parameter files and to create pre and post session commands.
  • Used Pmcmd commands in non-windows environment.
  • Worked as a part of a team and provided 7 x 24 supports when required.
  • Participated in Enhancements meeting to distinguish between bugs and enhancements.

Environment: Informatica Power Center 8.5, Erwin, Oracle 10g, SQL, PL/SQL, TOAD, SQL, Flat Files, UNIX Shell Scripting, Sun Solaris 5.8 and Windows 2000.

Confidential, Dallas, TX

Oracle Developer

Roles & Responsibilities:

  • Gathering requirements and system specifications from the business users.
  • Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
  • Extensively involved in tuning slow performing queries, procedures and functions.
  • Extensively used collections and collection types to improve the data upload performance.
  • Involved in working with ETL team in loading data from Oracle10g into Teradata
  • Co-ordinate with QA Team regularly for test scenarios and functionality.
  • Organized knowledge sharing sessions with PS team.
  • Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
  • Involved in both logical and physical model design.
  • Extensively worked with DBA Team for refreshing the pre-production databases.
  • Worked closely with JBOSS team in providing the data needs.
  • Worked on APEX tool which is used to create and store Customer Store information.
  • Created index organized tables
  • Closely worked with SAP systems.
  • Simultaneously worked on multiple applications.
  • Involved in estimating the effort required for the database tasks
  • Involved in fixing Production bugs which involves in and out of assigned projects
  • Explained the issues in laymen terms to help understand the BSAs
  • Executed Jobs in Unix Environment
  • Involved in many dry run activities to make sure we have smooth production release
  • Involved extensively in creating a release plan during the project Go-Live
  • Coordinated with the DBA team to gather statspack for a time frame which gives us the database load and Database activities happening during that particular time frame.

Environment: PL/SQL,SQL, ETL, Oracle10g, JBOSS, APEX, SAP, Unix.

Confidential

Oracle Developer

Roles & Responsibilities:

  • Worked on designing the content and delivering the solutions based on understanding the requirements.
  • Wrote web service client for tracking operations for the orders which is accessing web services API and utilizing in our web application.
  • Developed User Interface using JavaScript, JQuery and HTML.
  • Used AJAXAPI for intensive user operations and client-side validations.
  • Worked with Java, J2EE, SQL, JDBC, XML, JavaScript, web servers.
  • Utilized Servlet for the controller layer, JSP and JSP tags for the interface
  • Worked on Model View Controller Pattern and various design patterns.
  • Worked with designers, architects, developers for translating data requirements into the physical schema definitions for SQL sub-programs and modified the existing SQL program units.
  • Designed and Developed SQL functions and stored procedures.
  • Involved in debugging and bug fixing of application modules.
  • Efficiently dealt with exceptions and flow control.
  • Worked on Object Oriented Programming concepts.
  • Added Log4j to log the errors.
  • Used Eclipse for writing code and SVN for version control.
  • Installed and used MS SQL Server 2008 database.
  • Spearheaded coding for site management which included change of requests for enhancing and fixing bugs pertaining to all parts of the website.

Environment: Java, JDK1.8, Apache Tomcat-7, JavaScript, JSP, JDBC, Servlets, MS SQL, XML, Windows XP, Ant, SQL Server database, Red Hat Linux, Eclipse luna, SVN.

Hire Now