We provide IT Staff Augmentation Services!

Sr. Data/ Hadoop Engineer Resume

4.00/5 (Submit Your Rating)

Beaverton, OR

SUMMARY

  • Over 9 years of extensive experience in the field of information technology industry, providing strong business solutions with excellent technical, communication and customer service expertise.
  • Over 3 years of extensive experience in Hadoop, Spark and Big data Technologies
  • Strong Experience in using Hadoop eco - system components like HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Impala.
  • Strong Experience in AWS Cloud services like EC2 and S3
  • Expertise in Creating, Debugging, Scheduling and Monitoring jobs using Airflow and Oozie.
  • Strong Experience working with Python, Unix Shell scripts.
  • Experience in Creating, Scheduling and Debugging Spark jobs using Python.
  • Excellent Understanding of Spark and its benefits in BigData Analytics
  • Strong Experience in creating and debugging jobs in EMR cluster and running it successfully.
  • Strong and Extensive experience in dealing with data files in AWS S3
  • Experience working with Amazon RedShift database.
  • In-depth knowledge working with both Parquet and Avro data files
  • In-depth experience Sqooping wide range of data sizes from Oracle to Hadoop environment
  • Strong Experience in Parsing both structured and unstructured data files using Data Frames in PySpark
  • Strong Debugging skill using Cloudera Resource Manager
  • Strong Performance Improvement techniques that helps Hadoop jobs runs faster.
  • Good Knowledge on NOSQL databases like Cassandra and Hbase
  • Unique Experience in building ETL scripts in different languages like PLSQL, Informatica, Hive, Pig and PySpark.
  • Expert-level knowledge of Oracle PL/SQL programming in Oracle 10g/oracle 11g
  • Expert-level knowledge in design and development of PL/SQL Packages, Procedures, Functions, Triggers, Views, Sequences, Indexes and other DB objects, SQL Performance Tuning.
  • Design PL/SQL implementations, Optimize and troubleshooting existing PL/SQL packages
  • Demonstrated experience using Oracle Collections, bulking techniques, partition utilization to increase performance.
  • Very strong in data modeling techniques in normalized (OLTP) modeling
  • Expertise in creating complex Informatica mappings and workflows.
  • Expertise in Performance tuning Informatica workflows
  • Provide metrics and project planning updates for the development effort in Agile Projects.
  • Strong experience working in projects involving Agile Methodologies
  • Strong knowledge and use of development, methodologies, standards and procedures.
  • Strong leadership qualities with excellent written and verbal communications skills.
  • Ability to multi-task and provide expertise for multiple development teams across concurrent project tasks.
  • Good time management skills & Strong problem solving skills
  • Successfully coordinated & delivered several projects for Confidential
  • Exposure to all phases of software development life cycle (SDLC)
  • Involved in integration of Modules, also in integration test and finalizing the unit test cases
  • Excellent interpersonal skills and an innate ability to provide motivation, and open to new and Innovative ideas for best possible solution.

TECHNICAL SKILLS

Operating Systems: Sun Solaris 5.6, UNIX, Red hat LINUX 3, WINDOWS-NT, 95, 98, 2000, XP

Languages: C, C++, PL/SQL, Shell Scripting, HTML, XM, Java, Python, HQL, PIG

Databases: Oracle 7.3, 8, 8i, 9i, 10g, 11g, SQL Server CE, HBase, Cassandra

Tools: & Utilities: TOAD, SQL developer, SQL Navigator, Erwin, SQL* Plus, PL/SQL Editor, SQL* Loader, Informatica, Autosys, Airflow, Subversion, Git-Bucket, Jenkins

PROFESSIONAL EXPERIENCE

Confidential, Beaverton, OR

Sr. Data/ Hadoop Engineer

Responsibilities:

  • Developed SQOOP scripts to migrate data from Oracle to Big data Environment
  • Migrated the functionality of Informatica jobs to HQL scripts using HIVE
  • Developed ETL jobs using PIG, HIVE and SPARK
  • Extensively worked with Avro and Parquet files and converted the data from either format
  • Parsed Semi Strcutured JSON data and converted to Parquet using Data Frames in PySpark.
  • Created Python UDF that are used in Spark
  • Created Hive DDL on Parquet and Avro data files residing in both HDFS and S3 bucket
  • Created Airflow Scheduling scripts in Python
  • Worked extensively Sqooping wide range of data sets
  • Extensively worked in Sentry Enabled system which enforces data security
  • Involved in file movements between HDFS and AWS S3
  • Extensively worked with S3 bucket in AWS
  • Created Oozie workflows for scheduling
  • Created data partitions on large data sets in S3 and DDL on partitioned data.
  • Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size.
  • Self driven Multiple small projects with quality output
  • Extensively used Stash Git-Bucket for Code Control
  • Monitor and Troubleshoot Hadoop jobs using Yarn Resource Manager
  • Monitor and Troubleshoot EMR job logs using Genie
  • Provided mentorship to fellow Hadoop developers
  • Provided Solutions to technical issues in Big data
  • Explained the issues in laymen terms to help BSAs understand
  • Worked simultaneously on multiple tasks.

Confidential, Beaverton, OR

Sr. Data Engineer

Responsibilities:

  • Gathering requirements and system specifications from the business users.
  • Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
  • Developed complex Informatica workflows and mappings.
  • Worked on Tuning Informatica mappings using Partition Techniqques
  • Extensively involved in tuning slow performing queries, procedures and functions.
  • Extensively worked in OLAP environment.
  • Involves co-ordination between OLTP and OLAP systems and teams.
  • Extensively used collections and collection types to improve the data upload performance
  • Co-ordinate with QA Team regularly for test scenarios and functionality.
  • Organized knowledge sharing sessions with PS team.
  • Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
  • Involved in both logical and physical model design.
  • Extensively worked with DBA Team for refreshing the pre-production databases.
  • Created index organized tables
  • Simultaneously worked on multiple applications.
  • Involved in estimating the effort required for the database tasks
  • Involved in fixing Production bugs which involves in and out of assigned projects
  • Explained the issues in laymen terms to help understand the BSAs
  • Executed Jobs in Unix Environment
  • Involved in Hadoop technology learning and coding couple of Hadoop scripts.
  • Involved in many dry run activities to make sure we have smooth production release
  • Involved extensively in creating a release plan during the project Go-Live

Confidential, Beaverton, OR

Oracle Developer

Responsibilities:

  • Gathering requirements and system specifications from the business users.
  • Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
  • Extensively involved in tuning slow performing queries, procedures and functions.
  • Extensively used collections and collection types to improve the data upload performance into ATLAS.
  • Involved in working with ETL team in loading data from Oracle10g into Teradata
  • Co-ordinate with QA Team regularly for test scenarios and functionality.
  • Organized knowledge sharing sessions with PS team.
  • Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
  • Involved in both logical and physical model design.
  • Extensively worked with DBA Team for refreshing the pre-production databases.
  • Worked closely with JBOSS team in providing the data needs.
  • Worked on APEX tool which is used to create and store Customer Store information.
  • Created index organized tables
  • Closely worked with SAP systems.
  • Simultaneously worked on multiple applications.
  • Involved in estimating the effort required for the database tasks
  • Involved in fixing Production bugs which involves in and out of assigned projects
  • Explained the issues in laymen terms to help understand the BSAs
  • Executed Jobs in Unix Environment
  • Involved in many dry run activities to make sure we have smooth production release
  • Involved extensively in creating a release plan during the project Go-Live
  • Coordinated with the DBA team to gather statspack for a time frame which gives us the database load and Database activities happening during that particular time frame.

Confidential, Beaverton, OR

Oracle Developer

Responsibilities:

  • Gathering requirements and system specifications from the business users.
  • Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
  • Extensively involved in tuning slow performing queries, procedures and functions.
  • Extensively used collections and collection types to improve the data upload performance into ATLAS.
  • Involved in working with ETL team in loading data from Oracle10g into Teradata
  • Co-ordinate with QA Team regularly for test scenarios and functionality.
  • Organized knowledge sharing sessions with PS team.
  • Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
  • Involved in both logical and physical model design.
  • Extensively worked with DBA Team for refreshing the pre-production databases.
  • Worked closely with JBOSS team in providing the data needs.
  • Worked on APEX tool which is used to create and store Customer Store information.
  • Created index organized tables
  • Closely worked with SAP systems.
  • Simultaneously worked on multiple applications.
  • Involved in estimating the effort required for the database tasks
  • Involved in fixing Production bugs which involves in and out of assigned projects
  • Explained the issues in laymen terms to help understand the BSAs
  • Executed Jobs in Unix Environment
  • Involved in many dry run activities to make sure we have smooth production release
  • Involved extensively in creating a release plan during the project Go-Live
  • Coordinated with the DBA team to gather statspack for a time frame which gives us the database load and Database activities happening during that particular time frame.

We'd love your feedback!