Sr. Data/ Hadoop Engineer Resume Beaverton, OR - Hire IT People

SUMMARY

Over 9 years of extensive experience in the field of information technology industry, providing strong business solutions with excellent technical, communication and customer service expertise.
Over 3 years of extensive experience in Hadoop, Spark and Big data Technologies
Strong Experience in using Hadoop eco - system components like HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Impala.
Strong Experience in AWS Cloud services like EC2 and S3
Expertise in Creating, Debugging, Scheduling and Monitoring jobs using Airflow and Oozie.
Strong Experience working with Python, Unix Shell scripts.
Experience in Creating, Scheduling and Debugging Spark jobs using Python.
Excellent Understanding of Spark and its benefits in BigData Analytics
Strong Experience in creating and debugging jobs in EMR cluster and running it successfully.
Strong and Extensive experience in dealing with data files in AWS S3
Experience working with Amazon RedShift database.
In-depth knowledge working with both Parquet and Avro data files
In-depth experience Sqooping wide range of data sizes from Oracle to Hadoop environment
Strong Experience in Parsing both structured and unstructured data files using Data Frames in PySpark
Strong Debugging skill using Cloudera Resource Manager
Strong Performance Improvement techniques that helps Hadoop jobs runs faster.
Good Knowledge on NOSQL databases like Cassandra and Hbase
Unique Experience in building ETL scripts in different languages like PLSQL, Informatica, Hive, Pig and PySpark.
Expert-level knowledge of Oracle PL/SQL programming in Oracle 10g/oracle 11g
Expert-level knowledge in design and development of PL/SQL Packages, Procedures, Functions, Triggers, Views, Sequences, Indexes and other DB objects, SQL Performance Tuning.
Design PL/SQL implementations, Optimize and troubleshooting existing PL/SQL packages
Demonstrated experience using Oracle Collections, bulking techniques, partition utilization to increase performance.
Very strong in data modeling techniques in normalized (OLTP) modeling
Expertise in creating complex Informatica mappings and workflows.
Expertise in Performance tuning Informatica workflows
Provide metrics and project planning updates for the development effort in Agile Projects.
Strong experience working in projects involving Agile Methodologies
Strong knowledge and use of development, methodologies, standards and procedures.
Strong leadership qualities with excellent written and verbal communications skills.
Ability to multi-task and provide expertise for multiple development teams across concurrent project tasks.
Good time management skills & Strong problem solving skills
Successfully coordinated & delivered several projects for Confidential
Exposure to all phases of software development life cycle (SDLC)
Involved in integration of Modules, also in integration test and finalizing the unit test cases
Excellent interpersonal skills and an innate ability to provide motivation, and open to new and Innovative ideas for best possible solution.

TECHNICAL SKILLS

Operating Systems: Sun Solaris 5.6, UNIX, Red hat LINUX 3, WINDOWS-NT, 95, 98, 2000, XP

Languages: C, C++, PL/SQL, Shell Scripting, HTML, XM, Java, Python, HQL, PIG

Databases: Oracle 7.3, 8, 8i, 9i, 10g, 11g, SQL Server CE, HBase, Cassandra

Tools: & Utilities: TOAD, SQL developer, SQL Navigator, Erwin, SQL* Plus, PL/SQL Editor, SQL* Loader, Informatica, Autosys, Airflow, Subversion, Git-Bucket, Jenkins

PROFESSIONAL EXPERIENCE

Confidential, Beaverton, OR

Sr. Data/ Hadoop Engineer

Responsibilities:

Developed SQOOP scripts to migrate data from Oracle to Big data Environment
Migrated the functionality of Informatica jobs to HQL scripts using HIVE
Developed ETL jobs using PIG, HIVE and SPARK
Extensively worked with Avro and Parquet files and converted the data from either format
Parsed Semi Strcutured JSON data and converted to Parquet using Data Frames in PySpark.
Created Python UDF that are used in Spark
Created Hive DDL on Parquet and Avro data files residing in both HDFS and S3 bucket
Created Airflow Scheduling scripts in Python
Worked extensively Sqooping wide range of data sets
Extensively worked in Sentry Enabled system which enforces data security
Involved in file movements between HDFS and AWS S3
Extensively worked with S3 bucket in AWS
Created Oozie workflows for scheduling
Created data partitions on large data sets in S3 and DDL on partitioned data.
Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size.
Self driven Multiple small projects with quality output
Extensively used Stash Git-Bucket for Code Control
Monitor and Troubleshoot Hadoop jobs using Yarn Resource Manager
Monitor and Troubleshoot EMR job logs using Genie
Provided mentorship to fellow Hadoop developers
Provided Solutions to technical issues in Big data
Explained the issues in laymen terms to help BSAs understand
Worked simultaneously on multiple tasks.

Confidential, Beaverton, OR

Sr. Data Engineer

Responsibilities:

Gathering requirements and system specifications from the business users.
Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
Developed complex Informatica workflows and mappings.
Worked on Tuning Informatica mappings using Partition Techniqques
Extensively involved in tuning slow performing queries, procedures and functions.
Extensively worked in OLAP environment.
Involves co-ordination between OLTP and OLAP systems and teams.
Extensively used collections and collection types to improve the data upload performance
Co-ordinate with QA Team regularly for test scenarios and functionality.
Organized knowledge sharing sessions with PS team.
Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
Involved in both logical and physical model design.
Extensively worked with DBA Team for refreshing the pre-production databases.
Created index organized tables
Simultaneously worked on multiple applications.
Involved in estimating the effort required for the database tasks
Involved in fixing Production bugs which involves in and out of assigned projects
Explained the issues in laymen terms to help understand the BSAs
Executed Jobs in Unix Environment
Involved in Hadoop technology learning and coding couple of Hadoop scripts.
Involved in many dry run activities to make sure we have smooth production release
Involved extensively in creating a release plan during the project Go-Live

Confidential, Beaverton, OR

Oracle Developer

Responsibilities:

Gathering requirements and system specifications from the business users.
Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
Extensively involved in tuning slow performing queries, procedures and functions.
Extensively used collections and collection types to improve the data upload performance into ATLAS.
Involved in working with ETL team in loading data from Oracle10g into Teradata
Co-ordinate with QA Team regularly for test scenarios and functionality.
Organized knowledge sharing sessions with PS team.
Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
Involved in both logical and physical model design.
Extensively worked with DBA Team for refreshing the pre-production databases.
Worked closely with JBOSS team in providing the data needs.
Worked on APEX tool which is used to create and store Customer Store information.
Created index organized tables
Closely worked with SAP systems.
Simultaneously worked on multiple applications.
Involved in estimating the effort required for the database tasks
Involved in fixing Production bugs which involves in and out of assigned projects
Explained the issues in laymen terms to help understand the BSAs
Executed Jobs in Unix Environment
Involved in many dry run activities to make sure we have smooth production release
Involved extensively in creating a release plan during the project Go-Live
Coordinated with the DBA team to gather statspack for a time frame which gives us the database load and Database activities happening during that particular time frame.

Confidential, Beaverton, OR

Oracle Developer

Responsibilities:

Gathering requirements and system specifications from the business users.
Developed PL/SQL Packages, Procedures, Functions, Triggers, Views, Indexes, Sequences and Synonyms.
Extensively involved in tuning slow performing queries, procedures and functions.
Extensively used collections and collection types to improve the data upload performance into ATLAS.
Involved in working with ETL team in loading data from Oracle10g into Teradata
Co-ordinate with QA Team regularly for test scenarios and functionality.
Organized knowledge sharing sessions with PS team.
Identified and created missing DB Links, Indexes, and analyzed tables which helped improve performance of poor running SQL queries.
Involved in both logical and physical model design.
Extensively worked with DBA Team for refreshing the pre-production databases.
Worked closely with JBOSS team in providing the data needs.
Worked on APEX tool which is used to create and store Customer Store information.
Created index organized tables
Closely worked with SAP systems.
Simultaneously worked on multiple applications.
Involved in estimating the effort required for the database tasks
Involved in fixing Production bugs which involves in and out of assigned projects
Explained the issues in laymen terms to help understand the BSAs
Executed Jobs in Unix Environment
Involved in many dry run activities to make sure we have smooth production release
Involved extensively in creating a release plan during the project Go-Live
Coordinated with the DBA team to gather statspack for a time frame which gives us the database load and Database activities happening during that particular time frame.

We provide IT Staff Augmentation Services!

Sr. Data/ Hadoop Engineer Resume

Beaverton, OR

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship