Lead Data Engineer Resume Richmond, VA - Hire IT People

SUMMARY

Having 130 months of experience in design, development and implementation of ETL (extraction loading and transformation) mechanism strategies for high volume Data Warehousing projects using ETL tool Ab Initio using agile methodology.
Having 2 years of experience in AWS ECO system like amazon s3, RDS, EC2 instance and auto scaling, EMR, rehydration, Jenkins, GitHub, snowflake, crontab, cloud Formation template, Amazon machine image, MA produce, service now, Arow, s3 glacier, Lamda, SNS, cloud watch, etc.
Continuous learning, providing/implementing the innovative ideas in the Application/environment level, automating the manual work to reduce the time complexity.
Extensive programming skills in Abinitio ETL tool, ORACLE and UNIX Shell scripting.
Worked with Batch graphs, Conduct - IT Plans in Ab Initio.
Setting up the sandbox and sandbox parameters, check in/ checkout the code into EME, Trouble Shooting & Debugging
Demonstrated ability in grasping new concepts (Both Technical & Business) with strong analytical and problem-solving skills
Effective communication with good rapport with customers and commitment towards application success and support.
Practical experience with working on multiple environments like production, development & testing.
Setting up the sandbox and sandbox parameters, check in/ checkout the code into EME, Trouble Shooting & Debugging.
Experienced in using PLANS, PDL’S, PSET’S, Query>It, Express>IT and Meta-Programming functions.
Working with serial files and Multi-files.
Well versed with Abinitio parallelism techniques and in implementing Abinitio Graphs utilizing Data parallelism, MFS techniques.
Expertise in various Abinitio component groups like Partition, Departition, Database, Datasets, Transform, Normalization Sort and Miscellaneous.
Providing quick solutions for issues in the batch process flows on production environment.
Performance tuning of Abinitio graphs and SQL queries.
Practical experience with working on multiple environments like production, development & testing.
Having experience with Platform related issues.
Expertise in understanding requirements, designing modules as per the requirements and standards, timely delivery of modules.
Extensively worked in AGILE and Iterative methodologies of SDLC.
Provide assistance in Project Scoping, Planning, Estimating, Scheduling and Drafting Procedures.
Acting Delivery manager to deliver the Jira/stories created for that cycle/Release among all technologies.
Lead Activities such as new system/process development, enhancements to existing systems, Designing and Developing the Business/Technical documents.
Providing Extensive support to Reporting Team by creating the required source tables loaded as part of ETL.
Assisting in creating BRD (Business requirement documents), MDD (Mini design documents), and TDD (Technical design documents).
Responsible for co-coordinating and setup of secure transfer protocol either by SFTP/NDM between upstream and downstream servers.

PROFESSIONAL EXPERIENCE:

Confidential

Lead Data Engineer

Responsibilities:

Worked on the dataframe creation, manipulation etc using pyspark
Using dataframe performed manipulation on the field level, file level and table level manipulation
Worked on the databricks to load the data into snowflake databases,
Worked on the EC2/EMR rehydration activity
Post rehydration, worked on to onboard the applications on the new EC2/EMR.
Worked on the security group update, maintain the IAM roles for the EC2, S3 etc
Worked on, creating the data pipeline from source to database like snowflake/postgrace sql/redshift
Worked on the automation, innovation to reduce the manual efforts
Worked on the SQL operations using dataframe in the spark environment
worked on the AWS devops activities across AWS services like EC2,EMR, S3, RDS clusters etc.
worked on the tokenization for the file level, field level using pyspark
worked on the different environment like CAT1, CAT2 and CAT3 to maintain the data privacy across regions
worked on the HLD/DLD design document creation
worked on the architecture design of the projects for the new automation/innovation
Worked on the data conversion of the capitalone partners like Kohls, ridge, BJ’s, William Sonoma etc
Worked on the data quality fixes, troubleshooting the Abinitio jobs, raising the CO’s, etc
Worked on the agile environment, kanban process etc

Environment: UNIX shell scripting, Snowflake, Teradata, PySpark, data bricks, AWS services S3, S3 glacier, EMR, CFT, GIT, Jenkins, Dataframes EC2 etc

Confidential

Senior Data Engineer

Responsibilities:

Developing generic Abinitio graphs.
Creating and modifying the existing XFR’s by involving transformation rules to map the new hierarchies.
Providing the support for the enterprise information warehouse applications
Providing the support for the enterprise data warehouse applications
Proving the incident, data quality support to ensure the data load by end of the day
Working with the end users, clients, platform teams to implement the time to market
Providing the 24*7 support for the Abinitio and Informatica batch jobs
Involving in the code deployment and knowledge transition for the data warehouse applications
Proving the 24*7 integrated productions support
Involving in the requirement gathering, analysis, unit testing, performance testing etc.

Environment: UNIX shell scripting, Teradata, Oracle, informtica, Abinitio, service now, control-M

Confidential, Richmond, VA

Senior Data Engineer

Responsibilities:

As part of this AWS cloud platform, performing the rehydration activity for the EC2/EMR every 60 days by making the latest AMI ID and making sure that the applications running fine without compliance issue.
Performing the security group updates for the cloud applications throughout the enterprise level and running the production healthy.
Performing the code promotion using GITHUB/JENKINS for the environmental level changes like EC2 instance types m4, m5, t3 for the different configurations large, medium, and small.
Performing the Rehydration activity using cloud formation template, terraform etc.
Upon successful completion of rehydration activity onboarding the application codes onto new EC2/EMR.
Performing the code deployment into the new EC2, EMR’s like crontab entries etc.
Triggering the agents for Arow, control-m and crontab scheduling tools on the new EC2/EMR’s.Validating the application status for the 10 consecutive runs and making sure that its running fine.
Post validation/re-hydration activity performing the termination of older EC2’s/EMR’s and making sure that environment is following the standards
Performing the cleanup activity, for the different applications creating the files on the EC2’s/EMR’s.
Performing the auto scaling activity for the EC2’s/EMR’s in the production to set the max and min EC2 instances in the production.
Launching the additional EC2/EMR’s for the catch-up activity and making sure that the applications are running fine in the production.
Setting up the S3 bucket policy updates for the different regions in the production and making sure that the data replication is done across the regions.
Performing the S3 bucket version update/tags update in the production environment.
Creating/updating the S3 buckets in the production environment using the change order upon the proper approval form the application/business teams.
Creating additional EMR/EC2 to perform the catch-up activity for the streaming and non-streaming applications
Updating the Tags in the CFT and testing in the development region and QA, upon successful testing migrating the changes to production environment
Updating the security groups for the every Q1, Q2, Q3 and Q4
Working with the data analysts to make sure the integrity of the data loads on behalf of the data quality tickets
Working with the source teams for the files in case of incorrect file/corrupted files sent by source
Co-coordinating with the multiple Source POC’s to resolve the file missing issue, during the holidays etc.
Making temporary code fixes in order to complete the catch-up activity
Resolving/closing the snowflake failures using service now ticketing tool
Effectivelyimplementedthe I/Oexceptionfunctionalityin the core to handle the arithmetic operations and tested successfully in the development region and promoted the code to the prod
Successfully tested the java and python version upgrade andimplementedin the prod
ImplementedAWS proxy, the environmental changes successfully tested andimplementedin the production environment
Implementederror/exceptionhandlingfunctionality, worked towards the avro to json file conversion framework using python.
Worked on the different file format avro, parquet, json, xml, ascii etc.
Performing data quality fixes on the database snowflake/Teradata/redshift cluster/Presto AWS CLI
Working on the AWS lamda, AWS cloud watch, cloudwatch trial, kafka, flume, oziee, zookeeper

Environment: AWS,Unix, python, Snowflake, MS-Office, Arow, crontab, control-m, kafka, spark streaming

Confidential

Responsibilities:

Analyzing different applications in present environment.
Optimizing code into new environment
Developing generic Ab initio graphs.
Creating and modifying the existing XFR’s by involving transformation rules to map the new hierarchies.
Worked in a sandbox environment while extensively interacting with EME to maintain version control on objects. Sandbox features like check in and checkout were used for this purpose
Analyzing the data and getting clarified with the sourcing teams.
As for business requirement modifying existing table and views.
Worked with all major transformation components in Ab Initio and load data into Snowflake.
Helping the junior team members in coding and resolving the issues while developing the graphs and plans.
Unit testing for the developed graphs.
Creating the tag’s and save files in Promotion and Migrated code from DEV to UAT.
Interacting with QA team during release and resolving tickets.
Coordinating Offshore-Onsite call for status updates.
Coordinating with customer to get more information on defects.
Involved in creating high level design and detailed design documents for Abinitio graphs.
Raising the confluence ticket for ECE platform related issue to Level3 teams.
Jira tickets worked on the stories in a collaborative team and assigned tasks.
Raised the confluence tickets and interacted with Abinitio support team in resolving the impediments.
Raised the requests with TDM team to get the files copied to lower environments was assigned to ticket raised by the user in global support model and worked on the user request.
Worked in Ab Initio Lift and Shift project from On Premise to Cloud Server.
Worked on connectivity issues like various database connections and making sure their ports are open.
Worked on the inbound/outbound operations and interactions with third party vendors/remote servers etc.
Worked with landing files in AWS S3 bucket and checking the VPN and Security groups related issues.
Worked on FTP/SFTP to related issues and securely sending the files to third party vendors.

Environment: Abinitio, ECE - Cloud, AWS UNIX, Windows, Oracle, SQL Server, Teradata, Snowflake

Confidential

Responsibilities:

Developed and Implemented extraction, transformation and loading the data from the legacy systems using Ab Initio to replacing ETL transformation to Cloud ETL transformation (ECE modernized system)
Knowledge on Amazon Web Services, to load the target S3 buckets and Snowflake tables.
Involved in creating the Snowflake loader script, which is used to load data to snowflake tables from S3 buckets.
Created detail data flows with source to target mappings and convert data requirements into low level design templates.
Responsible for data cleansing from source systems using Ab Initio Components such as join, dedup sorted, Denormalize, Normalize, Reformat, Filter by expression, Rollup
Worked with Departition Components like Concatenate, Gather, Interleave and Merge in order to departition and repartition data from Multi files accordingly.
Worked with Partition Components like Partition by key, Partition by Expression, and Partition by Round Robin to partition the data from serial file.
Worked on Generic graphs for data validation and data transformation.
Involved in System and Integration testing of the project.
Tuning of Ab Initio graphs for better performance.
Used phases/checkpoints to avoid deadlocks to improve efficiency.
Gathering the knowledge of existing operational sources for future enhancements and performance optimization of graphs.
Used UNIX environment variables in all the Ab Initio graphs, which on premise of specified locations of files to be used as source and target.
Developed shell scripts to automate file manipulation and data loading.
Replicate operational table into staging tables, transform and load data into warehouse tables using Ab Initio GDE.

Environment: Ab Initio (ACE, BRE), Hadoop, AWS, Unix shell scripting, Snowflake, MS-Office, Control - M, Tidal

Confidential

Responsibilities:

User requirements understanding and analysis of the same for providing permanent solution.
Build the graphs to incorporate the transformation logic as per the business requirements.
Developing generic Ab initio graphs.
Creating and modifying the existing XFR’s by involving transformation rules to map the new hierarchies.
Worked in a sandbox environment while extensively interacting with EME to maintain version control on objects. Sandbox features like check in and checkout were used for this purpose.
As for business requirement modifying existing table and views.
Worked with all major transformation components in Ab Initio.
Unit testing for the developed graphs.
Creating the tag’s and save files in Promotion and Migrated code from DEV to UAT.
Here supporting for the L3 level supporting and monitoring the production jobs.
Coordinating Offshore-Onsite call for status updates.
Coordinating with customer to get more information on defects.
Involved in creating high level design and detailed design documents for Abinitio graphs.
Responsible for doing code reviews for the graphs developed by other developers.
Analyze and resolve tickets with in time line.
Work on call base support.
Experience working with raising change orders and emergency code changes and working with various approval with LOB's teams
For review and promoting the code higher environments.
Experience working as a POC for incident caused during production implementation as part of support team.
Raised the various request like reverting the tags/running the adhoc jobs and copying the masked files from prod server to lower environments.

Environment: AWS, Ab Initio (ACE, BRE), Hadoop, Windows, Unix, SQL, Unix shell scripting, Oracle, SQL server, db2, Teradata, Snowflake, MS-Office, Control - M, Control-M desktop, Arow, Tidal

Confidential

Responsibilities:

Analyzing different applications in present environment.
Optimizing code into new environment.
Making sure data lineage exists in all the graphs and within the application.
Performed dependency analysis, running profile results and loading profile results in metadata hub (Metadata Hub).
Created generic graphs and plan to load regular and history data.
Used Hadoop and hive commands to load the data.
Helping the junior team members in coding and resolving the issues while developing the graphs and plans.
Interacting with QA team during release and resolving tickets.
Analyzing the data and getting clarified with the sourcing teams.
Involved in creating high level design and detailed design documents for Abinitio graphs.
Responsible for doing code reviews for the graphs developed by other developers. Used the FTP components to migrate data from different servers to facilitate subsequent transformation and loading processes.
Created automation scripts in Unix.
Raised the incident ticket for the prod code changes.

Environment: AWS, Ab Initio, Hadoop, Unix, SQL, Unix shell scripting, Snowflake, MS-Office, Control - M

We provide IT Staff Augmentation Services!

Lead Data Engineer Resume

Richmond, VA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship