Data Engineer Resume Owings Mills, MD - Hire IT People

SUMMARY:

Have proven track record of working as Data Engineer on Amazon cloud services, Bigdata/Hadoop Applications and product development.
Well versed with Big data on AWS cloud services i.e. EC2, S3, Glue, Anthena, DynamoDB and RedShift
Experience in job/workflow scheduling and monitoring tools like Oozie, AWS Data pipeline & Autosys
Defined and deployed monitoring, metrics, and logging systems on AWS .
Experience working on creating and running Docker images with multiple micro - services .
Docker container orchestration using ECS, ALB and lambda.
Experience with Unix/Linux systems with scripting experience and building data pipelines
Experience on Cloud Databases and Data warehouses ( SQL Azure and Confidential Redshift/RDS )
Played a key role in migrating Cassandra, Hadoop cluster on AWS and defined different read/write strategies
Strong SQL development skills including writing Stored Procedures, Triggers, Views, and User Defined functions.
Expert in developing SSIS/DTS Packages to extract, transform and load (ETL) data into data warehouse/data marts from heterogeneous sources.
Good understanding of software development methodologies, including Agile (Scrum).
Expertise in development of various reports, dashboards using various Tableau Visualizations
Hands on experience with different programming languages such as Java, Python, R, SAS
Experience in using different Hadoop eco system components such as HDFS, YARN, MapReduce, Spark, Pig, Sqoop, Hive, Impala, Hbase, Kafka, and Crontab tools.
Expert in creating HIVE UDFs using java in order to analyze data sets for complex aggregate requirements.
Experience in developing ETL applications on large volumes of data using different tools: MapReduce, Spark-Scala, PySpark, Spark-Sql, and Pig.
Experience in using SQOOP for importing and exporting data from RDBMS to HDFS and Hive .
Created user-friendly GUI interface and Web pages using HTML, CSS and JSP
Experience on MS SQL Server, including SSRS, SSIS, and T-SQL.

PROFESSIONAL EXPERIENCE:

Confidential, Owings Mills, MD

Data Engineer

Responsibilities:

Was responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.
Coordinated with team and Developed framework to generate Daily adhoc, Report’s and Extracts from enterprise data and automated using Oozie.
Worked on cloud deployments using maven, docker and Jenkins.
Designed and Co-ordinated with Data Science team in implementing Advanced Analytical Models in Hadoop Cluster over large Datasets.
Created monitors, alarms, notifications and logs for Lambda functions, Glue Jobs, EC2 hosts using Cloudwatch
Used AWS Glue for the data transformation, validate and data cleansing.
Used python Boto 3 to configure the services AWS glue, EC2, S3

Confidential, Madison, SD

Data Engineer

Responsibilities:

Wrote scripts and indexing strategy for a migration to Confidential Redshift from SQL Server and MySQL databases
Used AWS glue catalog with crawler to get the data from S3 and perform sql query operations
Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift
Used JSON schema to define table and column mapping from S3 data to Redshift
Wrote indexing and data distribution strategies optimized for sub-second query response
Developed a statistical model using artificial neural networks for ranking the students to better assist the admission process.
Designed and developed schema data models.
Performed Data cleaning and Preparation on XML files.
Robotic Process Automation of data cleaning and preparation in Python.
Built analytical dashboards to track the student records and GPAs across the board.
Used deep learning frameworks like MXNet, Caffe 2, Tensorflow, Theano, CNTK and Keras to help clients build Deep learning models
Participated in requirements meetings and data mapping sessions to understand business needs.

Confidential

Data Engineer

Responsibilities:

Designing and building multi-terabyte, full end-to-end Data Warehouse infrastructure from the ground up on Confidential Redshift for large scale data handling Millions of records every day
Worked on Big data on AWS cloud services i.e. EC2, S3, EMR and DynamoDB
Managed security groups on AWS, focusing on high-availability, fault-tolerance, and auto scaling using Terraform templates. Along with Continuous Integration and Continuous Deployment with AWS Lambda and AWS code pipeline.
Implementing and Managing ETL solutions and automating operational processes.
Optimizing and tuning the Redshift environment, enabling queries to perform up to 100x faster for Tableau and SAS Visual Analytics
Wrote various data normalization jobs for new data ingested into Redshift
Advanced knowledge on Confidential Redshift and MPP database concepts.
Migrated on premise database structure to Confidential Redshift data warehouse
Was responsible for ETL and data validation using SQL Server Integration Services.
Defined and deployed monitoring, metrics, and logging systems on AWS.
Implemented Work Load Management (WML) in Redshift to prioritize basic dashboard queries over more complex longer-running adhoc queries. This allowed for a more reliable and faster reporting interface, giving sub-second query response for basic queries.
Worked publishing interactive data visualizations dashboards, reports /workbooks on Tableau and SAS Visual Analytics.
Expertise knowledge in Hive SQL, Presto SQL and Spark SQL for ETL jobs and using the right technology for the job to get done.

Confidential

Data Analyst

Responsibilities:

Developed stored procedures in MS SQL to fetch the data from different servers using FTP and processed these files to update the tables.
Responsible for Designing Logical and Physical data modelling for various data sources on Confidential Redshift
Designed and Developed ETL jobs to extract data from Salesforce replica and load it in data mart in Redshift.
Experience with building data pipelines in python/Pyspark/HiveSQL/Presto/BigQuery and building python DAG in Apache Airflow.
Created ETL Pipeline using Spark and Hive for ingest data from multiple sources.
Involved in using SAP and transactions done in SAP - SD Module for handling customers of the client and generating the sales reports.
Coordinated with clients directly to get data from different databases.
Worked on MS SQL Server, including SSRS, SSIS, and T-SQL.
Designed and developed schema data models.
Documented business workflows for stakeholder review.

Confidential, IND

Application Developer

Responsibilities:

Worked on developing a product “Ecommerce” a web-based application which is relied on SAP (ERP) using Java, JSPs, HTML, CSS and Java Script.
Developed reports for the Business using Google charts API
Built SQL queries to build the reports for pre sales and secondary sales estimations.
Used JavaScript and JQuery for client-side validations.
Created user-friendly GUI interface and Web pages using HTML, CSS and JSP.
Established connection between portal and SAP using JCo Connectors.
Designed and developed Session Beans for implementing Business logic.
Worked on developing a product “Ezcommerce” a web-based application which is relied on SAP (ERP) Troubleshooting/Debugging the code and providing support to the client.
Created complex SQL queries and used JDBC connectivity to access the database.

We provide IT Staff Augmentation Services!

Data Engineer Resume

Owings Mills, MD

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship