Data Engineer Resume
5.00/5 (Submit Your Rating)
SUMMARY
- 10+ years of experience in AI/Machine Learning, Data engineering and Data warehousing.
- Expertise and Experience in performing a variety of roles and responsibilities as that of a Data Warehouse Analyst/Developer
- Very comfortable in python programming to build APIs, web services, machine learning, data transformations and data engineering
- Familiar in building devops pipelines for CI/CD
- Familiar with Snowflake as Enterprise Datawarehouse.
- Comfortable with dockers and YAML files for containerized deployment on Google AppEngine
- Well versed in Java and Advanced Java programming.
- Familiar with distributed processing technologies like Apache Spark.
- Familiar with streaming processing on Apache Kafka, Apache Flink and Spark Streaming.
- Very comfortable with complex SQL scripts
- Have extensively used Informatica products like PowerCenter
- Familiar in AWS cloud solution
- Have used and worked in most of the traditional databases like SQL Server, Oracle as well as knowledge on NoSQL databases like MongoDb, DynamoDb
- Extensive experience doing Requirements gathering and Data discovery
- Experience with Amazon Redshift and Amazon Athena
- Familiar with setting up of ETL jobs on AWS data pipelines and AWS Glue/PySpark
- Familiarity in Machine Learning on TensorFlow used in building basic ML programs for text processing.
- Worked on setting up Data lake/Data catalog on AWS Glue
- Experience in performing data validations on large enterprise warehouse data to ensure data quality
- Excellent Verbal and Written Communication Skills and have proven to be highly effective in interacting with business and technical groups
PROFESSIONAL EXPERIENCE
Confidential
Data Engineer
Responsibilities:
- Interfacing with business customers, gathering requirements and creating data sets/data to be used by business users for visualisation
- .Setting up DEVOPS pipelines for CI/CD on GIT, Jenkins, Nexus repository
- Developed and deployed machine learning models for document category prediction and recommending content to Legal team into Google Clould’s AppEngine
- Developed code for converting a scanned PDF into searchable document and then into metadata using rules
- Set up Jenkins pipelines for CI/CD
- Implemented several machine learning algorithms - Name Entity recognition (Spacy), Random Forest Classifier, Logistic regression, Linear regression
- Packaged the application for deployment into a docker container using Docker and YAML config files
- Used Python programming for data processing, transformation and machine learning modules
- Agile methodology
Tools: /Technologies - Google Cloud AppEngine, Jetbrains Pycharms, Nexen API, Python Flask, Jira, Gitlab, AWS S3, Jenkins, Microsoft SQL, Snowflake.
Confidential
Data Engineer
Responsibilities:
- Interfacing with business customers, gathering requirements and creating data sets/data to be used by business users for visualisation
- Experience in migrating Enterprise Data (Trust data) and staging procedures from Microsoft SQL to AWS Redshift using AWS Glue, S3
- Setting up of data models and creating actual data lake on AWS Athena from S3 for visualisation in aws quicksight.
- Setting up of AWS Glue jobs for ETL.
- Setting up of trigger jobs for ingesting files from various vendor partners of Confidential &T bank
- Creating data transformations on PySpark, AWS Glue
- Written high quality, maintainable, and robust code, often in SQL, PL/SQL.
- Demonstrated expertise in data modeling, ETL development, and data warehousing per needs of project requirements.
- Designed Source to Target Data Mapping & getting it approved from the vendors & Data modelers.
- Designed data models and data bases and workflows for handling Big data volume such as maintaining Trust Accounting Data.
- Experience providing technical leadership and mentoring other engineers for best practices on requirements for project.
- Experience in working with business and developers to groom elaborate user stories
- Experience of working in Agile Scrum environment.
- Support during QA cycle in fixing bugs.
- Prepared release notes and scripts required for production deployment.
- Worked on Data Analysis for various Source systems.
- Worked on integrating Angular based application to database using NodeJS.
Tools: /Technologies - Informatica Power Center 10, SQL Workbench, Microsoft SQL development studio, Eclipse, AWS glue, AWS REdshift, AWS Athena, AWS Data pipeline
