Aws Data Engineer Resume
Lehi, UT
SUMMARY
- 7+ years of experience in IT as a Data Analyst, and data engineer working to transform raw data into actionable strategic knowledge to gain insight into business processes, and thereby guide strategic and tactical decision - making.
- Worked on cloud technologies and experience in Amazon EC2 and S3 and supporting both the development and production environment.
- Experience working with RDBMS including Oracle/ DB2, SQL Server, PostgreSQL 9.x, MS Access and Teradata for faster access to data on HDFS.
- Extensive experience in Data Mining solutions to various business problems and generating data visualizations using Tableau, Power BI, Sisense.
- Strong understanding of the principles of Data warehousing, Fact Tables, Dimension Tables, Star and Snowflake schema modeling.
- Experience in working with business intelligence and data warehouse software, including SSAS/SSRS/SSIS, Business Objects, Amazon Redshift, Azure Data Warehouse and Teradata.
- Working experience in data analysis techniques using Python libraries like NumPy, Pandas, SciPy and visualization libraries of Python like Seaborn, Matplotlib.
- Worked on big data technologies like Hadoop/ HDFS, Spark, MapReduce, Pig, Hive, Scoop to extract and load data of various heterogeneous sources like Oracle, flat files, XML, other streaming data sources into EDW and transform for analysis (ETL/ELT).
- Experience in AGILE (Scrum) Methodology, participating in daily scrum meetings, and being actively involved in sprint planning and product backlog creation.
TECHNICAL SKILLS
Programming Languages: Python, UNIX shell scripting
Tools: SQL Workbench, Putty, SSIS, SSAS, SAP Crystal Reports, TOAD, SQL Developer
Big Data Technologies: HDFS, Scoop, Flume, PySpark, Data Lake,Redshift
Data Analysis libraries: Pandas, NumPy, SciPy, Scikit-learn, NLTK, Plotly, Matplotlib
Data Modeling Tools: Toad Data Modeler, SQL Server Management Studio, MS Visio, SAP Power designer, Erwin 9.x
Databases: MySQL, Oracle12c/11g, MS Access 2016/2010, Hive, SQL Server 2014/2016, Amazon Redshift, Azure SQL Database
Reporting Tools: Crystal reports XI/2013, SSRS, Business Objects 5.x/ 6.x, Tableau, Informatica Power Center
Cloud Technologies: Amazon Web Services (AWS), Microsoft Azure (familiar), Amazon EC2
Analytics: Sisense, Tableau, Power BI, MS Excel
Project Execution Methodologies: Agile, Scrum, Lean Six Sigma, Ralph Kimball and Bill Inmon data warehousing methodology
BI Tools: Alteryx, Tableau Power BI, Sisense
Operating Systems: Windows Server 2012 R2/2016, UNIX, CentOS
PROFESSIONAL EXPERIENCE
Confidential, Lehi, UT
AWS Data Engineer
Responsibilities:
- Gathered and translated business requirements into detailed technical specifications creating robust data models using Erwin Data Modeler and Visio
- Written backend AWS lambda functions in Python
- Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.
- Architect and guide the project teams during full lifecycle project phases, going through requirements gathering, end user interaction, data analysis, data profiling, extraction and transformations, data modeling, Reporting/ Dash boarding - leveraging the appropriate data Integration, modeling and Visualization tools
- Utilized Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine Learning use cases under Spark ML and Mllib.
- Worked on machine learning on large size data using Spark and MapReduce.
- Developed Spark/Scala, Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for big data resources.
- Analyze and propose solution for Sisense BI as per requirements.
- Created ad-hoc reports to users in Sisense by connecting various data sources.
- Preparing dashboards using calculated fields, parameters, calculations, groups, sets and hierarchies in Sisense.
- Worked on Spark streaming using Apache Kafka for real time data processing and implemented Oozie job for daily import.
- Data Extraction, aggregations and consolidation of Adobe data within AWS Glue using PySpark.
- Create external tables with partitions using AWS Athena and Redshift
- Load data into Amazon Redshift and use AWS Cloud Watch to collect and monitor AWS RDS instances within Confidential
- Migrated on premises MySQL to AWS using Amazon RDS and DynamoDB
- Used the AWS-CLI to suspend an AWS Lambda function processing an Amazon Kinesis stream, then to resume it again
- Write scripts in python to move data in JSON format from S3 buckets to MySQL tables
- Writing UNIX shell scripts to automate the jobs and scheduling cron jobs for job automation using commands with Crontab
- Created scripts in Python which integrated with Amazon API to control instance operations
- Create and Maintain Tables and views in Snowflake
- Importing and exporting data from snowflake, Oracle and DB2 into HDFS and HIVE using Sqoop for analysis, visualization and to generate reports.
Environment: Redshift, HDFS, UNIX, SQL Workbench, Python, BI, DWH, AWS, S3, Sisense, MySQL, AWS Glue, Knime, AWS EC2, AWS Kinesis., AWS DynamoDB, Snowflake.
Confidential
Data Engineer
Responsibilities:
- Automating the data flow using Control-M.
- Extensive involved in Data Analysis, Data Cleansing, Requirements gathering, Data Mapping, Functional and Technical design docs, and Process Flow diagrams.
- Involved in extensive DATA validation using SQL queries and back-end testing.
- Leveraging Driver tables to pull data in Teradata, Redshift Worked on Django API's for accessing the database.
- Connected to RedShift through Tableau to extract live data for real time analysis.
- Created Tableau Dashboard for the top key Performance Indicators for the top management by connecting various data sources like Excel, Flat files and SQL Database.
- Created Tableau scorecards, dashboards using stack bars, bar graphs, scattered plots, geographical maps, Gantt charts using show me functionality.
- Developed Views and Templates with Python and using Django's view controller and template language, Website interface is created.
- Designed and created backend data access modules using PL/SQL stored procedures and Oracle.
- Good experience in writing SQL Queries and implementing stored procedures, functions, packages, tables, views, Cursors, triggers.
- Experience in using collections in Python for manipulating and looping through different user defined objects.
- Developed and executed User Acceptance Testing portion of test plan.
Environment: Teradata, Redshift, Tableau, SQL, UNIX, Cygwin, Control-M, Putty, WinSCP, Python, PyCharm, AWS, MS SQL Server, Excel.
