Sr. Data Engineer Resume
Charlotte, NC
SUMMARY
- About 7+ years of professional experience in IT, working with various Legacy Database systems, which includes work experience in Big Data technologies as well.
- Hands on experience on complete Software Development Life Cycle SDLC for the projects using methodologies like Agile and hybrid methods.
- Expert in using Python for Data Engineering and Modeling
- Vast experience of working in the area of data management including data analysis, gap analysis and data mapping.
- Excellent knowledge on Confidential Azure Services, Amazon Web Services and Management.
- Transforming and retrieving the data by using Pig, Hive, SSIS and Map Reduce.
- Extensive experience on interaction with system users in gathering business requirements and involved in developing projects.
- Data importing and exporting by using Swoop from HDFS to Relational Database Systems and vice - versa.
- Excellent knowledge and extensively using NOSQL databases (Base).
- Excellent Knowledge of Ralph Kimball and Billion’s approaches to Data Warehousing.
- Responsible for designing and building a Data Lake using Hadoop and its ecosystem components.
- Strong background in various Data modeling tools using ERWIN, ER Studio and Power Designer.
- Solid in-depth understanding of Information security concepts, Data modeling and RDBMS concepts.
- Hands on experience on writing Queries, Stored procedures, Functions and Triggers.
- Good knowledge on converting complex RDMS (Oracle, MySQL & Teradata) queries into Hive query language.
- Good knowledge in OLAP, OLTP, Business Intelligence and Data Warehousing concepts with emphasis on ETL and Business Reporting needs.
- Develop generic SQL Procedures and Complex T-SQL statements to achieve the reports generation.
- Experience in Performance Tuning and query optimization techniques in transactional and Data Warehouse Environments.
- Hands on experience on data modeling with Star schema and Snowflake schema.
- Perform structural modifications using Map - Reduce, HIVE and analyze data using visualization/ reporting tools.
- Experience in developing customized UDF’s in Python to extend Hive and Pig Latin functionality.
- Experience in extracting source data from Sequential files, XML files, CSV files, transforming and loading it into the target Data warehouse.
- Excellent knowledge of Microsoft Office with an emphasis on Excel.
- Experience in Data mining with querying and mining large datasets to discover transition patterns and examine financial reports.
- An excellent experience in generating ad-hoc repots using Crystal Report
- Hands on experience for Data Integrity process and Data Modelling concepts.
- Effectively Plan and Manage project deliverable with on-site and offshore model and improve the client satisfaction.
TECHNICAL SKILLS
Big Data Tools: Base 1.2, Hive 2.3, Pig 0.17, HDFS, Swoop 1.4, Kafka 1.0.1, Oozy 4.3, Hadoop 3.0
Data Modeling Tools: Erwin Data Modeler 9.8, ER Studio v17, and Power Designer 16.6.
Databases: Oracle 12c, Teradata R15, MS SQL Server 2016, DB2.
Cloud Platform: AWS, Azure, Google Cloud, Cloud Stack/Open Stack
Programming Languages: SQL, PL/SQL, UNIX shell Scripting, Python, Spark.
Methodologies: JAD, System Development Life Cycle (SDLC), Agile, Waterfall Model.
Operating System: Windows, UNISET Tools Informatics 9.6/9.1 and Tableau.
PROFESSIONAL EXPERIENCE
Confidential, Charlotte, NC
Sr. Data Engineer
Responsibilities:
- As aSr. Data Engineer,provided technical expertise and aptitude toHadoop technologiesas they relate to the development of analytics.
- Implemented data pipelines using python code
- Followed solely SDLC methodologies during the course of the project.
- Participated in JAD sessionsfor design optimizations related to data structures as well as ETL processes
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Usedwindows Azure SQL reporting servicesto create reports with tables, charts and maps.
- Developed adata model (star schema)for the sales data mart usingErwin tool.
- Extracted data usingSwoopImport query from multiple databases and ingest intoHive tables
- DevelopedSQL scriptsfor creating tables, Sequences, Triggers, views and materialized views.
- Performed severalad-hoc data analysisinAzure Data Bricks AnalysisPlatform on KANBAN board.
- UsedAzure reportingservices to upload and download reports.
- Worked on debugging and identifying the unexpected real-time issues in the production serverSSIS packages.
- Loaded real time data from various data sources intoHDFSusingKafka.
- DevelopedMap Reduce jobsforData CleanupinPython.
- Defined extract - translate-load (ETL) and extract-load-translate (ELT) processes for the Data Lake.
- Participated in integration ofMDM (Master Data Management)Hub anddata warehouses.
- Writtenpig scriptto load processed data fromHDFSintoMongoDB.
- Developed Simple to complexMap/reduceJobs usingHiveandPig
- Extracted the needed data from the server intoHDFSand Bulk Loaded the cleaned data intoBase.
- Designed theData Martsin dimensional data modeling using star and snowflake schemas.
- Worked on reading multiple data formats onHDFSusingpython.
- Involved in database development by creatingOracle PL/SQL Functions, Procedures and Collections.
- Worked inOozyand workflow scheduler to manageHadoop jobswith control flows.
- PreparedTableau reportsand dashboards with calculated fields, parameters, sets, groups or bins and publish on the server.
- Translatedbusiness requirementsintoSAS codefor use within internal systems andmodels
- Migrated ofETL processesfromRDBMStoHiveto test the easy data manipulation.
Environment: Erwin9.8, Big Data3.0, Hadoop3.0, Azure, SQL, Sqoop1.4, ETL, HDFS, Kafka1.1, Python, Map Reduce, SSIS, MDM, Pig0.17, MongoDB, Hive2.3, HBase1.2, Oracle12c, PL/SQL, Tableau, Oozie4.3 SAS.
Confidential, Les Vegas
Data Engineer / Data Modeler
Responsibilities:
- As a Sr. Data Engineer I was responsible for all data management related aspects of a project.
- Used the Agile Scrum methodology to build the different phases of Software development life cycle.
- Key player in maintaining data pipelines and flows
- Conducted JAD sessions, wrote meeting minutes and also documented the requirement
- Worked on Data modeling, Advanced SQL with Columnar Databases using AWS
- Imported data from RDBMS to HDFS and Hive using Swoop on regular basis.
- Performed Data Analysis using procedures and functions in PL/SQL.
- Developed stored procedures in SQL Server to standardize DML transactions such as insert, update and delete from the database.
- Wrote T-SQL procedures to generate DML scripts that modified database objects dynamically based on user inputs.
- Worked on Normalization and De-Normalization techniques for both OLTP systems.
- Analyzed data using Hadoop components Hive and Pig.
- Extensively designed and developed the Base target schema.
- Used Swoop to import data into HDFS and Hive from other data systems.
- Generated DDL statements for the creation of new ER/studio objects like table, views, indexes, packages and stored procedures.
- Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from Teradata database.
- Used reverse engineering to connect to existing database and create graphical representation (E-R diagram).
- Managed database design and implemented a comprehensive Snow Flake-Schema with shared dimensions.
- Created Hive tables to store data and written Hive queries.
- Designed Mapping Documents and Mapping Templates for ETL developers
- Analyzed the SQL scripts and designed the solution to implement using python.
- Facilitated in developing testing procedures, test cases and User Acceptance Testing (UAT).
- Designed and Developed Oracle PL/SQL and Shell Scripts, Data Import/Export, Data Conversions and Data Cleansing.
- Presented the data scenarios via, ER/Studio logical models and excel mockups to visualize the data better.
- Created graphical representation in the form of Entity Relationships and elicit more information.
Environment: ER/studio, Agile, Hadoop3.0, AWS, HDFS, SQL, PL/SQL, Hive2.3, Sqoop1.4, T-SQL, OLTP, Pig0.17, HBase1.2, Teradata15, ETL, Oracle12c.
Confidential, San Diego, CA
Data Analyst/Engineer
Responsibilities:
- Worked onAWS RedshiftandRDSfor implementing models and data on RDS and Redshift
- UsedAgile Methodfor daily scrum to discuss the project related information.
- Performeddata analysisanddata profilingusingcomplex SQLon various sources systems includingOracle.
- Worked withMDM systemsteam with respect to technical aspects and generating reports.
- CreatedBase tablesto load large sets of structured, semi-structured and unstructured data coming fromUNIX, NoSQLand a variety of portfolios.
- Worked on Normalization and De-Normalization techniques for bothOLTPandOLAP systems
- ImplementedDimensional modelfor theData Martand responsible for generatingDDL scriptsusingErwin.
- Involved in completeSSIS life cyclein creatingSSIS packages, building, deploying and executing the packages all environments.
- CreatedSQL scriptsfor database modification and performed multiple data modeling tasks at the same time under tight schedules.
- Loaded data intoHive TablesfromHadoop Distributed File System (HDFS) toprovideSQL-like access onHadoop data
- Created Mappings,Applets,Sessionsand Workflows usingInformaticsto replace the existing Stored Procedures
- DevelopedData mapping, Data Governance, TransformationandCleansingrules for the Data Management.
- Designed and createdData Martsas part of adata warehouse.
- DevelopedHive queriesto process the data for visualizing.
- Repopulated the static tables in theData warehouseusingPL/SQL proceduresandSQL Loader.
- Translated business concepts intoXML vocabulariesby designingXML SchemaswithUML
- Involved in the design and development of user interfaces and customization ofReports usingTableau.
- Created the data model for the Subject Area in theEnterprise Data Warehouse (EDW).
- Involved withdata profilingfor multiple sources and answered complex business questions by providing data to business users.
- Developed reports for users in different departments in the organization usingSQL Server Reporting Services (SSRS).
- Performed dicing and slicing on data usingPivot tablesto acquire the churn rate pattern and prepared reports as required.
Environment: Erwin9.7, Redshift, Agile, MDM, Oracle12c, SQL, HBase1.1, UNIX, NoSQL, OLAP, OLTP, SSIS, Informatics, HDFS, Hive, XML, PL/SQL, Tableau, SSRS.
Confidential, Amarillo, Texas
Data Analyst/Data Modeler
Responsibilities:
- Effectively involved inData Analyst/Data Modeler roleto review business requirement and compose source to targetdata mapping documents.
- UsedSybase Power Designer toolforrelational databaseanddimensional data warehousedesigns.
- Extensively usedStar Schema methodologies in building and designing thelogical data modelandPhysical Data ModelintoDimensional Models
- Designed and developed databases forOLTPandOLAP Applications.
- UsedSQL ProfilerandQuery Analyzerto optimizeDTS packagequeries andstored procedures.
- CreatedSSIS packagesto populate data from various data sources.
- Extensively performedData analysisusingPython Pandas.
- Extensively involved in themodelingand development ofReporting Data Warehousing System.
- Developed reports and visualizations usingTableau Desktopas per the requirements.
- Extensively usedInformatics tools- Source Analyzer, Warehouse Designer, Mapping Designer.
- Involved in migration and Conversion of Reports fromSSRS.
- Migrated data fromSQL Server to Oracleand uploaded asXML-TypeinOracle Tables.
- PerformedData Dictionary mapping, extensive database modeling (Relational and Star schema)utilizingSybase Power Designer.
- Involved in performance tuning and monitoring of bothT-SQLandPL/SQL blocks.
- Worked on creatingDDL, DML scriptsfor thedata models.
- Created volatile and global temporary tables to load large volumes of data intoTeradata database
- Involve in designingBusiness Objectsuniverses and creating reports
- Provided production support to resolve user issues for applications usingExcel VBA
- Worked withTableauin analysis and creation of dashboard and user stories.
- Involved in extensive Data validation usingSQL queriesand back-end testing
- PerformedGAP analysisof current state to desired state and document requirements to control the gaps identified.
Environment: Sybase Power Designer, SQL, PL/SQL, Teradata14, Oracle11g, XML, Tableau, OLAP, OLTP, SSIS, Informatics, SSRS, Python, T-SQL, Excel.
