Sr. Data Engineer Resume
Dallas, TexaS
SUMMARY
- Around 8+ years of professional software development experience, created enterprise level software solutions in continuous integration and continuous delivery pipelines and version controlling the development work carried out during the development life cycle.
- Hands - On experience into PySpark, Data Science, Data Bricks, Machine Learning, Salesforce, Deep Learning, ETL development experience.
- Experience in configuring, deployment and support of cloud services including Confidential Web Services (AWS).
- Strong knowledge and experience on Confidential Web Services (AWS) Cloud services like EC2, S3, EBS, RDS, VPC, and IAM.
- An excellent understanding of both traditional statistical modeling and Machine Learning techniques and algorithms like Regression, clustering, ensembling (random forest, gradient boosting), deep learning (neural networks), etc.
- Work with application and architecture teams to conduct proof of concept (POC) and implement the design in production environment in AWS.
- Proficient with deployment and management of AWS services - including but not limited to: VPC, Route 53, ELB, EBS, EC2, S3.
- Sound knowledge of Bigdata, Hadoop, MapReduce, Hive, Splunk, NoSQL Databases and other emerging technologies.
- Hand on experience in migrating on premises ETLs to GCP using cloud native tools such as BIG query, Cloud Data Proc, Google Cloud Storage, Composer.
- Experience in moving data between GCP, Cloud functions, BIG Query.
- Build data pipelines in airflow in GCP for ETL related jobs using different airflow operators.
- Expertise in Snowflake data modeling and ETL using Snowflake SQL, SNOWSQL for implementing complex stored procedures and best practices with data warehouse and ETL concepts.
- Demonstrated strength in SQL, aggregate functions PySpark scripting, data modelling and data warehousing.
- Experienced in working with applications like Snowflake/SNOWSSAS SQL, SQL Server Management studio (SSMS), MY SQL, Oracle PL/SQL, MongoDB NOSQL, SAP HANA, SQL Developer, and SQL plus, DBeaver for development and customization.
- Worked on cloud technologies like AWS, Azure, GCP (Google Cloud Platform), helped data science and machine learning and deep learning teams.
- Expertise in writing Packages, Stored Procedures, Functions, Views, and Database Triggers using SQL and PL/SQL.
- Expertise in SSIS (SQL SERVER INTEGRATION SERVICES), SSRS (SQL SERVER REPORTING SERVICES), SSAS (SQL SERVER ANALYSIS SERVICES), Power BI, Power Pivot and Tabular Model, DAX, MDX queries.
- Extensive experience in using SQL Management Studio, SQL Server Business Intelligence Solutions like DTS (Data Transformation Services), SSRS (SQL Server Reporting Services) & Crystal Reports, SSIS (SQL Server Integration Service Packages).
- Experience using Panda’s python libraries during the development lifecycle, and experience with python development under Linux Operating System.
- Skilled in Python with using new tools and technical developments (libraries used: libraries- NumPy, Scipy, PySide, Pandas data frame, Networkx, urllib2, Pychart, Highcharts) to drive improvements throughout entire SDLC.
- Working experience into Spark for data processing, aggregation and transformation with unit tests and design data processing pipelines.
- Hands-on experience building bigdata applications and utilities using Python and launching applications in a distributed environment.
- Exception Handling and performance optimization techniques on python scripts using spark data frames.
- Having experience in Agile Methodologies, Scrum stories and sprints experience in a Python based environment, along with data analytics, data wrangling and Excel data extracts.
- Hands on experience on Snowflake Connector for Python provides an interface for developing Python applications that can connect to Snowflake and perform all standard operations.
- Extensive experience in Tableau Desktop and Tableau Server in various versions of Tableau as a Developer and Administrator.
- Extensive experience in Tableau ecosystems, In depth knowledge on Tableau Desktop, Tableau Reader, Tableau Public and Tableau Server.
- Created Excel reports, Dashboards & Performing Data validation activity VLOOKUP, HLOOKUP, Macros, formulas, index match, Slicer with (Pivot table, Get Pivot Data, Dashboards), Power View, Power Map and Heat Map.
- Designed several business dashboards using Tableau Desktop, Tableau Server, Tableau reader and Tableau Online, Tableau PREP.
- Extensive experience in creating several reports such as drill down, drill through reports, parameterized reports, linked reports, Heat Maps, Dual Charts, Action Dashboards, and cascading reports using Tableau.
- Strong advanced Tableau developer skills such as complex calculations, table calculations, parameters, geographic mapping, data blending, and extract optimization.
- Experienced in Database optimization and developing stored procedures, Triggers, Cursors, Joins, Views, Cursors and SQL on databases: MySQL, Oracle10g, OMWB tool.
- Experience in High Level Design of ETL - DTS Packages & SSIS Packages - for integrating data using OLE DB connection from heterogeneous sources (Excel, CSV, flat file, Text Format Data) by using multiple transformations provided by SSIS such as Data Conversion, Conditional Split, Bulk Insert, merge, and union all.
- Expertise in working on Agile/Scrum methodology and used tools like JIRA, Rally, SharePoint, Confluence for project collaboration.
- Experience in gathering business requirements from business/user, creating Process Flows and Data Flow Diagram (DFD).
- Adept in creating Use cases, Test Cases and GUI (Graphic User Interface) using MS WORD.
- Experience with developing software using Object-oriented programming, and a solid understanding of core fundamental concepts of Data Structures and Algorithms.
- Hands-on experience and have in-depth knowledge of using different AWS services for software development.
- Experience with Data Visualization and Data Analysis techniques to gain insights into data.
- Strong communication and time management skills and can work under tight deadline.
TECHNICAL SKILLS
Software/Scripting languages: SQL, Python, Tableau server, Hadoop, UML 2.0, CSS, HTML.
Databases/Cloud: AWS, Azure, Snowflake, Databricks, My SQL, SAP HANA, Oracle, Teradata, Microsoft SQL Server, MS Access.
Tools: Pyspark, Office 365, Apex, Informatica, ALM, Atlassian JIRA, Bugzilla, Confluence, Tableaudesktop 2020.2.x,TableauServer2020.2.x, SQL server reporting services (SSRS), SQL Server Analysis Services (SSAS), DBeaver, ETL.
Methodologies: SDLC, Agile.
PROFESSIONAL EXPERIENCE
Confidential, Dallas, Texas
Sr. Data Engineer
Responsibilities:
- Developed efficient & custom SQL statements for report and analyzed data to identify for data inconsistencies and fix.
- Strong Knowledge in a PySpark/EMR environment.
- Proficient in programming languages (Python, PySpark) and used for automation.
- Developed and implemented predictive models of user behavior data on websites, URL categorical, social network analysis, social mining and search content based on large-scale Machine Learning.
- Involved in the execution of CSV files in Data Science Experience.
- Optimized Machine learning algorithms based on need.
- Developed SQL scripts on relational and non-relational databases, and worked on query optimization, and data modeling.
- End to end deployment ownership for projects on AWS. This includes Python scripting for automation, scalability, builds promotions for staging to production etc.
- Experience in moving data between GCP, Cloud functions, BIG Query.
- Developed and implemented predictive models of user behavior data on website, URL categorical, social network analysis, social mining and search content based on large-scale Machine Learning.
- Used multiple Machine learning algorithms, including random forest and boosted tree, SVM, SGD, neural network and deep learning using TensorFlow.
- Strong analytical experience with databases in writing complex queries, query optimization, debugging, user-defined functions, views, Indexes, etc.
- Have experience interacting with business stakeholders and customers directly.
- Developed highly scalable classifiers and tools by leveraging machine learning, Apache spark & deep.
- Created SQL scripting in Query Analyzer to create .tab files and .csv files for Excel reports.
- Created SQL queries/reports to pull metrics from Snowflake and SQL Server needed for executive management team.
- Conducted data blending, data preparation using SQL for Tableau consumption and publishing dashboard to Tableau Server.
- Using Tableau Desktop to analyze and obtain insights into large data sets using groups, bins, hierarchies, sorts, sets and filters.
- Configuring and maintaining MySQL & Tableau database servers and writing new database queries.
- Created SQL Store procedure, CTE, Trigger to validate, extract, transform and load data into Data Warehouse and Data Mart.
- Built, deployed and maintained complex python codes for automation.
- Used python tools to be cleaned, massaged data for Exploratory Data Analysis (EDA).
- Did calculations on data using NumPy.
- Developed, deployed and maintained python scripts for ETL processes
- Responsible for enhancing and developing APIs for smart start ignition interlock devices.
- Responsible for enhancing, the database performance by designing new APIs to handle memory latencies by performing selective deletion on the database.
- Handling bug fixes for various ignition interlock APIs and testing the new code changes.
- Root cause analysis to determine the effectiveness of applications, applying resolutions, and creating documentation before and after fixing the issues.
- Maintain coding standards and design customized applications, that have high availability and low latency to meet business goals.
- Monitored and performed tuning on the database server.
- Ensure the current system, application and process strategically meet the current and future design objectives.
- Experience with Subversion control and code review tools like SVN, GIT, and Fisheye.
- Experience working with non-structured files like JSON, CSV, and XML.
- Participate in agile ceremonies like program increment, Iteration planning, backlog refinement, grooming, retrospective, sprint review, and daily standup.
Confidential, NY
Data Engineer
Responsibilities:
- Designed API layer using tools like Apigee, AWS API Gateway, API Security, API Framework etc.,
- Documenting user requirements, translating requirements into system solutions, and developing implementation plans and schedules.
- Created functions, triggers, views, and stored procedures using My SQL.
- Participated in all phases of research including data collection, data cleaning, data mining, developing models, and visualizations.
- Created database objects like Views, Store Procedure, CTE, Triggers, Functions, etc. using T-SQL on SSMS to store data and maintain database efficiently.
- Wrote complex SQL queries using inner join, left join, the case when on tables and views to retrieve data for reporting purpose.
- Worked on gathering and converting data over to Tableau using SQL/SAS/Excel data over to Tableau reports and Dashboards.
- Created and managed Tableau sites, projects, and workbooks, groups, data views, data sources and data connections.
- Maintained and scheduled of Tableau Data Extracts using Tableau Server and the Tableau Command Utility.
- Developing ETL procedures to ensure conformity, compliance with standards, and lack of redundancy, translates business rules.
- Extracting data from excel files, high volume of data sets from data files, Oracle, DB2, SFDC using Informatica ETL Mappings, SQL, UNIX scripts and loaded to the data storage area.
- Performed Data analysis using Python and built the functions for manipulating the data.
- Strong knowledge of Data warehousing, SQL, Stored Procedures, and UNIX/Linux environment.
- Design and Implement ETL for data load from Source to target databases and for Fact and Slowly Changing Dimensions (SCD) Type1, Type 2, and Type 3 to capture the changes.
- Participated in all phases of the development life cycle with extensive involvement in the definition and design meetings and functional and technical walkthroughs.
- Implemented custom error handling in Talend jobs and worked on different logging methods.
- Created UNIX script to automate the process for long-running jobs and failure jobs status reporting.
- Developed a high-level data dictionary of ETL data mapping and transformations from a series of complex Talend data integration jobs.
- Developed mappings to load Fact and Dimension tables, SCD Type 1 and SCD Type 2 dimensions, and Incremental loading and unit tested the mappings.
- Expertise in interaction with end-users and functional analysts to identify and develop Business Requirement Documents (BRD) and Functional Specification documents (FSD).
- Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer of project from development to testing environment and then to the production environment.
- Designed ETL Process Using SAS to extract the data from flat files, excel, and CSV.
Confidential
ETL Developer
Responsibilities:
- Updating investment analysis for pension schemes, including Ghana OIL, Ghana Water, and others.
- Created incremental refreshes for data sources on Tableau server.
- Combined MS SQL and Tableau to create and modified Tableau worksheets and dashboards by performing Table level calculations like window functions with diversified analytics such as lines, average lines, forecasting, trend analysis and distribution band.
- Created clear Power Point with data visualization results from Tableau and presented to customer service team.
- Design and Implement ETL for data load from Source to target databases and for Fact and Slowly Changing Dimensions (SCD) Type1, Type 2, and Type 3 to capture the changes.
- Participated in all phases of the development life cycle with extensive involvement in the definition and design meetings and functional and technical walkthroughs.
- Implemented custom error handling in Talend jobs and worked on different logging methods.
- Manipulate Tables on Excel and Import to SQL using Enterprise Business Intelligence.
- Created Excel based Analysis and reports using VLOOKUP, Pivot tables and complex excel formulas per business needs.
- Develop the ETL mappings for XML, CSV, and TXT sources and load the data from these sources into relational tables with Talend ETL Developed Joblets for reusability and to improve performance.
- Created UNIX script to automate the process for long-running jobs and failure jobs status reporting.
Confidential
Data Analyst
Responsibilities:
- Applied Business Rules and met with the different pharmaceutical departments in order to understand and obtain necessary regulatory information to construct quality requirement documentation.
- Responsible to visit client sites to analyze the patient records and collect data required to audit the government health measures compliancy, records and documents all member and provider outreach activity.
- Prepared a high - level document describing detailed artifacts required to perform a Bronze Level Compatibility review for Electronic Data Capture (EDC) product.
- Assisted in creating detailed technical design after understanding the 'AS IS' business process and the new requirement.
- The created technical design will review internally and sent to client to get approval.
- Facilitated meetings with the product owners, scrum master, project coordinator, SME s and development and testing team.
- Conducted meetings with user groups and internal stakeholders to gather, understand, and document product/service requirements.
- Developed acceptance criteria to test user stories in sprint planning meeting. Assessed the status of the organization to determine the scope of the application for Gap Analysis.
- Developed ETL for Data Extraction, Data Mapping and data Conversion using SQL.
- Extensively used ETL to load data from source systems like Flat Files and Excel Files into staging tables and load the data into the target database Oracle.
- Created multiple visualization reports/dashboards using Histograms, filled map, bubble chart, Bar chart, Line chart, Tree map, Box and Whisker Plot, Stacked Bar etc.
- Developed Tableau workbooks involving advanced analysis techniques like Sets & Filters, Data Grouping/Visual Grouping, Computed Sort features and maps.
- Reporting skills using SQL Server 2008 Reporting Services (SSRS).
- Enabled Drill options like Drill down, drill up, merge dimensions, sections, filters, Hide Sections/rows/tables, set values and rules
- Maintained metrics that provide visibility to stakeholders on program-level plans, progress and quality to ensure commitment to the quality.
- This information may take the form of various Scrum artifacts, from backlogs to burn-down charts.
- Prepared documentation for all entities, attributes, data relationships, primary and foreign key structures, allowed values, codes, business rules, and glossary evolve and change during the project.
- Maintained the quality of data analysis, researched output and reporting, and ensured that all deliverables met specified requirements.
- Created mapping specification spreadsheets to document the transformation logic.
- Analyzed and interpret trends or patterns in complex data sets.
- Performed data analysis and data profiling using SQL queries on various sources.
- Created source to target mapping documents of data mart for all sources.
- Assisted Project Manager in documenting Project Charter using MS Project and actively involved in planning phase (WBS, Schedule and Resource Planning) of Project Management.
Environment: Windows, IE8, UML, Agile Methodology, Informatica, SSRS, Tableau, MS SQL Server 2012, Oracle, Java, HTML, MS Office Suite (Word, Excel, PPT, Visio)