Sr. Snowflake Developer Resume
Phoenix, AZ
SUMMARY
- 9 years of extensive IT Experience especially in Data Warehousing and Business Intelligence applications in Financial, Retail, Telecom, Insurance, HealthCare and Technology Solutions industries.
- Having experience on creating 2D drawings using Auto CAD.
- Experienced in creating Hive, Pig and custom map reduce programs for analyzing data.
- Experience in validating and analyzing Hadoop log files.
- Experience in loading multiple larger datasets into HDFS and processing the datasets by using the Hive and Pig.
- Familiar wif data architecture including data ingestion pipeline design, Hadoop information architecture, data modelling and data mining, machine learning and advanced data processing.
- Intake happens through Sqoop and Ingestion happens through Map Reduce, HBASE.
- Created Implicit, local and global Context variables in the job. Worked on Talend Administration Console (TAC) for scheduling jobs and adding users.
- Worked on various Talend components such as tMap, tFilterRow, tAggregateRow, tFileExist, tFileCopy, tFileList, tDie etc.
- Experience in validating tables wif Partitions, bucketing and Loading data into HIVE tables.
- Created Talend ETL jobs to receive attachment files from pop e - mail using tPop, tFileList and tFileInputMail and then loaded data from attachments into database and achieved the files
- Organizing and managing the technical DATA team of data architects, data modelers, engineers and providing leadership, guidance, and mentoring.
- Developing conceptual and logical information models wifin the context of the enterprise and line of business information Architecture.
- Data architects will oversee the mapping of data sources, data movements, interfaces, and analytics wif the goal of ensuring data quality.
- Deploying, managing and operating scalable, highly available, and fault tolerant systems on AWS.
- Worked on snowflake connector for developing python applications.
- Installing python connectors and python connector API
- Hands on experience on Hadoop technology stack (HDFS, Map - Reduce, Hive, HBase, Pig, Cassandra, Flume, Kafka and Spark)
- 3+ years of experience using Talend Data Integration/Big Data Integration (6.1/5.x) / Talend Data Quality.
- Performed research for lead architect on specifications, materials, building codes
- Well versed wif Talend Big Data, Hadoop, Hive and used Talend Big data components like tHDFSInput, tHDFSOutput, tPigLoad, tPigFilterRow, tPigFilterColumn, tPigStoreResult, tHiveLoad, tHiveInput, tHbaseInput, tHbaseOutput, tSqoopImport and tSqoopExport.
- Strong experience in migrating other databases to Snowflake.
- Participate in design meetings for creation of the Data Model and provide guidance on best data architecture practices.
- Analyzing the source data to no the quality of data by using Talend Data Quality.
- Broad design, development and testing experience wif Talend Integration Suite and noledge in Performance Tuning of mappings.
- Developed jobs in Talend Enterprise edition from stage to source, intermediate, conversion and target
- Participates in the development improvement and maintenance of snowflake database applications
- Evaluate Snowflake Design considerations for any change in the application
- Build the Logical and Physical data model for snowflake as per the changes required
- Define roles, privileges required to access different database objects.
- Define virtual warehouse sizing for Snowflake for different type of workloads.
- Design and code required Database structures and components.
- Development of Jobs in Matillion redshift ETL tool
- Build the Logical and Physical data model for snowflake as per the changes required
- Worked wif cloud architect to set up the environment
- Define virtual warehouse sizing for Snowflake for different type of workloads.
- Design and code required Database structures and components
- Ensure in corporation of best practices and lessons learned from prior projects
- Coding for Stored Procedures/ Triggers.
- Implement performance tuning where applicable.
- Designs batch cycle procedures on major projects using scripting and Control
- Develop SQL queries SnowSQL
- Develop transformation logic using snowpipeline.
- Optimize and fine tune queries
- Performance tuning of Big Data workloads.
- Has good Knowledge in ETL and hands on experience in ETL.
- Used ETL methodologies and best practices to create Talend ETL jobs. Followed and enhanced programing and naming standards.
- Write highly tuned and performant SQLs on various DB platform including MPPs.
- Develop highly scalable, fault tolerant, maintainable ETL data pipelines to handle vast amount of data.
- Build high quality, unit testable code.
- Operationalize data ingestion, data transformation and data visualization for enterprise use.
- Define architecture, best practices and coding standards for the development team.
- Provides expertise in all phases of the development lifecycle from concept and design to testing and operation.
- Work wif domain experts, engineers, and other data scientists to develop, implement, and improve upon existing systems.
- Deep dive into data quality issues and provide corrective solutions.
- Interface wif business customers, gathering requirements and deliver complete Data Engineering solution.
- Ensure proper data governance policies are followed by implementing or validating Data Lineage, Quality checks, classification, etc.
- Deliver quality and timely results.
- Mentor and train junior team members and ensure coding standard are followed across the project.
- Help talent acquisition team in hiring quality engineers.
TECHNICAL SKILLS
Cloud Technologies: Snowflake, SnowSQL, SnowpipeAWS.
Spark, Hive: LLAP, Beeline, Hdfs, MapReduce, Pig, Sqoop, HBase, Oozie, Flume
Reporting Systems: Splunk
Hadoop Distributions: Cloudera, Hortonworks
Programming Languages: Scala, Python, Perl, Shell scripting.
Data Warehousing: Snowflake, Redshift, Teradata
DBMS: Oracle, SQL Server, MySQL, Db2
Operating System: Windows, Linux, Solaris, Centos, OS X
IDEs: Eclipse, Netbeans.
Servers: Apache Tomcat
PROFESSIONAL EXPERIENCE
Confidential, Phoenix, AZ
Sr. Snowflake Developer
Responsibilities:
- Perform unit and integration testing and document test strategy and results
- Created data sharing between two snowflake accounts (Prod—Dev).
- Migrate the database 500 + Tables and views from Redshift to Snowflake
- Redesigned the Views in snowflake to increase the performance.
- Unit tested the data between Redshift and Snowflake.
- Creating Reports in Looker based on Snowflake Connections
- Proactively support team building and on boarding efforts via mentoring contributions.
- White-boarding and planning.
- Import of JSON S3 data into redshift using matillion
- Offloading historical data to Redshift spectrum using Matillion, Python and AWS glue.
- Working in distributed computing.
- Performed daily admin tasks of user registration, security configuration and usage monitoring
- Worked on Letter generated programs using C and UNIX shell scripting.
- Validated the Map reduce, Pig, Hive Scripts by pulling the data from the Hadoop and validating it wif the data in the files and reports.
- Utilized SQOOP, Kafka, Flume and Hadoop File System API’s for implementing data ingestion pipelines from heterogeneous data Sources.
- Experience wif Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
- As a Spotfire Admin, performed upgrades, hot fixes, server installations & webplayer installations.
- Created Talend jobs using the dynamic schema feature.
- Load and transform data into HDFS from large set of structured data /Oracle/Sql server using Talend Big data studio.
- Developed spark applications in python (PySpark) on distributed environment to load huge number of CSV files wif different schema in to Hive ORC tables.
- Worked on reading and writing multiple data formats like JSON, ORC, Parquet on HDFS using PySpark.
- Involved in End to End migration of 800+ Object wif 4TB Size from Sql server to Snowflake.
- Data moved from Sql Server Azure snowflake internal stage Snowflake wif copy options.
- Created roles and access level privileges and taken care of Snowflake Admin Activity end to end.
- Converted 230 views query’s from Sql server snowflake compatibility.
- Publishing customized interactive reports and dashboards, report scheduling using Tableau server.
- Administered user, user groups, and scheduled instances for reports in Tableau.
- Involved in installation of Tableau desktop 8.1, Tableau server Application software
- Worked on snowflake connector for developing python applications.
- Installing python connectors and python connector API.
- Knowledge of Azure Site Recovery and Azure Backup Installed and Configured the Azure Backup agent and virtual machine backup, Enabled Azure Virtual machine backup from the Vault and configured the Azure Site Recovery (ASR).
- Implemented a CI/CD pipeline using Azure DevOps (VSTS, TFS) in both cloud and on-premises wif GIT, MS Build, Docker, Maven along wif Jenkins’s plugins.
- Experience in writing Infrastructure as a code (IaC) in Terraform, Azure resource management, AWS Cloud formation. Created reusable Terraform modules in both Azure and AWS cloud environments.
- Retrofitted 500 Talend jobs from SQL Server to Snowflake.
- Experience in validating map-reduce jobs to support distributed processing using java, hive and pig.
- Responsible to make sure that team members are maintaining technical skills, keeping track of risk and issues and other required documentations, on time delivery, coordination wif other DBT teams, participate in all relevant meetings, prepare project report.
- Experience writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Worked on SnowSQL and Snowpipe
- Utilize admin tools object manager, Command manager, Enterprise manager, System Manager, and operation manager in day-to-day BI administration operations.
- Expertise in all areas of data warehousing.
- Involved in DBT estimates, IQA, EQA.
- Participate in developing and documenting technical standards and best practices for the BI organization.
- Converted Talend Joblets to support the snowflake functionality.
- Data integration in performs debugging, troubleshooting, modifications and unit result of integration solutions.
- Analysed the SQL scripts and designed it by using PySpark SQL for faster performance.
- Knowledge of monitoring, logging and cost management tools that integrate wif AWS.
- Validation of Looker report wif Redshift database.
- Created data sharing out of snowflake wif consumers.
- Validating the data from SQL Server to Snowflake to make sure it TEMPhas Apple to Apple match.
- Consulting on Snowflake Data Platform Solution Architecture, Design, Development and deployment focused to bring the data driven culture across the enterprises
- Used different AWS Data Migration Services and Schema Conversion Tool along wif Matillion ETL tool.
- Driving replacing every other data platform technology using Snowflake wif lowest TCO wif no compromise on performance, quality and scalability.
- Building solutions once for all wif no band-aid approach.
Environment: Snowflake, Redshift, SQL server, BI Architect, AWS, AZURE, TALEND, JENKINS and SQL
Confidential, Secaucus, New Jersey
Snowflake Developer
Responsibilities:
- The objective of this project is to develop a data warehouse to fulfill reporting needs and develop new marketing strategies. The data comes from source systems such as life applications to feed the business intelligence system. data into the Data Warehouse is the requirement to be designed and implemented. Worked on Hue interface for Loading the data into HDFS and querying the data.
- Designed and Created Hive external tables using shared Meta-store instead of derby wif partitioning, dynamic partitioning, and buckets.
- Wrote scripts and indexing strategy for a migration to Confidential Redshift from SQL Server and MySQL databases
- Used spark-sql to create Schema RDD and loaded it into Hive Tables and handled structured data using Spark SQL.
- Worked on multiple projects wif different business units including Insurance actuaries
- Gather the various reporting requirement from the business analysts.
- Gather all the Sales analysis report prototypes from the business analysts belonging to different Business units; Participated in JAD sessions involving the discussion of various reporting needs.
- Reverse engineered the reports and identified the Data Elements (in the source systems), Dimensions, Facts and Measures required for new enhancements of reports.
- Conduct Design discussions and meetings to come out wif the appropriate Data Mart at the lowest level of grain for each of the Dimensions involved.
- Designed a STAR schema for the detailed data marts and Plan data marts involving confirmed dimensions.
- Created and maintained the Data Model repository as per company standards.
- Conduct Design reviews wif the business analysts and content developers to create a proof of concept for the reports.
- Worked and resolved on the production data issues due to the migration of Data warehouse from Teradata to Netezza for NFS.
- Worked extensively wif Teradata Utilities like BTEQ, Fast export and all load utilities’
- Ensured the feasibility of the logical and physical design models.
- Collaborated wif the Reporting Team to design Monthly Summary Level Cubes to support the further aggregated level of detailed reports.
- Worked on the Snow-flaking the Dimensions to remove redundancy.
- Designed Sales Hierarchy dimensions to handle sales hierarchy reporting historically and dynamically
- Worked wif the Implementation team to ensure a smooth transition from the design to the implementation phase
- Worked closely wif the ETL SSIS Developers to explain the complex Data Transformation using Logic
- Created ETL Jobs and Custom Transfer Components to move data from Oracle Source Systems to SQL Server usingSSIS,
- Developed Mappings, Sessions, and Workflows to extract, validate, and transform data according to the business rules using Informatica.
- Worked wif Various HDFS file formats like Avro, Sequence File, and various compression formats like snappy, Gzip.
- Worked on data ingestion from Oracle to hive.
- Involved in fixing various issues related to data quality, data availability and data stability.
- Worked in determining various strategies related to data security.
- Performance monitoring and Optimizing Indexes tasks by using Performance Monitor, SQL Profiler, Database Tuning Advisor, and Index tuning wizard.
- Acted as point of contact to resolve locking/blocking and performance issues.
- Wrote scripts and indexing strategy for a migration to Confidential Redshift from SQL Server and MySQL databases
- Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift
- Used JSON schema to define table and column mapping from S3 data to Redshift
- Wrote indexing and data distribution strategies optimized for sub-second query response.
Environment: Snowflake, Redshift, SQL server, BI Architect, AWS, AZURE, TALEND, JENKINS and SQL
Confidential, Torrance CA
Snowflake Data Engineer
Responsibilities:
- Created data pipeline for several events of ingestion, aggregation and load consumer response data from AWS S3 bucket into Hive external tables.
- Worked on Oracle Databases, RedShift and Snowflakes
- Core Technologies: End to end data platform wif Snowflake, Matillion, Power BI, Qubole, Databricks, Tableau, Looker, Python, Data Iku & R
- Creating lambda SQS queues to create event based trigger for matillion
- Unloading data to AWS Athena using Glue and matillion
- Implemented a 'server less' architecture using API Gateway, Lambda, and Dynamo DB and deployed AWS Lambda code from Amazon S3 buckets. Created a Lambda Deployment function, and configured it to receive events from your S3 bucket.
- Create conceptual, logical and physical models for OLTP, Data Warehouse Data Vault and Data Mart Star/Snowflake schema implementations.
- Used Alteryx for Data Preparation and then Tableau for Visualization and Reporting.
- Processed data in Alteryx to create TDE for tableau reporting.
- Working experience wif Kimball Methodology and Data Vault Modeling.
- Define virtual warehouse sizing for Snowflake for different type of workloads.
- Experience in Tableau Server administration tasks including Tableau server optimization and performance tuning.
- Extensively used Autosys for scheduling the UNIX jobs.
- Major challenges of the system were to integrate many systems and access them which are spread across South America; creating a process to involve third party vendors and suppliers; creating authorization for various department users wif different roles.
- Experienced in Building a Talend job outside of a Talend studio as well as on TAC server.
- CSM certified, Worked in Scaled Agile (SAFE) environment as System/DBT QA, hands on experience wif Rally, JIRA.
- Involved in design and development of GLS application developed in C/C++ on HP UNIX.
- Setup full CI/CD pipelines so that each commit a developer makes will go through standard process of software lifecycle and gets tested well enough before it can make it to the production.
- Validated the data load process for Hadoop using the HiveQL qurey’s.
- Evaluate Snowflake Design considerations for any change in the application
- Build the Logical and Physical data model for snowflake as per the changes required
- Define roles, privileges required to access different database objects.
- Define virtual warehouse sizing for Snowflake for different type of workloads.
- Design and code required Database structures and components
- Build the Logical and Physical data model for snowflake as per the changes required.
- Developed the Pysprk code for AWS Glue jobs and for EMR.
- Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR, MapR distribution.
- Used more components in Talend and Few to be mentioned: tjava, toracle, txmlMap, tdelimited files, tlogrow, tlogback components etc. in many of my Jobs Design
- Worked on Joblets (reusable code) & Java routines in Talend
- Implemented Talend POC to extract data from Salesforce API as an XML Object & .csv files and load data into SQL Server Database.
- Worked wif cloud architect to set up the environment
- Educate developers on how to commit their work and how can they make use of the CI/CD pipelines that are in place.
Environment: Snowflake, SQL server, AWS and SQL
Confidential, Dallas, TX
Senior Data Engineer
Responsibilities:
- Performance monitoring and Optimizing Indexes tasks by using Performance Monitor, SQL Profiler, Database Tuning Advisor and Index tuning wizard.
- Acted as point of contact to resolve locking/blocking and performance issues.
- Wrote scripts and indexing strategy for a migration to Confidential Redshift from SQL Server and MySQL databases
- Worked on AWS Data Pipeline to configure data loads from S3 to into Redshift
- Used JSON schema to define table and column mapping from S3 data to Redshift
- Helped individual teams to set up their repositories in bit bucket and maintain their code and help them setting up jobs which can make use of CI/CD environment.
- Wrote indexing and data distribution strategies optimized for sub-second query response.
- Primarily worked on a project to develop internal ETL product to handle complex and large volume healthcare claims data. Designed ETL framework and developed number of packages.
- Performing Data source investigation, developed source to destination mappings and data cleansing while loading the data into staging/ODS regions
- Involved in various Transformation and data cleansing activities using various Control flow and data flow tasks in SSIS packages during data migration
- Applied various data transformations like Lookup, Aggregate, Sort, Multicasting, Conditional Split, Derived column etc.
- Developed Mappings, Sessions, and Workflows to extract, validate, and transform data according to the business rules using Informatica.
- Designed the data models to be used in data intensive AWS Lambda applications which are aimed to do complex analysis creating analytical reports for end-to-end traceability, lineage, definition of Key Business elements from Aurora.
- Worked on scalable distributed data system using Hadoop ecosystem in AWS EMR and MapR (MapR data platform).
- Querying, creating stored procedures and writing complex queries and T-SQL join to address various reporting operations and also random data requests.
Environment: SQL Server 2016, SSRS, SSIS, T-SQL, Snowflake, ELT Maestro, Oracle 11g, SQL * Loader, Visual Studio 2015, XML, Excel.
Confidential
ETL Developer
Responsibilities:
- Extensively used Oracle ETL process for address data cleansing.
- Developed and tuned all the Affiliations received from data sources using Oracle and Informatica and tested wif high volume of data.
- Responsible for developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Oracle and Informatica Power Center.
- Created common reusable objects for the ETL team and overlook coding standards.
- Reviewed high-level design specification, ETL coding and mapping standards.
- Designed new database tables to meet business information needs. Designed Mapping document, which is a guideline to ETL Coding.
- Used ETL to extract files for the external vendors and coordinated that effort.
- Migrated mappings from Development to Testing and from Testing to Production.
- Performed Unit Testing and tuned for better performance.
- Created various Documents such as Source-to-Target Data mapping Document, and Unit Test Cases Document.
- Developed Logical and Physical data models that capture current state/future state data elements and data flows using Erwin 4.5.
- Responsible for design and build data mart as per the requirements.
- Extracted Data from various sources like Data Files, different customized tools like Meridian and Oracle.
- Extensively worked on Views, Stored Procedures, Triggers and SQL queries and for loading the data (staging) to enhance and maintain the existing functionality.
- Done analysis of Source, Requirements, existing OLTP system and identification of required dimensions and facts from the Database.
- Created Data acquisition and Interface System Design Document.
- Designed the Dimensional Model of the Data Warehouse Confirmation of source data layouts and needs.
Environment: MS SQL Server 2012/2008R2, SQL server integration services (SSIS) 2012, SQL server reporting services (SSRS) 2012, and TSQL, MS Office, Notepad++