Etl Developer Resume
East Setauket, NY
SUMMARY
- 5 years of experience in Business Intelligence including SQL Server, ETL tool (SSIS), Reporting tool (SSRS), Data Visualization tool (Power BI) in multiple industries such as finance, mortgage, fund.
- Exposure in SDLC including requirement gathering, development, testing, debugging, deployment, production and maintenance.
- Expertise in developing T - SQL objects such as stored procedure, user defined function, view, trigger, common table expression (CTE), complex queries that involves nested joins and subqueries.
- Proficiency in developing SSIS packages to implement ETL (Extract, Transform, Load) process to migrate data from various data sources to data warehouse.
- Strong experience in data profiling, data validation, data cleansing using T-SQL and SSIS.
- Rich experience in developing C# scripts to work with files including check, download, upload, transform and archive from file system and FTP server.
- Development of different reports using drill-down, drill-through, cascading parameters, Sub reports and charts using grouping variables, expressions and functions.
- Competence with the report developing using SSRS in both Native mode and SharePoint mode.
- Extensive experience performance tuning and troubleshooting with T-SQL queries, stored procedures, SSIS packages, SSRS reports.
- Hands on experience in Hadoop Ecosystem components such as HDFS, Map Reduce, YARN, Hive, Pig, Sqoop, HBase and Flume for Data Analytics.
- Fundamental knowledge and academic background in objective oriented programming, data analysis, web crawler, machine learning algorithm using Python, C#, Java.
- Fundamental knowledge and academic background in financial market including primary market, secondary market, portfolio management, asset evaluation, fixed income, exotic options.
TECHNICAL SKILLS
Database: MS-SQL Server (2008,2012, 2014, 2016)
Integration Services: SSIS
Reporting Services: SSRS
Scripting language: C#, C++, Java, Python, Matlab, R
Data visualization: Power BI, Tableau
Source control: Team Foundation Server (TFS)
Collaboration tool: SharePoint
Data warehouse: Inmon, Kimball, star-schema, snowflake-schema
Cloud service: SQL server Azure
Big data tool: Hadoop Hive, HDFS, Pig, MapReduce
SDLC: Agile, Scrum
PROFESSIONAL EXPERIENCE
Confidential, East Setauket, NY
ETL developer
Environment: SQL server 2016, Oracle database 11g, DB2, SSMS, TSQL, C#, SSIS, SSRS, SSDT, SQL Azure, JIRA, TFS
Responsibilities:
- Organized joint application design (JAD) meeting with business end users, business analysts to collect business requirements regarding to ETL project.
- Created and reviewed business requirement documents (BRD), functional specification documents and data mapping documents to support ETL project.
- Wrote and maintained T-SQL scripts to create and modify database objects such as stored procedure, user defined function, regular view, schema-binding view to satisfy business requirements.
- Wrote T-SQL scripts to import/export XML data in different formats to SQL server tables.
- Utilized T-SQL transactions to ensure ACID properties during ETL process.
- Created different types of indexes such as clutered index, non-clustered index, covering index and columnstore index on SQL tables to improve overal data retrieval time.
- Optimized slow-running queries in T-SQL by using SQL execution plan and Database Tuning Advisor.
- Wrote C# scripts with System I/O and Linq libraries to automatically check, manage, download, zip/unzip and archive landing files from FTP server on a daily basis.
- Performed data profiling, data validation, data cleansing by writing T-SQL stored procedures, SSIS data flow transformations such as derived columns, conditional split, script components.
- Created and maintained over 60 SSIS packages to implement ETL (extraction, transformation, load) from different financial institutions (Bloomberg, CQG, Moody’s) OLTP into staging database for both initial load and incremental load.
- Implemented slowly changing dimension (SCD) type 1 and type 2 to do incremental loading into dimension tables using lookup transformation in SSIS data flow.
- Created custom logging tables to record critical information such as time duration, row count, error message during package execution using execution SQL task in SSIS event handler.
- Established error-handling procedures includes recording into log files, sending email notification, redirecting error output in SSIS event handler, control flow and data flow.
- Created checkpoints on time-consuming tasks to prevent repeating the loads of data.
- Optimized SSIS packages by using various techniques such as non-blocking, synchronous transformations as well executing packages in parallel, and adjusting memory buffer size to allow as much as data getting into cache.
- Performed unit testing on SSIS packages by creating sample testing data and test scripts.
- Deployed SSIS packages and setting up environment on test server.
- Created scheduled jobs using SQL Server Job Agent to automate SSIS package execution, alert, and notification.
- Prepared implementation documents about details of SSIS packages for research team manager and DBA.
Confidential
Senior BI developer
Responsibilities:
- Cooperate with business stakeholder, Business Analyst and Data Architecture to determine essential elements and requirements which must be included in business reports.
- Wrote and maintained various T-SQL scripts, Stored Procedures, User defined functions to import, manipulate datasets from SQL Server for different business reports.
- Utilized SSRS to create, maintain, display summarized data about Balance Sheet, Income Statement, Other Comprehensive Income (OCI) of individual company.
- Developed and maintained over 30 SSRS reports using techniques such as Drill-down, Drill- through, Cascading Parameterized, Linked Reports, Sub Reports, Matrix, Charts.
- Used snapshots and caching options to improve performance of report server.
- Deployed SSRS reports to SharePoint and report server for all the environments.
- Created subscriptions to automatically delivery reports in specific formats such as PDF, Excel, XML in different ways such as by E-mail and windows file share.
Confidential
BI developer
Environment: SQL server 2012, T-SQL, SSIS, SSDT, Oracle database, Python3.6, Hadoop (Hive, HDFS, MapReduce, Sqoop), Windows, Linux
Responsibility:
- Teamed up with business analysts and conducted JRD session for requirement gathering.
- Developed normalized logic and physical database models of OLTP system for equity transaction using ER-studio.
- Wrote required T-SQL scripts to create and modify database objects such as stored procedures, user defined functions, views, and triggers.
- Established error handling in T-SQL using try…catch block and @@error approach.
- Designed SSIS packages to extract data from different sources (SQL server database, Oracle database, flat files, Excel), transform data using different data flow transformations such as sort, aggregation, derived column, conditional split and finally load into data warehouse.
- Monitored the performance of SQL queries using SQL execution plan and optimized slow performing queries by changing T-SQL structure, utilizing indexes and implementing table vertical/horizontal partition techniques to enable parallel data loading.
- Created clustered/non-clustered indexes on critical columns to improve overall data retrieval performance and maintained fragmentation through index rebuild & reorganize.
- Worked with production management team to troubleshoot the production issue, mostly required to fix the price issue within short time window.
- Performed Web Crawler to extract data regard to stock market from various stock trading forums using Python packages including urllib2, request, Sqlite, Beautiful soup.
- Performed data profiling and data cleansing using Python libraries such as Pandas, Numpy.
- Created Hive tables (internal table) and utilized Sqoop and JDBC driver to load large sets of structured data coming from different sources (SQL server database, DB2, flat files) into Hive.
- Performed Web Crawler to collect online users’ comments regard to stock market from various stock forums using Python packages including urllib2, request, Sqlite, Beautiful soup, Pandas and load data into Hadoop Hive tables using Sqoop.
- Performed dynamic partitioning and bucketing with Hive tables to improve performance of data loading and HQL queries.
- Involved in back testing financial models using NLP (natural language process), Naïve Bayes classifier, ARMA-GARCH time series approach to do public sentiment analysis using Python scripts with packages such as NLTK, scikit-learn, Scipy.
Confidential, Indianapolis, Indiana
Database/BI developer
Environment: s: SQL server 2008, Oracle database 11g, TSQL, SSMS, SSRS, SharePoint, SSIS, TFS
Responsibilities:
- Worked with business stakeholders, application developers, and production teams and across functional units to identify business needs and discuss solution option.
- Created SQL stored procedures, temp tables, and views for the development of reports.
- Created complex T-SQL queries to work with different business requirements using joins, sub-queries, common table expression (CTE), ranking function such as Row number(), window functions such as Lead/Lag, advanced aggregation functions such as Rollup, cube.
- Performance tuning with SQL queries by creating various clustered, non-clustered indexes, columnstore index according to table structure.
- Developed over 20 SSRS reports with visualization features including bar charts(stacked), pie charts, line charts and conditional formatting.
- Applied drill-down, drill-through, cross-tab, cascading parameterized methods to highlight general information and hidden detailed records for users.
- Customized each SSRS report with user-defined parameters, filtering, grouping procedure conditional format from both blank tab and drop-down list.
- Applied SSRS subscriptions to render amortization schedule reports, refinance and cash out reports to each mortgage borrower, compliance reports such as Real Estate Settle Procedure act (RESPA), Home Ownership and Equity Protection Act (HOEPA) to different government regulatory institutions in different formats such as Excel, CSV, PDF on daily, weekly and monthly basis.
- Used snapshots and caching options to improve performance of Report Server.
- Configured snapshot replication for reporting, user acceptance test (UAT) and development servers.
- Involved in trouble shooting, performance tuning with different reports under production environment.
- Developed SSIS packages to extract raw data from data warehouse, saved as Excel files and send to internal audit department on a daily basis.