Senior Data Engineer Resume
NC
SUMMARY:
- Over 10+ years of experience in Data Engineer, including profound expertise and experience on statistical data analysis such as transforming business requirements into analytical models, designing algorithms, and strategic solutions that scales across massive volumes of data.
- Experience in Big Data/Hadoop, Data Analysis, Data Modeling professional with applied information Technology.
- Strong experience working with HDFS, MapReduce, Spark, Hive, Sqoop, Flume, Kafka, Oozie, Pig and HBase.
- IT experience on Big Data technologies, Spark, database development.
- Good experience in Amazon Web Service (AWS) concepts like EMR and EC2 webservices which provides fast and efficient processing of Teradata Big Data Analytics.
- Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing Data Mining, Data Acquisition, Data Preparation, Data Manipulation, Feature Engineering, Machine Learning Algorithms, Validation and Visualization and reporting solutions that scales across massive volume of structured and unstructured Data.
- Experience in usage of Hadoop distribution like Cloudera and Hortonworks.
- Excellent Experience in Designing, Developing, Documenting, Testing of ETL jobs and mappings in Server and Parallel jobs using Data Stage to populate tables in Data Warehouse and Data marts.
- Have experience in Apache Spark, Spark Streaming, Spark SQL and NoSQL databases like HBase, Cassandra , and MongoDB.
- Establishes and executes the Data Quality Governance Framework, which includes end - to-end process and data quality framework for assessing decisions that ensure the suitability of data for its intended purpose.
- Expert in designing Server jobs using various types of stages like Sequential file, ODBC, Hashed file, Aggregator, Transformer, Sort, Link Partitioner and Link Collector.
- Proficiency in Big Data Practices and Technologies like HDFS, MapReduce, Hive, Pig, HBase, Sqoop, Oozie, Flume, Spark, Kafka.
- Expert in designing Parallel jobs using various stages like Join, Merge, Lookup, remove duplicates, Filter, Dataset, Lookup file set, Complex flat file, Modify, Aggregator, XML.
- Expertise in configuring the monitoring and alerting tools according to the requirement like AWS CloudWatch.
- Extensive experience in Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau.
- Excellent knowledge of studying the data dependencies using metadata stored in the repository and prepared batches for the existing sessions to facilitate scheduling of multiple sessions.
- Utilized analytical applications like SPSS, Rattle and Python to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into risk management and marketing strategies that drive value.
- Extensive experience in loading and analyzing large datasets with Hadoop framework (MapReduce, HDFS, PIG, HIVE, Flume, Sqoop, SPARK, Impala, Scala), NoSQL databases like MongoDB, HBase, Cassandra.
- Integrated Kafka with Spark Streaming for real time data processing.
- Skilled in performing data parsing, data manipulation and data preparation with methods including describe data contents.
- Strong experience in the Analysis, design, development, testing and Implementation of Business Intelligence solutions using Data Warehouse/Data Mart Design, ETL, BI, Client/Server applications and writing ETL scripts using Regular Expressions and custom tools (Informatica, Pentaho, and Sync Sort) to ETL data.
- Experienced on Hadoop Ecosystem and Big Data components including Apache Spark, Scala, Python, HDFS, Map Reduce, KAFKA.
- Implemented Hadoop based data warehouses, integrated Hadoop with Enterprise Data Warehouse systems.
- Hands on experience with big data tools like Hadoop, Spark, Hive, Pig, Impala, Pyspark, Spark SQL.
- Good knowledge in Database Creation and maintenance of physical data models with Oracle, Teradata, Netezza, DB2, MongoDB, HBase and SQL Server databases.
- Deep understanding of MapReduce with Hadoop and Spark. Good knowledge of Big Data ecosystem like Hadoop 2.0 (HDFS, Hive, Pig, Impala), Spark (SparkSQL, Spark MLLib, Spark Streaming).
- Experienced in writing complex SQL Quires like Stored Procedures, triggers, joints, and Sub quires .
- Interpret problems and provides solutions to business problems using data analysis, data mining, optimization tools, and machine learning techniques and statistics.
- Large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
- Good experience on programming languages Python, Scala.
- Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
- Ability to work with managers and executives to understand the business objectives and deliver as per the business needs and a firm believer in teamwork.
- Good understanding and hands on experience with AWS S3 and EC2.
- Experience and domain knowledge in various industries such as healthcare, insurance, retail, banking, media and technology. Moreover, working closely with customers, cross-functional teams, research scientists, software developers, and business teams in an Agile/Scrum work environment to drive data model implementations and algorithms into practice.
- Excellent performance in building, publishing customized interactive reports and dashboards with customized parameters including producing tables, graphs, listings using various procedures and tools such as Tableau and user-filters using Tableau.
- Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement. Practical understanding of the Data modeling (Dimensional & Relational) concepts like Star-Schema Modeling, Snowflake Schema Modeling, Fact and Dimension tables.
- Strong written and oral communication skills for giving presentations to non-technical stakeholders.
TECHNICAL SKILLS:
Databases: Oracle, MySQL, SQLite, NO SQL, RDBMS, SQL Server 2014, HBase 1.2, MongoDB 3.2. Teradata, Netezza. Cassandra. Alation, Data Governance
Database Tools: PL/SQL Developer, Toad, SQL Loader, Erwin.
Web Programming: Html, CSS, Xml, JavaScript.
Programming Languages: R, Python, SQL, Scala, UNIX, C, JAVA, Tableau
DWH BI Tools: Data Stage 9.1, 11.5, Tableau Desktop
Data Visualization: Dataiku, Tableau9.4/9.2.
Bigdata Frameworks: HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Flume and HBase, Amazon EC2, S3 and Red Shift, Spark, Storm, Impala, Kafka.
Scheduling Tools: Autosys, Control-M.
Operating Systems: AIX, LINUX, UNIX.
Environment: AWS, AZURE, Databricks.com
Reporting Tools: SSIS/SSRS/SSAS.
PROFESSIONAL EXPERIENCE:
Confidential, NC
Senior Data Engineer
RESPONSIBILITIES:
- Developing Spark programs with Python, and applied principles of functional programming to process the complex structured data sets.
- Work in a fast-paced agile development environment to quickly analyze, develop, and test potential use cases for the business.
- The individual will be responsible for design and development of High-performance data architectures which support data warehousing, real-time ETL and batch big-data processing.
- This project was mainly focus on reporting the commercial loan detailed information to ‘Federal department’ with applying ‘Data Governance controls on it.
- Worked with Hadoop infrastructure to storage data in HDFS storage and use Spark / HIVE SQL to migrate underlying SQL codebase in AWS.
- Converting Hive/SQL queries into Spark transformations using Spark RDDs and Pyspark
- Analyzing SQL scripts and designed the solution to implement using PySpark
- Export tables from Teradata to HDFS using Sqoop and build tables in Hive.
- Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
- Use SparkSQL to load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using SparkSQL.
- Worked with Hadoop ecosystem and Implemented Spark using Scala and utilized DataFrames and Spark SQL API for faster processing of data.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster processing of data.
- Develop RDD's/Data Frames in Spark using and apply several transformation logics to load data from Hadoop Data Lakes.
- Filtering and cleaning data using Scala code and SQL Queries.
- Working experience in Financial Reports “2052A”, “FR-Y9C”, “14Q”,10K-Q”
- Involved as primary on-site ETL Developer during the analysis, planning, design, development, and implementation stages of projects using IBM Web Sphere software (Quality Stage v9.1, Web Service, Information Analyzer, Profile Stage)
- Prepared Data Mapping Documents and Design the ETL jobs based on the DMD with required Tables in the Dev Environment.
- Implemented Installation and configuration of multi-node cluster on Cloud using Amazon Web Services (AWS) on EC2.
- Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Bigdata technologies. Extracted Mega Data from Amazon Redshift, AWS, and Elastic Search engine using SQL Queries to create reports.
- Active participation in decision making and QA meetings and regularly interacted with the Business Analysts &development team to gain a better understanding of the Business Process, Requirements & Design.
- Used DataStage as an ETL tool to extract data from sources systems, loaded the data into the ORACLE database.
- Designed and Developed Data stage Jobs to Extract data from heterogeneous sources, applied transform logics to extracted data and Loaded into Data Warehouse Databases.
- Used Talend for Big data Integration using Spark and Hadoop.
- Created DataStage jobs using different stages like Transformer, Aggregator, Sort, Join, Merge, Lookup, Data Set, Funnel, Remove Duplicates, Copy, Modify, Filter, Change Data Capture, Change Apply, Sample, Surrogate Key, Column Generator, Row Generator, Etc.
- Generate metadata, create Talend etl jobs, mappings to load data warehouse, data lake.
- Designed and Developed Real time Stream processing Application using Spark, Kafka, Scala and Hive to perform Streaming ETL and apply Machine Learning.
- Involved in Relational and Dimensional Data modeling for creating Logical and Physical Design of Database and ER Diagrams with all related entities and relationship with each entity based on the rules provided by the business manager using ERWIN r9.6.
- Experienced in developing parallel jobs using various Development/debug stages (Peek stage, Head & Tail Stage, Row generator stage, Column generator stage, Sample Stage) and processing stages (Aggregator, Change Capture, Change Apply, Filter, Sort & Merge, Funnel, Remove Duplicate Stage)
- Extensively worked with Join, Look up (Normal and Sparse) and Merge stages.
- Applying the Data modelling and Data Designing in-between staging and target for creating the views.
- Responsible for uploading the data into ‘Enterprise Data Warehouse’ by using the ETL tool ‘IBM DataStage 9.1 version or 11.5 version.
- Responsible for gathering requirements, system analysis, design, development, testing and deployment and Responsible to manipulate HTML5, CSS3 in jQuery and also provided dynamic functionality using AJAX, XML and JSON.
- Extensive knowledge on applying Data Governance controls.
- Working with RSR team and accomplish the tasks in timely manner and report to the stakeholders by weekly especially with the presentations.
- Worked on SQL Server concepts SSIS (SQL Server Integration Services), SSAS (Analysis Services) and SSRS (Reporting Services). Using Informatica & SSIS, SPSS, SAS to extract transform & load source data from transaction systems.
- Involved with writing scripts in Oracle, SQL Server and Netezza databases to extract data for reporting and analysis and Worked in importing and cleansing of data from various sources like DB2, Oracle, flat files onto SQL Server with high volume data
- Evaluated big data technologies and prototype solutions to improve our data processing architecture. Data modeling, development and administration of relational and NoSQL databases (Big Query, Elastic Search)
- Utilized Spark, Scala, Hadoop, HBase, Cassandra, MongoDB, Kafka, Spark Streaming, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction etc.
- Used Informatica power center for (ETL) extraction, transformation and loading data from heterogeneous source systems and studied and reviewed application of Kimball data warehouse methodology as well as SDLC across various industries to work successfully with data-handling scenarios, such as data
- Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLTP reporting.
- Extensive knowledge on AQT tool. Classifying the domains in Enterprise Data Warehouse.
- Worked on ERwin for developing data model using star schema methodologies and collaborated with other data modeling team members to ensure design consistency and integrity.
- Hands on experience on ‘DataIku’ visualization tool.
ENVIRONMENT: IBM Info sphere DataStage 9.1/11.5, Oracle 11g, Flat files, Autosys, UNIX, Erwin, TOAD, MS SQL Server database, XML files, AWS, MS Access database.
Confidential, Plantation, FL
Sr. Data Engineer
RESPONSIBILITIES:
- This project was focused on customer clustering. Used the ETL Data Stage Director to schedule and running the jobs, testing and debugging its components & monitoring performance statistics.
- Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple Map Reduce jobs in PIG and Hive for data cleaning and pre-processing.
- Architected, Designed and Developed Business applications and Data marts for reporting. Involved in different phases of Development life including Analysis, Design, Coding, Unit Testing, Integration Testing, Review and Release as per the business requirements.
- Implemented Spark GraphX application to analyze guest behavior for data science segments.
- Worked on batch processing of data sources using Apache Spark , Elastic search.
- Developed Big Data solutions focused on pattern matching and predictive modeling.
- Collaborated with EDW team in, High Level design documents for extract, transform, validate and load ETL process data dictionaries, Metadata descriptions, file layouts and flow diagrams.
- Develop an Estimation model for various product & services bundled offering to optimize and predict the gross margin
- Designed OLTP system environment and maintained documentation of Metadata. Used forward engineering approach for designing and creating databases for OLAP model.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- A highly immersive Data Science program involving Data Manipulation & Visualization, Web Scraping, Machine Learning, Python programming, SQL, GIT, Unix Commands, NoSQL, MongoDB, Hadoop.
- Worked on migrating PIG scripts and Map Reduce programs to Spark Data frames API and Spark SQL to improve performance.
- Involved in creating UNIX shell scripts for database connectivity and executing queries in parallel job execution.
- Used the ETL Data Stage Director to schedule and running the jobs, testing and debugging its components & monitoring performance statistics.
- Worked closely with the ETL Developers in designing and planning the ETL requirements for reporting, as well as with business and IT management in the dissemination of project progress updates, risks, and issues.
- Performed scoring and financial forecasting for collection priorities using Python, and SAS.
- Handled importing data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS
- Worked in AWS environment for development and deployment of custom Hadoop applications.
- Managed existing team members lead the recruiting and on boarding of a larger Data Science team that addresses analytical knowledge requirements.
- Developed predictive causal model using annual failure rate and standard cost basis for the new bundled services.
- Design and develop analytics, machine learning models, and visualizations that drive performance and provide insights, from prototyping to production deployment and product recommendation and allocation planning.
- Worked with sales and marketing team for Partner and collaborate with a cross-functional team to frame and answer important data questions.
- Participated in Normalization /De-normalization, Normal Form and database design methodology. Expertise in using data modeling tools like MS Visio and Erwin Tool for logical and physical design of databases.
- Prototyping and experimenting ML algorithms and integrating into production system for different business needs.
- Successfully implemented pipeline and partitioning parallelism techniques and ensured load balancing of data. Deployed different partitioning methods like Hash by column, Round Robin, Entire, Modulus, and Range for bulk data loading and for performance boost.
- Worked on Multiple datasets containing 2billion values which are structured and unstructured data about web applications usage and online customer surveys
- Design built and deployed a set of python modeling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior prediction and support multiple marketing segmentation programs.
- Designed, developed and maintained data integration programs in Hadoop and RDBMS environment with both RDBMS and NoSQL data stores for data access and analysis.
- Used all major ETL transformations to load the tables through Informatica mappings.
- Created Hive queries and tables that helped line of business identify trends by applying strategies on historical data before promoting them to production.
- Worked on Data modeling, Advanced SQL with Columnar Databases using AWS.
- Extensively used Apache Sqoop for efficiently transferring bulk data between Apache Hadoop and relational databases (Oracle) for product level forecast. Extracted the data from Teradata into HDFS using Sqoop.
- Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring.
- Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom visualization tools using R, Tableau, and Power BI.
ENVIRONMENT: IBM DataStage, Python, Spark framework, AWS, Redshift, MS Excel, NoSQL, Tableau, T-SQL, ETL, RNN, LSTM MS Access, XML, MS office 2007, Outlook, MS SQL Server.
Confidential, San Jose, CA
Sr. Data Engineer
RESPONSIBILITIES:
- Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive innovative business solutions.
- Involved in designing data warehouses and data lakes on regular (Oracle, SQL Server) high performance on big data (Hadoop - Hive and HBase) databases. Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark.
- Generated ad-hoc SQL queries using joins, database connections and transformation rules to fetch data from legacy DB2 and SQL Server database systems.
- Translated business requirements into working logical and physical data models for OLTP &OLAP systems.
- Creation of BTEQ, Fast export, Multi Load, TPump, Fast load scripts for extracting data from various production systems.
- Reviewed Stored Procedures for reports and wrote test queries against the source system (SQL Server-SSRS) to match the results with the actual report against the Datamart (Oracle)
- Perform Data profiling, preliminary data analysis and handle anomalies such as missing, duplicates, outliers, and imputed irrelevant data. Remove outliers using Proximity Distance and Density based techniques.
- Involved in Analysis, Design and Implementation/translation of Business User requirements.
- Used supervised, unsupervised and regression techniques in building models.
- Performed Market Basket Analysis to identify the groups of assets moving together and recommended the client their risks
- Developed ETL (Extraction, Transformation and Loading) procedures and Data Conversion Scripts using Pre-Stage, Stage, Pre-Target and Target tables.
- Creating the data pipelines using state of the art Big Data frameworks/tools
- Experience in extracting appropriate features from datasets in-order to handle bad, null, partial records using spark SQL.
- Worked on storing the dataframe into hive as table using Python (PySpark).
- Experienced in ingesting data into HDFS from various Relational databases like Teradata using sqoop and exported data back to Teradata for data storage.
- Hands on experience in developing apache SPARK applications using Spark tools like RDD transformations, Spark core, Spark MLlib, Spark Streaming and Spark SQL.
- Experience in developing various spark application using Spark-shell (Scala).
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive
- Executed Hive queries on ORC tables stored in Hive to perform data analysis to meet the business requirements
- Utilizing Spark get data from HDFS, process it and store it back into HDFS
- Developed a Python Script to load the CSV files into the S3 buckets and created AWS S3 buckets, performed folder management in each bucket, managed logs and objects within each bucket.
- Created Airflow Scheduling scripts in Python to automate the process of Sqooping wide range of data sets.
- Involved in file movements between HDFS and AWS S3 and extensively worked with S3 bucket in AWS
- Developed Spark SQL scripts using Python for faster data processing
- Using Sqoop to extract the data from warehouse, SQL server and load into Hive
- Used Spark framework to transform the data for final consumption of analytical applications
- Involved in scheduling Oozie workflow engine to run multiple Hive jobs, developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Spark
- Performed Exploratory Data Analysis using R. Also involved in generating various graphs and charts for analyzing the data using Python Libraries.
- Dynamic implementation of SQL server work on website using SQL developer tool and Experience with continuous integration and automation using Jenkins and Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Involved in the execution of multiple business plans and projects Ensures business needs are being met Interpret data to identify trends to go across future data sets.
- Simultaneously working on pilot project to move to the environment to Amazon EMR, a cloud-based Hadoop distribution and other Amazon cloud solutions available
- Developed interactive dashboards, created various Ad Hoc reports for users in Tableau by connecting various data sources.
ENVIRONMENT: Python, SQL server, Hadoop, HDFS, HBase, MapReduce, Hive, Impala, Pig, Sqoop, Mahout, LSTM, RNN, Spark MLLib, MongoDB, AWS, Tableau, Unix/Linux.
Confidential, Phoenix, AZ
Data Engineer
RESPONSIBILITIES:
- Involved in Analysis, Design and Implementation/translation of Business User requirements.
- Worked on collection of large sets using Python scripting. Spark SQL
- Worked on large sets of Structured and Unstructured data.
- Worked on creating DL algorithms using LSTM and RNN.
- Actively involved in designing and developing data ingestion, aggregation, and integration in Hadoop environment.
- Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.
- Developed SQOOP scripts to migrate data from Oracle to Big data Environment.
- Extensively worked with Avro and Parquet files and converted the data from either format Parsed Semi Structured JSON data and converted to Parquet using Data Frames in Spark.
- Converted all Hadoop jobs to run in EMR by configuring the cluster according to the data size
- Implemented Spring security for SQL injunction and user access privileges, Used various Java, J2EE design patterns like DAO, DTO, Singleton etc.
- Experience in creating Hive Tables, Partitioning and Bucketing.
- Performed data analysis and data profiling using complex SQL queries on various sources systems including Oracle 10g/11g and SQL Server 2012.
- Identified inconsistencies in data collected from different source.
- Participated in requirement gathering and worked closely with the architect in designing and modeling.
- Designed object model, data model, tables, constraints, necessary stored procedures, functions, triggers, and packages for Oracle Database.
- Wrote Spark applications for Data validation, cleansing, transformations and custom aggregations.
- Imported data from various sources into Spark RDD for processing.
- Developed custom aggregate functions using Spark SQL and performed interactive querying.
- Worked on installing cluster, commissioning & decommissioning of Data node, Name node high availability, capacity planning, and slots configuration.
- Developed Spark applications for the entire batch processing by using Scala.
- Automatically scale-up the EMR instances based on the data.
- Stored the time-series transformed data from the Spark engine built on top of a Hive platform to Amazon S3 and Redshift.
- Facilitated deployment of multi-clustered environment using AWS EC2 and EMR apart from deploying Dockers for cross-functional deployment.
- Visualized the results using Tableau dashboards and the Python Seaborn libraries were used for Data interpretation in deployment.
- Created PDF reports using Golang and XML documents to send it to all customers at the end of month.
- Worked with business owners/stakeholders to assess Risk impact, provided solution to business owners.
- Experienced in determine trends and significant data relationships Analyzing using advanced Statistical Methods.
- Carrying out specified data processing and statistical techniques such as sampling techniques, estimation, hypothesis testing, time series, correlation and regression analysis Using R.
- Applied various data mining techniques: Linear Regression & Logistic Regression, classification, clustering.
- Took personal responsibility for meeting deadlines and delivering high quality work.
- Strived to continually improve existing methodologies, processes, and deliverable templates.
ENVIRONMENT: R, SQL server, Oracle, HDFS, HBase, AWS, MapReduce, Hive, Impala, Pig, Sqoop, NoSQL, Tableau, RNN, LSTM, Unix/Linux, Core Java.
Confidential
Data Analyst/Engineer
RESPONSIBILITIES:
- Worked on different dataflow and control flow task, for loop container, sequence container, script task, executes SQL task and Package configuration.
- Created new procedures to handle complex logic for business and modified already existing stored procedures, functions, views and tables for new enhancements of the project and to resolve the existing defects.
- Loading data from various sources like OLEDB, flat files to SQL Server 2012 database Using SSIS Packages and created data mappings to load the data from source to destination.
- Created batch jobs and configuration files to create automated process using SSIS.
- Created SSIS packages to pull data from SQL Server and exported to Excel Spreadsheets and vice versa.
- Built SSIS packages, to fetch file from remote location like FTP and SFTP, decrypt it, transform it, mart it to data warehouse and provide proper error handling and alerting
- Extensive use of Expressions, Variables, Row Count in SSIS packages
- Data validation and cleansing of staged input records was performed before loading into Data Warehouse
- Automated the process of extracting the various files like flat/excel files from various sources like FTP and SFTP (Secure FTP).
- Deploying and scheduling reports using SSRS to generate daily, weekly, monthly and quarterly reports.
ENVIRONMENT: MS SQL Server 2005 & 2008, SQL Server Business Intelligence Development Studio, SSIS-2008, SSRS-2008, Report Builder, Office, Excel, Flat Files, .NET, T-SQL.
Confidential
Data Analyst/Engineer
RESPONSIBILITIES:
- Worked on different dataflow and control flow task, for loop container, sequence container, script task, executes SQL task and Package configuration.
- Created new procedures to handle complex logic for business and modified already existing stored procedures, functions, views and tables for new enhancements of the project and to resolve the existing defects.
- Loading data from various sources like OLEDB, flat files to SQL Server 2012 database Using SSIS Packages and created data mappings to load the data from source to destination.
- Created batch jobs and configuration files to create automated process using SSIS.
- Created SSIS packages to pull data from SQL Server and exported to Excel Spreadsheets and vice versa.
- Built SSIS packages, to fetch file from remote location like FTP and SFTP, decrypt it, transform it, mart it to data warehouse and provide proper error handling and alerting
- Extensive use of Expressions, Variables, Row Count in SSIS packages
- Data validation and cleansing of staged input records was performed before loading into Data Warehouse
- Automated the process of extracting the various files like flat/excel files from various sources like FTP and SFTP (Secure FTP).
- Deploying and scheduling reports using SSRS to generate daily, weekly, monthly and quarterly reports.
ENVIRONMENT: MS SQL Server 2005 & 2008, SQL Server Business Intelligence Development Studio, SSIS-2008, SSRS-2008, Report Builder, Office, Excel, Flat Files, .NET, T-SQL.