Sr. Big Data Engineer Resume
Renton, Wa
SUMMARY:
- Overall 11+ years’ experience in client - server Application Development & Administration of SQL server databases, including ETL, Data Warehouse, Reports, of SQL in Linux and windows Server Platforms
- 4+ Years of experience in Big Data Analytics and Big-Data using Hadoop, Hive, Azure, hive- SQL. Pig, Python with hands on experience in Data ingestion, ETL, Data Analysis and Data Visualization
TECHNICAL SKILLS:
Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Oozie, Spark
Languages: Python, Java, SQL, XML, C++, C, WSDL, XHTML, HTML, CSS, Java Script, AJAX, PLSQL.
Database Modeling: Logical and Physical data base design for both transactional and decision-support systems using ER-Win.
Data warehouse: Extensive hands-on and managerial experience in the design, construction and deployment of data warehouse systems.
SQL backups: Redgate
Database Tools: SQL Profiler, DTA, Data Transformation Services (DTS), SSIS, Upgrade Wizard, Query Build er, Jobs and Task creation and scheduling, BcP, Bulk Insert, XML, OLEDB, ODBC SQL Management Studio, SSRS
Databases: Oracle (10g/9i/8i), SQL Server 2000/2005/2008 /2012
Utilities & Tools: SQL* Plus, T-SQL, SSIS, SQL* Loader, Tivoli Monitoring and Lumigent Audit tool
Operating System: Win XP, Win 2003/2008/2012
GUIs & Languages: .NET, PL/SQL, XML, HTML, Oracle Client, Visual Basic, Visual Studio
Modeling Tool: Erwin 3.5/4.1/7.0 and ER Studio
Version Control: VSS
Deployment Tools: Octopus & AutoPilot
PROFESSIONAL EXPERIENCE:
Confidential, Renton, WA
Sr. Big data Engineer
Responsibilities:
- Requirement gathering, of various click stream events for web page clicks
- Collaborate internal applications and tools for big data development
- Ingest and load data, write python scripts & modules, register as jypthon (jar files)
- Prepare storage (avro) and pig for data retrieval and aggregations
- Work with large data sets, Experience with building meta data with Azure Data Warehouse environment, Dimensional modelling
- Work with SQL Azure Data Warehouse and schedule ‘copy data load’ of on-primes data
- Create and maintain Azure Gateways for Data and deployments, linked services and storage
- Prepare schema and metadata data for their internal warehouse
- Prepare test and system- integration for internal developed modules
- Evaluate procedure and process flow from end to end solutions
- Wrote ETL scripts using python modules and API’s
- Did data modelling, 3rd normal, design databases, Model ETL process and design SSAS cubes
- Develop data pipelines, use of data sets of Azure Data Factory
- Build data pipe line to move data between on premise data, use of management Gateway and copy data and deploying
- Work with dev and end partner users for product integration at the customer facing
- Wrote SQL queries using derived tables and joins to extract data from the database.
- Building Power BI dashboards and reports for enterprise level.
- Document defining, diemension modelling and data mart building from various data sources
- Worked on big data technologies Hive SQL, Sqoop, Hadoop and MapReduce
Environment: Hadoop, HDInsight, SQL Server 2005/2008r2/2012, T-SQL, Python, Hive, SQL, HTML, Big Data, Hadoop, hive-SQL, ET, pig, avro, website API’s
Confidential, Redmond WASr. Big Data Engineer
Responsibilities:
- Analyze the business requirements, pipeline data management
- High Level Design and Detailed Design and scripting using Python.
- Design and Develop Parsers for different file formats (CSV, XML, Binary, ASCII, Text, etc.)
- Develop database schema, objects store procedures, function and index
- Creating Database Objects such as Tables, Views, Functions, Stored Procedures, Indexes
- Executing parameterized Pig, Hive, impala, and UNIX batches in Production.
- Big Data management in Hive and Impala (Table, Partitioning, ETL, etc.).
- Used Map Reduce programmer
- Extensive usage of Jupiter notebook on SPARK clustering
- Build Multiple Seasoned patterns, peaks and trend analysis (page clicks)
- Simulate and predictive analysis of new peak times over a course of period
- Correlate with past data and do trend analysis
- Use COSMOS and SCOPE script to develop streams Did process ETL jobs using SSIS
- Design and Develop Dashboards in ZoomData and Write Complex Queries.
- Worked on Shell Programming and Cron Tab automation.
- Use Microsoft R for statistic analytics for classification & regression analysis
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
- Experience with full development cycle of a Data Warehouse, including requirements gathering, design, implementation, and maintenance.
- Performed testing and bug fixing.
Environment: Hadoop, Hive, Spark, Windows Server 2003/2008/2012, .NET Framework 3.5, python, Zupytor. Azure
Confidential, Redmond WASr. Data engineer/Analyst
Responsibilities:
- Implemented Spark streaming framework that processes the data for Kafka and perform analytics on top of it.
- Proposed an automated system using Shell script to sqoop the job.
- Wrote share point oData service queries, create share point and excel work books and create SSIS flow to build data pipeline
- Did data refresh and validation. Use power pivot and Power query for end user reports
- Build Cubes, did cube partitions and performance tuning, create process flags for availability of data, load and process data. Developed a strategy for Full load and incremental load using Sqoop.
- Mainly worked on Hive queries to categorize data of different claims.
- Integrated the hive warehouse with HBase
- Handle the tasks of integrating and testing data warehouse for small and large applications
- Generate final reporting data using Tableau for testing by connecting to the corresponding Hive tables using Hive ODBC connector.
Environment: Apache Hadoop, HDFS, Pig Hive, Java, Sqoop, Cloudera CDH5, Oracle, MySQL, Tableau, Talend, Elastic search, Storm, Data governance implementation
Confidential, Redmond WASr. Database Engineer
Responsibilities:
- Develop data pipelines, use of data sets of Azure Data Factory
- Build data pipe line to move data between on premise data, use of management Gateway and copy data and deploying
- Develop, build model for regression analysis
- Create and instantons classes and load data to develop models
- Manage and create cluster resource group from the accounts
- Create cluster and use Microsoft HDInsight
- Run SQL queries and use Jupiter notebook
- Detect anomalies, send alerts, register message or conversations
- Did security analysis with existing systems
- Monitored System health and logs and respond accordingly to any warning or failure conditions.
- Use Pyspark kernel. Load data to Azure ML, create schema and register data frame and load table
Environment: Windows Server 2003/2008/2012, RAID, SAN, Hadoop, Python, HDinsight, SQL HBase, NET Framework 3.5, SCOM, Octopus, SQL Server 2005/2008r2/2012
Confidential, Burbank CASr. Data engineer/ Analyst
Responsibilities:
- Actively involved in writing T-SQL Programming for implementing Stored Procedures and Functions and cursors, views for different tasks
- Designed dynamic SSIS Packages to transfer data crossing different platforms, validate data during transferring, and archived data files for different DBMS.
- Created cross reference tables and ETL routines to standardize the data coming from multiple sources
- Involved in ETL Architecture, Developing Source to target data mapping (STDM) document defining transformation logic for SAP data
- Translated business logic to transformation logic to generate pseudo code for the ETL process
- Created packages using SSIS for data extraction from Flat Files, Excel Files OLEDB to SQL Server using ETL Tool.
- Used transformations like Lookup, Aggregator, Sort, Derived Column, Union All and OLE DB Command to implement the transformation Logic and load the target tables.
- Prepare and study existing row level data for Columned data
- Use and process, variety of warehouse data distribution across it’s warehouse,
- Implement event-trigger action for warehouse Monitoring and update accordingly
- Aggregates,
- Develop Streaming model
- Did aggregations challenges: Running, /tuning during Production pipeline for large volume streams
Environment: Windows Server 2003/2008, ESX hosts
Confidential, Pittsburgh, PASr. Data Engineer/Analyst
Responsibilities:
- Ingest data, integrate and prepare data for partners
- Load data to process for cases and validate its features
- Apply methods and logic to evaluate appropriate values of results
- Validate results with collected field data
- Build model using above statistics for further analysis
- Up gradation from SQL 2000 to SQL 2005
- Develop plans and manage and execute complex technical efforts
- Perform DTS packages, perform Link Servers, work with T-SQL and write Stored Procedures
- Deployed and managed reports through SQL Server Reporting Services (SSRS).
- Develop Documentation for all Databases
- Optimized data processing for reports
Environment: Microsoft Windows Server 2000/2003, Microsoft SQL Server 2000/2005/2008, Microsoft SQL ServerReporting Services (SSRS), Microsoft SQL Server Integration Services (SSIS), XML, SQL Server Business IntelligenceDevelopment Studio.
Confidential, Mt. laurel NJDatabase Engineer
Responsibilities:
- Maintained cluster servers to provide high availability.
- Set up database backup and recovery strategies.
- Worked with development teams in the design of physical databases
- Troubleshoot the Queue updated transactional replication conflicts
- Provided technical support to internal developers and external clients
- Involved in performing Extraction, Transformation and Loading using DTS.
- Provided 24/7 Support for Production, Development & Test Servers of MS SQL Servers.
- Involved in Data Integration by identifying the information needs within and across functional areas of an enterprise database upgrade and scripting/data Migration with SQL server Export Utility
- Development and implementation of Backup and Recovery strategies
- Support for data integration, integrate PowerShell and monitor health of database
- Optimized the performance of queries with modifications in T-SQL queries, removed unnecessary columns, eliminated redundant and inconsistent data, normalized tables, established joins and created indexes whenever necessary.
- Run and execute SQL jobs
- Planned a complete backup on the database and restored database from disaster recovery.
- Day-to-day activities such as backups to the disk and restore the databases and transaction log on production/development servers as per business / IT needs.
Environment:: SQL Server 2000/2005, SSRS, Oracle9i, Windows Server 2003, ASP.NET, JavaScript, XML.
Confidential, Lawrenceville, GASr.SQL Developer/DBA
Responsibilities:
- Installation and configuration of SQL Server 2005 on windows environment
- Created Stored Procedures, Triggers, Views and Functions for the Application.
- To design the security for the SQL Server and for individual databases.
- Provided the detailed documentation of database structure and source code to Application Developers.
- Created clustered and non-clustered index data structures on database objects.
- Monitored and modified Performance using execution plans and index tuning.
- Monitoring SQL server performance using Profiler to find performance and dead locks.
- Analyze long running using SQL Server Profiler and tune them to optimize application and system performance.
- Migrated data from legacy Databases in Sql 2000 to Sql server 2005 using SSIS.
- Managed the migration of SQL Server 2000 databases to SQL Server 2005
- Developed, deployed and monitored SSIS Packages including upgrading DTS to SSIS.
- Performed daily tasks including backup and restore by using SQL Server 2005 tools like SQL Management Studio, SQL Server Profiler, SQL Server Agent, and Database Engine Tuning Advisor
- Developed and optimized database structures, stored procedures, views, triggers and user-defined functions.
- Troubleshoot system issues, monitored scheduled jobs and set up maintenance plans for proactively monitoring the performance of SQL Server databases
Environment: SQL Server2005, MS Analysis Services (SSAS), Business Objects, Erwin 4.5, SQL Server Integration Services (SSIS), Transact-SQL, SQL Server Reporting Services (SSRS), SQL Server Management Studio (SSMS), Business Intelligence Development Studio (BIDS), Excel, .Net, XML, Windows XP
ConfidentialDatabase Developer
Responsibilities:
- Responsible for logical and physical data modeling, database design, star schema, design, data analysis, programming, documentation and implementation.
- Monitoring SQL server lock & block using Profiler to find performance and dead locks.
- Analyze long running using SQL Server Profiler and tune them to optimize application and system performance.
- Migrated data from legacy Databases in Sql 2000 to Sql server 2005 using SSIS.
- Worked with various business groups while developing their applications, assisting in database design, phasing from development to QA and to Production environment.
- Used DTS to load data from various formats into a warehouse, Created and processed cubes.
- Created DTS packages for Uploading of Various format of files and databases to MS SQL.
- Scheduled the DTS packages run at different intervals to load the data into data warehouse.
- Coded several Stored Procedures, Database Triggers and Packages.
- Involved in component and integration testing of stored procedures. Wrote test conditions and test scripts for all the stored procedures that I have built.
- Involved in performance tuning of the database which includes index maintenance optimizing SQL statements and monitoring the server.
- Created test data according to the business scenarios and unit testing.
- Worked closely with Production support team and testing team.
Environment: SQL Server 2000/2005, DTS, T-SQL, DB2, ETL, Erwin 4.0, Visual Source Safe (VSS 6.0), SQL,PL/SQL, SQL Server Analysis Services (OLAP), Win NT/2000
