Sr. Big Data Engineer Resume Bellevue, WA. - Hire IT People

SUMMARY:

Overall 13 years of experience in IT industry and played various roles (Big Data Engineer, ETL Systems Engineer and Database Administrator).
Senior Big Data Engineer and Hadoop Developer.
Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce Programming Paradigm.
Transforming and retrieving the data by using Spark, Impala, Pig, Hive, SSIS and Map Reduce.
Data Streaming from various sources like cloud (AWS, Azure) and on - premises by using the tools Spark and Flume.
Experience in developing customized UDF’s in Python to extend Hive and Pig Latin functionality.
Data importing and exporting by using Sqoop from HDFS to Relational Database Systems and vice-versa.
Extensively using open source languages Perl, Python, Scala and Java.
Excellent knowledge and Extensively using NOSQL databases (HBase).
Experience in Hadoop streaming and writing MR jobs by using Perl, Python other than JAVA.
Excellent knowledge and Extensively using WebHDFS REST API commands.
Experience in automation and building CICD pipelines by using Jenkins and Chef .
Develop generic SQL Procedures and Complex T-SQL statements to achieve the reports generation.
Hands on experience on data modelling with Star schema and Snowflake schema.
Business Intelligence Systems Engineer and Database Administration.
MCP Database Administrator for SQL Server 2005, 2008, 2008 R2, 2012 and 2014.
Excellent knowledge on Business Intelligence tools SSIS, SSAS, SSRS, Informatica and PowerBI.
Design and Implement the Data Distribution Mechanisms on SQL Server (Transactional, Snapshot, Merge Replications, SSIS and DTS).
High Availability and Disaster Recovery Systems Design and Implementation on SQL Server (Always On, Mirroring and Log Shipping).
Hands on experience with SQL Server Failover Cluster with Active/Passive model.
Database Backup, Database Restore, Data Recovery and Data Protection on SQL Server.
SQL Server Capacity planning, Space Management, Data Partition and Data Management.
Excellent knowledge on Database/Data Warehousing concepts such as Normalization, Entity-Relationship Modeling, Dimensional Data Modeling, Schema and Metadata.
Monitoring Data Activities (Database Status, Logs, Space Utilization, Extents, Checkpoints, Locks and Long Transactions) and apply improvements.
Excellent knowledge on Confidential Azure Services, Amazon Web Services and Management.
Side by side upgrade, In-Place upgrade and Data Migration.
Incident Management, SLA Management, TSG Maintenance and FTM Improvement.
Effectively Plan and Manage project deliverable with on-site and offshore model and improve the client satisfaction.
Responsible for team goal settings, timely feedback and improve their performance.

TECHNICAL SKILLS:

Big Data: Cloudera, Impala, Pig, Hive, Sqoop, Spark, Oozie, Flume, Hue, HDInsight, Zeppelin, Qubole

BI Tools: SSIS, SSAS, SSRS, Informatica PowerCenter, PowerBI, Zoom-Data

Languages: SQL, Perl, Scala, Java, Power Shell, Python, C#, VB.Net

Server Technologies: ASP.Net, ASP, Hibernate, JSP, JavaScript, XML, JSON

SQL Tools: BCP, TableDiff, DTA, SSIS, SQL Profiler

Operating Systems: RedHat 7, Ubuntu 12.x, Windows 7/8/2008/2012/2003/2012, CentOS 6.0

Tools: Eclipse, Intelli J, SQL Developer, Toad, VSTF, GIT, JIRA

DevOps Tools: Jenkins, Chef

Databases: HDFS, HBASE, SQL Server, SQL Azure, Oracle

PROFESSIONAL EXPERIENCE:

Confidential, Bellevue, WA.

Sr. Big Data Engineer

Responsibilities:

Create data pipeline of gathering, cleaning and optimizing data using Hive, Spark.
Gathering the data stored in AWS S3 from various third party vendors, optimizing it and joining with internal datasets to gather meaningful information.
Combining various datasets in HIVE to generate Business reports.
Using partitioning and bucketing in HIVE to optimize queries.
Storing data in ORC, Parquet and Avro File format with compression.
Moving data between cloud and on premise Hadoop using DISTCP and proprietary ingest framework.
Scheduling Workflows, Coordinators and Bundle using Oozie.
Using Spark Dataframe API in Scala for analyzing data.
Using Hadoop on Cloud service (Qubole) to process data in AWS S3 buckets.
Programming using Java and Scala.
Use Jenkins and Maven as build tools.
A continuous integration and deployment pipeline by using Jenkins and Chef.
Working using agile methodology.

Environment: Hadoop, Horton Works, Spark, Zeppelin, Qubole, Oozie, Hive, Sqoop, AWS, Jenkins, Chef, Linux Red-Hat and Teradata.

Confidential, Snoqualmie, WA.

Sr. Big Data Engineer

Responsibilities:

Design and Develop Data Collectors and Parsers by using Perl or Python.
Experience in developing customized UDF’s in Python to extend Hive and Pig Latin functionality.
Design and Develop Parsers for different file formats (CSV, XML, Binary, ASCII, Text, etc.).
Data Import and Export from various sources through Script and Sqoop.
Extensive usage of Spark for data streaming and data transformation for real time analytics.
Extensively using WebHDFS REST API commands in Perl scripting.
Big Data management in Hive and Impala (Table, Partitioning, ETL, etc.).
Extensive Usage of Hue and other Cloudera tools.
Extensive usage of NOSQL (HBASE) Database.
Design and Create Dimension, Fact Tables as per the KPIs.
Design and Develop Dashboards with KPIs as per the Metrics.
Design and Develop Dashboards in Zoom-Data and Write Complex Queries and Data Aggregation.
Extensive usage of Cloudera Hadoop distribution.
Shell Programming and Crontab automation.

Environment: Hadoop, Cloudera, Spark, Zeppelin, Hue, Impala, Pig, Hive, Sqoop, Zoom-Data, Linux Red-Hat and Oracle.

Confidential, Redmond, WA.

BI Engineer

Responsibilities:

Transform the raw data to meaningful information to support business decisions.
Develop, Deploy and Troubleshoot the ETL Work Flows using Hive, Pig and Sqoop.
Develop Map Reduce jobs for Data Cleanup in Python and C#.
Dimensions, Facts Design, and Data Modeling for reporting purpose.
Design and Implement the DataMart and improve the performance.
Identify the Reports for decision-making and create the Dash Boards by using the PowerBI.
Data analysis and Management reporting, Generate Metrics, Data Discovery, Dashboards and Scorecards.
Writing generic SQL Procedures and Complex T-SQL statements to achieve the reports generation.
Creating complex graphical representation to identify the service health for decisions making.
Maintain the HDInsight Clusters, troubleshoot the issues, and coordinate with partners.
Prepare the Metrics and bandwidth utilization of the Team, identify the root cause of the spike, and apply the improvements.
Identify the process gaps and define the process to optimize the support.
Extensive usage of Azure Portal, Azure PowerShell, Storage Accounts, Certificates and Azure Data Management.
Excellent knowledge on Confidential Cloud technology, Components and Subscription Management.
Prepare and Present the Metrics for the Team utilization and Environment status in PowerBI, Power Point and SQL Azure.
Responsible for team goal settings, timely feedback and improve their performance.

Environment: Hadoop, Horton Works, HDInsight 3.1, YARN, Oozie, Pig, Hive, Sqoop, PowerBI, Azure Storage and SQL Server 2014.

Confidential, Redmond, WA.

ETL System Engineer SQL DBA

Responsibilities:

Deploy and Troubleshoot ETL jobs that use SSIS packages.
Hands on experience for Data Integrity process and Data Modelling concepts.
Manage and troubleshoot the multi-dimensional data cubes developed in SSAS.
Manage large data movements with partitions and data management.
Backup the databases, restore the databases, and troubleshoot the issues.
Setup the Always On including Windows cluster and troubleshoot the issues.
Setup the Snapshot, Transactional replication on AlwaysOn environment.
Troubleshoot the SQL server performance issues and optimize the TSQL statements.
Extensively using the Business Intelligence tools (SSIS, SSAS and SSRS).
Manage the Security through Logins, Users, Permissions, Certificates, Credentials and Encryption Schemes as per the requirements.
SQL Server space, storage and database management for OLAP systems.
Extensive usage of SQL Server utilities BCP, TableDiff, DTA, DTEXEC, Profiler and SQLCMD.
Migrate databases to cloud platform SQL Azure and as well the performance tuning.
Build and maintain the environment on Azure IAAS, PAAS.
Lead the team and Manage project deliverables with on-site and offshore model.
Responsible for team goal settings, timely feedback and improve their performance.

Environment: SQL Server 2005, 2008R2 & 2012, IIS, Windows 2012, SSIS, SSAS, SSRS, Entity Framework.

Confidential .

Senior Support Analyst

Responsibilities:

Resolving issues related to Enterprise data warehouse (EDW), stored procedures in OLTP system and analyzed, design and develop ETL strategies.
Identified performance issues in existing sources, targets and mappings by analyzing the data flow, evaluating transformations and tuned accordingly for better performance.
Worked with heterogeneous source to Extracted data from Oracle database, XML and flat files and loaded to a relational Oracle warehouse.
Troubleshoot standard and reusable mappings and mapplets using various transformations like Expression, Aggregator, Joiner, Router, Lookup (Connected and Unconnected) and Filter.
Performed tuning of SQL queries and Stored Procedure for speedy extraction of data to resolve and troubleshoot issues in OLTP environment.
Troubleshooting of long running sessions and fixing the issues related to it.
Worked with Variables and Parameters in the mappings to pass the values between sessions.
Involved in the development of PL/SQL stored procedures, functions and packages to process business data in OLTP system.
Worked with Services and Portal teams on various occasion for data issues in OLTP system.
Worked with the testing team to resolve bugs related to day one ETL mappings before production.
Creating the weekly project status reports, tracking the progress of tasks according to schedule and reporting any risks and contingency plan to management and business users.

Environment: Informatica PowerCenter 8.6.1, Oracle 11g/10g/9i/8i, PL/SQL, SQL Developer 3.0.1, Toad 11.

We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

Bellevue, Wa

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship