Softre Engineer - Big Data Resume
WA
SUMMARY
- (7+) Years of professional IT experience with 3+ years of experience in developing big data processing pipelines using Hadoop eco - system components and Microsoft bigdata services including Azure HDinsights and Cosmos.
- 3+ years of data analytics and engineering experience in extraction, processing, loading data on big data platforms HortonWorks, Cloudera and Cosmos.
- Experience in creating end-end big data pipelines on Cloudera, Cosmos, Azure - HDinsights, data lake store, data lake analytics, data factory and data warehousing.
- Experience in developing big data applications targeting 30-40 TB semi-structured and structured data per day data using technologies Hive, Spark, U-SQL, Scala, Sqoop, Flume, Impala and Scope.
- Proven ability in data analytics with larger datasets and comprehensive knowledge carrying out data analytics involving larger set of data points.
- In-Depth knowledge on Hadoop architecture and its components like HDFS, Name Node, Data Node, Job Tracker, Task Tracker and MapReduce programming paradigm.
- Good understanding of HBase architecture and its components like HCatalog, Region Servers, Meta Server, HFile, WAL, MemStore.
- Proficient in writing interactive Ad-hoc queries for analyzing large sets of data using Scala spark.
- Experience with data ingestion using technologies Sqoop and Flume.
- Experience efficient pipeline scheduling using oozie workflows and coordinators.
- Experience in implementing custom map-reduce applications and UDF’s in Java.
- Experience in creating and maintaining SSAS ROLAP and MOLAP cubes on Azure SQL server.
- Experience in implementing error handling, exception management, tracing and logging features.
- Strong abilities in Database Design, and experienced writing SQL Stored Procedures, Views, Functions in MS SQL Server.
- Tuning SQL queries for performance and performance testing using SQL DMV’s and by running server profiler.
- Experience on implementing RESTful web services using ASP .Net Web API.
- Knowledge in the process of Software Configuration Management (Daily Build, Release and Testing methodology) using tools like Team Foundation Server (TFS) and GIT.
- Well versed with the both Agile Scrum and waterfall methodologies.
TECHNICAL SKILLS
Languagess: Java, C#, SQL Server 2016, U-SQL, Hive, Impala, Scope, XML, JSON, XSD, JavaScript, JQuery, ASP.NET, WCF, Web API, ADO.NET.
BigData Technologies: Azure HDinsights, Cloudera Hadoop Distribution CDH5, HDP, HiveQL, Oozie, Map-Reduce, Spark, HDFS, YARN, Sqoop, Flume, Avro, Parquet, Cloudera Manager, Zookeeper, HBase, Cosmos, Scope, Azure Data Lake(ADL), ADL Analytics, Azure Data Factory(ADF)Databases SQL Server 2016, Azure Data Warehouse, T-SQL, MySQL.
IDE: Visual Studio 2015, Eclipse, LINQPad, SSMS, TFS, GIT, Janus, NUnit, Hue, MSExcel, SVN, VSS and, NetBeans
Operating Systems: Windows 2012 Server/R2, Linux, UNIX, Mac OS, Windows 7,8, 10.
PROFESSIONAL EXPERIENCE
Confidential, WA
Software Engineer - Big Data
Responsibilities:
- Develop big data applications with ability to efficiently extract, process raw or structured data into smaller datasets for insights using HDP on HDInsight clusters and COSMOS.
- Extensively worked on data engineering activities data acquisition, processing and reporting solutions.
- Core Functionalities: Creating Scripts in HiveQL, Spark Scala, USQL/SCOPE, Impala, Spark SQL, UDF’s (Processors, Reducers, Extractors, Dynamic Functions, Dynamic Views, Views, Modules and inline helper functions in Java and C#).
- Worked on creating HiveQL scripts to process and load data into Hive tables.
- Proficient designing data distributions for Hive tables using partition, distribution and clustering concepts.
- Created several Hive views that exposes set of data points by joining multiple larger datasets as required.
- Proficient debugging map-reduce queries for performance and suggesting improvisations.
- Experienced pipeline scheduling and debugging using Oozie workflow engine and web UI.
- Experienced performing data quality investigations writing interactive Ad-hoc queries using spark shell.
- Worked on developing spark applications for scenario’s involving iterative queries.
- Importing and Exporting data between SQL Server and HDFS using Sqoop.
- Mastered data distributions on virtual clusters and, processing of large data sets for telemetry purposes.
- Creating and Scheduling ADF pipelines to load processed data into SQL Server on ADW.
- Experienced creating SSAS ROLAP cubes on Azure querying the ADW databases.
- Worked on TSQL stored procedures, functions, views to further process the data for visualization/reporting solutions.
- Worked on developing U-SQL scripts and scheduling Azure Data Lake(ADL) Analytics jobs.
- Experience on making design decisions on ‘what’ to use ‘when’ concepts like stream adjuster, schema adjuster to support backward compatibility of data points or fields.
- Analyzing job runtime for Map-Reduce jobs for any data skew and ensure efficient processing by equal distribution of data.
- Worked on implementing custom tools for job, data stream, data skew monitoring and integrating existing monitoring to address vital problems maintaining data processing pipeline.
- Worked on implementing validation pipeline running against production pipeline to ensure pipeline health.
- Developing and maintaining Vizfx reports using HTML, JavaScript and, JQuery for partner and internal consumption.
- Experienced efficiency of software development using agile methodologies.
- Experience in closely working with customers to understand the requirements and documentation of new features.
Environment: HDInsights, Big Data, ADL Analytics, Hive, Spark, Scala, Sqoop, Scope, Ambari, Impala, Yarn, SQL 2016, SQL Server Profiler, U-SQL, Azure Data Warehouse Sql, ADL, Cosmos, Azure Data Factory, C# 6.0, SSAS, Excel, Vizfx, JQuery, Javascript, Visual Studio 2015, SSMS 2016.
Confidential
Hadoop Developer
Responsibilities:
- Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
- As part Data acquisition, used Sqoop to inject the data from server to Hadoop using incremental import.
- In pre-processing phase used spark to remove all the missing data and data transformation to create new features.
- In data exploration stage used hive and impala to get some insights about the customer data.
- Experienced creating Sqoop jobs with incremental load to populate HIVE external tables.
- Used flume, sqoop, hadoop, spark and oozie for building data pipeline
- Installed and configured Hadoop Map Reduce, HDFS, Developed multiple Map
- Reduce jobs in java for data cleaning and Processing.
- Importing and exporting data into HDFS and Hive using Sqoop
- Experienced in defining job flows
- Experienced in managing and reviewing Hadoop log files.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
- Load and transform large sets of structured, semi structured and unstructured data
- Responsible to manage data coming from different sources
- Supported Map Reduce Programs those are running on the cluster
- Cluster coordination services through Zookeeper.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and written Hive UDFs.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map way
- Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Environment: Hadoop, Big Data, HDFS, Map-Reduce, Sqoop, Oozie, Spark, Scala, Hive, Flume, Yarn, LINUX, Java, Cloudera Hadoop, Cloudera Manager, Hue.
Confidential
.Net Developer
Responsibilities:
- Involved in development of detailed design document for allocated use cases.
- Involved in development of Security component for the application based on Groups, Users and Roles - Role based security.
- Developed classes in the Business Layer and in Data Access Layer in C#.
- Developed User Interface using ASP.NET.
- Import & Export of Metadata Documents using SvcUtil.exe.
- Developed & maintained WCF Services.
- Integrated Content Management functionality into browser based solution using REST API’s
- Created Business Logic Layer & Data Access Layers to implement the MVC architecture.
- Involved in development of Business Logic layer using C#.
- Developed Web services using C# and ADO.NET.
- Implemented the client-side validation using validation controls and JavaScript.
- Generated different kinds of ad-hoc reports using SSRS.
- Responsible for authentication user using Form based authentication.
- Used Ajax Controls Extensively.
- Developed basic reports using crystal reports, SQL server reports and also created SQL server jobs.
- Developed common functions for client side validations using JavaScript
- Implemented Web Services, which used to call core business layer methods, to expose the core functionality based on SOA pattern.
- Written stored procedures and Views.
- Involved in Data Modeling of the existing and new application.
- Involved in code peer review, system regression and unit testing using Nunit
Environment: ASP.NET 3.5/4.0,MVC 5, ASP .NET Web API, Angular JS, C#, Crystal Reports 11.0,SQL Server 2008, IIS, Win Forms, Html, LINQ, SSRS, JavaScript, CSS, XML, XSLT, XPATH, Ajax, JQuery, WCF, Visual Studio.NET 2008/2010.
Confidential
.Net Developer
Responsibilities:
- Involved in requirement gathering and extensive interaction with users.
- Validation Controls. web forms, HTML Controls and Web Controls with cascaded style sheets in user interface for client application
- UX Design with Innovated personalized view of content, resources, and support by role and activity.
- Developed web forms using standard ASP.NET model of class inheritance.
- Used various Validations Control.
- Implementation of client side validations using JQuery and ASP.NET MVC Validation implemented Confidential Controller level.
- Worked on creating Restful services using ASP .NET Web API.
- Developed RESTful services that expose the set of resources that identify the target of the interaction with its Clients.
- Designed and implemented SQL Server 2008 database.
- Developed connectivity with SQL Server-2008 using ADO.NET.
- Programming database components including Stored Procedures, Views, functions and Triggers in SQL Server 2005.
- Designed and developed multi-tier components in C# to implement business rules. Used object oriented programming techniques to build modules.
- Implemented AJAX enabled controls for Auto fill text boxes and Update Panels for contents that needed re-fetching/post back on certain sections of the page along with confirmation of data wherever needed.
- To persist and transfer data between pages Session State, View State, Cookies and Query Strings were used.
- Design and development of reports using Crystal Reports
- Developed Reporting and Analysis module by web enabling reports, Web development using HTML, ASP.NET to deploy crystal reports for the end users to view, print and export data for analysis purposes.
Environment: ASP.NET, C#, MVC 3, ADO.NET, .NET Framework 3.5/4.0, Web API, RESTful web Services HTML, CSS,JSON, JavaScript, Crystal Reports
