Sr. Big Data Engineer Resume Cincinnati, OH - Hire IT People

SUMMARY

Over 10+ years of IT experience including 4+ years of working experience as Big Data Engineer/Data Engineer.
Over 5+ years involved in the process of Data Analysis and Data modeling.
Strong experience with architecting highly per formant databases using PostgreSQL, PostGIS, MySQL and Cassandra.
Extensive experience in using ER modeling tools such as Erwin and ER/Studio.
Hands on experience in Normalization (1NF, 2NF, 3NF and BCNF) Denormalization techniques for effective and optimum performance in OLTP and OLAP environments.
Experience in transferring the data using Informatica tool from AWS S3 to AWS Redshift
Extensive experience in performing ETL on structured, semi - structured data using Pig Latin Scripts.
Expertise in moving structured schema data between Pig and Hive using HCatalog.
Excellent working experience in Scrum / Agile framework and Waterfall project execution methodologies.
Solid knowledge of Data Marts, Operational Data Store (ODS),OLAP, Dimensional Data Modeling with Ralph Kimball Methodology (Star Schema Modeling, Snow-Flake Modeling for FACT and Dimensions Tables) using Analysis Services.
Good understanding and exposure to Python programming.
Excellent working experience in Scrum/Agile framework and Waterfall project execution methodologies.
Strong Experience in working with Databases like Teradata and proficiency in writing complex SQL, PL/SQL for creating tables, views, indexes, stored procedures and functions.
Experience in importing and exporting Terabytes of data between HDFS and Relational Database Systems using Sqoop.
Experienced in configuring and administering the Hadoop Cluster using major Hadoop Distributions like Apache Hadoop and Cloudera.
Solid understanding of architecture, working of Hadoop framework involving Hadoop Distribute File System and its eco-system components MapReduce, Pig, Hive, HBase, Flume, Sqoop, and Oozie.
Experience in building highly reliable, scalable Big data solutions on Hadoop distributions Cloudera, Horton works, AWS EMR.
Good Experience on importing and exporting the data from HDFS and Hive into Relational Database Systems like MySQL and vice versa using Sqoop.
Good knowledge on NoSQL Databases including HBase, MongoDB, MapR-DB.
Strong experience and knowledge of NoSQL databases such as MongoDB and Cassandra.
Familiar with Amazon Web Services along with provisioning and maintaining AWS resources such as EMR, S3 buckets, EC2instances, RDS and others.
Expertise in Data Migration, Data Profiling, Data Cleansing, Transformation, Integration, Data Import, and Data Export through the use of multiple ETL tools such as Informatica Power Centre.
Experience with Client-Server application development using Oracle PL/SQL, SQL PLUS, SQL Developer, TOAD, and SQL LOADER.
Good experienced in Data Analysis as a Proficient in gathering business requirements and handling requirements management.
Experience in migrating the data using Sqoop from HDFS and Hive to Relational Database System and vice-versa according to client's requirement.
Experience with RDBMS like SQL Server, MySQL, Oracle and data warehouses like Teradata and Netezza.
Proficient knowledge and hands on experience in writing shell scripts in Linux.

TECHNICAL SKILLS

Data Modeling Tools: Erwin R9.7/9.6, ER Studio V17

Big Data & Hadoop Ecosystem: MapReduce, Spark 2.3, HBase 1.2, Hive 2.3, Pig 0.17, Flume 1.8, Sqoop 1.4, Kafka 1.0.1, Oozie 4.3, Hue, Cloudera Manager, Neo4j, Hadoop 3.0, Apache Nifi 1.6, Cassandra 3.11

RDBMS: Microsoft SQL Server 2017, Teradata 15.0, Oracle 12c, and MS Access

OLAP Tools: Tableau 7, SAP BO, SSAS, Business Objects, and Crystal Reports 9

Reporting Tools: SSRS, Power BI, Tableau, SSAS, MS-Excel, SAS BI Platform.

Cloud Platforms: AWS, EC2, EC3, Redshift & MS Azure

BI Tools: Tableau 10, Tableau server 10, Tableau Reader 10, SAP Business Objects, Crystal Reports

Programming Languages: SQL, PL/SQL, UNIX shell Scripting, R

Operating Systems: Microsoft Windows Vista7/8 and 10, UNIX, and Linux.

Methodologies: Agile, RAD, JAD, RUP, UML, System Development Life Cycle (SDLC), Waterfall Model.

PROFESSIONAL EXPERIENCE

Confidential - Cincinnati, OH

Sr. Big Data Engineer

Responsibilities:

As a Big Data Engineer, you will provide technical expertise and aptitude to Hadoop technologies as they relate to the development of analytics.
Assisted in leading the plan, building, and running states within the Enterprise Analytics Team.
Engaged in solving and supporting real business issues with your Hadoop distributed File systems and Open Source framework knowledge.
Built the data pipelines that will enable faster, better, data-informed decision-making within the business.
Identified data within different data stores, such as tables, files, folders, and documents to create a dataset in pipeline using Azure HDInsight.
Performed detailed analysis of business problems and technical environments and use this data in designing the solution and maintaining data architecture.
Used data integration to manage data with speed and scalability using the Apache Spark engine in Azure Databricks.
Involved in various phases of development analyzed and developed the system going through Agile Scrum methodology.
Designed efficient and robust Hadoop solutions for performance improvement and end-user experiences.
Worked in a Hadoop ecosystem implementation/administration, installing software patches along with system upgrades and configuration.
Loaded and transformed large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
Performed Data transformations in Hive and used partitions, buckets for performance improvements.
Continuously monitor and manage data pipeline (CI/CD) performance alongside applications from a single console with Azure Monitor.
Ingested data into HDFS using Sqoop and scheduled an incremental load to HDFS.
Worked with Hadoop infrastructure to storage data in HDFS storage and use HIVE SQL to migrate underlying SQL codebase in Azure.
Extensively involved in writing PL/SQL, stored procedures, functions and packages.
Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.
Created partitioned tables in Hive, also designed a data warehouse using Hive external tables and also created hive queries for analysis.
Developed scripts in Pig for transforming data and extensively used event joins, filtered and done pre- aggregations.
Performed Data scrubbing and processing with Apache NiFi and for workflow automation and coordination.
Worked in developing Pig Scripts for data capture change and delta record processing between newly arrived data and already existing data in HDFS.
Developed Simple to complex streaming jobs using Python, Hive and Pig.
Optimized Hive queries to extract the customer information from HDFS.
Involved in scheduling Oozie workflow engine to run multiple Hive and Pig jobs.
Analyzed data using Hive the partitioned and bucketed data and compute various metrics for reporting.
Built Azure Data Warehouse Table Data sets for Power BI Reports.
Working on BI reporting with At Scale OLAP for Big Data.
Developed customized classes for serialization and De-serialization in Hadoop.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.

Environment: Hive 2.3, Pig 0.17, Python, HDFS, Hadoop 3.0, Azure, NOSQL, Sqoop 1.4, Oozie, Power BI, Agile, OLAP.

Confidential

Big Data Engineer

Responsibilities:

Participated in requirements sessions to gather requirements along with business analysts and product owners.
Involved in Agile development methodology active member in scrum meetings.
Involvement in design, development and testing phases of Software Development Life Cycle (SDLC).
Installed and configured Hive and also written Hive UDFs and Cluster coordination services through Zookeeper.
Architected, Designed and Developed Business applications and Data marts for reporting.
Involved in different phases of Development life including Analysis, Design, Coding, Unit Testing, Integration Testing, Review and Release as per the business requirements.
Developed Big Data solutions focused on pattern matching and predictive modeling
Objective of this project is to build a data lake as a cloud based solution in AWS using Apache Spark.
Installed and configured Hadoop Ecosystem components.
Worked on implementation and maintenance of Cloudera Hadoop cluster.
Created Hive External tables to stage data and then move the data from Staging to main tables
Implemented the Big Data solution using Hadoop, hive and Informatica to pull/load the data into the HDFS system.
Pulling the data from data lake (HDFS) and massaging the data with various RDD transformations.
Involved in Kafka and building use case relevant to our environment.
Developed Oozie workflow jobs to execute hive, Sqoop and MapReduce actions.
Provided thought leadership for architecture and the design of Big Data Analytics solutions for customers, actively drive Proof of Concept (POC) and Proof of Technology (POT) evaluations and to implement a Big Data solution.
Created Integration Relational 3NF models that can functionally relate to other subject areas and responsible to determine transformation rules accordingly in the Functional Specification Document.
Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
Imported the data from different sources like HDFS/HBase into Spark RDD and developed a data pipeline using Kafka and Storm to store data into HDFS.
Documented the requirements including the available code which should be implemented using Spark, Hive, HDFS, HBase and Elastic Search.
Developed Spark code using Scala for faster testing and processing of data.
Apache Hadoop installation & configuration of multiple nodes on AWS EC2 system
Developed Pig Latin scripts for replacing the existing legacy process to the Hadoop and the data is fed to AWS S3.
Collaborated with Business users for requirement gathering for building Tableau reports per business needs.
Developed continuous flow of data into HDFS from social feeds using Apache Storm Spouts and Bolts.
Involved in loading data from Unix file system to HDFS.

Environment: Spark, 3NF, flume 1.8, Sqoop 1.4, pig 0.17, Hadoop 3.0, YARN, HDFS, HBase 1.2, Kafka, Scala 2.12, NoSQL, Cassandra 3.11, Elastic Search, Sqoop, UNIX, Zookeeper 3.4

We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

Cincinnati, OH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship