Big Data Engineer Resume North Kansas City, MO - Hire IT People

SUMMARY:

Extensive experience in the IT industry with expertise in Big Data
Worked in the healthcare & marketing industries
Worked on several Hadoop distribution platforms, including Cloudera CDH and Hortonworks HDP
Experienced in programming languages, such as Java, Python, Scala, and SQL
Proficient in utilizing Big Data Ecosystem including Hadoop, HDFS, YARN, MapReduce, Spark, Hive, Impala, HBase, Sqoop, Flume, Kafka, Oozie and Zookeeper
Experienced with real - time data processing mechanism in Big Data Ecosystem such as Apache Kafka and Spark Streaming
Proficient in programming Scala to analyze large datasets using Spark Streaming
Experienced in writing HiveQL, and developing Hive UDFs in Java to process and analyze data
Adept at using Sqoop to migrate data between RDBMS, NoSQL and HDFS
Worked with RDBMS including MySQL & Oracle SQL
Worked with NoSQL databases including HBase, and Cassandra
Performed data visualization with Tableau
Worked with Windows & Linux operating systems for development
Excellent knowledge of Linux/Unix Shell Commands
Excellent knowledge of Unit Testing with Pytest, ScalaCheck, ScalaTest, JUnit and MRUnit
Familiar with software development techniques such as Agile/Scrum and Waterfall
Involved in building, evolving and reporting framework on top of the Hadoop cluster to facilitate data mining, analytics and dash-boarding
Support a wide variety of ad-hoc data needs
Strong ability to prepare and present data in an easy-to-understand and visually appealing manner
Build high volume real-time data processing applications using Hadoop platform
Experienced in Cloud platforms such as Amazon Web Service(AWS)
Experience working in large scale Databases like Oracle 11g, DB2, XML,and MS Excel
Demonstrated ability to communicate and gather requirements, partner with enterprise architects, business users, analysts and development teams to deliver rapid iterations of complex solutions
Excellent teamwork skills, communication and leadership skills

TECHNICAL SKILLS:

Programming Languages Operating Systems: Java 8, Python 2.7/3.7, Scala, R, MATLAB, \ Linux, Windows, and MacOS\ and Arduino IDE\

Tools: Big Data Ecosystem: Tableau, Plotly, Microsoft Office (Word, \ Apache Hadoop 2.5, Spark 1.6/2.3, MapReduce, Excel with macros, PowerPoint), Putty, and \ Hive, HDFS, Kafka, Pig, Oozie, Sqoop \

Database Technologies Web Technologies: MySQL, Microsoft Access Hive, HBase, \ HTML 5, CSS, JavaScript\ and Cassandra\

PROFESSIONAL EXPERIENCE:

Confidential, North Kansas City, MO

Big Data Engineer

Responsibilities:

Worked with the Data Science team to gather requirements for various data mining projects.
Load and transform large sets of structured, and semi structured data
Utilized SparkSQL to extract and process data by parsing, using Datasets or RDDs in HiveContext, with transformations and actions (map, flatMap, filter, reduce, reduceByKey)
Enhanced and optimized product Spark code to aggregate, group and run data mining tasks using Spark framework
Monitored and tuned Spark jobs running on the cluster
Worked on Flume for efficiently collecting, aggregating and moving large amounts of data
Installed and configured Zookeeper for Hadoop cluster
Worked on setting up high availability for cluster and designed automatic failover using Zookeeper
Worked with application teams to install Hadoop updates, patches, version upgrades and operating system as required

Environment: Hadoop, HDFS, Spark 1.6.2, Flume 1.5.0, Zookeeper, Cloudera, MySQL 5.6, Putty, Eclipse

Confidential, Melbourne, FL

Project Owner / Developer

Responsibilities:

Developed a processing pipeline including transformations, estimations, evaluation of analytical models
Performed pre-processing on a dataset prior to training, including, standardization, and normalization
Built models by implementing a recurrent neural network and training it on the dataset
Evaluated model accuracy by dividing data into training and test datasets and computing metrics using evaluators
Tuned training hyper-parameters by integrating cross-validation into pipelines
Troubleshot and tuned machine learning algorithm in Spark

Environment: Spark 1.6.2, Spark Mllib, Spark ML, Hive 1.2.1, Flume 1.5.0, HBase 1.1.4, MySQL 5.6, Scala, Shell Scripting, Tableau 9.2

Confidential, Melbourne, FL

Data Analyst

Responsibilities:

Developed data pipeline using Kafka, Hive and Spark to ingest data into HDFS for analysis
Involved in creating Hive Tables, loading them with data and writing hive queries for data analysis
Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources
Explored Spark for improving the performance and optimization of the existing algorithms in Hadoop
Import the data from different sources like HDFS/HBase into Spark RDD
Involved in converting Hive/SQL queries into Spark transformations using Spark RDD and Scala
Developed functional programs in Scala for connecting the streaming data application and gathering web data
Configured connection between Hive and Tableau using Impala for BI development tool

Environment: Java 8, Scala, Apache Hadoop 2.5, Kafka, Spark, Hive, HDFS, YARN, MySQL 5.7, Tableau, and Microsoft Excel 2016,

Confidential, Melbourne, FL

IT Engineer

Responsibilities:

Managed project to improve total user activity within mobile application from 65% to 75% by designing gamification model for integrating elements such as VIP perks, points, badges, user competitions and leadership boards
Developed complete end to end big data processing in hadoop ecosystem
Queried user engagement data from RDBMS (MySQL) to csv files for each month
Preprocessed user data to track missing or unrelated sets of data
Wrote hive queries for data analysis and filtering out the required data for further processing
Visualized the data using Tableau software and Microsoft Excel
Hands on experience in exporting the results into relational databases using Tableau for visualization and to generate reports for the BI team
Communicate deliverables to BI team at periodic review meetings

Environment: Java 8, Apache Hadoop 2.5, Apache Sqoop, Hive, HDFS, MySQL 5.6, Tableau, and Microsoft Excel 2015

Confidential, Grand Rapids, MI

Data Analyst

Responsibilities:

Handled extraction of data from different databases and transferring into HDFS using Sqoop
Loaded and transformed large sets of structured, and semi structured data
Performed transformations using Hive and loaded data into HDFS for aggregations
Utilized SparkSQL to extract and process data
Enhance and optimize product Spark code to aggregate, group and run data mining tasks using Spark framework
Built dynamic dashboards in spreadsheets within tableau

Environment: Java 8, Apache Hadoop 2.5, Apache Sqoop, Hive, HDFS, MySQL 5.6, Tableau, and Microsoft Excel 2014

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

North Kansas City, MO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship