Cloud Big Data Engineer Resume Charlotte, NC - Hire IT People

PROFESSIONAL SUMMARY:

IT professional with 7+ years of experience in software design, development, deployment and maintenance of business applications in fields of health, insurance, finance (BFSI), retail and Investment sectors.
4 years of experience in domain of BigData using various Hadoop eco - system tools and Spark APIs.
Solid understanding of architecture, working of Hadoop framework involving Hadoop Distributed File System and its eco-system components Map Reduce, Pig, Hive, HBase, Flume, Sqoop, Hue, Ambari, Zookeeper and Oozie, Storm, Spark, Kafka.
Experienced in building highly reliable, scalable Big-data solutions on Hadoop distributions Cloudera, Horton works, AWS EMR.
Expertise in Developing Spark application using SparkCore, SparkSQL and SparkStreaming API's in Scala deploying in yarn cluster in client, cluster mode using spark-submit.
Involved in creating, transforming and actions on RDDs, Datasets using Scala integrating the applications to Spark framework using SBT and MAVEN build automation tools.
Experience in using D- Streams in streaming, Accumulator , Broadcastvariables , various levels of caching .
Deep understanding of performance tuning , partitioning for optimizing spark applications.
Worked on real time data integration using Kafka data pipeline, Sparkstreaming and HBase.
Extensive knowledge on NoSQL databases like HBase and Mongo DB.
In-depth understanding of NoSQL databases such as HBase, MongoDB and its Integration with Hadoop cluster.
Experience in streaming data ingestion using Kafka and stream processing platforms like Spark Streaming.
Configured and deployed Cloudera distribution Multi-node Hadoop cluster on Amazon Ec2 instances, pseudo-distributed cluster in local Linux machines for Proof of concepts (POC).
Strong working experience in extracting, wrangling, ingestion, processing, storing, querying and Analyzing structured, semi-structured and unstructured data.
Solid understanding of Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Developed, deployed and supported several MapReduce applications in Java to handle semi and unstructured data.
Sound Knowledge in Map side join, Reducer side join, Shuffle & Sort, Distributed Cache, Compression techniques, Multiple Hadoop Input & output formats.
Solid experience in working with csv, text, sequential, avro, parquet, orc, json formats of data.
Expertise in working with Hive data warehouse tool - creating tables, data distribution by implementing static and dynamic partitioning, bucketing and optimizing the HiveQL queries.
Involved in ingestion of structured data from SQL Server, MySQL, TERADATA to HDFS, Hive and HBase using Sqoop.
Extensive experience in performing ETL on structured, semi-structured data using Pig Latin Scripts.
Expertise in moving structured schema data between Pig and Hive using HCatalog.
Designed and implemented Hive and Pig UDF's using java for evaluation, filtering, loading and storing of data.
Experience in migrating the data using Sqoop from HDFS and Hive to Relational Database System and vice-versa according to client's requirement.
Experience with RDBMS like SQL Server, MySQL, Oracle and data warehouses like Teradata and Netezza.
Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism with the service catalog services.
Experienced in job workflow such as Oozie and monitoring tools like Hue.
Proficient knowledge and hands on experience in writing shell scripts in Linux.
Expertise in complete JavaPackage, object-oriented design.
Developed core modules in large cross-platform applications using JAVA , JSP , Servlets , Hibernate , RESTful , JDBC , JavaScript , XML , and HTML .
Extensive experience in developing and deploying applications using WebLogic , ApacheTomcat and JBOSS .
Development experience with RDBMS, including writing SQL queries, views, stored procedure, triggers, etc.
Strong understanding of Software Development Lifecycle (SDLC) and various methodologies (Waterfall, Agile).

TECHNICAL SKILLS:

BigData Technologies: HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Oozie, Hue, Ambari, Zookeeper, Kafka, Apache Spark, Storm

Hadoop Distributions: Cloudera, Horton Works, Apache, AWS EMR

Languages: C, Java, PL/SQL, PigLatin, HiveQL, Scala

IDE Tools: Eclipse, NetBeans, IntelliJ, Spring Tool Suite (STS)

Web Technologies: HTML, CSS, JavaScript, XML, JSP, RESTful.

Operating Systems: Windows (XP,7,8,10), UNIX, LINUX, Ubuntu, CentOS

Reporting Tools /ETL Tools: Tableau, Power view for Microsoft Excel, Splunk.

Databases: Oracle, SQL Server, MySQL, MS Access, NoSQL Database (Hbase, MongoDB)

Build Automation tools: SBT, Ant, Maven

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Cloud Big Data Engineer

Responsibilities:

Working as a part of the Big Data analytics team on cloud (AWS) big data infrastructure such as Hadoop Eco System, HDFS, Spark, and cloud technologies. Build data pipelines with gigabytes/terabytes of data and triage the challenges of manipulating such large datasets.
Working on the large-scale Hadoop Yarn cluster for distributed data processing analyzing using Spark, Hive.
Used various spark Transformations and Actions for cleansing the input data.
Developed shell scripts to generate the hive create statements from the data and load the data into the table.
Wrote Map Reduce jobs using Java API.
Responsible for performing extensive data validation using Hive.
Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access.
Involved in designing and developing nontrivial ETL processes within Hadoop using tools like Sqoop and Oozie.
Used DML statements to perform different operations on Hive Tables.
Developing Hive queries for creating foundation tables from stage data.
Developed java Map Reduce custom counters to track the records that are processed by map reduce job.
Involved in Oozie Workflow with Java Actions, Bash actions to submit Spark jobs.
Involved in creating Dash boards on Splunk using the AWS Cloud watch logs.

Environment: Hadoop2.6, Spark, Scala, Hive, MapReduce, MYSQL8.4, SQLServer2014/2012, Java, Sqoop, Splunk, various file formats like JSON & Parquet, AWS S3, EMR, Cloud Watch, Service Catalog, Hue, Control M.

Confidential, Springfield, IL

BigData/Hadoop developer

Responsibilities:

Involved in creating Hive tables, loading and analyzing data using hive queries.
Developed Simple to complex Map Reduce Jobs using Hive and Pig.
Used Cloudera Quickstart vm for deploying the cluster.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Mentored analyst and test team for writing Hive Queries.
Analyzed the data by performing Hive queries and running Pig scripts to validate data.
Generated the datasets and loaded to HADOOP Ecosystem.
Assisted in exporting analyzed data to relational databases using Sqoop.
Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries, Pig Scripts, Sqoop jobs.
Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
Worked with Linux systems and RDBMS database on a regular basis in order to ingest data using Sqoop.
Used Sqoop, Pig, Hive as ETL tools for pulling and transforming data.
Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality

Environment: Cloudera, Hadoop, HDFS, Spark, Oozie, Pig, Hive, MapReduce, Sqoop, MongoDB, Linux, Core Java, SOAP, XML, JMS, JBOSS.

Confidential

Hadoop Developer

Responsibilities:

Installed/clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
Involved in installation and configuration of Cloudera distribution Hadoop CDH 3.x, CDH 4.x.
Involved in setup of 50 nodes Hadoop cluster.
Developed Hive/Pig scripts.
Worked on Sqoop and hive tuning activities.
Worked on upgrading cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, and slots configuration.
Installed and integrated Oozie with the Hadoop stack to run multiple hive and Pig scripts.
Involved in creating and maintaining Hive tables, loading data into the tables using Hive queries and MapReduce jobs.
Handled Data load from different UNIX file systems to HDFS.
Customized the SSH settings in the Master node.
Resolved various issues faced by users which are related to platform.

Environment: CDH, Hadoop, HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Kerberos, Shell script, UNIX

Confidential

Java Developer

Responsibilities:

Analyzed, Designed and developed the system to meet the requirements of business users.
Participated in the design review of the system to perform Object Analysis and provide best possible solutions for the application
Implemented presentation tier using HTML, JSP, Servlets, AJAX frameworks.
Used AJAX for implementing part of the functionality for Customer Registration, View Customer information modules.
Used JavaScript for client-side validation.
Implemented Struts MVC framework for developing J2EE based web application.
Used JDBC to connect and access database.
IBM WebSphere to deploy J2EE application components
Database tier involved the SQL Server.
Developed JUnit test cases.

Environment: Java, JSP, Servlets, Struts, HTML, JavaScript, JQuery, SQL Server, WebSphere MQ, JUnit, XML, AJAX, Windows NT, CVS.

We provide IT Staff Augmentation Services!

Cloud Big Data Engineer Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship