Hadoop/Spark Engineer Resume Charlote, NC - Hire IT People

SUMMARY:

As an active member on Q&A communities in the free time and reference links as follows
Stack over flow.
Hortonworks Community.

TECHNICAL SKILLS:

Programming Languages: Python, Scala, SQL, Core Java.

Big Data Technologies: Hadoop, HDFS, MapReduce, Hive, Pig, HBase, Impala, Hue, Sqoop, Spark, Oozie, NiFi, Kafka, Zookeeper, Cloudera, HortonWorks.

Databases: Oracle, MySQL, SQL Server, DB2, Teradata, Cloudant (CouchDB), MongoDB.

Scripting and Query Languages: UNIX, Shell scripting, SQL, PL/SQL and HiveQL.

Operating Systems: Windows, UNIX, Linux distributions (Centos, Ubuntu).

PROFESSIONAL EXPERIENCE:

Confidential, Charlote, NC

Hadoop/Spark Engineer

Responsibilities:

Designed and created NiFi flows to pull data from RDBMS, NoSql, S3, HTTP Ports, Files, Click Stream sources and migrated all the existing oozie jobs into NiFi.
Worked on RestApi to pull the data and process them in near real time and store into Nosql databases.
Worked on handling updates in hive using Transactional tables with merge strategy and other ways.
Worked on Hbase for real time lookup’s, HBase - Hive tables to access Hbase tables in hive and load HBase tables into spark for batch analysis.
Tuned hive tables by analyzing the jobs that are running and business use cases to serve faster and better.
Designed and implemented Pyspark scripts to access HBase, Hive tables, and load Files using JDBC.
Worked on Hive Interactive and Spark LLAP connector to access Hive transactional tables into Spark.
Implemented Spark scripts using Python to build complex Hive jobs for better performance and to meet our SLA’s.
Created monitoring flows to identify Failed, Long running jobs, disk usage, memory usage, files in a directory and send alerts once exceeds threshold.
Worked on creating oozie workflows using Sqoop, Hive, Shell actions and scheduled using oozie coordinator to run incrementally.
Used incremental imports, delta imports on tables from Sql Server having no primary keys and importing them into Hive for the transformations, aggregations.
Exporting the data using Sqoop and NiFi to RDBMS systems.
Created Partitioning, Bucketing, and Map side Join, Parallel execution for optimizing the hive queries/tables and worked extensively on Avro, ORC formats.
Designed and implemented workload distribution in NiFi using Remote Processor Groups for more parallelism.
Used NiFi to schedule, automate and monitor Hive, Spark, and Shell scripts.
Implemented Spark scripts using Python to migrate the hive jobs to perform better and faster.
For near Real time analysis worked on Kafka and Spark Streaming.
Worked in Agile development environment in sprint cycles of two weeks by dividing and organizing tasks. Participated in daily scrum and other design related meetings.

Environment: Hadoop, Map Reduce, HDFS, Pig, Hive, LLAP, Oozie, S3, Java, Unix, Spark-Streaming, Kafka, Python, Hbase, Cloudant, MongoDB, Hortonworks, NiFi.

Confidential

SQL and hadoop Developer

Responsibilities:

Developed SQL Scripts/Stored Procedures to perform different joins, sub queries, nested querying, Insert/Update and Delete data in MS SQL database tables.
Experience in writing PL/SQL and in developing and implementing Stored Procedures, Packages and Triggers.
Responsible for the designing the advance SQL queries, procedure, cursor, triggers.
Build data connection to the database using MS SQL Server.
Worked on project to extract data from xml file to SQL table and generate data file reporting using SQL Server 2008.
Worked on installing/configuring Hadoop tools using Cloudera manager.
Created ingestion scripts using sqoop and oozie to schedule/monitor the jobs.
Created Hive UDF has to convert different time zones data into GMT in UTC format.

Environment: SQL SERVER, PL/SQL, My SQL, Visual studio 2000/2005, Cloudera, Sqoop, OOzie, Hive.

We provide IT Staff Augmentation Services!

Hadoop/spark Engineer Resume

Charlote, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship