Hadoop/Spark Consultant Resume Dallas, TX - Hire IT People

PROFESSIONAL SUMMARY:

Around 6 years of IT experience in software development and support with experience in developing strategic methods for deploying Big Data technologies to efficiently solve Big Data processing requirement.
Expertise in Hadoop eco system components HDFS, Map Reduce, Yarn, HBase, Pig, SQOOP, Spark, Spark SQL, Spark Streaming and Hive for scalability, distributed computing, and high - performance computing.
Experience in using Hive Query Language for data Analytics.
Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
Strong knowledge in NOSQL column oriented databases like HBase, Cassandra and its integration with Hadoop cluster.
Flexible with Unix/Linux and Windows Environments working with Operating Systems like Centos (RHEL), Ubuntu 13/14, Cosmos .
Good experience on Kafka and Storm.
Java Developer with extensive experience on various Java Libraries, API's and frameworks .
Hands on development experience with RDBMS, including writing complex, Stored procedure and triggers .
Strong understanding of Agile Scrum and Waterfall SDLC methodologies.
Strong communication, collaboration & team building skills with proficiency at grasping new Technical concepts quickly and utilizing them in a productive manner.
Strong analytical and Problem-solving skills.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: HDFS, Map Reduce, SQOOP, Flume, Pig, Hive, impala, Spark, Kafka, Storm

Databases: Oracle, My SQL, No SQL HBase, Cassandra

Monitoring & Reporting: Tableau, Custom Shell Script

Build Tools: Maven, SQL Developer

Programing & Scripting: Java, SQL, Scala, Shell Scripting

Java Technologies: Servlets, Hibernate, Spring, JDBC

Web Dev. Technologies: HTML, XML, JSON, CSS, JavaScript

Operating Systems: Linux, Unix, Cent OS, Windows

PROFESSIONAL EXPERIENCE:

Confidential, Dallas, TX

Hadoop/Spark Consultant

Responsibilities:

Involved in queries and writing data from OLTP server to Hadoop file system using SQOOP.
Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
Developed Scala scripts, UDFFs using both Data frames/SQL and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through SQOOP
Implemented ELK (Elastic Search, Log stash, Kibana) stack to collect and analyze the logs produced by the spark cluster.
Actively participated to writing Hive and Impala queries to load and Processing data in Hadoop File system.
Developed Spark scripts by using Scala shell commands as per the requirement.

Environment: Cloudera (CDH5.4), Hadoop, HBASE, HDFS, Spark Core, Spark SQL, Java (jdk1.7), ELK stack.

Confidential, Falls Church, VA

Teaching Associate (Hadoop and Java)

Responsibilities:

Responsible for teaching the OOAD and Tools and Technology subjects and the projects related to the subjects.
Helping and guiding the students for any problems that they come across academically.
Become familiar with the Hadoop Distributed File System (HDFS), a distributed system architecture that supports data locality for data-intensive computing, and understand the trade-offs of this architecture compared to the computation/storage architecture.
Understand the underlying concepts of the MapReduce programming model and can design and implement MapReduce programs to analyze a large data set.
Understand the scalability and performance of MapReduce programs running on HDFS.

Environment: HDFS, Map Reduce, Flume, Pig, SQOOP, Hive, HBase.

Confidential

Hadoop Developer

Responsibilities:

Worked on SQOOP to import data from various relational data sources.
Worked with Flume in bringing click stream data from front facing application logs.
Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
Exported the result set from HIVE to MySQL using Shell scripts.
Used Git for version control.
Developed Hive queries to process the data and generate the data cubes for visualizing
Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
Worked on performing data standardization using PIG scripts.

Environment: HDFS, Map Reduce, Flume, Pig, SQOOP, Hive, HBase, Shell Scripting.

Confidential

Hadoop Developer

Responsibilities:

Developed multiple MapReduce jobs in java for data cleaning and preprocessing Involved in Importing and exporting data into HDFS and Hive.
Involved in defining job flows.
Involved in managing and reviewing Hadoop log files.
Involved in Loading and transforming large sets of structured, semi structured and unstructured data.
Involved in loading data from UNIX file system to HDFS.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map reduce way.

Environment: Java 6 JDK 1.6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, Cassandra, Linux, HDFS, Hive.

Confidential

JAVA/ J2EE Developer

Responsibilities:

Involved in Java, J2EE, struts, web services and Hibernate in a fast-paced development environment.
Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
Involved in design and implementation of web tier using Servlets and JSP.
Used Apache POI for Excel files reading.
Developed the user interface using JSP and Java Script to view all online trading transactions.
Designed and developed Data Access Objects (DAO) to access the database.
Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects
Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
Coded HTML pages using CSS for static content generation with JavaScript for validations.
Used JDBC API to connect to the database and carry out database operations.
Used JSP and JSTL Tag Libraries for developing User Interface components.
Performing Code Reviews.
Performed unit testing, system testing and integration testing.
Involved in building and deployment of application in Linux environment.

Environnent: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.