We provide IT Staff Augmentation Services!

Hadoop/spark Consultant Resume

4.00/5 (Submit Your Rating)

Dallas, TX

PROFESSIONAL SUMMARY:

  • Around 6 years of IT experience in software development and support with experience in developing strategic methods for deploying Big Data technologies to efficiently solve Big Data processing requirement.
  • Expertise in Hadoop eco system components HDFS, Map Reduce, Yarn, HBase, Pig, SQOOP, Spark, Spark SQL, Spark Streaming and Hive for scalability, distributed computing, and high - performance computing.
  • Experience in using Hive Query Language for data Analytics.
  • Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
  • Strong knowledge in NOSQL column oriented databases like HBase, Cassandra and its integration with Hadoop cluster.
  • Flexible with Unix/Linux and Windows Environments working with Operating Systems like Centos (RHEL), Ubuntu 13/14, Cosmos .
  • Good experience on Kafka and Storm.
  • Java Developer with extensive experience on various Java Libraries, API's and frameworks .
  • Hands on development experience with RDBMS, including writing complex, Stored procedure and triggers .
  • Strong understanding of Agile Scrum and Waterfall SDLC methodologies.
  • Strong communication, collaboration & team building skills with proficiency at grasping new Technical concepts quickly and utilizing them in a productive manner.
  • Strong analytical and Problem-solving skills.

TECHNICAL SKILLS:

Hadoop/Big Data Technologies: HDFS, Map Reduce, SQOOP, Flume, Pig, Hive, impala, Spark, Kafka, Storm

Databases: Oracle, My SQL, No SQL HBase, Cassandra

Monitoring & Reporting: Tableau, Custom Shell Script

Build Tools: Maven, SQL Developer

Programing & Scripting: Java, SQL, Scala, Shell Scripting

Java Technologies: Servlets, Hibernate, Spring, JDBC

Web Dev. Technologies: HTML, XML, JSON, CSS, JavaScript

Operating Systems: Linux, Unix, Cent OS, Windows

PROFESSIONAL EXPERIENCE:

Confidential, Dallas, TX

Hadoop/Spark Consultant

Responsibilities:

  • Involved in queries and writing data from OLTP server to Hadoop file system using SQOOP.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Developed Scala scripts, UDFFs using both Data frames/SQL and RDD/MapReduce in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through SQOOP
  • Implemented ELK (Elastic Search, Log stash, Kibana) stack to collect and analyze the logs produced by the spark cluster.
  • Actively participated to writing Hive and Impala queries to load and Processing data in Hadoop File system.
  • Developed Spark scripts by using Scala shell commands as per the requirement.

Environment: Cloudera (CDH5.4), Hadoop, HBASE, HDFS, Spark Core, Spark SQL, Java (jdk1.7), ELK stack.

Confidential, Falls Church, VA

Teaching Associate (Hadoop and Java)

Responsibilities:

  • Responsible for teaching the OOAD and Tools and Technology subjects and the projects related to the subjects.
  • Helping and guiding the students for any problems that they come across academically.
  • Become familiar with the Hadoop Distributed File System (HDFS), a distributed system architecture that supports data locality for data-intensive computing, and understand the trade-offs of this architecture compared to the computation/storage architecture.
  • Understand the underlying concepts of the MapReduce programming model and can design and implement MapReduce programs to analyze a large data set.
  • Understand the scalability and performance of MapReduce programs running on HDFS.

Environment: HDFS, Map Reduce, Flume, Pig, SQOOP, Hive, HBase.

Confidential

Hadoop Developer

Responsibilities:

  • Worked on SQOOP to import data from various relational data sources.
  • Worked with Flume in bringing click stream data from front facing application logs.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Exported the result set from HIVE to MySQL using Shell scripts.
  • Used Git for version control.
  • Developed Hive queries to process the data and generate the data cubes for visualizing
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
  • Worked on performing data standardization using PIG scripts.

Environment: HDFS, Map Reduce, Flume, Pig, SQOOP, Hive, HBase, Shell Scripting.

Confidential

Hadoop Developer

Responsibilities:

  • Developed multiple MapReduce jobs in java for data cleaning and preprocessing Involved in Importing and exporting data into HDFS and Hive.
  • Involved in defining job flows.
  • Involved in managing and reviewing Hadoop log files.
  • Involved in Loading and transforming large sets of structured, semi structured and unstructured data.
  • Involved in loading data from UNIX file system to HDFS.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in Map reduce way.

Environment: Java 6 JDK 1.6, Eclipse, Oracle 10g, Sub Version, Hadoop, Hive, Cassandra, Linux, HDFS, Hive.

Confidential

JAVA/ J2EE Developer

Responsibilities:

  • Involved in Java, J2EE, struts, web services and Hibernate in a fast-paced development environment.
  • Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
  • Involved in design and implementation of web tier using Servlets and JSP.
  • Used Apache POI for Excel files reading.
  • Developed the user interface using JSP and Java Script to view all online trading transactions.
  • Designed and developed Data Access Objects (DAO) to access the database.
  • Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects
  • Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
  • Coded HTML pages using CSS for static content generation with JavaScript for validations.
  • Used JDBC API to connect to the database and carry out database operations.
  • Used JSP and JSTL Tag Libraries for developing User Interface components.
  • Performing Code Reviews.
  • Performed unit testing, system testing and integration testing.
  • Involved in building and deployment of application in Linux environment.

Environnent: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.

We'd love your feedback!