Hadoop spark developer Resume Irvine, CA - Hire IT People

SUMMARY

Around 7+ years of experience in Analysis, Architecture, Design, Development, Testing, Maintenance and User training of software application which includes over 3+ Years in Big Data, Hadoop and HDFS environment and experience in JAVA/J2EE.
Hands on experience on Hadoop (HDFS, Map Reduce, PIG, HIVE, and SQOOP etc).
Hands on experience on Spark (1.5, 1.6) & Scala Full stack developer.
Seasoned Hadoop/Spark/Scala/Java developer experience in Object Oriented programming.
Experience in Installing, Configuring, Testing Hadoop Ecosystem components, experience on Hadoop clusters using major Hadoop Distributions - Cloudera (CDH3, CDH4 and CDH5) and HortonWorks.
Experienced in building highly scalable Big-data solutions using Hadoop and multiple distributions i.e., Cloudera, Hortonworks and NoSQL platforms (Hbase& Cassandra).
Experience in analyzing data using HiveQL, Pig Latin and writing custom mapreduce programs in Java and Python.
Experienced in converting HiveQL queries into Spark transformations using Spark RDDs and Scala.
Hands on experience in Apache Sqoop, Apache Storm and Apache Hive integration.
Hands on experience working with different File Formats like TEXTFILE, JSON, AVROFILE, ORC for HIVE querying and processing.
Experience Data Modelling using Star and Snow Flake Schema and also worked with Metadata.
Experience in AWS cloud environment and on S3 storage and EC2 instances.
Experience on Apache Kafka, used for Messaging broker, Log Aggregation and Stream processing.
Expertise in migration data from different databases (i.e. Oracle, DB2, Teradata) to HDFS.
Experience in designing and coding web applications using Core Java & Web Technologies- JSP, Servlets and JDBC, full Understanding of utilizing J2EE technology Stack, including Java related frameworks like spring, ORM Frameworks (Hibernate).
Have good interpersonal, communicational skills, strong problem solving skills, Strong analytical and judgment techniques.

TECHNICAL SKILLS

Hadoop/Big Data Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Storm, Zookeeper, Kafka, Impala, HCatalog, Apache Spark, Spark Streaming, Spark SQL, Hbase and Cassandra, AWS, Horton works, Cloudera

Web technologies: JSP, Servlets, JDBC, Java Script, CSS

Application Servers: IBM Web sphere, Tomcat

Development and BI Tools: TOAD Visio,, Rational Rose, Endure Informatica 9.1

Databases: Oracle9g/10g & MySQL 4.x/5.x, Hbase, NoSQL

Programming Languages: Java (JDK 5/JDK 6), C/C++, Python, Scala, HTML, SQL

Operating Systems: UNIX, Windows, LINUX, Mac OS X

Development Methodologies: Agile Methodology -SCRUM, Hybrid

PROFESSIONAL EXPERIENCE

Confidential, Irvine, CA

Hadoop spark developer

Responsibilities:

Evaluated Business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
Worked on analyzing Hadoop cluster and different big data analytical and processing tools including Pig, Hive, Spark, Spark Streaming.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Migrating various Hive UDF’s and queries into Spark SQL for faster requests.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and exported the data from HDFS to MYSQL using Sqoop.
Configured Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
Hands on experience in Spark and Spark Streaming creating RDD's, applying operations -Transformation and Actions.
Used HIVE to analyze the partitioned and bucketed data and compute various metrics for reporting.
Experience in using Apache Kafka for log aggregations.
Experience on working with different data types like FLATFILES, ORC, AVRO and JSON.
Developed Talend jobs for reading log files.
Involved in implementing Cluster for Cassandra to address HBase limitations.
Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems and suggested some solution.

Environment: MapReduce, HDFS, Hive, Pig, Spark, Spark-Streaming, Spark SQL, Apache Kafka, Sqoop, Java, Scala, CDH4, CDH5, AWS, Eclipse, Oracle, Git, Shell Scripting and Cassandra.

Confidential, Fort Worth, TX.

Hadoop developer

Responsibilities:

Involved in Various Stages of Software Development Life Cycle (SDLC) deliverables of the project using the AGILE Software development methodology.
Worked on importing data from various sources and performed transformations using Cloudera MapReduce, hive to load data into HDFS.
Loading the data from the different data sources like (Teradata, DB2, Oracle and Flat files) into HDFS using Sqoop and load into Hive tables, which are partitioned.
Created different PIG Scripts and converted them as shell command to provide aliases for common operation for project business flow.
Hands on experience in Apache Sqoop, Apache Storm and Apache Hive integration as part of the project implementation.
Expereince in using Apache Storm to build real-time data integration systems, to analyze clean, normalize, and resolve large amounts of non-unique data points with low latency and high throughput.
Experience in working on log files using Apache Storm.
Implemented various Hive queries for analysis and call them from java client engine to run on different nodes.
Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and then imported into hive tables.
Developed scripts to bring the log files from FTP Server and then processing it to load into Hive tables.
Experience in developing Hive UDFs using Java programming language.
Experience in creating statistics of logs and extracts useful information from the statistics in real-time using Apache Storm.
Moved data from HDFS to Hbase using Map Reduce and Bulk Output Format class.
Experience in Implementing Rack Topology scripts to the Hadoop Cluster.
Developed Helper class for abstracting Hbase cluster connection act as core toolkit.
Participated day-to-day meeting, status meeting, and effective communication with team members.

Environment: MapReduce, HDFS, Hive, Pig, Hbase, Apache Storm, HDP, Sqoop, Java, Eclipse, Oracle, Linux, Shell Scripting, Maven, Git.

Confidential, Denver, CO

Hadoop Developer

Responsibilities:

Responsible for architecting Hadoop clusters with CDH3.
Extensively involved in Installation and configuration of Cloudera distribution Hadoop 2, 3, NameNode, Secondary NameNode, JobTracker, TaskTrackers and DataNodes.
Installed and configured Hadoop ecosystem like HBase, Flume, Pig and Sqoop.
Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
Managed and reviewed Hadoop Log files.
Load log data into HDFS using Flume. Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Worked extensively with Sqoop for importing metadata from Oracle.
Designed a data warehouse using Hive. Created partitioned tables in Hive.
Mentored analyst and test team for writing Hive Queries.
Installed and configured Hive and also written Hive UDFs.
Involved in HDFS maintenance and WEBUI it through Hadoop-Java API.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Importing and exporting data into HDFS and Hive using Sqoop.
Created HBase tables to store various data formats of PII data coming from different portfolios.
Extensively used Pig for data cleansing.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Oozie, Java (jdk1.6), Oracle 11g/10g, PL/SQL, SQL*PLUS, Windows NT, UNIX Shell Scripting.

Confidential

Java Developer

Responsibilities:

Involved in various stages of Enhancements in the Application by doing the required analysis, development, and testing.
Prepared the High and Low level design document and Generating Digital Signature.
For the registration and validation of the enrolling customer developed logic and code.
Extensively used Java Multi-Threading concept for downloading files from a URL.
Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
Developed web-based user interfaces using J2EE Technologies.
Handled Client side Validations used JavaScript.
Used Validation Framework for Server side Validations.
Created test cases for the Unit and Integration testing.
Front-end was integrated with Oracle database using JDBC API through JDBC-ODBC Bridge driver at server side.
Developed required stored procedures and database functions using PL/SQL.
Developed, Tested and debugged various components in WebLogic Application Server.
Used XML, XSL for Data presentation, Report generation and customer feedback documents.
Implemented Logging framework using Log4J.
Involved in code review and documentation review of technical artifacts.

Environment: Java Servlets, JSP, JavaScript, XML, HTML, UML, Apache Tomcat, Eclipse, JDBC, Oracle 11g and other basic office tools.

We provide IT Staff Augmentation Services!

Hadoop Spark Developer Resume

Irvine, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship