We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

Bothell, WA

SUMMARY:

  • Over 7 years of professional IT experience which includes over 4 years of experience in Big data ecosystem related technologies like Hadoop, Pig, Hive, Sqoop, HBase, Cassandra and designing and implementing Map/Reduce jobs to support distributed data processing and process large data sets utilizing the Hadoop cluster.
  • Working experience with large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
  • Experience in installation, configuration, supporting and managing - Cloudera’s Hadoop platform along with CDH3 & 4 clusters.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, Big Data eco system (Job Tracker, Task Tracker, Name Node, Data Node) and Map Reduce programming paradigm.
  • Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper, Yarn and Lucene.
  • Experience writing Spark core, Spark-SQL and Spark-streaming.
  • Good integration experience Apache Spark with Cassandra.
  • Good Knowledge on Scala.
  • POC’s on Spark Streaming with various applications inventory data to FCC.
  • Experienced with fast and general engine for large-scale data processing in Spark.
  • Good knowledge with fast and general engine for large-scale data processing in Spark with Spark core, Spark-SQL and Spark-streaming.
  • Knowledge with Kafka Cluster Developed a Spark Streaming Kafka supplication cluster to Process Hadoop Jobs Logs.
  • Analyzing code issues with logs from splunk with all the application and web server
  • Experienced in monitoring Hadoop cluster environment using Ganglia.
  • Extensive experience with Oozie work-flows and bundling Java, Shell, Map-Reduce and other jobs into one work-flow to create an end to end automated processes.
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Involved in generating automated scripts (YAML files) of Falcon and Oozie using ruby.
  • Experience in Object Oriented Analysis, Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java design patterns.
  • Experienced in SQA (Software Quality Assurance) including Manual and Automated testing with tools such as Selenium RC/IDE/WebDriver/Grid and Junit, Load Runner, Jprofile, RFT (Rational Functional Tester).
  • Proficient in deploying applications on J2EE Application servers like Web-Sphere, Web-logic, Glassfish, Tuxedo, JBoss and Apache Tomcat web server.
  • Expertise in developing applications using J2EE Architectures / frameworks like Struts, Spring Framework and SDP (Qwest Communications) Framework.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC.
  • Experience in NoSQL data stores (Hbase, Cassandra).
  • Implemented POC’s using Amazon Cloud Components (S3, EC2, Elastic beanstalk and SimpleDB).
  • Experience in database design using PL/SQL to write Stored Procedures, Functions, Triggers and strong experience in writing complex queries for Oracle 8i/9i/10g.
  • Extensive experience using multiple languages/technologies (Java/J2EE, C++, C, Perl, PHP, Ruby) and environments (JVM, Linux, Various Unix, Windows).
  • Ability to adapt to evolving technology, strong sense of responsibility.
  • Ability to meet deadlines and handle multiple tasks, flexible in work schedules and possess good communication skills.
  • Performance Tuning Hadoop run-time parameters to minimize the map disk spill, mapper tasks and mapper output.
  • 7Migration of data using sqoop databases transferring data between traditional and Hadoop.
  • Using Flume for migration data from any source into Hadoop (HDFS).
  • Developed hive query’s to process the data and generate data cubes for visualization.
  • Working on the tableau for creating dash boards and Generating reports.

TECHNICAL SKILLS:

JAVA Technologies: Java, JDK 1.2, JDK 1.3, JDK1.4, JDK1.5, JDK1.6.

J2EE Technologies: JSP, Java Bean, Servlets, JDBC, JPA1.0, EJB3.0, JDBC, JNDI, JOLT, Amazon Cloud (S3, EC2)

Languages: C, C++, PL/SQL, and Java.

Big Data Technologies: Hadoop, Hortonworks 2.0, CDH5, CDH4&3 HDFS, Hive, Pig, Oozie, Sqoop, Map-Reduce, Hbase, Flume, Zookeeper, Spark, Storm

Frame Works: Hadoop (HDFS, Map Reduce, Pig, Hive, HBase, Mahout, Oozie, Zookeeper, YARN, Lucene,Spark, Storm Struts 1.x and Spring 3.x

Web Technologies: XHTML, JavaScript, AngularJS, AJAX, HTML, XML, XSLT, XPATH, CSS, DOM, WSDL, GWT, JQuery, Perl, VB Script.

Application Servers: WebLogic8.1/9.1/10.x, Web-Sphere5.x/6.x/7.x, Tuxedo server 7.x/9.x, Glass Fish Server 2.x, JBoss4.x/5.x.

Web Servers: Apache Tomcat 4.0/ 5.5, Java Web Server 2.0.

Operating Systems: Windows-XP/2000/NT, UNIX, Linux, and DOS

Database: SQL, Oracle 9i/10g, SQLServer, Hbase and Cassandra.

IDE: Eclipse3.x, My Eclipse 8.x, RAD 7.x and JDeveloper 10.x.

Tools: Adobe, Sql Developer, Flume and Sqoop, Hue, Tableau

Web Technologies: XHTML, JavaScript, XML, CSS, DOM, WSDL, SOA, Web Services.

Platforms: Windows XP/NT/9x/2000, MS-DOS, UNIX /LINUX/Solaris/AIX,, Hortonworks 2.0, CDH5, CDH4, CDH3

Databases: SQL, PL/SQL, Oracle 9i/10g, MYSQL, Microsoft Access, SQLServer.

Version Control: Win CVS, VSS, PVCS, Subversion, Git

PROFESSIONAL EXPERIENCE:

Confidential, Bothell, WA

Sr. Hadoop Developer

Responsibilities:

  • Launching and Setup of HADOOP related tools on different sources, which include configuring different components of HADOOP.
  • Setting up the Spark Streaming and Kafka Cluster Developed a Spark Streaming Kafka supplication cluster to Process Hadoop Jobs Logs.
  • Designed schema and modeling of data and Written algorithm to store all validated data in Cassandra using Spring Data Cassandra REST
  • Spark Streaming App collect user actions data from front end Kafka Producer based Rest API to collect user events and send to Spark Streaming App
  • Worked on SparkML library for Recommendations, Coupons Recommendations and Rules Engine.
  • Using Scala, Spark & echo system, enriched given data using Fashion Ontology to Validation/Normalizing the data
  • Cassandra & Hadoop and Cassandra & Spark integrations.
  • Involved to writing storm topology for event firing mechanism.
  • Migrated from Storm to spark Streaming for event firing using REST calls.
  • Experience on data load from Cassandra to Spark.
  • Using Spark-streaming from web server for analyzing near real time log data
  • Developed simple to complex Map-Reduce jobs using Java programming language that was implemented using Hive and Pig.
  • Supported Map Reduce Programs that are running on the cluster.
  • Handled the importing of data from various data sources, performed transformations using hive, Map-Reduce, loaded data into HDFS and extracted data from Mysql into HDFS using Sqoop.
  • Experience in Using Sqoop to connect to the Sql Server or Oracle database and move the pivoted data to Hive tables and stored in Orc files.
  • Managed the Hive database, which involves ingest and index of data.
  • Expertise in exporting the data from orc files and indexing the documents in sequence or serde file format.
  • Hands on experience in writing custom UDF’s and also custom input and output formats.
  • Scheduling the Hive jobs using Tidal jobs in process files.
  • Using wrapper scripts and log aggregation, real time event processing, monitoring and queueing.
  • Involved in design and architecture of custom Lucene storage handler.
  • Configured and Maintained different topologies in storm cluster and deployed them on regular basis.
  • Experience in Using Sqoop to connect to the Sql Server or Oracle database and move the pivoted data to Hive tables and stored in orc files and snappy compression.
  • Managed the Hive database, which involves ingest and index of data.
  • Expertise in exporting the data from orc files and indexing the documents in sequence or serde file format.
  • Writing pig UDFs for requirements and automated using oozie.
  • Understanding of multi-threaded java programs for generated Hive.QL .
  • Maintained the test mini cluster using vagrant and VMware fusion..
  • Involved transferring data between traditional databases and Hadoop using Sqoop.
  • Involved data types testing for source systems and Hive data types in HDFS
  • Developed hive query’s to process the data and generate data cubes for visualization.
  • Working on the tableau for creating dash boards and Generating reports.

Environment: Hadoop, Big data, Hortonworks 2.0 Hive, Pig, Hbase, Sqoop, Oozie, Spark, Storm Map Reduce, Casandra, Jira, Bit bucket, Maven, Tableau, Bamboo, J2EE, Guice, AngularJS, Jmockit, Lucene, Ruby, Unix, Sql, ORC.

Confidential, Boston, MA

Sr. Hadoop Developer

Responsibilities:

  • Launching and Setup of HADOOP related tools on AWS, which includes configuring different components of HADOOP.
  • Experience in Using Sqoop to connect to the Sql Server or Oracle database and move the pivoted data to Hive tables and stored in RC files.
  • Managed the Hive database, which involves ingest and index of data.
  • Experienced in analyzing data with Hive and Pig.
  • Writing Pig scripts to process the data.
  • Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
  • Hands on experience in writing custom UDF’s and also custom input and output formats.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Scheduling the Hive jobs using Oozie process files.
  • Involved in design and architecture of custom Lucene storage handler.
  • Configured and Maintained different topologies in storm cluster and deployed them on regular basis.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
  • Storm Streaming App collect user actions data from front end Kafka Producer based Rest API to collect user events and send to Storm Streaming App
  • Along with the Infrastructure team, involved in design and developed Kafka and Storm based data pipeline
  • Maintained the test mini cluster using vagrant and VMware fusion.
  • Involved in GUI development using JavaScript and AngularJS and Guice.
  • Developed Unit test case using Jmockit framework and automated the scripts.
  • Worked in Agile environment, this uses Jira to maintain the story points and Kanban model.
  • Involved in brain storming JAD sessions to design the GUI.
  • Hands on experience on maintaining the builds in Bamboo and resolved the build failures in Bamboo.
  • Involved transferring data between traditional databases and Hadoop using Sqoop.
  • Using Flume for migration data from any source into Hadoop (HDFS).

Environment: Hadoop, Big data,,, Hive, Hbase, Sqoop, Accumulo, Oozie, Storm, HDFS, Map Reduce, Hue Jira, Bit bucket, Maven, Bamboo, J2EE, Guice, AngularJS, Jmockit, Lucene, Ruby, Unix, Sql, AWS (Amazon Web Services).

Confidential, Hoffman Estates, IL

Hadoop Developer

Responsibilities:

  • Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
  • Good understanding and related experience with Hadoop stack - internals, Hive, Pig and Map/Reduce.
  • Wrote MapReduce jobs to discover trends in data usage by users.
  • Developed data pipeline using Cassandra, Pig and Java map reduce to ingest customer profiles and purchase histories into HDFS for analysis.
  • Deployed Hadoop Cluster in Fully Distributed and Pseudo-distributed modes.
  • Used Pig as ETL tool to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS
  • Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data.
  • The data is collected from distributed sources into data models. Applied transformations and standardizations and loaded into Hbase for further data processing.
  • Applied pattern matching algorithms to match customers spending habits with loyalty points using Hive and stored the output in Hbase.
  • Involved in Performance Tuning Hadoop run-time parameters to minimize the map disk spill, mapper tasks and mapper output
  • Use configuration file and command line arguments to set parameters with balancing reducer’s loading.

Environment: Hadoop, Big data, CDH3&4 Distribution, JDK1.6, RHEL, HDFS, Map-Reduce, Hive, Pig, Sqoop, Cassandra, Flume

Confidential, San Jose, CA

Hadoop Developer

Responsibilities:

  • Built a data flow pipeline using flume, java map reduce and pig.
  • Used Flume to capture the streaming mobile sensor data and loaded it to HDFS.
  • Used Java map reduce and Pig scripts to process the data and store the data on HDFS
  • Used Hive scripts to compute aggregates and store them on Hbase for low latency applications.
  • Analyze Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirement.
  • Integrated Cassandra as a distributed persistent metadata store to provide metadata resolution for network entities on the network.
  • Used Mahout CF algorithms for deploying recommendation algorithms.
  • Involved in design and development phases of Software Development Life (SDLC) using scrum methodology.

Environment: JDK1.6, Red Hat Linux, Big Data, Hive, Pig, Sqoop, Flume, Zookeeper, DB2, HBase, Mahout and Cassandra.

Confidential, Houston, TX

Java/J2EE Developer

Responsibilities:

  • Used Hibernate ORM tool as persistence Layer - using the database and configuration data to provide persistence services (and persistent objects) to the application.
  • Responsible for developing DAO layer using Spring MVC and configuration XML’s for Hibernate and to also manage CRUD operations (insert, update, and delete).
  • Implemented Dependency injection of spring framework.
  • Develo0ped reusable services using BPEL to transfer data.
  • Created JUnit test cases, and Development of JUnit classes.
  • Configured log4j to enable/disable logging in application.
  • Developed Rich user interface using HTML, JSP, AJAX, JSTL, Java Script, JQuery and CSS.
  • Implemented PL/SQL queries, Procedures to perform data base operations.
  • Wrote UNIX Shell scripts and used UNIX environment to deploy the EAR and read the logs.
  • Implemented Log4j for logging purpose in the application.

Environment: Java, Jest, SOA Suite 10g (BPEL), Struts, Spring, Hibernate, Web services (JAX-WS), JMS, EJB, Web logic 10.1 Server, JDeveloper, Sql Developer, HTML, LDAP, Maven, XML, CSS, JavaScript, JSON, SQL, PL/SQL, Oracle, JUnit, CVS and UNIX/Linux.

Confidential

SQL Server Developer

Responsibilities:

  • Created new database objects like Procedures, Functions, Packages, Triggers, Indexes and Views Using T-SQL in Development and Production environment for SQL Server.
  • Developed Database Triggers to enforce Data integrity and additional Referential Integrity.
  • Developed SQL Queries to fetch complex data from different tables in remote databases using joins, database links and formatted the results into reports and kept logs.
  • Involved in performance tuning and monitoring of both T-SQL and PL/SQL blocks.
  • Wrote T-SQL procedures to generate DML scripts that modified database objects dynamically based on user inputs.

Environment: SQL Server 7.0, Oracle 8i, Windows NT, C++, HTML, T-SQL, PL/SQL, SQL Loader.

Hire Now