We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

San Jose, CA

SUMMARY

  • Over 6+ years of professional IT experience, 5 years in Big Data Ecosystem experience in ingestion, querying, processing and analysis of big data.
  • Experience in using Hadoop ecosystem components like Map Reduce, HDFS, HBase, ZooKeeper, Hive, Sqoop, Pig, Flume, Cloudera.
  • Knowledge on NoSQL databases like HBase, Cassandra
  • Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.
  • Experience in working with Map Reduce programs, Pig scripts and Hive commands to deliver the best results.
  • Have competence on different Big Data frameworks such as Kafka, Neo4j, Hive, Elasticsearch, HDFS, YARN etc. and on various data visualization libraries such as D3.js etc.
  • Extensively worked on development and optimization of Map reduce programs, PIG scripts and HIVE queries to create structured data for data mining.
  • Solid knowledge of Hadoop architecture and daemons like Name node, Data nodes, Job trackers, Task Trackers.
  • Good knowledge on ZooKeeper to coordinate clusters.
  • Experience in Database design, Data analysis, Programming SQL, Stored procedure's PL/ SQL, and Triggers in Oracle and SQL Server.
  • Experience in extending HIVE and PIG core functionality by using Custom user Defined functions.
  • Experience in writing custom classes, functions, procedures, problem management, library controls and reusable components.
  • Working knowledge on Oozie, a workflow scheduler system to manage the jobs that run on PIG, HIVE and SQOOP.
  • Followed Test driven development of Agile and scrum Methodology to produce high quality software.
  • Expert in AWS Cloud Formation template creation
  • Experience in AWS EMR cluster configuration
  • Experience in AWS cloud environment on S3 storage and EC2 instances.
  • Experience in R - Studio by creating visualization on data file.
  • Experienced in integrating Java-based web applications in a UNIX environment.
  • Developed applications using JAVA, JSP, Servlets, JDBC, JavaScript, XML and HTML.
  • Strong analytical skills with ability to quickly understand clients business needs. Involved in meetings to gather information and requirements from the clients.
  • Research-oriented, motivated, proactive, self-starter with strong technical, analytical and interpersonal skills.

TECHNICAL SKILLS

Hadoop Ecosystem: Kafka,HDFS,MapReduce,Hive,Impala,Pig,Sqoop,Flume,Oozie,Zookeeper,Ambari,Hue,Spark,Strom,Ganglia

Project Management / Tools / Applications: All MS Office suites(incl. 2003), MS Exchange & Outlook, Lotus Domino Notes, Citrix Client, SharePoint, MS Internet Explorer, Firefox, Chrome, Apache, IIS

Web Technologies: HTML, XML, CSS, JavaScript

NoSQL Databases: HBase, Cassandra

Databases: Oracle 8i/9i/10g, MySQL

Languages: Java, SQL, PL/SQL, Ruby, Shell Scripting

Operating Systems: UNIX(OSX, Solaris), Windows, Linux(Cent OS, Fedora, Red Hat)

IDE Tools: Eclipse, NetBeans

Application Server: Apache Tomcat

PROFESSIONAL EXPERIENCE

Confidential - San Jose, CA

Hadoop Developer

Responsibilities:

  • Worked on Data Scientist activities and developed different scatter graphs using R-Studio.
  • Created automated python scripts to validate the data flow through elastic search.
  • Setting up the project/tenant with keystone user role.
  • Experience in AWS cloud environment on S3 storage and EC2 instances.
  • Creating the network, router, Subnet.
  • Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from LINUX file system to Hadoop Distributed File System.
  • Created Hbase tables to store various data formats of PII data coming from different portfolios.
  • Experience in managing and reviewing Hadoop log files.
  • Creating instances in openstack for setting up the environment.
  • Setting up the ELK( ElatsticSearch, Logstash, Kibana) Cluster.
  • Trouble shooting any Nova, Glance issue in openstack, Kafka, Rabbitmq bus.
  • Performance testing of the environment- Creating python script to load on IO, CPU.
  • Experience with OpenStack Cloud Platform.
  • Experienced in Provisioning Hosts with flavors GP(General-purpose), SO(Storage Optimize), MO(Memory Optimize), CO(Compute Optimize).

Environment: Openstack, ElasticSearch, Logstash, Ansible, Rhel7, python, Kafka, streamsets, Influxdb, sensu, rabbitmq, Uchiwa, kibana, Hive,Pig,Hbase, Sqoop.

Confidential - Boston, MA

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
  • Extracted/Imported data from/to Databases into HDFS using Sqoop.
  • Worked on reading multiple data formats on HDFS using Scala
  • Implemented many complex Hive queries using Joins in Hive to optimize performance.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables.
  • Developed and executed shell scripts to automate the jobs
  • Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Cassandra and SQL
  • Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
  • Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala
  • Worked on Log files using Flume import and performed Load Test on them.
  • Worked with JSON based REST Web services and Amazon Web Services (AWS).
  • Performed Load Test on AWS.
  • Worked on the core and Spark SQL modules of Spark extensively.
  • Experienced in running Hadoop streaming jobs to process terabytes data.
  • Involved in importing the real time data to hadoop using Kafka and implemented the Oozie job for daily imports.
  • Involved in requirement analysis, design, build, testing phases and responsible for documenting technical specifications.

Environment: & Tools: Hadoop, HDFS, AWS, Hive, Scala, Sqoop, Spark, SQL, Cassandra,Oozie,Tableau.

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

  • Installed and configured Pig and also written Pig Latin scripts.
  • Involved in managing and reviewing Hadoop Job tracker log files and control-m log files.
  • Scheduling and managing cron jobs, wrote shell scripts to generate alerts.
  • Monitoring and managing daily jobs, processing around 200k files per day and monitoring those through RabbitMQ and Apache Dashboard application.
  • Used Control-m scheduling tool to schedule daily jobs.
  • Experience in administering and maintaining a Multi-rack Cassandra cluster
  • Monitored workload, job performance and capacity planning using InsightIQ storage performance monitoring and storage analytics, experienced in defining job flows.
  • Got good experience with NOSQL databases like Cassandra, Hbase.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers/sensors
  • Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
  • Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
  • The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
  • Worked on setting up High Availability for GPHD 2.2 with Zookeeper and quorum journal nodes.
  • Used Control-m scheduling tool to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs
  • Worked with BI teams in generating the reports and designing ETL workflows on Tableau.
  • Involved in Scrum calls, Grooming and Demo meeting, Very good experience with agile methodology.

Environment: Apache Hadoop 2.3, gphd-1.2, gphd-2.2, Map Reduce 2.3, HDFS, Hive, Java 1.6 & 1.7, Cassandra, Pig, SpringXD, Linux, Eclipse, RabbitMQ, Zookeeper, PostgresDB, Apache Solar, Control-M, Redis., Tableau, Qlikview, DataStax.

Confidential, NC

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop Map reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
  • Installed and configured Pig and also written Pig Latin scripts.
  • Developed PIG scripts using Pig Latin.
  • Involved in managing and reviewing Hadoop log files.
  • Exported data using Sqoop from HDFS to Teradata on regular basis.
  • Developing Scripts and Batch Job to schedule various Hadoop Program.
  • Written Hive queries for data analysis to meet the business requirements.
  • Creating Hive tables and working on them using Hive QL.
  • Experienced in defining job flows.
  • Got good experience with NOSQL databases like Cassandra.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Designed and implemented Map reduce-based large-scale parallel relation-learning system
  • Setup and benchmarked Hadoop clusters for internal use.
  • Worked with BI teams in generating the reports and designing ETL workflows on Tableau.
  • Monitoring the log flow from LM Proxy to ES-Head.
  • Used secportal as front end of Gracie where we perform the search operations.
  • Wrote the Map Reduce code for the flow from Hadoop Flume to ES Head.

Environment: Cloudera Hadoop(CDH 4.4), Map Reduce, HDFS, Hive, Java, Pig, Cassandra, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, PL/SQL, SQL connector, Sub Version.

Confidential

Java/J2EE Developer

Responsibilities:

  • Involved in Java, J2EE, struts, web services and Hibernate in a fast paced development environment.
  • Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
  • Involved in design and implementation of web tier using Servlets and JSP.
  • Used Apache POI for Excel files reading.
  • Developed the user interface using JSP and Java Script to view all online trading transactions.
  • Designed and developed Data Access Objects (DAO) to access the database.
  • Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects
  • Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
  • Coded HTML pages using CSS for static content generation with JavaScript for validations.
  • Used JDBC API to connect to the database and carry out database operations.
  • Used JSP and JSTL Tag Libraries for developing User Interface components.
  • Performing Code Reviews.
  • Performed unit testing, system testing and integration testing.
  • Involved in building and deployment of application in Linux environment.

Environment: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.s

We'd love your feedback!