Hadoop Developer Resume San Jose, CA - Hire IT People

SUMMARY

Over 6+ years of professional IT experience, 5 years in Big Data Ecosystem experience in ingestion, querying, processing and analysis of big data.
Experience in using Hadoop ecosystem components like Map Reduce, HDFS, HBase, ZooKeeper, Hive, Sqoop, Pig, Flume, Cloudera.
Knowledge on NoSQL databases like HBase, Cassandra
Experience includes Requirements Gathering, Design, Development, Integration, Documentation, Testing and Build.
Experience in working with Map Reduce programs, Pig scripts and Hive commands to deliver the best results.
Have competence on different Big Data frameworks such as Kafka, Neo4j, Hive, Elasticsearch, HDFS, YARN etc. and on various data visualization libraries such as D3.js etc.
Extensively worked on development and optimization of Map reduce programs, PIG scripts and HIVE queries to create structured data for data mining.
Solid knowledge of Hadoop architecture and daemons like Name node, Data nodes, Job trackers, Task Trackers.
Good knowledge on ZooKeeper to coordinate clusters.
Experience in Database design, Data analysis, Programming SQL, Stored procedure's PL/ SQL, and Triggers in Oracle and SQL Server.
Experience in extending HIVE and PIG core functionality by using Custom user Defined functions.
Experience in writing custom classes, functions, procedures, problem management, library controls and reusable components.
Working knowledge on Oozie, a workflow scheduler system to manage the jobs that run on PIG, HIVE and SQOOP.
Followed Test driven development of Agile and scrum Methodology to produce high quality software.
Expert in AWS Cloud Formation template creation
Experience in AWS EMR cluster configuration
Experience in AWS cloud environment on S3 storage and EC2 instances.
Experience in R - Studio by creating visualization on data file.
Experienced in integrating Java-based web applications in a UNIX environment.
Developed applications using JAVA, JSP, Servlets, JDBC, JavaScript, XML and HTML.
Strong analytical skills with ability to quickly understand clients business needs. Involved in meetings to gather information and requirements from the clients.
Research-oriented, motivated, proactive, self-starter with strong technical, analytical and interpersonal skills.

TECHNICAL SKILLS

Hadoop Ecosystem: Kafka,HDFS,MapReduce,Hive,Impala,Pig,Sqoop,Flume,Oozie,Zookeeper,Ambari,Hue,Spark,Strom,Ganglia

Project Management / Tools / Applications: All MS Office suites(incl. 2003), MS Exchange & Outlook, Lotus Domino Notes, Citrix Client, SharePoint, MS Internet Explorer, Firefox, Chrome, Apache, IIS

Web Technologies: HTML, XML, CSS, JavaScript

NoSQL Databases: HBase, Cassandra

Databases: Oracle 8i/9i/10g, MySQL

Languages: Java, SQL, PL/SQL, Ruby, Shell Scripting

Operating Systems: UNIX(OSX, Solaris), Windows, Linux(Cent OS, Fedora, Red Hat)

IDE Tools: Eclipse, NetBeans

Application Server: Apache Tomcat

PROFESSIONAL EXPERIENCE

Confidential - San Jose, CA

Hadoop Developer

Responsibilities:

Worked on Data Scientist activities and developed different scatter graphs using R-Studio.
Created automated python scripts to validate the data flow through elastic search.
Setting up the project/tenant with keystone user role.
Experience in AWS cloud environment on S3 storage and EC2 instances.
Creating the network, router, Subnet.
Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Involved in loading data from LINUX file system to Hadoop Distributed File System.
Created Hbase tables to store various data formats of PII data coming from different portfolios.
Experience in managing and reviewing Hadoop log files.
Creating instances in openstack for setting up the environment.
Setting up the ELK( ElatsticSearch, Logstash, Kibana) Cluster.
Trouble shooting any Nova, Glance issue in openstack, Kafka, Rabbitmq bus.
Performance testing of the environment- Creating python script to load on IO, CPU.
Experience with OpenStack Cloud Platform.
Experienced in Provisioning Hosts with flavors GP(General-purpose), SO(Storage Optimize), MO(Memory Optimize), CO(Compute Optimize).

Environment: Openstack, ElasticSearch, Logstash, Ansible, Rhel7, python, Kafka, streamsets, Influxdb, sensu, rabbitmq, Uchiwa, kibana, Hive,Pig,Hbase, Sqoop.

Confidential - Boston, MA

Hadoop Developer

Responsibilities:

Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
Extracted/Imported data from/to Databases into HDFS using Sqoop.
Worked on reading multiple data formats on HDFS using Scala
Implemented many complex Hive queries using Joins in Hive to optimize performance.
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables.
Developed and executed shell scripts to automate the jobs
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Cassandra and SQL
Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala
Worked on Log files using Flume import and performed Load Test on them.
Worked with JSON based REST Web services and Amazon Web Services (AWS).
Performed Load Test on AWS.
Worked on the core and Spark SQL modules of Spark extensively.
Experienced in running Hadoop streaming jobs to process terabytes data.
Involved in importing the real time data to hadoop using Kafka and implemented the Oozie job for daily imports.
Involved in requirement analysis, design, build, testing phases and responsible for documenting technical specifications.

Environment: & Tools: Hadoop, HDFS, AWS, Hive, Scala, Sqoop, Spark, SQL, Cassandra,Oozie,Tableau.

Confidential, Boston, MA

Hadoop Developer

Responsibilities:

Installed and configured Pig and also written Pig Latin scripts.
Involved in managing and reviewing Hadoop Job tracker log files and control-m log files.
Scheduling and managing cron jobs, wrote shell scripts to generate alerts.
Monitoring and managing daily jobs, processing around 200k files per day and monitoring those through RabbitMQ and Apache Dashboard application.
Used Control-m scheduling tool to schedule daily jobs.
Experience in administering and maintaining a Multi-rack Cassandra cluster
Monitored workload, job performance and capacity planning using InsightIQ storage performance monitoring and storage analytics, experienced in defining job flows.
Got good experience with NOSQL databases like Cassandra, Hbase.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers/sensors
Developed MapReduce programs to cleanse the data in HDFS obtained from heterogeneous data sources to make it suitable for ingestion into Hive schema for analysis.
Used Hive data warehouse tool to analyze the unified historic data in HDFS to identify issues and behavioral patterns.
The Hive tables created as per requirement were internal or external tables defined with appropriate static and dynamic partitions, intended for efficiency.
Worked on setting up High Availability for GPHD 2.2 with Zookeeper and quorum journal nodes.
Used Control-m scheduling tool to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java map-reduce, Hive and Sqoop as well as system specific jobs
Worked with BI teams in generating the reports and designing ETL workflows on Tableau.
Involved in Scrum calls, Grooming and Demo meeting, Very good experience with agile methodology.

Environment: Apache Hadoop 2.3, gphd-1.2, gphd-2.2, Map Reduce 2.3, HDFS, Hive, Java 1.6 & 1.7, Cassandra, Pig, SpringXD, Linux, Eclipse, RabbitMQ, Zookeeper, PostgresDB, Apache Solar, Control-M, Redis., Tableau, Qlikview, DataStax.

Confidential, NC

Hadoop Developer

Responsibilities:

Installed and configured Hadoop Map reduce, HDFS, Developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
Installed and configured Pig and also written Pig Latin scripts.
Developed PIG scripts using Pig Latin.
Involved in managing and reviewing Hadoop log files.
Exported data using Sqoop from HDFS to Teradata on regular basis.
Developing Scripts and Batch Job to schedule various Hadoop Program.
Written Hive queries for data analysis to meet the business requirements.
Creating Hive tables and working on them using Hive QL.
Experienced in defining job flows.
Got good experience with NOSQL databases like Cassandra.
Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
Designed and implemented Map reduce-based large-scale parallel relation-learning system
Setup and benchmarked Hadoop clusters for internal use.
Worked with BI teams in generating the reports and designing ETL workflows on Tableau.
Monitoring the log flow from LM Proxy to ES-Head.
Used secportal as front end of Gracie where we perform the search operations.
Wrote the Map Reduce code for the flow from Hadoop Flume to ES Head.

Environment: Cloudera Hadoop(CDH 4.4), Map Reduce, HDFS, Hive, Java, Pig, Cassandra, Linux, XML, MySQL, MySQL Workbench, Java 6, Eclipse, PL/SQL, SQL connector, Sub Version.

Confidential

Java/J2EE Developer

Responsibilities:

Involved in Java, J2EE, struts, web services and Hibernate in a fast paced development environment.
Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
Involved in design and implementation of web tier using Servlets and JSP.
Used Apache POI for Excel files reading.
Developed the user interface using JSP and Java Script to view all online trading transactions.
Designed and developed Data Access Objects (DAO) to access the database.
Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects
Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
Coded HTML pages using CSS for static content generation with JavaScript for validations.
Used JDBC API to connect to the database and carry out database operations.
Used JSP and JSTL Tag Libraries for developing User Interface components.
Performing Code Reviews.
Performed unit testing, system testing and integration testing.
Involved in building and deployment of application in Linux environment.

Environment: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.s

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

San Jose, CA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship