Hadoop Developer Resume Plano, TX - Hire IT People

SUMMARY:

7+ years of professional experience in IT in Analysis, Testing, Documentation, Deployment, Integration, and Maintenance of web based and Client/Server applications.
Qualified Hadoop developer with experience in Hadoop, database management system architecture, Java core, Testing and Implementing Big Data.
Good experience in developing and implementing big data solutions and data mining applications on Hadoop using HDFS, Map Reduce, Hbase, Pig, Hive, Sqoop, Flume, Kafka, Strom, Spark, Oozie, Zookeeper.
Strong Experience in analyzing data using HiveQL, Pig Scripts and custom Map Reduce programs in Java.
Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per requirement.
Experience in importing and exporting the data using Sqoop and Flume from HDFS to Relational Database System and vice - versa
Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Knowledge on architecture and functionality of NOSQL DB like HBase, Cassandra and MongoDB.
Proficient in processing data by using TEZ job.
Expertise in Real time data ingestion into HBASE and HIVE using Storm.
Expertise in setting up automated monitoring and escalation infrastructure for Hadoop Cluster using Ganglia and Nagios.
Good experience in loading unstructured data into HDFS using Flume/Kafka.
Excellent experience in dealing with Compression Codecs like Snappy, Gzip.
Expertise in managing and reviewing Hadoop Log files.
Hands on experience in in-memory data processing with Apache Spark.
Hands on experience in data cleaning, transformation and pushing data as delimited files into HDFS using Informatica Developer.
Worked on ETL tools like Talend to extract, transform and load data according to the requirement.
Extensively used ETL methodology for supporting Data Extraction, transformations and loading processing, using Hadoop.
Implemented ELK (ElasticSearch, Logstash, Kibana) stack to collect and analyze the logs produced by the Storm cluster
Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
Excellent communication skills, interpersonal skills, problem solving skills, a very good team player along with extremely strong positive attitude.

TECHNICAL SKILLS:

Hadoop/Big Data: Hadoop 1.x/2.x (Yarn), HDFS, Map Reduce, Spark, Hive, Zookeeper, Oozie, Tez, Pig, Sqoop, Flume, Kafka, Storm, Ganglia, Nagios.

Development Tools: Eclipse, IBM DB2 Command Editor, SQL Developer, Microsoft Suite (Word, Excel, PowerPoint, Access).

Programming/Scripting Languages: Java, SQL, Unix Shell Scripting, Python.

Databases: Oracle 11g,10g,9i, MySQL, PL/SQL, SQL Server 2005,2008 & DB2

NoSQL Databases: HBase, Cassandra, Mongo DB

ETL: Informatica, Talend

Web Tools: HTML, JavaScript, XML,XSL,DOM

Methodologies: Agile/ Scrum, Waterfall

Operating Systems: Windows 98/2000/XP/Vista/7/8, 10, Macintosh, Unix, Linux and Solaris.

Monitoring & Reporting Tools: Ganglia, Nagios, Custom shell reports

PROFESSIONAL EXPERIENCE:

Confidential, Plano, TX

Hadoop Developer

Responsibilities:

Involved in full life-cycle of the project from Design, Analysis, logical and physical architecture modeling, development, Implementation, testing.
Developing MapReduce programs to parse the raw data and store the refined data in tables.
Injecting, analyzing, processing the data and storing results into HDFS, Hive/HBase using Sqoop.
Responsible for managing data from various sources and their metadata using Hive .
Working with Hive for partitioning, bucketing of data to improve the performance of data from different kind of data sources.
Involved in extracting data from various data sources into HDFS . Used Sqoop to efficiently transfer data between RDBMS and HDFS, used Flume to stream log data from servers.
Worked extensively in creating MapReduce jobs to power data for search and aggregation.
Altered existing Scala programs to enhance performance and obtain partitioned results.
Responsible for analyzing and cleansing raw data by performing Hive queries and running Pig scripts on data.
Developing Spark code in Scala and Spark-SQL environment for faster testing and processing of data.
Exporting analyzed data to the relational databases using Sqoop for virtualization and to generate reports for the BI team.
Involved in loading data into Cassandra NoSQL Database
Working with Oozie to automate the flow of jobs and coordination in the cluster respectively.

Environment: Hadoop 0.20.2, Hive, Hbase, Apache Sqoop, Scala, PIG, Spark, Oozie, Cassandra,Cloudera manager.

Confidential, Jersey City, NJ

Hadoop Developer

Responsibilities:

Involved in complete Implementation lifecycle, specialized in writing custom MapReduce, Pig and Hive programs.
Worked on Large-scale Hadoop cluster for distributed data processing and analysis using Sqoop, Hive, Pig and MapReduce .
Imported data to HDFS from different databases and exported the processed data to Hive, HBase and RDBMS using Sqoop .
Performed data analysis in Hive by creating tables, loading it with data and writing hive queries which will run internally in a MapReduce way.
Optimized MapReduce algorithms using Combiners and Partitions to ensure best results.
Used Pig as ETL tool to do transformations, event joins and some pre-aggregations before storing the data onto HDFS .
Loading data in to NoSQL database HBase using Pig .
Developed a robust data-pipeline to cleanse, filter, aggregate, normalize, and de-normalize the data using Apache Pig and Spark.
Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins
Coordinated the cluster services using ZooKeeper .
Developing workflow in Oozie to automate the tasks of loading the data into HDFS .
Actively participated in collection, analysis and design of the requirements to meet the clients criteria.
Maintained System integrity of all subcomponents (primarily HDFS, Map Reduce, HBase, and Flume).
Documented all the requirements, code and implementation methodologies for reviewing and analyzation purposes.

Environment: HDFS, Hive, Pig, Sqoop, Spark, ZooKeeper, Oozie

Confidential, Phoenix, AZ

Hadoop Developer

Responsibilities:

Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats.
Developed MapReduce programs that filter bad and unnecessary claim records and find out unique records based on account type.
Analyzed the data by performing Hive queries (HiveQL), ran Pig scripts, Spark SQL, Splunk and Spark streaming.
Used Sqoop to import data into HDFS from MySQL database and vice-versa.
Handled importing of data from various data sources, performed transformations using Hive, Map Reduce, Hbase .
Extensive experience in writing Pig scripts to transform raw data from several big data sources into forming baseline big data.
Configured Flume to extract the data from the web server output files to load into HDFS .
Worked on the RDBMS system using PL/SQL to create packages, procedures, functions, triggers as per the business requirements.
Involved in creating Hive Tables, loading the data and writing Hive Queries that will run internally in a map reduce way.
Responsible for importing and exporting data into HDFS from Oracle Database, and vice versa using Sqoop .
Extensively worked with Partitions, Bucketing tables in Hive and designed both Managed and External table.
Created and worked with Sqoop jobs with full refresh and incremental load to populate Hive External tables.
Designing and creating Oozie workflows to schedule and manage Hadoop, Hive, pig and sqoop jobs.

Environment: Hadoop, MapReduce, Pig, Hive, Spark, Splunk, Hbase, HDFS, MySQL, Sqoop, Flume, Oozie.

Confidential, Nashville, TN

ETL Developer

Responsibilities:

Understanding the design Requirements.
Analyzed business process workflows and assisted in the development of ETL procedures for moving data from source to target systems.
Extensively used ETL to data transfer from different sources like flat files, .csv, XML, VSAM and load the data into the target staging database.
Designed and implemented appropriate ETL mappings to extract and transform data from various sources to meet requirements.
Extensively used Informatica Transformations like Source Qualifier, Rank, SQL, Router, Filter, Lookup, Joiner, Aggregator, Normalizer, Sorter etc. and all transformation properties.
Created Sessions, Workflows, Post Session email task and also performed various workflow monitoring and scheduling tasks.
Used Informatica Designer to create reusable transformations to be used in Informatica mappings and mapplets.
Developed slowly changing dimension, according to the data mart schemas.
Involved in identifying the sources for various dimensions and facts for different data marts according to star schema design pattern.
Involved in Fine-tuning of sources, targets, mappings and sessions for Performance Optimization.
Monitored sessions using the workflow monitor, which were scheduled, running, completed or failed. Debugged mappings for failed sessions.

Environment: Informatica Power Center 8.5/8.6.1, Oracle10g, Windows.

Confidential, Atlanta, GA

Java Developer

Responsibilities:

Involved in various phases of Software Development Life Cycle (SDLC) such as requirements gathering, analysis, design and development.
Designed and developed the Application based on J2EE Architecture for server side on Spring MVC Framework.
Involved in analysis, design and developing front end/UI using JSP, HTML, DHTML and JavaScript.
Prepared workflow diagrams using MS VISIO and modeled the methods based on OOPS methodology.
Data Migration from Flat files, CSV, MS-Access, Excel and OLE DB to SQL Database.
Accountable to guide projects on the design and execution of data quality initiatives and other data performance measures following data quality programs.
Developed the Host modules using C++, DB2 and SQL .
Responsible for creating the front-end code and java code to suit the business requirement.
Installed, configured and administered Web Logic Application Server and deploy JSP, Servlets and EJB applications.
Written Maven scripts for build, unit testing, deployment, check styles etc.

Environment: Java, J2EE, JDK, JSP, Eclipse, Maven, HTML, Servlets, SQL, DB2.

Confidential

Java Developer

Responsibilities:

Developed front end screens which includes JQuery, JavaScript, Java and CSS.
Responsible for developing platform related logic and resource classes, controller classes to access the domain and service classes.
Designed and developed the Application based on J2EE Architecture for server side on Spring MVC Framework.
Involved in development and enhancement of web client. Involved in enhancements and optimization in Business logic.
Developed web-based user interfaces using struts frame work.
Designed the GUI screens using Struts and Configured log4j to debug the Application.
Involved in the development of test cases for the testing phase.
Performed End to end integration testing of online scenarios and unit testing using JUnit Testing Framework.

Environment: Java, J2EE, JavaScript, JSP, JSF, Oracle, Eclipse, Log4j

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Plano, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship