Hadoop Developer Resume Richardson, TX - Hire IT People

SUMMARY

IT Professional with over 7 years of experience in implementing various technologies across different phases of software development life cycle delivering cutting edge solutions to clients
Worked extensively as a Hadoop Developer for 3 years implementing Big Data applications using different components such as PIG, HIVE, Map Reduce and Spark, Spark SQL
More than 4 years of experience in Enterprise Application Development (Back end and Front end) in Java/J2EE technologies
Experience in working in diverse domains like Healthcare, Telecom and Banking
Experience in working on Cloudera, Hortonworks and MapR Hadoop distributions
Expertise in developing custom functionalities to meet business needs by creating UDF s in PIG and HIVE
Strong knowledge on Hadoop architecture and its daemons such as Name Node, Data Node, Secondary Name Node, Job Tracker, Task Tracker, Yarn
Expertise in using Talend integrated with Cloudera distribution Hadoop for ETL and Data Lake
Worked on Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark Context, Data Frame, RDDs, Spark YARN
Experience in developing PIG scripts for data flow and transformation activities
Expertise in creating Managed tables, External tables and Views in Hive and analyzing data large datasets using HiveQL
Stored the data in tabular formats using Hive tables and Hive SerDe
Working knowledge on using different NoSQL databases such as HBase, MongoDB, Cassandra
Implemented SQOOP for data loading activities from RDBMS (SQL server, Oracle) data sources into HDFS
Imported data into HDFS from various streaming systems using Spark Streaming, Flume into Big Data Lake
Good knowledge on using Oozie and ZooKeeper for workflows, scheduling and monitoring activities on the cluster
Computed indexed views for data exploration using Apache Solr
Experience in implementing a distributed messaging queue to integrate with Cassandra using Apache Kafka and zookeeper
Expert in creating and designing data ingest pipelines using technologies such as Spring Integration, Apache Storm - Kafka
Working knowledge on developing and using Python and Shell Scripting
Worked on implementing Combiners concepts in Map Reduce to increase efficiency and tune the performance of the jobs
Expertise in Optimization of Hive tables using optimization techniques like Static Partitioning, Dynamic Partitioning and Bucketing
Expertise in debugging and resolving issues in batches and scripts on a Hadoop cluster
Worked on Sequence files, RC files, ORC for data loading to HDFS and Parquet file format for Spark SQL
Hands on development knowledge with RDBMS, including writing SQL queries, PL/SQL, views, stored procedure, Triggers and Cursors
Used different frameworks such as Spring, Hibernate, Servlets, JSP, EJB, JMS, JDBC, SOA, Web Services, SOAP, REST
Experience in using Eclipse, NetBeans and IDLE, Anaconda IDEs for Java and Python programming
Worked extensively and implemented best practices in different software development lifecycles such as Waterfall and Agile Scrum
Good knowledge in applications design using Unified Modelling Language (UML), Sequence diagrams, Class diagrams, Data flow diagrams and Entity Relationship diagrams
Excellent communication skills and interpersonal skills along with a strong motivation for optimum project delivery
Prime problem solving skills with great ability to organize and prioritize tasks and been a noted team player

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop 1.x/2.x(Yarn), HDFS, Map Reduce, Hive, Pig, Sqoop, Flume, Kafka, Spark, Storm, Zookeeper, Oozie, Tez, Impala, Mahout, Solr

Programming Languages: Java, Python, PL/SQL, PIG Latin, HiveQL, Spark SQL, C

Hadoop Distributions: Cloudera, Hortonworks

IDE: Eclipse, NetBeans, IDLE, SQL Developer, Intellij, Anaconda

Relational Databases: SQL server, MySQL, Oracle 10g, 11g

NoSQL Databases: HBase, MongoDB, Cassandra

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP, JSON, XML,JSF

Application Frameworks: Hibernate, Spring, Struts, JMS, EJB, Junit, MRUnit

Web Services: SOAP, REST, WSDL, JAXB, and JAXP

Application Servers: Tomcat, WebLogic, WebSphere

Scripting Languages: Python, Shell

Visualization Tools: Tableau, R, MS Excel

Methodologies: Waterfall, Agile

Operating Systems: Microsoft, Linux, Unix, Ubuntu

PROFESSIONAL EXPERIENCE

Confidential - Richardson, TX

Hadoop Developer

Environment: Cloudera distribution Hadoop, Map Reduce, Hive, Pig, HBase, Sqoop, Flume, Cassandra, Storm, Solr Scala, Spark, Oozie, Kafka, Linux, Java, Tableau, Eclipse, HDFS, PIG, Java (JDK), MySQL

Responsibilities:

Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data
Importing and exporting data into RDBMS and Hive using Sqoop
Worked on partitioning Hive table both Static and Dynamic partitions as per the need
Optimized HIVE analytics SQL queries to achieve job performance
Created and worked Sqoop jobs with incremental load to populate Hive External tables
Developed Pig scripts in the areas where extensive coding needs to be reduced
Loaded and transformed large datasets of Structured, Un-structured and semi Structured Data into HDFS
Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data
Used Flume to collect, aggregate, and store the log data from different web servers
Created HBase tables to store variable data formats of data coming from different portfolios
Experienced in developing Map Reduce programs to load the data from system generated log file to HBase database
Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior for targeted advertising
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Worked on migrating MapReduce programs into Spark transformations using Spark and Scala
Design technical solution for real-time analytics for streaming data using Kafka and Spark Streaming
Helped the team to increase cluster size from 20 nodes to 45 nodes. The configuration for additional data nodes was managed using Puppet
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to Map Reduce jobs
Designing conceptual model with Spark for performance optimization
Developed Oozie workflow for scheduling and orchestrating the ETL process
Developed Map Reduce programs to parse the raw data and store the refined data in tables
Analyzing data with Hive, Pig and Hadoop Streaming
Worked on creating the Data Model for Cassandra from the current Oracle Data model
Worked with CQL to execute queries on the data persisting in the Cassandra cluster
To analyze data migrated to HDFS, used Hive data warehouse tool and developed Hive queries
Used Tableau for visualizing and to generate reports

Confidential - Dallas, TX

Hadoop Developer

Environment: Cloudera Distribution, Hive, MapReduce, Pig, Impala, Tableau, HDFS, Kafka, SQOOP, Flume, HBase, Oozie, Tableau, Java, AWS

Responsibilities:

Worked on a Hadoop environment with MapReduce, KAFKA, Sqoop, Oozie, Flume, HBase, Pig, Hive and IMPALA on a multi node cloud environment
To configure Hadoop environment in cloud through Amazon Web Services (AWS) and to provide a scalable distributed data solution
Worked on installation of KAFKA on Hadoop cluster and to use it for streaming & cleansing of raw data and have extracted useful information using Hive and stored the results in HBase
Developed producers for Kafka which compress, and bind many small files into a larger Avro and Sequence files before writing to HDFS to make best use of a Hadoop block size
Worked on implementing MapReduce Jobs to parse raw weblogs into delimited records and also in handling files in various formats such as JSON, XML, Text formats
Improved performance on MapReduce Jobs by creating combiners, Partitioning and Distributed Cache
Exposure in Spark iterative processing
Created partitioned tables in Hive for best performance and faster querying
Utilized Sqoop to import data from various database sources into HBase using Sqoop scripts by incremental data loading on transactions of customer's data by date
Utilized Flume in moving log files generated from various sources into Amazon S3 for processing of data
Performed extensive data analysis using Hive and Pig
Created Simple as well as complex results using Hive and have improved performance and reduced query time by creating partitioned tables
Created workflow in Oozie for Automating tasks of loading data into Amazon S3 and to preprocess using Pig, utilized Oozie for data scrubbing and processing
Developed scripts and deployed them to pre-process the data before moving to HDFS
Performed extensive analysis on data with Hive and Pig
Worked on proof of concept on IMPALA
Used Synergy for Version control and Clear Quest for creating and recording logs on defects and tasks

Confidential - Jacksonville, FL

Hadoop Developer

Environment: Hortonworks Distribution Hadoop, HDFS, Map Reduce, Hive, Flume, HBase, Sqoop, PIG, Java (JDK 1.6), Eclipse, MySQL and Ubuntu, Zookeeper, Oozie, Apache Kafka, Apache Storm

Responsibilities:

Developed multiple Map Reduce jobs in java for data cleaning and pre-processing
Developed simple to complex Map Reduce jobs using Hive and Pig
Involved in creating Hive tables loading and analyzing data using Hive queries
Involved in running Hadoop jobs for processing millions of records of text data
Responsible for managing data from multiple sources
Implemented best income logic using Pig scripts
Assisted in exporting analyzed data to relational databases using SQOOP
Involved in loading data from UNIX file system to HDFS
Created HBase tables to store different data formats
Experience in managing and reviewing Hadoop log files
Export the analyzed data to the relational databases using SQOOP for visualization and to generate reports for the BI team
Analyzed large amounts of datasets to determine optimal way to aggregate and report on it
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and SQOOP
Established connections to ingest data in and from HDFS
Monitored jobs on Informatica monitoring tool
Fetched data from oracle and written into HDFS
Used Hive connections to analyze data from Oracle
Extensive knowledge on debugging Map Reduce programs, Hive UDF's using Eclipse

Confidential - St. Louis, MO

Java Developer

Environment: Java/J2EE, JSP, CSS, HTML, PHP,JavaScript, AJAX, Hibernate, Spring 2.5, XML, Web Services, Oracle 9i

Responsibilities:

Involved in software development / Production support on web-based front-end applications
Involved in development of the CSV files using the Data load
Implemented Data Access Object (DAO) adapter pattern to communicate with Business Layer with Database by using Hibernate Template class
Responsible for Database Designing and Back End Procedures writing using SQL and PL/SQL in ORACLE database
Utilized WSDL and SOAP to implement Web Services in order to optimize performance by using remote model applications
Development of Service Layer forming the business logic of MVC based spring architecture
Responsible for all aspects of CMS development and administration using OpenText
Updating and maintaining all CMS content
CMS development, code customization, and administration of all Web Servers
Involved in configuration and deployment of front-end application on RAD
Involved in developing JSP's for graphical user interface
Developed the UI using JSP, PHP, HTML and JavaScript
Implemented code for validating the input fields and displaying the error messages
Performed unit testing using JUnit test cases

Confidential - Minneapolis, MN

SQL/Java Developer

Environment: Java/J2EE, JSP, CSS,JavaScript, AJAX, Hibernate, spring 3.0, XML, Web Services, SOAP, Restful, Maven, Rational Rose, HTML, Log4J, JBoss 4

Responsibilities:

Analysis and understanding of business requirements
Developed views and controllers for client and manager modules using Spring MVC 3.0 and Spring Core 3.0
Business logic is implemented using Spring Core 3.0 and Hibernate
Data Operations are performed using Spring ORM wiring with Hibernate and Implemented Hibernate Template and criteria API for Querying database
Developed Exception handling framework and used log4J for logging
Developed Web Services using XML messages that use SOAP
Developed Web Services for Payment Transaction and Payment Release
Developed Restful web Services
Created WSDL and the SOAP envelope
Developed and modified database objects as per the requirements
Involved in Unit integration, bug fixing, acceptance testing with test cases, Code reviews

Confidential

Java Developer

Environment: Java6.0, J2EE, Eclipse IDE, J2EE, JSP2.0, JDBC 3.0, Servlets, JavaScript, Springs, Struts, Ajax, HTML, JQuery, Clear Case, Clear Quest, Windows XP

Responsibilities:

Worked on requirement analysis and design, interacting with the business teams
Implemented Database driven Left Navigation Tree Menu for Admin Module using Ajax4 JSF Framework
Developed Validation frame work to show custom validation on JSF Screens
Participated in the entire SDLC of the project
Developed UI screens by using HTML, JSPs, CSS, jQuery, Ajax
Extensively written CORE JAVA with in application
Developed business layer using Spring, Hibernate and DAO s
JavaScript and jQuery validation of JDBC for all database interactions
Used Code Collaborator for code review
Creating server-side JAVA architecture using Java Servlets
Developed and deployed EJB's, Servlets and JSPs on WebLogic Server
Used MySQL as a database product
Used Eclipse as the IDE for the development

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Richardson, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship