Hadoop Developer Resume Overland Park, KS - Hire IT People

SUMMARY:

8+ years of total Software development experience with Hadoop Ecosystem, Big Data and Data Science Analytical Platforms, Java/J2EE Technologies, Database Management Systems and Enterprise - level Cloud Base Computing and Applications.
Around 3years of experience in Design and Implementation of Big data applications using Hadoop stack MapReduce, Hive, Pig, Oozie, Sqoop, Flume, HBase and NoSQL Data bases.
Hands on experience in writing complex Map reduce jobs, Pig Scripts and Hive data modeling.
Have experience creating batch style distributed computing applications using Apache Spark and Flume.
Have hands-on experience doing analytics using SPARK SQL.
Hands-on experience and in depth understanding and usage of Hadoop Architecture frameworks and various components
Experience and in-depth understanding of analyzing data using HIVEQL, PIG.
Worked extensively with HIVE DDLs and Hive Query language (HQLs) and Developed UDF, UDAF, UDTF functions and implemented it in HIVE Queries.
Good hands-on experience with PIVOTAL’S query processing model HAWQ.
In-depth understanding of NoSQL databases such as HBase.
Proficient knowledge and hands on experience in writing shell scripts in Linux.
Adequate knowledge and working experience in Agile & Waterfall methodologies.
Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
Have a good understanding of Kafka.
Experienced in job workflow scheduling and monitoring tools like Oozie and CISCO TIDAL.
Experience using various Hadoop Distributions (PIVOTAL, Hortonworks, MapR etc) to fully implement and leverage new Hadoop features.
Expertise in Hadoop Ecosystem tools which including HDFS, Yarn, MapReduce, Pig, Hive, Sqoop, Flume, Kafka, Spark, Zookeeper and Oozie.
Experienced in requirement analysis, application development, application migration and maintenance using Software Development Lifecycle (SDLC) and Java/J2EE technologies.
Experience in development of Client/Server Technologies and Systems Software design and development using Java/ JDK, Java Beans, J2EE(TM) Technology- J2EE technologies such as Spring, Struts, Hibernate, Servlets, JSP, JBOSS, JavaScript and JDBC and web Technologies like HTML, CSS, PHP, XML.
Experienced in backend development using SQL, stored procedures on Oracle 9i, 10g and 11i
Worked on various Tools and IDEs like Eclipse, IBM Rational, Visio, Apache Ant-Build Tool, MS-Office, PLSQL Developer, SQL*Plus
Expertise in full life cycle development of system, requirement elicitation, making Use Cases, Class Diagram, and Sequence Diagram.
Conscientious team player and motivated to learn and apply new concepts. Always aspires to exceed client expectations and to effectively collaborate with several cross-functional teams.
Worked with geographically distributed and culturally diverse team, including roles that involve interaction with clients and team members.

TECHNICAL SKILLS:

Big-Data /Hadoop Technologies: MapReduce, PIG, HIVE, SQOOP, FLUME, HDFS, Kafka,Oozie, HAWQ

NO SQL Database: Hbase

Real Time/Stream processing: Apache Spark,Apache Kafka

Programming Languages: JAVA, C++, C, SQL, PL/SQL,Python, Scala.

Java Technologies: Servlets, JavaBeans, JDBC, JNDI, JTA, JPA, EJB 3.0

Framework: JUnit and JTest, LDAP

Databases: Oracle8i/9i, MY SQL, MS SQL server, POSTGRESQL

IDE's & Utilities: Eclipse, NetBeans

Web Dev. Technologies: HTML, XML

Protocols: TCP/IP, HTTP and HTTPS

Operating Systems: Linux, MacOS, Windows 8, Windows 7, Vista, XP, Windows 95/2000 and MS-DOS

PROFESSIONAL EXPERIENCE:

Confidential, Overland Park, KS

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop cluster environment with Hortonworks distribution.
Worked on Kafka and REST API to collect and load the data on Hadoop file system and used sqoop to load the data from relational databases.
Implemented Talend jobs to load data from excel sheets and integrated with Kafka.
Used Spark-Streaming APIs to perform necessary transformations and actions on the data got from Kafka and Persists into Cassandra database.
Developed Spark scripts by writing custom RDDs in Scala and Python for data transformations and actions on RDDs.
Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala, Python.
Worked with Python, to develop analytical jobs using light weight PySpark API of spark.
Worked with Avro, ORC file formats and compression techniques like LZO.
Used Hive to form an abstraction on top of structured data resides in HDFS and implemented Partitions, Dynamic Partitions, Bucketson HIVE tables.
Used Spark API over Hadoop YARN as execution engine for data analytics using Hive.
Worked on migrating MapReduce programs into Spark transformations using Scala.
Designed, developed data integration programs in a Hadoop environment with NoSQL data store Cassandra for data access and analysis.
Using Job management scheduler apache Oozie to execute the workflow.
Using Ambarito monitor node's health, status of the jobs and to run the analytics jobs in Hadoop clusters.
Worked on Tableau to build customized interactive reports, worksheets and dashboards.
Implemented Kerberos for strong authentication to provide data security.
Involved in performance tuning of spark jobs using Cache and by utilizing complete advantage of cluster environment.

Environment: Hadoop, HDP, Spark, Scala, Python, Kafka, Hive, Sqoop, Ambari, Mesos, Talend, Oozie, Cassandra, Tableau, Jenkins, Hortonworks, Amazon AWS and Red Hat Linux.

Confidential, KS

Hadoop Developer

Responsibilities:

Built python script to extract the data from the Hawq tables and generated a “dat” file for the downstream application.
Built a generic framework to parse raw data with fixed length using python which takes JSON Layout for the fixed positions of the strings and load the data into Hawq tables.
Built generic framework that transforms two or more data sets in HDFS using python.
Built generic frameworks for Sqoop/Hawq to load data from SQL server to HDFS and HDFS to Hawq using python.
Performed extensive data validation using Hawq partitions for efficient data access.
Built generic framework that allows for us to update the data in a Hawq tables using python.
Coordinated in all testing phases and worked closely with Performance testing team to create a baseline for the new application.
Created automated workflows that schedule jobs daily for loading data and other transformation jobs in using cisco tidal.
Created PostgreSQL functions (stored procs) to populate the data into the tables on daily basis.
Developed functions using PL python for various use cases.
Wrote programs on Scala to support the play framework and act as code behind for the frontend application.
Developed multiple Kafka topics/queues and produced 20Million data using producer
Wrote and worked on various data types like complex Json, canonical Json and xml data to Kafkatopics.
Developed the code for Data ingestion and acquisition using Spring XD streams to Kafka.
The format of data is JSON data which is finally converted into avro byte format and then published to kafka.
Documented technical design documents and production support documents.
Documented technical design documents and production support documents.
Worked on SSIS and SSRS tools to aid in the decommission of the data from SQL to distributed environment.
Wrote python scripts to create automated workflows.

Environment: PHD-2.0, HAWQ 1.2, SQOOP 1.4, Python 2.6, SQL, Apache Kafka

Confidential, Philadelphia, PA

Hadoop Developer

Responsibilities:

Pulled the data from data warehouse using Sqoopand placed in HDFS.
Wrote MapReduce jobs to join data from multiple tables and convert it to CSV files.
Worked with Play Frameworkto design the frontend of the application.
Wrote programs onScala to support the play framework and act as code behind for the frontend application.
Wrote programs in java and at times Scala to implement intermediate functionalities like events or records count from the Hbase.
Configured multiple remote akka worker nodes and Master nodes from scratch to as per the software requirement specifications.
Also wrote some pig scripts to do ETL transformations on the MapReduce processed data.
Involved in review of functional and non-functional requirements.
Responsible to manage data coming from different sources.
Wrote shell scripts to pull the necessary fields from huge files generated by MapReduce jobs.
Converted ORC data from hiveinto flat file using mapReduce jobs.
Creating Hive tables and working on them using Hive QL.
Supported the existing MapReduce Programs those are running on the cluster.
Followed agile methodology for the entire project.
Prepare technical design documents, detailed design documents.

Environment: Linux - Ubuntu, Hadoop pseudo distributed mode 1.2.1, HDFS, Hive, Hortonworks, Flume, Hive.

Confidential, Los Angeles, CA

Hadoop Developer

Responsibilities:

Converting the existing relational database model to Hadoop ecosystem.
Generate datasets and load to HADOOP Ecosystem
Worked with Linux systems and RDBMS database on a regular basis to ingest data using Sqoop.
Worked with Spark to create structured data from the pool of unstructured data received.
Managed and reviewed Hadoop and HBase log files.
Involved in review of functional and non-functional requirements.
Responsible to manage data coming from various sources.
Experience in implementing Kafka Java producers and create custom partitions, configured brokers and implemented High level consumers to implement data platform.
Loaded the CDRs from relational DB using Sqoop and other sources to Hadoop cluster by using Flume.
Involved in loading data from UNIX file system and FTP to HDFS.
Designed and implemented HIVE queries and functions for evaluation, filtering, loading and storing of data.
Creating Hive tables and working on them using Hive QL.
Wrote Spark code to convert unstructured data to structured data.
Developed Hive queries to analyze the output data.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Developed Kafka consumer's API in Scala for consuming data from Kafka topics.
Had to do the Cluster co-ordination services through ZooKeeper.
Collected the logs data from web servers and integrated in to HDFS using Flume.
Used HIVE to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Design and implement Spark jobs to support distributed data processing.
Supported the existing MapReduce Programs those are running on the cluster.
Wrote the shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in Hadoop cluster task like Adding and Removing Nodes without any effect to running jobs and data.
Followed agile methodology for the entire project.
Installed and configured Apache Hadoop Hive and Pig environment
Prepare technical design documents, detailed design documents.

Environment: Linux - Ubuntu, Hadoop pseudo distributed mode 1.2.1, HDFS, Hive 0.12, Flume, Kafka, Hortonworks, Spark, Flume, Hive.

Confidential, Atlanta, GA

Hadoop Developer

Responsibilities:

Utilized Flume to filter out the input data read to retrieve only the data needed to perform analytics by implementing flume interception.
Used Flume to transport logs to HDFS
Worked on Pig script to count the number of times a URL was opened in a duration. Later a comparison of the count of various other URL’s shows the relative popularity of that website among employees.
Hive was used to pull out additional analytical information.
Worked on Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance.
Involved in moving all log files generated from various sources to HDFS for further processing through Flume
Worked on Hue interface for querying the data.
Involved in writing MapReduce programs for analytics
Also used MapReduce for structuring the data coming from flume sinks.
Managing and scheduling Jobs on a Hadoop cluster using Oozie.
Generated the datasets and loaded to HADOOP Ecosystem.
Performed the installation, configuration and used the Hadoop ecosystem components such as Map Reduce, HDFS, Pig, Hive, Scoop, Flume, HBase.

Environment: Hadoop, Cloudera Manager, Map Reduce, Hive, Flume, Pig.

Confidential, Houston, TX

Java Developer

Responsibilities:

Worked with several clients with day-to-day requests and responsibilities.
Involved in analyzing system failures, identifying root causes and recommended course of actions.
Integrated Struts Hibernate and JBoss Application Server to provide efficient data access.
Involved in HTML page Development using CSS and JavaScript.
Developed the presentation layer with JSF, JSP and JAVA Script technologies.
Designed table structure and coded scripts to create tables, indexes, views, sequence, synonyms and database triggers. Involved in writing Database procedures, Triggers, PL/SQL statements for data retrieval.
Developed the UI components using JQuery and JavaScript Functionalities.
Designed database and coded PL/SQL stored Procedures, triggers required for the project.
Used Session and FacesContext of JSF Objects for passing content from one Bean to other.
Designed and developed Session Beans to implement business logic.
Tuned SQL statements and Web Sphere application server to improve performance, and consequently met the SLAs.
Created the EAR and WAR files and deployed the application in different environment.
Engaged in analyzing requirements, identifying various individual logical components, expressing the system design through UML diagrams using Rational Rose.
Involved in running shell scripts for regression testing.
Extensively used HTML and CSS in developing the front-end.
Designed and Developed JSP pages to store and retrieve information.

Environment: Java, J2EE, JSP, Java Script, JSF, Spring, XML XHTML, Oracle9i, PL/SQL, SOAP Web service, Web Sphere, Oracle, JUnit, SVN.

Confidential

Graduate Trainee/Programmer Analyst

Responsibilities:

Prepared program Specification for the development of PL/SQL procedures and functions.
Created Custom Staging Tables to handle import data.
Created custom triggers, stored procedures, packages and functions to populate different database.
Developed SQL* loader scripts to load data in the custom tables.
Run Batch files for loading database tables from flat files using SQL*loader.
Created UNIX Shell Scripts for automating the execution process.
Developed PL /SQL code for updating payment terms.
Created indexes on tables and Optimizing Stored Procedure queries.
Design, Development and testing of Reports using SQL*plus.
Modified existing codes and developed PL/SQL packages to perform certain specialized functions/enhancement on oracle application.
Created Indexes and partitioned the tables to improve the performance of the query.
Involved in preparing documentation and user support documents.
Involved in preparing test plans, unit testing, System integration testing, implementation and maintenance.

Environment: Oracle 9i/10g, PL/SQL, SQL*Loader, SQL Navigator, SQL*Plus, UNIX, Windows NT, Windows2000.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Overland Park, KS

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship