Hadoop/Big Data Developer Resume

SUMMARY:

Around 8 Years of IT experience as a developer, Designer and Quality reviewer with cross platform integration experience using Hadoop, Hadoop architecture and python.
Good understanding of Hadoop architecture and various components such as HDFS, Job tracker, Task Tracker, Name Node and Data Node.
Strong understanding of Hadoop daemons and Map Reduce concepts
Hands on experience in installing, configuring and using Hadoop ecosystems such as Map Reduce, HIVE, PIG, SQOOP, FLUME and OOZIE.
Good Knowledge in loading the data from Oracle and MySQL databases to HDFS system using SQOOP (Structured Data) and FLUME (Log Files & XML).
Extensive experience in developing PIG Latin Scripts and using Hive Query Language for data analytics.
Experienced in writing custom Hive UDF’s to in corporate Business logic with Hive Queries.
Knowledge on analyzing data interactively using Apache Spark and Apache Zeppelin.
Good Knowledge in understanding the Apache Storm - Kafka pipelines.
Good experience in optimizing Map Reduce algorithms using Mappers, Reducers, combiners and partitioners to deliver the best results for the large datasets.
Extensive experience in working with application servers like WebSphere, WebLogic and Tomcat.
Hands on experience in developing PIG Latin Scripts and Hive Query language for data analytics
Strong understanding of NoSQL databases like Cassandra, HBase and MangoDB.
Good Knowledge in job/workflow scheduling and monitoring tools like Oozie & Zookeeper.
Extensive experience in design, development and support Model View Controller using Struts and Spring framework.
Hands on experience in application development using core SCALA, RDBMS, Linux shell scripting and developed UNIX shell scripts to automate various processes.
Proficiency in using BI tools like Tableau/Pentaho.
Experience in understanding the security requirements for Hadoop and integrate with Key Distribution Centre.
Extensive Experience in using database applications of RDBMS in ORACLE and MS Access, SQL Server
Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Scrum, Waterfall and Agile.
Well experienced in testing huge and complex databases, Reporting and ETL tools like Informatica and Data Stage.

TECHNICAL SKILLS:

Hadoop/Big Data: HDFS, MapReduce, YARN, Pig, Hive, HBase, Sqoop, Solr, Flume, Oozie, ZooKeeper, Kafka.

No SQL Databases: HBase, MangoDB, Cassandra

Languages: C, Python, Pig Latin, Scala, HiveQL, Perl, Unix shell scripts

Frameworks: Struts, Spring, Spring XD,Hibernate

Operating Systems: Ubuntu Linux, Windows XP/Vista/7/10, MAC OS

Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP

Web/Application servers: Apache Tomcat, WebLogic, WebSphere

Databases: Oracle, MySQL,PL/SQL,PostgreSQL

Tools: and IDE Eclipse, Anaconda, Spyder

Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP

Development Methodologies: Agile, Scrum, Waterfall

Highest Qualification: Bachelors

University: Vellore Institute of Technology

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop/Big Data Developer

Responsibilities:

Worked on unstructured and semi structured data of 100 TB and with replication factor of 3 the total size is 300TB.
Collected and aggregated a large amount of log data using Apache Flume and staged data in HDFS for further analysis.
Used PIG as ETL tool for transforming, filtering, events joining and performing aggregations.
Scripted UDF and UDAF’s for Hive.
Populated HDFS and Cassandra with large amounts of data using Apache Kafka.
Worked on Spark stream processing to get the data into in-memory and implemented RDD transformations, actions to process as units.
Developed scripts and Batch job to schedule various Hadoop Programs.
Developed Hive Queries for creating foundation tables from stage data.
Used DML statements to perform different operations on Hive tables.
Developed job flows to automate the workflow for PIG and HIVE jobs.
Worked on Apache Crunch library to write, test and run Hadoop Map Reduce Pipeline jobs.
Analyzed the data by performing Hive queries and running Pig scripts to study customer behavior.
Cluster coordination services through Zookeeper.
Created Hive tables, dynamic partitions and buckets for sampling and worked on them using the Hive QL.
Extracted the data from Teradata into HDFS using Scoop.
Adjusted the minimum share of the maps and reducers for all the queues.
Used Tableau for visualizing the data reported from Hive tables.
Worked in using Sequence files, RC File, AVRO and HAR file formats.

Environment: Hadoop, HDFS, Apache Crunch,Map Reduce, Hive, Flume, Sqoop, Zookeeper, Kafka, Storm, Cassandra, Spark, Puppet, Storm, Linux.

Confidential

Hadoop Developer

Responsibilities:

Worked on Kafka-Storm on HDP 2.2 platform for real time analysis
Created PoC to store server log data in MangoDB to identify System Alert Metrics.
Implemented Hadoop framework to capture user navigation across the application to validate the user interface and provide analytic feedback/result to the UI team.
Computed MapReduce jobs using Java API and PIG Latin .
Worked on loading the data into the cluster from the dynamically generated files using Flume and sent the cluster to Relational database management systems using SCOOP .
Loaded the data from Teradata to HDFS using Teradata Hadoop connectors.
Worked on Oozie for defining and scheduling jobs to manage apache Hadoop jobs by Directed Acyclic graph (DAG) of actions with control flows.
Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
Responsible for managing data from multiple sources.
Wrote Pig scripts to run ETL jobs on the data in HDFS for future testing.
Used Hive to analyze the data and checked for correlation.
Imported data using Sqoop to load data from MySQL to HDFS and Hive on regular basis.
Automatically Importing data regular basis using SQOOP to into the Hive partition by using apache Oozie.
Supported Map Reduce Programs that are running on the cluster.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Used Agile methodology in developing the application, which included iterative application development, weekly status report and stand up meetings.

Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Flume, ZooKeeper, Agile Cloudera Manager,Oozie, MySQL, SQL, Linux

Confidential

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solution using Hadoop.
Moved data from Oracle to HDFS and HDFS to Oracle using SQOOP.
Worked on loading and transforming of large sets of semi structured data using Pig Latin operations.
Wrote shell scripts to monitor the health checkup of Hadoop daemon services and responds according to any warning to failure conditions
Imported and exported data into HDFS and HIVE using SQOOP.
Wrote the Apache PIG scripts to process the HDFS data.
Clustered customers category based on the offers using Apache Hive.
Grouping, Aggregation and Sorting are done by using Pig and Hive which are higher level abstractions of MapReduce.
Experience on Pig UDF’s for pre-processing the data for analysis
Wrote Hive queries for data analysis to meet the business requirements.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Generated workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing using Pig.
Extensive experience in performance tuning of oracle queries.
Tested and validated Hadoop Log files.
Created data-models for customer data using Cassandra Query language.
Worked in monitoring, managing and troubleshooting the Hadoop Log files.

Environment: Apace Hadoop, Hive, Cassandra, DataStax, Oracle 11g/10g, MySQL, UNIX, Oozie

Confidential

Java/J2EE Developer

Responsibilities:

Involved in analysis and design of the application.
Involved in preparing the detailed design document for the project.
Developed the application using J2EE architecture.
Involved in developing JSP forms.
Designed and developed web pages using HTML and JSP.
Designed various applets using JBuilder.
Designed and developed Servlets to communicate between presentation and business layer.
Used EJB as a middleware in developing a three-tier distributed application.
Developed Session Beans and Entity beans to business and data process.
Used JMS in the project for sending and receiving the messages on the queue.
Developed the Servlets for processing the data on the server.
The processed data is transferred to the database through Entity Bean.
Used JDBC for database connectivity with MySQL Server.
Used CVS for version control.
Involved in unit testing using Junit.

Environment: Core Java, J2EE, JSP, Servlets, XML, XSLT, EJB, JDBC, JBuilder 8.0, JBoss, Swing, JavaScript, JMS, HTML, CSS, MySQL Server, CVS, Windows 2000

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship