Hadoopdeveloper Resume
GA
PROFESSIONAL SUMMARY
- Over 8 years of experience in Analysis, Development, Testing, Maintenance and User training of software application which includes over 4 Years in Big Data, Hadoop and HDFS environment and around 4 Years of experience in SQL, JAVA and J2EE.
- Experience in developing Map Reduce Programs using Apache Hadoop for analyzing teh big data as per requirement.
- Hands on using Sqoop to import data into HDFS from RDBMS and vice - versa.
- Used different Serde’s like Regex Serde and HBase Serde.
- Experience in analyzing data using Hive, Pig Latin and customMR programs in Java.
- Hands on experience in writing Spark SQL scripts and implementing Spark RDD transformations and actions using Python/Scala.
- Well versed wif developing and implementing Spark programs usingPython/Scala and Spark Streaming to work wif Big Data.
- Hands on writing custom UDFs for extending Hive and Pig core functionality.
- Hands on dealing wif log files to extract data and to copy into HDFS using flume.
- Wrote Hadoop Test Cases in Hadoop for checking Input and Outputs.
- Hands on integrating Hive and HBase.
- Experience in Elastic Search and Solr.
- Experience in NOSQL databases: Mongo DB, Hbase and Cassandra.
- Good experience in working wif real time streaming applications using tools like Spark Streaming, Storm and Kafka.
- Hands on using job scheduling and monitoring tools like Oozie and Zookeeper.
- Experience wif handling different file format like XML,JSON,AVRO,ORC and PARQUET format in HIVE using different SerDes.
- Experience in Dimensional Data Modeling using Star and Snow Flake Schema.
- Worked on reusable code known as Tie outs to maintain teh data consistency.
- Clear understanding onHadoop architecture and various components such as HDFS, Job and Task Tracker,Name and Data Node, Secondary Name Node and Map Reduce programming.
- Experience inHadoopadministration activities such as installation and configuration of clusters using Apache and Cloudera.
- Knowledge on installing, configuring, and usingHadoopcomponents likeHadoopMap Reduce(MR1), YARN(MR2), HDFS, Hive, Pig, Flume and Sqoop.
- More TEMPthan one year of experience in JAVA, J2EE, Web Services, SOAP, HTML andXML related technologies demonstrating strong analytical and problem solving skills, computer proficiency and ability to follow through wif projects from inception to completion.
- Extensive experience working in Oracle, DB2, SQL Server and My SQL database and Java Coreconcepts like OOPS,Multithreading, Collections and IO.
- Hands on JAXWS, JSP, Servlets, Struts, Web Logic, Web Sphere, Hibernate, Spring, Jboss, JDBC, RMI, Java Script, Ajax, jQuery, Linux, Unix, XML, HTML, Python, Scala and Vertica.
- Having REDHAT certification on LINUX.
- Developed applications using Java, RDBMS, and Linux shell scripting.
- Good understanding of Data Mining and Machine Learning techniques.
- Configured GIT wif Jenkins and schedule jobs using POLL SCM option.
- Using Jenkins AWS CodeDeploy plugin to deploy and Chef for unattended bootstrapping in AWS.
- Have good interpersonal, communicational skills, strong problem solving skills, explore/adopt to new technologies wif ease and a good team member.
TECHNICAL SKILLS
Hadoop/BigData Technologies: HDFS, Map Reduce, YARN, Pig, Hbase, Spark, Zookeeper, Hive, Oozie, Sqoop, Flume, Kafka, Storm, Impala
HadoopDistribution Systems: Horton works, Cloudera, MapR
Programming Languages: Java JDK1.6/1.8, Python, Scala. C/C++, HTML, SQL, PL/SQL, AVS & JVS
Frameworks: Hibernate 2.x/3.x, Spring 2.x/3.x,Struts 1.x/2.x
Web Services: WSDL, SOAP, Apache CXF/XFire, Apache Axis, REST, Jersey
Operating Systems: UNIX, Windows, LINUX
Web/Application Servers: IBM Web sphere, Tomcat, Web Logic, JBOSS
Web technologies: JSP, Servlets, JNDI, JDBC, Java Beans, JavaScript
Databases: Teradata, Oracle, Netezza, MySQL
NoSQL Databases: HBase, Cassandra, MangoDB
Java IDE: Eclipse 3.x, IBM Web Sphere Application Developer, IBM RAD 7.0
Development Tools: SOAP UI, ANT, Jenkins, Nexus, Maven,Visio, Rational Rose
PROFESSIONAL EXPERIENCE
Confidential, GA
HadoopDeveloper
Responsibilities:
- Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Zookeeper and Sqoop.
- Configured, Designed implemented and monitored Kafka cluster and connectors.
- Implemented a proof of concept (Poc's) using Kafka, Strom, Hbase for processing streaming data.
- Used Sqoop to import data into HDFS and Hive from multiple data systems.
- Developed complex queries using HIVE and IMPALA.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Handled importing of data from various data sources, performed transformations using Hive, Mapreduce, and Loaded data into HDFS.
- Helped wif teh sizing and performance tuning of teh Cassandra cluster.
- Involved in converting Cassandra/Hive/SQL queries into Spark transformations using Spark RDD's.
- Developed multiple Pocs using Spark and deployed on teh Yarn cluster, compared teh performance of Spark, wif Cassandra and SQL.
- Involved in teh process of Cassandra data modelling and building efficient data structures.
- Analyzed teh Cassandra/SQL scripts and designed teh solution
- Extracted teh data from Teradata into HDFS using Sqoop.
- Analyzed teh data by performing Hive queries and running Pig scripts to know user behavior like shopping
- Configured Ooozie workflow to run multiple Hive and Pig jobs, which run independently wif time and data availability.
- Optimized Mapreduce code, pig scripts and performance tuning and analysis.
- Implemented advanced procedures like text analytics and processing, using teh in-memory computing capabilities of Spark.
- Developed Spark Application by using Python (Pyspark)
- Exported teh aggregated data onto Oracle using Sqoop for reporting on teh Tableau dashboard.
- Involvement in design, development and testing phases of Software Development Life Cycle.
- Performed Hadoop installation, updates, patches and version upgrades when required.
- Weekly meetings wif technical collaborators and active participation in code review sessions wif senior and junior developers.
- Storage on AWS EBS, S3 and Glacier and automate sync data to Glacier. Databases services on AWS like RDS, Dynamo DB, Elastic Transcoder, Cloud front, Elastic Beanstalk. Migration of 2 instances from one region to another.
- Leveraged AWS cloud services such as EC2; auto-scaling; and VPC (Virtual Private Cloud) to build secure, highly scalable and flexible systems dat handled expected and unexpected load bursts.
- Automated various infrastructure activities like Continuous Deployment, Application Server setup, Stack monitoring using Ansible playbooks and has Integrated wif Jenkins.
Environment:Hadoop, MapReduce, HDFS, Hive, Pig, Ooozie, Java, Eclipse, Cloudera, Cassandra, AWS Oracle 10g, 11g, Flume, Kafka, Flume, Scala, Spark, Sqoop, Python.
Confidential, GA
HadoopDeveloper
Responsibilities:
- Primary responsibilities include building scalable distributed data solutions using Hadoop ecosystem
- Datasets will be loaded from two different sources like Oracle, MySQL to HDFS and Hive respectively on daily basis.
- Installed and configured Hive on teh Hadoop cluster.
- Worked on Hbase Java API to populate operational Hbase table wif Key value.
- Developed multiple Mapreduce jobs in java for data cleaning and preprocessing.
- Developing and running MapReduce jobs on YARN and Hadoop clusters to produce daily and monthly reports as per user's need.
- Scheduling and managing jobs on a Hadoop cluster using Oozie work flow.
- Experience in developing multiple Mapreduce programs in java for data extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV and other file formats.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Integrated Apache Storm wif Kafka to perform web analytics. Uploaded click stream data from Kafka to Hdfs, Hbase and Hive by integrating wif Storm.
- Worked on migrating data from Mongo DB to Hadoop.
- Developed teh Pig UDF'S to pre-process teh data for analysis.
- Designed and developed PIG Latin Scripts to process data in a batch to perform trend analysis.
- Developed HIVE scripts for analyst requirements for analysis.
- Developed java code to generate, compare & merge AVRO schema files.
- Developed complex Mapreduce streaming jobs using Java language dat are implemented Using Hive and Pig.
- Collected teh logs data from web servers and integrated in to HDFS using Flume.
- Optimized Mapreduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Handled importing of data from various data sources, performed transformations using Hive, Mapreduce, loaded data into HDFS and Extracted teh data from MySQL into HDFS using Sqoop
- Automating and scheduling teh Sqoop jobs in a timely manner using Unix Shell Scripts.
- Analyzed teh data by performing Hive queries (HiveQL) and running Pig Latin scripts to study customer behavior.
- Developed Data Cleansing techniques / UDFs using Pig scripts / Hive QL, Map/Reduce.
- Worked on NoSQL Like MongoDB.
- Continuously monitored and managed teh Hadoop Cluster using Cloudera Manager.
Environment: HDFS, Pig, Pig Latin, Storm, Kafka, Eclipse, Hive, Mapreduce, Java, Avro, Sqoop, LINUX, Cloudera, Big Data, MongoDB, JSON, XML and CSV.
Confidential, NC
HadoopDeveloper
Responsibilities:
- Responsible for building scalable distributed data solutions usingHadoop and migrate legacy Retail applications ETL to Hadoop.
- Accessed information through mobile networks and satellites from teh equipment.
- Implemented ETL code to load data from multiple sources into HDFS using pig scripts.
- Hands on creating different applications in social networking websites and obtaining access data.
- Wrote Map Reduce jobs using teh access tokens to get teh data from teh customers.
- Developed simple to complex Map Reduce jobs using Hive and Pig for analyzing teh data.
- Used different Serde's for converting JSON data into pipe separated data.
- Implemented some business logics by writing UDFs in Java and used various UDFs from Piggybanks and other sources.
- Used Oozie workflow engine to run multiple Hive and Pig jobs.
- Exported teh results to Teradata using Sqoop to generate reports for teh BI team.
- Worked wif application teams in installing operating system,Hadoopupdates, patches, version upgrades as required.
- Continuously monitored and managed teh Hadoop Cluster using Cloudera Manager.
Environment:Hadoop, Map Reducer, Cloudera Manager, HDFS, Hive, Pig, Sqoop, Oozie, Impala, SQL, Java (jdk 1.6), Eclipse and Informatica 9.1.
Confidential - Bangalore, India Feb 2013 - Mar 2014JavaHadoopDeveloper
Responsibilities:
- Involved in analysing requirements and establish development capabilities to support future opportunities.
- Involved in sharing data to teams which analyse and prepare reports on Risk management.
- Handled importing of data from various data sources, performed transformations using PIG, Mapreduce, loaded data into HDFS and extracted data from MySQL into HDFS using SQOOP.
- Worked on streaming teh analyzed data to teh existing relational databases using SQOOP by making it available for visualization and report generation to teh BI team.
- Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
- Did various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in hive and Map Side joins.
- Involved in End to End implementation of ETL logic.
- Effective coordination wif offshore team and managed project deliverable on time.
- Worked on QA support activities, test data creation and Unit testing activities.
- Developed Oozie workflows and they are scheduled through a scheduler on a monthly basis.
- Designed and developed read lock capability in HDFS.
- Created Pig Latin scripts to sort, group, join and filter teh enterprise wise data.
- Analysed Web server log data using Apache Flume.
Environment: Hadoop, Map Reduce, Hive, Pig, Sqoop, Hbase, SQL, Oozie, Linux, UNIX.
Confidential
JavaDeveloper
Responsibilities:
- Worked wif Business Analyst and helped representing teh business domain details.
- Actively involved in setting coding standards and writing related documentation.
- Created Preferred Vehicle Web Service using JAXWS.
- Teh web service is created using top down approach and tested using SOAP UI tool
- Used Hibernate 3.3.1 to interact wif Data base.
- Developed JSPs & Servlets to dynamically generate HTML and display data to client side.
- An Admin tool is created in struts MVC design pattern to add preferred vehicle to Database.
- Designed Web Applications using MVC design pattern.
- Developed Shell script to retrieve teh vendor files dynamically and used Cron tab to execute these scripts periodically.
- Designed teh Batch Process for processing vendor data files using IBM Web sphere Application Server’s Task Manager Framework.
- Performed unit testing usingJUnitTesting Framework andLog4Jto monitor teh error log.
- Created & modified database objects like tables, views, procedures, functions, triggers, packages, indexes, synonyms, materialized views using Oracle tools like TOAD and SQL Navigator.
- Developed SQL and PL/SQL scripts to transfer tables across teh schemas and databases.
- Updated procedures, functions, triggers and packages based on teh change request from users.
- Support activities like Job monitoring, enhancements and resolving defects.
- Worked wif testing teams; perform UAT testing wif business users
- Worked wif release team for teh staging & production move.
- Implemented efficient error handling process by capturing errors into user managed tables.
- Pair-program wif developers to enhance current PL/SQL packages to fix production issues, build new functionality and improve processing time through code optimization.
Environment:Oracle, SQL Developer, TOAD, Windows 2000/XP, ASP.Net, Visual Studio.
