Hadoop Developer Resume

PROFESSIONAL SUMMARY:

Around 5 years of professional IT experience wif hands - on experience in development of Big Data Technologies, data analytics.
Experienced as Hadoop Developer wif good knowledge in MapReduce, YARN, HBASE, CASSANDRA, PIG, HIVE, SQOOP.
Extensive work experience in Object Oriented Analysis and Design, Java/J2EE technologies including HTML5, XHTML, DHTML, JavaScript, JSTL, CSS, AJAX and Oracle for developing server side applications and user interfaces.
Experience wif distributed systems, large-scale non-relational data stores, NoSQL map-reduce systems, data modeling, database performance tuning, and multi-terabyte data warehouses.
Excellent understanding and knowledge of NOSQL databases like HBase and Cassandra.
Excellent understanding of Hadoop architecture, Hadoop Distributed File System and API's.
Good Exposure on Apache Hadoop Map Reduce programming architecture and API's.
Experienced in running MapReduce and Spark jobs over YARN.
Experienced in writing custom MapReduce me/O formats and key-value formats.
Hands-on Experience in installing, configuring and maintaining teh Hadoop clusters.
Expert in working wif Hive data warehouse tool-creating tables, data distribution by implement-ing partitioning and bucketing, writing and optimizing teh HiveQL queries.
Familiar in writing MapReduce jobs for processing teh data over Cassandra cluster.
Experienced in writing MapReduce jobs over HBase, custom Filters, and Co-processors.
Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.
Used Hive and Pig for performing data analysis.
Familiar wif MongoDB concepts and its architecture.
Experienced wif moving data from Teradata to HDFS using Teradata connectors.
Good experience in all teh phases of Software Development Life Cycle (Analysis of requirements, Design, Development, Verification and Validation, Deployment).
Hands-on experience wif "productionalizing" Hadoop applications (e.g. administration, configu-ration management, monitoring, debugging, and performance tuning)
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Experience working wif JAVA J2EE, JDBC, ODBC, JSP, Java Beans, Servlets.
Experience wif AJAX, REST and JSON
Experience in using IDEs like Eclipse and experience in DBMS like Oracle and MYSQL.
Evaluate and propose new tools and technologies to meet teh needs of teh organization.
Good knowledge in Unified Modeling Language (UML), Object Oriented Analysis and Design and Agile Methodologies.
An excellent team player and self-starter wif TEMPeffective communication skills.

TECHNICAL SKILLS:

Hadoop/Big Data/NoSql Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, YARN, Zookeeper, HBase

Programming Languages: Java, Python, C, SQL, PL/SQL, Shell Script

IDE Tools: Eclipse, Rational Team Concert, NetBeans

Framework: Hibernate, Spring, Struts, JMS, EJB, JUnit, MRUnit, JAXB

Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, Rest Web Services

Application Servers: Jboss, Tomcat, Web Logic, Web Sphere

Databases: Oracle 11g/10g/9i, MySQL, DB2, Derby, MS-SQL Server

Operating Systems: LINUX,UNIX, Windows

Build Tools: Jenkins, Maven, ANT

Reporting/BI Tools: Jasper Reports, iReport, Tableau, QlikView

PROFESSIONAL EXPERIENCE:

Confidential, Columbus IN

Responsibilities:

Implemented Hadoop framework to capture user navigation across teh application to validate teh user interface and provide analytic feedback/result to teh UI team.
Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
Performed analysis on teh unused user navigation data by loading into HDFS and writing Map Reduce jobs. Teh analysis provided inputs to teh new APM front-end developers and lucent team.
Written spark programs in Scala and ran spark jobs on YARN.
Worked wif Cassandra for non-relational data storage and retrieval on enterprise use cases.
Wrote Map Reduce jobs using Java API and Pig Latin.
Loaded teh data from Teradata to HDFS using Teradata Hadoop connectors.
Used Flume to collect, aggregate and store teh web log data onto HDFS.
Wrote Pig scripts to run ETL jobs on teh data in HDFS.
Used Hive to do analysis on teh data and identify different correlations.
Written AdhocHiveQL queries to process data and generate reports.
Involved in HDFS maintenance and administering it through Hadoop-Java API.
Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
Worked on HBase. Configured MySQL Database to store Hive metadata.
Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
Written Hive queries for data analysis to meet teh business requirements.
Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
Involved in creating Hive tables and working on them using Hive QL.
Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
Maintaining and monitoring clusters. Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers wif regular code review sessions.
Weekly meetings wif technical collaborators and active participation in code review sessions wif senior and junior developers.

Environment: Hadoop, Map Reduce, HDFS, Flume, Pig, Hive, Spark, Scala,Yarn,HBase, Sqoop, ZooKeeper, Cloudera, Oozie, Cassandra, NoSQL, ETL, MYSQL, agile, Windows,UNIX Shell Scripting, Teradata.

Confidential, Princeton NJ

Responsibilities:

Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats
Developed MapReduce programs that filter bad and un-necessary claim records and find out unique records based on account type
Processed semi, unstructured data using Map Reduce programs
Implemented Daily Cron jobs that automate parallel tasks of loading teh data into HDFS and pre-processing wif Pig using Oozie co-ordinator jobs
Implemented custom DataTypes, InputFormat, RecordReader, OutputFormat, RecordWriter for MapReduce computations
Worked on CDH4 cluster on CentOS.
Successfully migrated Legacy application to Big Data application using Hive/Pig/HBase in Production level
Transformed date related data into application compatible format by developing apache Pig UDFs
DevelopedMapReducepipeline for feature extractionand tested teh modules using MRUnit
Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms
Creating Hive tables, loading wif data and writing Hive queries which will run internally in MapReduceway
Responsible for performing extensive data validation using Hive
Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access
Worked on different set of tables like External Tables and Managed Tables
Used Oozie workflow engine to run multiple Hive and Pig jobs
Involved in installing and configuring Hive, Pig, Sqoop, Flume and Oozie on teh Hadoop cluster.
Involved in designing and developing non-trivial ETL processes wifin Hadoop using tools likePig, Sqoop, Flume, and Oozie
Used DML statements to perform different operations on Hive Tables
Developed Hive queries for creating foundation tables from stage data
Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
Analyzed teh data by performing Hive queries and running Pig scripts to study customer behavior
Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
Worked wif Sqoop to export analyzed data from HDFS environment into RDBMS for report generation and visualization purpose
Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping
Developed Mapping document for reporting tools

Environment: Apache Hadoop, HDFS, MapReduce, Java (jdk1.6), MySQL, DB Visualizer, Linux, Sqoop, Apache Hive, Apache Pig

Confidential, Boston, MA

Responsibilities:

Installed and configured Hadoop clusters for Dev, Qa and production environments
Installed and configured teh Hadoop name node ha service using Zookeeper.
Installed and configured Hadoop security and access controls using Kerberos, Active Directory
Imported data from Sql Server database to hdfs using Sqoop.
Creating Hive tables, loading wif data and writing Hive queries which will run internally in MapReduce way.
Used Eclipse to develop J2EE Components. Components involved a JSP front end (Light weight). CSS and scripts were also part of front end development
Designed J2EE project wif Front Controller pattern
Designed CSS and tag libraries for Front end
Extensive use of Java scripts and AJAX to control all user functions.
Developing front end screens which includes JQuery, JavaScript, Java and CSS
Attend business and requirement meetings
Using ANT to create build scripts for deployment and run teh JUnit test cases
Using VSS extensively to code check-in, check-out and version them and maintain production, test and development views appropriately.
Understand teh sources of data and organize it in a structured table setup
Deliver daily reports and data sheets to clients for their business meetings.
Code review, unit testing and local Integration testing.
Integrating of application modules, components and deploying in teh target platform.
Involving in teh requirement study, and preparation of detailed software requirement specification.
Involving in low level and high level design and preparation of HLD and LLD documents Visio
Testing support during integration and production

Environment: Hadoop, Hive, Sqoop, Zookeeper, Mapreduce, WebSphere / DB2/ IBM RAD.JDK1.6, JSP, J2EE, HTML, Javascript, CSS, Servlets, Struts, JDBC, Oracle, SQL, Log4j, JUnit, VSS, Ant, Shell script, Visio.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship