Hadoop Developer Resume
PROFESSIONAL SUMMARY:
- Around 5 years of professional IT experience wif hands - on experience in development of Big Data Technologies, data analytics.
- Experienced as Hadoop Developer wif good knowledge in MapReduce, YARN, HBASE, CASSANDRA, PIG, HIVE, SQOOP.
- Extensive work experience in Object Oriented Analysis and Design, Java/J2EE technologies including HTML5, XHTML, DHTML, JavaScript, JSTL, CSS, AJAX and Oracle for developing server side applications and user interfaces.
- Experience wif distributed systems, large-scale non-relational data stores, NoSQL map-reduce systems, data modeling, database performance tuning, and multi-terabyte data warehouses.
- Excellent understanding and knowledge of NOSQL databases like HBase and Cassandra.
- Excellent understanding of Hadoop architecture, Hadoop Distributed File System and API's.
- Good Exposure on Apache Hadoop Map Reduce programming architecture and API's.
- Experienced in running MapReduce and Spark jobs over YARN.
- Experienced in writing custom MapReduce me/O formats and key-value formats.
- Hands-on Experience in installing, configuring and maintaining teh Hadoop clusters.
- Expert in working wif Hive data warehouse tool-creating tables, data distribution by implement-ing partitioning and bucketing, writing and optimizing teh HiveQL queries.
- Familiar in writing MapReduce jobs for processing teh data over Cassandra cluster.
- Experienced in writing MapReduce jobs over HBase, custom Filters, and Co-processors.
- Hands on experience in Import/Export of data using Hadoop Data Management tool SQOOP.
- Used Hive and Pig for performing data analysis.
- Familiar wif MongoDB concepts and its architecture.
- Experienced wif moving data from Teradata to HDFS using Teradata connectors.
- Good experience in all teh phases of Software Development Life Cycle (Analysis of requirements, Design, Development, Verification and Validation, Deployment).
- Hands-on experience wif "productionalizing" Hadoop applications (e.g. administration, configu-ration management, monitoring, debugging, and performance tuning)
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Experience working wif JAVA J2EE, JDBC, ODBC, JSP, Java Beans, Servlets.
- Experience wif AJAX, REST and JSON
- Experience in using IDEs like Eclipse and experience in DBMS like Oracle and MYSQL.
- Evaluate and propose new tools and technologies to meet teh needs of teh organization.
- Good knowledge in Unified Modeling Language (UML), Object Oriented Analysis and Design and Agile Methodologies.
- An excellent team player and self-starter wif TEMPeffective communication skills.
TECHNICAL SKILLS:
Hadoop/Big Data/NoSql Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Oozie, Avro, Hadoop Streaming, YARN, Zookeeper, HBase
Programming Languages: Java, Python, C, SQL, PL/SQL, Shell Script
IDE Tools: Eclipse, Rational Team Concert, NetBeans
Framework: Hibernate, Spring, Struts, JMS, EJB, JUnit, MRUnit, JAXB
Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP,JSON, XML, XHTML, Rest Web Services
Application Servers: Jboss, Tomcat, Web Logic, Web Sphere
Databases: Oracle 11g/10g/9i, MySQL, DB2, Derby, MS-SQL Server
Operating Systems: LINUX,UNIX, Windows
Build Tools: Jenkins, Maven, ANT
Reporting/BI Tools: Jasper Reports, iReport, Tableau, QlikView
PROFESSIONAL EXPERIENCE:
Confidential, Columbus IN
Responsibilities:
- Implemented Hadoop framework to capture user navigation across teh application to validate teh user interface and provide analytic feedback/result to teh UI team.
- Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Performed analysis on teh unused user navigation data by loading into HDFS and writing Map Reduce jobs. Teh analysis provided inputs to teh new APM front-end developers and lucent team.
- Written spark programs in Scala and ran spark jobs on YARN.
- Worked wif Cassandra for non-relational data storage and retrieval on enterprise use cases.
- Wrote Map Reduce jobs using Java API and Pig Latin.
- Loaded teh data from Teradata to HDFS using Teradata Hadoop connectors.
- Used Flume to collect, aggregate and store teh web log data onto HDFS.
- Wrote Pig scripts to run ETL jobs on teh data in HDFS.
- Used Hive to do analysis on teh data and identify different correlations.
- Written AdhocHiveQL queries to process data and generate reports.
- Involved in HDFS maintenance and administering it through Hadoop-Java API.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Worked on HBase. Configured MySQL Database to store Hive metadata.
- Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
- Written Hive queries for data analysis to meet teh business requirements.
- Automated all teh jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Involved in creating Hive tables and working on them using Hive QL.
- Extracted files from MongoDB through Sqoop and placed in HDFS and processed.
- Maintaining and monitoring clusters. Loaded data into teh cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Utilized Agile Scrum Methodology to help manage and organize a team of 4 developers wif regular code review sessions.
- Weekly meetings wif technical collaborators and active participation in code review sessions wif senior and junior developers.
Environment: Hadoop, Map Reduce, HDFS, Flume, Pig, Hive, Spark, Scala,Yarn,HBase, Sqoop, ZooKeeper, Cloudera, Oozie, Cassandra, NoSQL, ETL, MYSQL, agile, Windows,UNIX Shell Scripting, Teradata.
Confidential, Princeton NJ
Responsibilities:
- Developed simple and complex MapReduce programs in Java for Data Analysis on different data formats
- Developed MapReduce programs that filter bad and un-necessary claim records and find out unique records based on account type
- Processed semi, unstructured data using Map Reduce programs
- Implemented Daily Cron jobs that automate parallel tasks of loading teh data into HDFS and pre-processing wif Pig using Oozie co-ordinator jobs
- Implemented custom DataTypes, InputFormat, RecordReader, OutputFormat, RecordWriter for MapReduce computations
- Worked on CDH4 cluster on CentOS.
- Successfully migrated Legacy application to Big Data application using Hive/Pig/HBase in Production level
- Transformed date related data into application compatible format by developing apache Pig UDFs
- DevelopedMapReducepipeline for feature extractionand tested teh modules using MRUnit
- Optimized MapReduce jobs to use HDFS efficiently by using various compression mechanisms
- Creating Hive tables, loading wif data and writing Hive queries which will run internally in MapReduceway
- Responsible for performing extensive data validation using Hive
- Implemented Partitioning, Dynamic Partitions and Bucketing in Hive for efficient data access
- Worked on different set of tables like External Tables and Managed Tables
- Used Oozie workflow engine to run multiple Hive and Pig jobs
- Involved in installing and configuring Hive, Pig, Sqoop, Flume and Oozie on teh Hadoop cluster.
- Involved in designing and developing non-trivial ETL processes wifin Hadoop using tools likePig, Sqoop, Flume, and Oozie
- Used DML statements to perform different operations on Hive Tables
- Developed Hive queries for creating foundation tables from stage data
- Used Pig as ETL tool to do transformations, event joins, filter and some pre-aggregations
- Analyzed teh data by performing Hive queries and running Pig scripts to study customer behavior
- Implemented business logic by writing Pig UDFs in Java and used various UDFs from Piggybanks and other sources
- Worked wif Sqoop to export analyzed data from HDFS environment into RDBMS for report generation and visualization purpose
- Queried and analyzed data from DatastaxCassandrafor quick searching, sorting and grouping
- Developed Mapping document for reporting tools
Environment: Apache Hadoop, HDFS, MapReduce, Java (jdk1.6), MySQL, DB Visualizer, Linux, Sqoop, Apache Hive, Apache Pig
Confidential, Boston, MAResponsibilities:
- Installed and configured Hadoop clusters for Dev, Qa and production environments
- Installed and configured teh Hadoop name node ha service using Zookeeper.
- Installed and configured Hadoop security and access controls using Kerberos, Active Directory
- Imported data from Sql Server database to hdfs using Sqoop.
- Creating Hive tables, loading wif data and writing Hive queries which will run internally in MapReduce way.
- Used Eclipse to develop J2EE Components. Components involved a JSP front end (Light weight). CSS and scripts were also part of front end development
- Designed J2EE project wif Front Controller pattern
- Designed CSS and tag libraries for Front end
- Extensive use of Java scripts and AJAX to control all user functions.
- Developing front end screens which includes JQuery, JavaScript, Java and CSS
- Attend business and requirement meetings
- Using ANT to create build scripts for deployment and run teh JUnit test cases
- Using VSS extensively to code check-in, check-out and version them and maintain production, test and development views appropriately.
- Understand teh sources of data and organize it in a structured table setup
- Deliver daily reports and data sheets to clients for their business meetings.
- Code review, unit testing and local Integration testing.
- Integrating of application modules, components and deploying in teh target platform.
- Involving in teh requirement study, and preparation of detailed software requirement specification.
- Involving in low level and high level design and preparation of HLD and LLD documents Visio
- Testing support during integration and production
Environment: Hadoop, Hive, Sqoop, Zookeeper, Mapreduce, WebSphere / DB2/ IBM RAD.JDK1.6, JSP, J2EE, HTML, Javascript, CSS, Servlets, Struts, JDBC, Oracle, SQL, Log4j, JUnit, VSS, Ant, Shell script, Visio.