Big Data / Hadoop Developer Resume
Lansing, MI
PROFESSIONALSUMMARY:
- Around 8+ years of experience in Information technology, 4years of expertise in development of Hadoop/Big Data ecosystem followed by 4 years of experience in fields of Developing and Testing in Java/J2EE technology, web based technologies with different back end databases.
- Good Knowledge and exposure in Big Data processing using HadoopEcosystem includingPig, Hive,HDFS, Map Reduce (MRV1 and YARN), Sqoop, Flume, Kafka, Oozie, Zookeeper, Spark, Impala.
- Experience in Cloudera, Hortonworks, MapRand Amazon Web Services distributions of Hadoop.
- Experience in installing, configuring and using ecosystem components like Hadoop MapReduce, Sqoop, Pig, Hive, Hbase,HDBS Impala & Spark .
- Good Knowledge and exposure in Hadoop architecture and various components such as HDFS, Job Tracker, Name Node, Data Node, Task Tracker.
- Experience in working with java for writing custom UDFs to extend Hive and Pig core functionality.
- Have hands on experience in writing MapReduce jobs using Java.
- Extensive experience in Unix Shell Scripting.
- Experience in working with NoSQL databases such asHbase, Cassandra& MongoDB .
- Experience in working on databases like MS SQL Server, Oracle, DB2, and MySQL.
- Good Knowledge on general data analytics on distributed computing cluster like Hadoop using Apache Sparkand Scala.
- Having experience in developing a data pipeline using Kafkato store data into HDFS.
- Expertise in Hadoop workflows scheduling and monitoring using Oozie,Zookeeper.
- Expertise in Database development, ETL, Reporting like Tableau
- Expertise in various Java/J2EE technologies like JSP, Servlets, Hibernate, Struts, Spring.
- Experience in Software Development Life Cycles (SDLC) like Waterfall Model, and Agile methodologies which include Test Driven Development, SCRUM, Pair Programming.
- Have hands on experience in writing MapReduce jobs using Java.
- Experience in using application servers like WebSphere, Web logic, Tomcat.
- Knowledge on implementing BigData in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
- Sound knowledge with web - based UI development using jQuery UI, jQuery, ExtJS, CSS3, HTML, HTML5, XHTML and JavaScript.
- Experience with unit testing, functional Testing, system Testing, Integration Testing the applications using JUnit, Mockito, Jasmine and Cucumber, PowerMock & EasyMock.
- Experience in using IDEs like Eclipse, Visual Studio and experience in DBMS like Oracle and MYSQL.
- Experience in working Linux Based Operating systems like Ubuntu, CentOS and Fedora.
- Have strong experience in code debugging and bug fixing.
TECHNICAL SKILLS:
Hadoop/Big Data Technologies: Apache Hadoop, HDFS and Map Reduce, Pig, Hive, Sqoop, Flume, Hue, HBase, YARN, Oozie, Zookeeper, MapR Converged Data Platform, Apache Spark, Apache Kafka
Web Technologies: JavaScript, HTML, CSS, XML,AJAX,SOAP
MVC Frameworks: Spring, Hibernate, Struts
Languages: JAVA, PYTHON, C, C++, SQL, PL/SQL, Ruby, Bash and Perl
SQL/NOSQL Databases: Apache HBase, Mogo DB, Cassandra, Oracle11g/10g/9i, MS SQL Server, MYSQL
Application Server: Web Logic, Web Sphere, Apache Tomcat & JBoss
Testing Frameworks: JUnit, Mockito, PowerMock, EasyMock, Jasmine, Cucumber
ETL Tools: Informatica, Pentaho
Operating Systems: Windows, Mac OS, LINUX
PROFESSIONAL EXPERIENCE:
Confidential, Lansing, MI
Big Data / Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster using different big data analytic tools including Kafka, Pig, Hive and Map Reduce .
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scale.
- Worked on implementing Spark using Scala and SparkSQL for faster analyzing and processing of data.
- Handling in Importing and exporting data into HDFS and Hive using SQOOP and Kafka
- Involved in creating Hive tables, loading the data and writing hive queries, which will run internally in map reduce .
- Worked on Designing and Developing ETL Workflows using Java for processing data in HDFS/Hbase using Oozie .
- Worked on importing the unstructured data into the HDFS using Flume.
- Wrote complex Hive queries and UDFs .
- Involved in developing Shell scripts to easy execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.
- Expert in Hbase modeling/development supporting customer behavior/life cycle analysis.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala .
- Worked with cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration.
- Worked with NoSQL databases like Hbase, Cassandra in creating tables to load large sets of semi structured data.
- Generated Java APIs for retrieval and analysis on No-SQL database such as HBase and Cassandra
- Experience in writing database objects - Stored Procedures, Triggers, SQL, PL/SQL packages and Cursors for Oracle and creating oracle functions using custom UDF’s using Java and creating UDF’s in hive.
- Worked on loading data from UNIX file system to HDFS
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Environment:Hadoop, HDFS, MapReduce, Hive Sqoop, Hbase, Apache Spark, Oozie Scheduler, Java, UNIX Shell Scripts, Kafka, Git, Maven, PLSQL, Python, Scala, cloudera
Confidential, Hillsboro, OR
Big Data / Hadoop Developer
Responsibilities:
- Installed and Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Oozie, Zookeeper, Hbase, Flume and Sqoop.
- Involved in importing the real time data to Hadoop using Kafka and implemented the Oozie job for daily imports.
- Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
- Installed Oozie workflow engine to run multiple Map Reduce, Hive and Pig jobs .
- Developed Sparkscripts by using Scala Shell commands as per the requirement.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
- Worked on PrepareDeveloper(Unit) Test cases and executedeveloper Testing.
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts .
- Involved in Cluster coordination services through Zookeeper and Adding new nodes to an existing cluster.
- Used Flume extensively in gathering and moving log data files from Application Servers to a central location in Hadoop Distributed File System (HDFS).
- Worked on Creating Data Marts and loaded the data using Informatica Tool.
- Experienced in Analyzing Cassandra database and compare it with other open-source NoSQL databases to find which one of them better suites the current requirements.
- Extracted files from Cassandra, MongoDB through Sqoop and placed in HDFS for processed.
Environment:Apache Hadoop, HDFS, Hive, Map Reduce, Java, Pig, Sqoop, Kafka, MySQL, Spark, Scala, Cassandra, MongoDB, HDFS, Oozie, Informatica, Flume
Confidential, Portland, Oregon
Big Data / Hadoop Developer
Responsibilities:
- Worked on installing, configureing and maintaining Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Cassandra, Zookeeper and Sqoop.
- Involved in designing Logical/Physical Data Models .
- Developed PIG Latin scripts to extract the data from the web server output files to load into HDFS.
- Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
- Involved in working of Big data analysis using Pig and User defined functions (UDF).
- Created Hive External tables and loaded the data into tables and query data using HQL.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Worked in Windows, UNIX/Linux platform with different Technologies such as Big Data, SQL, PL/SQL, XML, HTML, Core Java, C#, ASP.NET, Shell Scripting.
- Cluster co-ordination services through Zookeeper.
- Collected the logs data from web servers and integrated into HDFS using Flume .
- Used Oozie workflow engine to run multiple Hive and pig jobs
Environment:Big Date/Hadoop, Spark, HDFS, Map-Reduce, Hive, Pig, Sqoop, Flume, Impala, Oozie, Informatica, Java, and DB2.
Confidential
JAVA Application Developer
Responsibilities
- Responsible for the analysis, documenting the requirements and architecting the application based on J2EE standards. Followed test driven.
- Participated in designing of Use Case, Class Diagram and Sequence Diagram for various Engine components and used IBM Rational Rose for generating the UML notations.
- Implemented different Design patterns like DAO, Singleton Pattern and MVC architectural design pattern of Springs.
- Responsible for secure batch data flow to downstream systems using Middleware Java technologies.
- Developing Intranet Web Application using J2EE architecture, using JSP to design the user interfaces and Hibernate for database connectivity.
- Extensively Worked with Eclipse as the IDE to develop, test and deploy the complete application.
- Development of hibernate objects for data fetching for batch and front-end processing.
- Front end screens development using JSP with tag libraries and HTML pages.
- Implementing JSP Standard Tag Libraries (JSTL) along with Expression Language (EL).
Environment: Java 1.6, Struts-Spring-Hibernate integration framework, JSP, HTML, Oracle 9i, SQL, PL/SQL, XML, Web logic, Eclipse, Ajax, JQuery.
Confidential
Java Developer
Responsibilities:
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Involved in both maintenance and new enhancements of the application.
- Developed Servlets and JDBC were used in retrieving data.
- Deployed EJB Components on Web Logic.
- Tested the modules and fixed the bugs.
- XML was used to transfer the data between different layers.
- Developed presentation layer using JSP, HTML and CSS.
- Used JavaScript for client side validations.
- Dealt with java Beans helper classes and Servlets for interacting with the user.
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
Environment: Java, JSP, Servlets, JDBC, EJB, Javascript, XML, Web logic and ORACLE 8i.