Hadoop Developer Resume
SUMMARY
- Over 7 years of IT experience in Design, Development, Deployment, Maintenance, and Support of Java/J2EE applications.
- 3 years of experience in Hadoop distributed file system (HDFS), Impala, Hive, Hbase, Spark, Hue, Map Reduce framework and Sqoop.
- Around 1year experience on Spark and Scala.
- Experienced as Hadoop, expertise in providing end to end solutions for real time big data problems by implementing distributed processing concepts such as map reduce on Hadoop frameworks such as HDFS and Hadoop Ecosystem components.
- Experience in working on large scale big data implementations and in production environment.
- Hands on experience on Data Migration from Relational Database to Hadoop Platform\ SQOOP.
- Experienced in using Pig scripts to do transformations, event joins, filters and some pre - aggregations before storing teh data onto HDFS.
- Developed analytical components using Scala, Spark and Spark Stream.
- Expertise in developing both Front End and Back End applications using Java, Servlets, JSP, Web Services, JavaScript, HTML, Spring, Hibernate, JDBC, XML.
- Worked on Web logic, Tomcat Web Server for Development and Deployment of teh Java/J2EE Applications.
- Good experience in Spring & Hibernate and Expertise in developing Java Beans.
- Working noledge of Web logic server clustering.
- Proficient in various web based technologies like HTML, XML, XSLT, and JavaScript.
- Expertise in unit testing using JUnit.
- Responsible for teh formation and direction of Business Intelligence, (Oracle Appliance (11g)
TECHNICAL SKILLS
Big Data: Apache Spark, Map Reduce, Hbase, Pig, Hive, Sqoop, Flume, Kafka, Impala, Oozie.
Languages: C, C++, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Scala.
Databases: Oracle, SQL Server, MySQL, NoSQL, HBase.
Scripting Languages: UNIX Shell script, Java Script, python.
Frameworks: Spring, Struts, Hibernate.
Web Services: SOAP, RESTful, JAX-WS, Apache Axis.
Web Technologies: HTML, XHTML, CSS, XML, XSL, XSLT, Ajax
Web Servers: Web Logic, Web Sphere, Apache Tomcat.
Version Control Systems: CVS, SVN, GitHub.
PROFESSIONAL EXPERIENCE
Confidential
Hadoop Developer
Responsibilities:
- working on Wholesale loss forecasting application, a CCAR teams as part of global risk analytics team on Hadoop and spark architecture.
- Design and develop oozie workflows for timely data loading into Hadoop ecosystems from other data sources.
- Develop data validations on teh data prior to giving data as an input to different models.
- Used Pyspark to perform transformational and data processing on models.
- analyze teh commercial losses and coordinate wif teh business users.
- worked on migrating teh existing loss predicting business models written in Map reduce to Pyspark and developed new models directly in Pyspark. Teh output produced by these models is used generate reports and to predict teh losses of dat Particular loan data.
- Experience in handling hundreds/thousands of Spark jobs dat are run during teh streets testing as a part of CCAR cycle.
- Worked in writing Impala queries dat were used as a part of DQ checks and other post model processes.
- Experience in handling JSON datasets and writing custom python functions to parse through JSON data using spark.
- worked extensively on performance tuning of spark jobs and Hive/Impala queries.
- Hands on experience on monitoring teh spark applications through teh spark UI and identify executor failures and data skew and other runtime issues.
- Worked on proving a solution for compacting small files in Hadoop and running stats on Hive tables as a part of maintaining teh overall stability of teh Hadoop cluster and Impala.
Confidential .
Hadoop/Spark/Scala Developer
Responsibilities:
- Developing scripts to perform business transformations on teh data using Hive and PIG.
- Developing UDFs in java for hive and pig, Worked on reading multiple data formats on HDFS using Scala.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
- Developed multiple POCs using Scala and deployed on teh Yarn cluster, compared teh performance of Spark, wif Hive and SQL/Teradata.
- Analysed teh SQL scripts and designed teh solution to implement using Scala.
- Data analysis through Pig, Map Reduce, Hive.
- Design and develop Data Ingestion component.
- Cluster coordination services through Zookeeper
- Import of data using Sqoop from Oracle to HDFS
- Import and export of data using Sqoop from or to HDFS and Relational DB Teradata.
- Developed POC on Apache-Spark and Kafka.
- Implement Flume, Spark, Spark Stream framework for real time data processing.
- Hands on experience in installing, configuring and using eco-System components like Hadoop MapReduce, HDFS, Hbase, Pig, Flume, Hive and Sqoop.
- Developed analytical component using Scala, Spark and Spark Stream.
Environment: Java, Scala, Python, J2EE, Hadoop, Spark, HBase, Hive, Pig, Sqoop, MySQL, Teradata, GitHub.
Confidential
Hadoop/Spark Developer
Responsibilities:
- Expert in implementing advanced procedures like text analytics and processing using teh in-memory computing capabilities like Apache Spark written in Scala.
- Developed and executed shell scripts to automate teh jobs.
- Wrote complex Hive queries and UDFs.
- Worked on teh core and Spark SQL modules of Spark extensively.
- Developed multiple MapReduce Jobs in java for data cleaning and pre-processing.
- Experienced in defining job flows using Oozie.
- Experienced in collecting, aggregating, and moving large amounts of streaming data into HDFS using Flume.
- Working Knowledge in NoSQL Databases like HBase and Cassandra.
- Involved in running Ad-Hoc query through PIG Latin language, Hive or Java MapReduce.
- Developed Power enter mappings to extract data from various databases, Flat files and load into DataMart using teh Informatica 8.6.1.
- Involved in log file management where teh logs greater TEMPthan 7 days old were removed from log folder and loaded into HDFS and stored for 3 months.
- Conducted Scrum Daily stand up, Product backlog, Sprint Planning, Sprint Review & Sprint Retrospective meetings.
- Involved wif reporting team to generating reports from Data Mart using Cognos.
Environment:Apache Hadoop, EDW, SQL Server 2005, TOAD, Rapid SQL, Oracle 10g (RAC), HDFS, Map ReduceJava, VMware, HIVE, Eclipse, PIG, Hive, HBase, Sqoop, Flume, Linux, UNIX, DB2.
Confidential
Java Developer
Responsibilities:
- Involved in different phases of Software Development Lifecycle (SDLC) like Requirements gathering, Analysis, Design and Development of teh application.
- Involved in designing and implementation of MVC design pattern using Spring framework for Web-tier.
- Worked on teh Web Services using SOAP and RESTful web services.
- Involved in developing teh user interface using Struts.
- Wrote several Action Classes and Action Forms to capture user input and created different web pages using JSTL, JSP, HTML, Custom Tags and Struts Tags.
- Designed and developed Message Flows and Message Sets and other service component to expose Mainframe applications to enterprise J2EE applications.
- Used standard data access technologies like JDBC and ORM tool like Hibernate
- Worked on various client websites dat used Struts 1 framework and Hibernate
- Wrote test cases using JUnit testing framework and configured applications on WebLogic Server
- Involved in writing stored procedures, views, user-defined functions and triggers in SQL Server database for Reports module.
Environment:Java, Spring MVC, Struts, RESTful, JSP, JUnit, Eclipse, JIRA, JDBC, Struts 1, Hibernate, WebLogicOracle 9i.
Confidential
Java Developer
Responsibilities:
- Involved in designing and implementation of MVC design pattern using Spring framework for Web-tier.
- Used Spring framework for dependency injection, transaction management.
- Developed Web interface using JSP, Standard Tag Libraries (JSTL), and Struts Framework.
- Used Struts as MVC framework for designing teh complete Web tier.
- Implemented REST web services using Apache-CXF framework.
- Developed different GUI screens JSPs using HTML, DHTML and CSS to design teh Pages according to Client Experience Workbench Standards.
- Validated teh user input using Struts Validation Framework.
- Data Access Objects (DAO) framework is bundled as part of teh Hibernate Database Layer
- Client side validations were implemented using JavaScript.
- implemented teh mechanism of logging and debugging wif Log4j.
- Version control of teh code and configuration files are maintained by CVS.
- Developed PL/SQL packages and triggers.
- Developed test cases for Unit testing and performed integration and system testing.
Environment:Spring MVC, Spring JDBC, J2EE, Hibernate, RESTful, WebLogic, Eclipse, Struts 1.0, JDBC, JavaScript,CSS, XML, ANT, Log4J, VSS, PL/SQL and Oracle 8i.