We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

2.00/5 (Submit Your Rating)

SUMMARY:

  • 8+ years of experience of IT experience which includes around 3+ years of experience in Big Data using Hadoop distributed file system, MapReduce framework and hadoop big data ecosystem.
  • Experience working with major components for structured and unstructured data in Hadoop Ecosystem like Hadoop MapReduce, Apache Crunch, HDFS, Hive, HCatalog, Pig, Falcon, Sqoop, Scala, Python, HBase, Flume, Sqoop, Spark, Storm, Kafka, Oozie and Zookeeper in Cloudera, Hortonworks distribution and AWS.
  • Excellent understanding of Hadoop Architecture YARN and in setting up, configuring and monitoring various components like HDFS, Job Tracker, Task Tracker, Name Node, Data Node, YARN and MapReduce on Hadoop Clusters.
  • As a Sun Certified Java Programmer with extensive programming experience in developing web based applications and Client - Server technologies using Java, J2EE.
  • Analyzed large data sets of structured, semi-structured and unstructured data using HiveQL, Pig Latin and MapReduce programs.
  • Experience in importing and exporting data using Sqoop from/ to RDMS such as Teradata and DB2.Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Hands Working experience on designing and implementing complete end-to-end Hadoop Infrastructure including PIG, HIVE, Sqoop, Oozie, Flume and zookeeper.
  • Experienced in managing and reviewing the Hadoop log files.
  • Worked with data in multiple file formats including Avro, ORC, RC and Text/ CSV.
  • Developed Spark code using Python, Scala and Spark-SQL for faster testing and data processing.
  • Experienced in Extraction, Transformation, and Loading (ETL) processes based on business need using scoop, Falcon and Oozie workflows to execute multiple Java, Hive, Shell and SSH actions.
  • Created detailed AWS Security groups which behaved as virtual firewalls that controlled the traffic allowed reaching one or more AWS EC2 instances.
  • Good understanding of NoSQL databases like HBase and Cassandra.
  • Hands on experience in Stream processing frameworks such as Storm, Spark Streaming.
  • Experience in automating the Hadoop Installation, configuration and maintaining the cluster by using the tools like Puppet.
  • Solid understanding and extensive experience in working with different databases such as Oracle, SQL Server, MySQL and writing Stored Procedures, Functions, Joins and Triggers for different Data Models.
  • Experience in working with flume to load the log data from multiple sources directly into HDFS.
  • Excellent Java development skills using J2EE, Servlets, Junit and familiar with popular frameworks such as Spring, MVC and AJAX.
  • Extensive experience in PL/SQL, developing stored procedures with optimization techniques.
  • Adept at Web Development and experience in developing front end applications using JavaScript, CSS and HTML.
  • Experienced with distributed message brokers (such as Kafka).
  • Expertise in Waterfall and Agile - SCRUM methodologies.
  • Excellent team player, with pleasant disposition and ability to lead a team and a proven track record.

PROFESSIONAL EXPERIENCE

Confidential

Hadoop/Spark Developer

Responsibilities:

  • Designed and developed Talend jobs with Map/Reduce and non-M/R components to fetch the
  • data from external resources into HDFS and create external Hive tables on top of the data.
  • Developed Spark SQL to load tables into HDFS to run select queries on top.
  • Used Spark Streaming to divide streaming data into batches as an input to spark engine for batch processing.
  • Involved in converting Hive/SQL queries into Apache Spark transformations using Apache Spark DataFrames and Python.
  • Migrated Existing MapReduce programs to Spark Models using Python.Developed predictive analytic using Apache Spark Scala APIs.
  • Implemented various Data Quality rules to ensure traffic data meets quality standards as outlined by analytics stakeholders.
  • Built custom UDF for Hive in Python.
  • Experience in Hive partitioning, bucketing and performed joins on hive tables and utilizing hive SerDes like CVS, JSON.
  • Realized various initiatives from Apache Software Foundation, vetted new frameworks and built Proof of Concepts.
  • Implemented machine learning techniques like clustering and regression using SPARK API.
  • Developed workflows to cleanse and transform raw data into useful information to load it to a Kafka Queue to be loaded into HDFS and NOSQL database.
  • Developed Sqoop Jobs to both import data into HDFS from Relational Database Management System like Teradata & DB2 and export data from HDFS to Teradata.
  • Developed workflows for complete end to end ETL process starting with getting data into HDFS, validating and applying business logic, storing clean data in hive external tables, exporting data from hive to RDBMS sources for reporting and escalating and data quality issues.
  • Built scalable distributed data solutions using Hadoop. Developed MapReduce jobs written in Java to apply the business logic.
  • Developed Pig functions to preprocess the data for analysis.Developed Spark scripts by using Scala shell commands as per the requirement.
  • Created Oozie workflows to sqoop the data from source to HDFS and then to target tables.
  • Created HBase tables to store different formats of data as a backend for user portals.
  • Analyzed system failures, identified its root cause and recommended course of actions.
  • Functioned as the point of contact for tracking issues and communicating it to the vendors and all other stakeholders.Experienced with batch processing of data sources using Apache Spark and Elasticsearch.
  • Developed utilities in Python to be used by ingestion workflows as part of Data Ingestion Process.

Environment:Hadoop, Hive, Crunch, Falcon, Kafka, Oozie, Sqoop, Pig, Hbase, Spark, Oracle, Teradata, Scala, Java, Python, SQL Navigator, Spark streaming, Eclipse IDE.

Confidential, Delaware

Hadoop Developer

Responsibilities:

  • Designed Hive tables to load data to and from external tables.
  • Load and transform large sets of unstructured data from UNIX system to HDFS
  • Involved in developing a linear regression model to predict a continuous measurement for improving the observation on wind turbine data developed using Apache Spark with Python API.
  • Develop, validate and maintain HiveQL queries.
  • Fetch data to/from HBase using Mapreduce jobs.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS.
  • Upgraded IBM Maximo database from 5.2 to 7.5.
  • Analyze, validate and document the changed records for IBM Maximo web application.
  • Importing data from MySQL database to HiveQL using Scoop.
  • Running reports in Pig and Hive Queries.
  • Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
  • Installed and configured Hue.
  • Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Apache, Cloudera (CDH3, CDH4), Yarn distributions.
  • Support full testing cycle for ETL processes, including bug fixes.
  • Worked as a Cassandra developer Setting-up configuration and optimized the Cassandra cluster.
  • Developed real-time java based application to work along with the Cassandra database.

Environment: Hadoop, Hive, Crunch, Falcon, Kafka, Oozie, Sqoop, Pig, Hbase, Spark, Oracle, Teradata, Scala, Java, Python, SQL Navigator, Spark streaming, Eclipse IDE.

Confidential, NY

Java/Hadoop Developer

Responsibilities:

  • Developed Oozie Workflows for daily incremental loads, which gets data from Teradata and
  • then imported into hive tables.
  • Developed pig scripts to transform the data into structured format and it are automated through Oozie coordinators.
  • Developed Hive queries for Analysis across different banners.
  • Loading the data from the different Data sources like (Teradata and DB2) into HDFS using
  • sqoop and load into Hive tables, which are partitioned.
  • Developed Hive UDF’s to bring all the customers email id into a structured format.
  • All the bash scripts are scheduled using Resource Manager Scheduler.
  • Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
  • Developed MapReduce programs for applying business rules on the data.
  • Developed and executed hive queries for denormalizing the data.
  • Supported Data Analysts in running MapReduce Programs.
  • Worked on importing and exporting data into HDFS and Hive using Sqoop.
  • Worked on analyzing data with Hive and Pig.

Environment: Hadoop, Hive, Linux, MapReduce, HDFS, Hive, Pig, Sqoop, Java 1.5, J2EEAXIS 2

Java Team Lead

Confidential

Responsibilities:

  • Involved in end to end development to SSO solution using OpenAM Product for State farm integration ICP platform and E+ Platform.
  • Developed and implemented UI controls and APIs with ExtJS.
  • Administered and supported ExtJS applications within scope.
  • Used JAAS API to authenticate (Login, ForgotPassword, ForgotUserid Flows) the User belongs to particular realm and LDAP group.
  • Created REST Web Services based and SOAP with CXF for mobile authentication.
  • Refactored code using design patterns like Decorator, visitor pattern, Factory Pattern, Proxy, Adapter and Singleton Patterns.
  • Connected to LDAP using Forge Rock and SPRING LDAP API.
  • Spring MVC architecture with Spring3 MVC and implemented DI (IoC), AOP and Extensible XML
  • Used JUNIT frameworks like mockito, power mock with 90% code coverage.
  • Enabled OpenAM J2EE Agent to provide authentication and coarse grained authorization.
  • Familiar with virtualization (VMware) and cloud concepts and provisioned Windows, Linux App servers with application blueprint using VMware Cloud Portal (VCAC, Spring Batch Job, and puppet scripting).
  • Attended training sessions on VMware Products installation and configurations.
  • Done Whitelist for Applications.
  • Used Maven features like overlay, cobertura, mutation features.
  • Knowledge of cucumber/JBehave automation test cases.
  • Experienced with high availability websites (99.9%).
  • Used point to point messaging for audit logs
  • Integrated Login application with iOS and native mobile applications using REST API.
  • Familiar with concepts like encoding, decoding.

Environment: OpenAM 10 & 11, OpenDJ, LDAP,ExtJS, SpringMVC, Vfarbric, Spring3, WebServices, VisualVM, VMware vFabric tcServer,Vcac, Hyperic, GemFire, IBM Tivoli Directory Server (TDS) 6.2.0.27, Radiant Logic RadiantOne Virtual Directory Server 5.3.7, Cucumber, SOAP 1.2, CXF, Restful, tomcat, SVN-Tortoise, Mule ESB, HP dynamic scan, HP static scan, Mule Flows, WebSphere MQ, RabbitMQ, JAXB, Maven, Jenkins, SONAR, Checkstyle, JUnit, Audit Logs, Framework, XML, SAML, WSDL, WADL, Html, SPlunk, PostgreSQL, JQuery, AJAX, json, JavaScript, Mustache template, CSS, MS Visio, Eclipse, XSLT, ESAPI, Federation, F5, SOAP UI 4.6.4, VMware DataCenter, VMware Hypervisor, VMware Esxi 6.0, Vrealize, Vcac

Confidential, NJ

Front End developer

Responsibilities:

  • Created front end UI screens in XHTML using JSP. Commonalities across UI screens are captured into reusable UI components.
  • Performed front end validation using Spring Web Flow validation framework.
  • Translated UI designs into well organized and structured HTML/CSS compatible with modern browsers, and use open source tools & frameworks to improve the structure and maintainability of the front-end code.
  • Routed user requests to Spring Web Flows which in turn call action classes.
  • Manage and extend codebase for JavaScript/HTML, primarily JQuery.
  • Experience in gathering requirements from the application users or functional team. Formulate the requirements and develop the system design using UML artifacts.
  • Exposed EJB with SPRING services across modules, and published individual EARs one per business module to establish module wise deployment.
  • Design and development of User Interfaces and menus using HTML, JSP, Java Script, client side and server side validations.Developed Services which in turn talk to DAO layer to communicate to databases.
  • Followed industry best practices across all levels of development and effectively used design patterns in designing business modules.
  • Actively participated and implemented the development of cross functional items like logging, error handling, exception handling, auditing (AOP), etc.
  • Configured Web Sphere Application Server (WAS) for development, testing and pre-production release to the customer.
  • Maintained documentation all along the development process and maintained them in Microsoft SharePoint. Used Clear Quest for communicating business related tasks across team members.

Environment: WAS 7.5, Java 1.6, Windows, Spring Webflow, Spring MVC, Spring Security, Crystal Reports 11X, HTML, JSP, JQuery, JPA, myBatis, Unit, JAXB, CAPTIVA, File net, MQ Series, ILogJrules, Web sphere ESB, SQL/PLSQL, AJAX, Oracle 11i, DB2 9, XML, XSLT, CSS, HTML, DHTML, Java Script, JIRA, Log4J.

Confidential, Columbus, Ohio

J2EE developer

Responsibilities:

  • Developed, implemented, and maintained MVC architecture using SPRING and SEAM framework.
  • Involved in developing front end screens using JSPX, JSF, Tags, DOJO, JQuery, DOM, JSTL, HTML, CSS, AJAX and JavaScript.
  • Developed custom validators using JSF and implemented server side validations.
  • Used JSF Web application Framework for developing Server side DOJO User interfaces.
  • Used message resource file to display application information and error messages.
  • Written test cases for all the classes developed in DAO layer.
  • Incorporated design patterns like MVC pattern, DAO pattern, DTO pattern and factory pattern.
  • Developed various Action classes as a controller component for handling the user actions.
  • Developed bean classes, DAO’s for implementing Hibernate object relational (O/R) mapping for persistence in DB2 and Oracle database.
  • Developed several Crystal Reports like Daily, Weekly and Monthly reports.
  • Involved in configuring the pages.xml, web.xml and validation.xml.
  • In the database end, involved in creation of tables, triggers, stored procedures, sub-queries, joins and views.
  • Involved in communicating with business Analyst resolve the applications production issues and to deliver the best quality application enhancements to the client.
  • Involved in maintenance and enhancements of SEAM 2 and Java 1.4/1.5 version of the same application in live.
  • Used CVS as a version control tool.

Environment: Web Sphere 6.2, Java 1.4/1.5, Windows, RAD 7.0, SPRING, SEAM 2.1, JSPX, JSF, dojo, JQuery, JDBC, JAX-WS, Web services, SOAPUI, Hibernate 3.0, JUnit, File Net, Adobe Professional, SQL/PLSQL, AJAX, DOJO, Oracle 10g, DB2 9,. XML, XSLT, CSS, HTML, DHTML, Java Script, Log4j, Mercury QC, MQ, Log4J.

We'd love your feedback!