Senior Hadoop Developer Resume
San Jose, CA
SUMMARY
- 8+years of professional IT experience in analyzing requirements, designing, building, highly distributed mission critical products and Applications.
- 4+ years of big data related experience in Apache Hadoop Cloudera and Hortonworks Distributions
- Expertise in core Hadoop and Hadoop technology stack which includes HDFS, Map Reduce, Oozie, Hive, Sqoop, Pig, Flume, HBase, Spark, Storm, Kafka and Zookeeper.
- Well versed in installation, configuration, supporting and managing of Big Data and underlying infrastructure of Hadoop Cluster.
- Experienced in implementing complex algorithms on semi/unstructured data using Map reduce programs.
- Hands on experience in AWS cloud services like S3, EMR, and Redshift.
- Experienced in working wif structured data using Hive QL, join operations, Hive UDFs, partitions, bucketing and internal/external tables.
- Experienced in migrating ETL kind of operations using Pig transformations, operations and UDF's.
- Spark Streaming collects dis data from Kafka in near - real-time and performs necessary transformations and aggregation on the fly to build the common learner data model and persists the data in columnar store (HBase).
- Specialization in Data Ingestion, Processing, Development from Various RDBMS data sources into a Hadoop Cluster using Map Reduce/Pig/Hive/Sqoop.
- Experienced in implementing unified data platform to get data from different data sources using Apache Kafka brokers, cluster, Java producers and Consumers.
- Experienced in working wif in-memory processing frame work like Spark transformations, Spark and Spark streaming.
- Excellent understanding and noledge of NOSQL databases like HBase, Mongo DB, Teradata and on Data warehouse.
- Developed fan-out workflow using flume for ingesting data from various data sources like Webservers, Rest API by using different sources and ingested data into Hadoop wif HDFS sink.
- Experienced in implementing custom interceptors and sterilizers in flume for specific customer requirements.
- Good noledge on Amazon EMR, S3 Buckets, RedShift.
- Experience in implementing in setting up standards and processes for Hadoop based application design and implementation.
- Worked on importing and exporting data from MYSQL into HDFS, HIVE and HBase using Sqoop.
- Experienced on cloud integration wif AWS using Elastic Map Reduce (EMR), Simple Storage Service (S3), EC2, Redshift.
- Tool monitored log input from several datacenters, via Spark Stream, was analyzed in Apache Storm and data was parsed and saved into HBase.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems MYSQL and vice versa.
- Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce programming paradigm.
- Good Exposure on Apache Hadoop Map Reduce programming, PIG Scripting and Distribute Application and HDFS.
- Experience in managing Hadoop clusters using Cloudera Manager and Ambari Tools.
- Worked on Cluster co-ordination services through Zookeeper.
- Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
- Experience on different operating systems like UNIX, Linux and Windows.
- Experience on Java Multi-Threading, Collection, Interfaces, Synchronization, and Exception Handling.
- Involved in writing PL/SQL stored procedures, triggers and complex queries.
- Worked in Agile environment wif active scrum participation.
TECHNICAL SKILLS
Hadoop/Big Data: HDFS, Map reduce, HBase, Pig, Hive, Sqoop, MongoDB, Flume, Oozie, Zookeeper, AWS, Spark, Kafka, Teradata, ETL (Kettle).
Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, Java Beans
IDE’s: Eclipse, IntelliJ Idea.
Frameworks: MVC, Struts, Hibernate, Spring.
Programming languages: C, C++, Java, Ant scripts, Linux shell scripts
Databases: Oracle 11g/10g/9i, MYSQL, DB2, MS-SQL SERVER
AWS: S3, RedShift
Web Servers: Web Logic, Web Sphere, Apache Tomcat
Web Technologies: HTML, XML, JavaScript, AJAX, SOAP, WSDL, JAX-RS, Restful, JAX-WS.
Version Controls: CVS, SVN, GIT.
PROFESSIONAL EXPERIENCE
Senior Hadoop Developer
Confidential, San Jose, CA
Responsibilities:
- Designing the applications from the ingestion to reports delivery to third party vendors using big data technologies flume, Kafka, Sqoop, map-reduce, hive, pig.
- Responsible for building scalable distributed data solutions using Hadoop.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it using Map Reduce programs.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from HDFS to MYSQL using Sqoop.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Experience in AWS cloud environment and on s3 storage.
- Database Migration from SQL Server to Amazon Redshift using AWS DMS and SCT. Worked on AWS Data Pipeline to configure data loads from S3 into Redshift.
- AWS Redshift performance tuning and optimization by using correct distkey and sortkey.
- Experience wif creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse using Kettle tool
- Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scale.
- Involved in creating Hive tables, loading wif data and writing hive queries which will run internally in MapReduce way.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Process the raw log files from the set top boxes using java map reduce code and shell scripts and stored them as text files in HDFS.
- Ingesting the data from legacy and upstream systems to HDFS using apache Sqoop, Flume java map reduce programs, hive queries and pig scripts.
- Generating the required reports using Oozie workflow and Hive queries for operations team from the ingested data.
- Alert and monitoring mechanism for the Oozie jobs for the failure conditions and successful conditions using email notifications.
- Responsible for migrating the code base from Cloudera Platform to Amazon EMR and evaluated Amazon eco systems components like RedShift, Dynamo DB.
- Working wif application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Writing Map Reduce code to make un-structured and semi- structured data into structured data and loading into Hive tables.
- Worked on debugging, performance tuning of Hive & Pig Jobs
- Analyzing system failures, identifying root causes, and recommended course of actions as part of operations support.
- Involved in Spark streaming solution for the time sensitive revenue generating reports to match the pace wif upstream (STB) systems data
- Experience in working on HBase wif Apache phoenix as a data layer to serve the web requests to meet the SLA requirements
- Responsible for functional requirements gathering, code reviews, deployment scripts and procedures, offshore coordination and on-time deliverables
Environment: Hadoop, MapReduce, HDFS, Hive, Pig, Java, SQL, Sqoop, Java, Spark, Kafka, AWS(S3, EMR, RedShift), MongoDB, ETL.
Hadoop Developer
Confidential, El Segundo, CA
Responsibilities:
- Worked on writing Map Reduce jobs to discover trends in data usage by customers.
- Worked on and designed Big Data analytics platform for processing customer interface preferences and comments using Java, Hadoop, Hive and Pig.
- Involved in hive-HBase integration by creating hive external tables and specifying storage as HBase format.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Experienced in defining job flows to run multiple Map Reduce and Pig jobs using Oozie.
- Installed and configured Hive and also written Hive QL scripts.
- Experience wif loading the data into relational database for reporting, dash boarding and ad-hoc analyses, which revealed ways to lower operating costs and offset the rising cost of programming.
- Experience wif creating ETL jobs to load JSON data and server data into MongoDB and transformed MongoDB into the Data Warehouse.
- Created reports and dashboards using structured and unstructured data.
- Experienced wif performing analytics on Time Series data using HBase.
- Hands on Installing and configuring nodes CDH4 Hadoop Cluster on CentOS.
- Implemented Hive Generic UDF's to implement business logic.
- Experienced wif accessing Hive tables to perform analytics from java applications using JDBC.
- Experienced in running batch processes using Pig Scripts and developed Pig UDFs for data manipulation according to Business Requirements.
- Experience wif streaming work flow operations and Hadoop jobs using Oozie workflow and scheduled through AUTOSYS on a regular basis.
- Performed operation using Partitioning pattern in Map Reduce to move records into different categories.
- Responsible for batch processing and real-time processing in HDFS and NOSQL Databases.
- Responsible for retrieval of Data from Casandra and ingestion to PIG.
- Experience in customizing map reduce framework Confidential various levels by generating Custom Input formats, Record Readers, Partitioner and Data types.
- Experienced wif multiple file in HIVE, AVRO, Sequence file formats.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Pig Script.
- Implemented business logic by writing Pig UDF's in Java and used various UDFs from Piggybanks and other sources.
Environment: Hadoop, ETL, Pig Scripts, Flume, Hadoop BI, Pig UDF’s, Oozie, AVRO, Hive, Map Reduce, Java, Eclipse, Zookeeper.
Hadoop Developer
Confidential, Kansas City, MO
Responsibilities:
- Involved in the Complete Software development life cycle (SDLC) to develop the application.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop, zookeeper.
- Involved in loading data from LINUX file system to HDFS.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Implemented test scripts to support test driven development and continuous integration.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Involved in creating Hive tables, loading wif data and writing hive queries dat will run internally in Map Reduce way.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Worked on tuning the performance Pig queries.
- Mentored analyst and test team for writing Hive Queries.
- Installed Oozie workflow engine to run multiple MapReduce jobs.
- Involved in designing and developing the Cascading work flows
- Involved in writing a new subassembly for a complex query used for lookup the data
- Worked on custom functions and subassemblies which halps for code reuse
- Involved in running cascading job in local and Hadoop modes
- Worked wif application teams to install VM, HBASE, Hadoop updates, patches, version upgrades as required
- Created and tested JUNIT test cases for different Cascading flows
- Idea on continuous integration tools like Jenkins to automate the process for code builds and deployments
- Worked on zookeeper for coordinating between different master node and data nodes
Environment: Hadoop, HDFS, Cascading, Lingual, HBASE, REDIS, Zookeeper, JUnit, Jenkins, Gradle, Hive
Java /J2EE Developer
Confidential, Raleigh, NC
Responsibilities:
- Work wif business users to determine requirements and technical solutions.
- Followed Agile methodology (Scrum Standups, Confidential Planning, Confidential Review, Confidential Showcase and Confidential Retrospective meetings).
- Developed business components using core java concepts and classes like Inheritance, Polymorphism, Collections, Serialization and Multithreading etc.
- Used SPRING framework dat handles application logic and makes calls to business make them as Spring Beans.
- Implemented, configured data sources, session factory and used Hibernate Template to integrate Spring wif Hibernate.
- Developed web services to allow communication between applications through SOAP over HTTP wif JMS and mule ESB.
- Actively involved in coding using CoreJavaand collection API's such as Lists, Sets and Maps
- Developed a Web Service (SOAP, WSDL) dat is shared between front end and cable bill review system.
- Implemented Rest based web service using JAX-RS annotations, Jersey implementation for data retrieval wif JSON.
- Developed MAVEN scripts to build and deploy the application onto Web Logic Application Server and ran UNIX shell scripts and implemented autodeployment process.
- Used Maven as the build tool and is scheduled/triggered by Jenkins (build tool).
- Develop JUNIT test cases for application unit testing.
- Implement Hibernate for data persistence and management.
- Used SOAP UI tool for testing web services connectivity.
- Used SVN as version control to check in the code, created branches and tagged the code in SVN.
- Used RESTFUL Services to interact wif the Client by providing the RESTFUL URL mapping.
- Used Log4j framework to log/track application and debugging.
Environment: JDK 1.6, Eclipse IDE, Core Java, J2EE, Spring, Hibernate, Unix, Web Services, SOAP UI, Maven, Web logic Application Server, SQL Developer, Junit, SVN, Agile, REST, Log 4j, JSON.
Java Developer
Confidential
Responsibilities:
- Involved in analysis, design and development of Expense Processing system.
- Created used interfaces using JSP.
- Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
- Developed the DAO objects using JDBC.
- Business Services using the Servlets and Java.
- Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, client side and server-side validations.
- Developed GUI using JSP, Struts frame work.
- Involved in developing the presentation layer using Spring MVC/Angular JS/jQuery.
- Involved in designing the user interfaces using Struts Tiles Framework.
- Used Spring 2.0 Framework for Dependency injection and integrated wif the Struts Framework and Hibernate.
- Used Hibernate 3.0 in data access layer to access and update information in the database.
- Experience in SOA (Service Oriented Architecture) by creating the web services wif SOAP and WSDL.
- Developed JUnit test cases for all the developed modules.
- Used Log4J to capture the log dat includes runtime exceptions, monitored error logs and fixed the problems.
- Used RESTFUL Services to interact wif the Client by providing the RESTFUL URL mapping.
- Used CVS for version control across common source code used by developers.
- Used ANT scripts to build the application and deployed on Web Logic Application Server 10.0.
Environment: - Struts1.2, Hibernate3.0, Spring2.5, JSP, Servlets, XML,SOAP, WSDL, JDBC, JavaScript, HTML, CVS, Log4J, JUNIT, Web logic App server, Eclipse, Oracle, Restful.
Java Developer
Confidential
Responsibilities:
- Wrote SQL queries, stored procedures, and triggers to perform back-end database operations.
- Developed nightly batch jobs which involved interfacing wif external third-party state agencies.
- Implemented JMS producer and Consumer using Mule ESB.
- Gatheird business requirements and wrote functional specifications and detailed design documents.
- Extensively used CoreJava, Servlets, JSP and XML.
- Wrote AngularJS controllers, views, and services.
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
- Implemented Enterprise Logging service using JMS and apache CXF.
- Developed Unit Test Cases, and used JUNIT for unit testing of the application.
- Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements.
Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB Mule, Junit, WAS7, jQuery, Ajax, SAX.