Hadoop Developer Resume
Sanjose, CA
PROFESSIONAL SUMMARY:
- Having 7+years of IT experience and 3+ Years of working experience in Big Data Hadoop technologies like Spark, Map Reduce, Hive, HBase, Pig, Sqoop, Oozie, Zookeeper and HDFS.
- Extensive experience in HDFS, Map Reduce, PIG, Hive, Sqoop, Flume, Oozie, Zookeeper, Maven, HBase and Cassandra.
- Hands on NoSQL database experience with HBase, Cassandra.Good understanding of Data Lakes.
- Experience in data management and implementation of Big Data applications using HADOOP frameworks.
- Hands on experience in installing, configuring and deploying Hadoop distributions in cloud environments (Amazon Web Services).
- Hands on experience in NOSQL databases like HBase, MongoDB and Cassandra.
- Well versed with Talend BigData, Hadoop, Hive and used Talend Bigdata components like tHdfsoutput, tHdfsInput, tHiveLoad.
- Good database experience using SQL Server, Stored Procedures, Cursors, Constraints and Triggers.
- Good knowledge in handling databases like SQL SERVER, Oracle, and MySQL.
- Hands on experience in worked on Mesos cluster.
- Created various Parser programs to extract data from Autosys, Tibco Business Objects, XML, Informatica, Java, and database views using Scala.
- Sound Knowledge in Design, Development and implementation of several J2EE frameworks like (Model View Controller) Struts, Spring, Hibernate.
- Experienced in creating and analyzing Software Requirement Specifications (SRS) and Functional Specification Document (FSD).
- Strong knowledge of Software Development Life Cycle (SDLC).
- Developed high - performance distributed queueing system. Scala, Redis, Akka, closure, MQ messaging, Json.
- Extensive experience in testing, debugging and deploying MapReduce Hadoop platforms.
- Expertise in installing, designing, sizing, configuring, provisioning and upgrading Hadoop environments.
- Experienced in creating and analyzing Software Requirement Specifications (SRS) and Functional Specification Document (FSD).
- Used Flume to channel data from different sources to HDFS.
- Extensive experience in testing, debugging and deploying MapReduce Hadoop platforms.
- Experience in fine-tuning Map reduce jobs for better scalability and performance.
- Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
- Expertise in workflow management tools like SQL Workbench, SQL Developer and TOAD tool for accessing the Database server.
- Experience in developing and implementing web applications using Java, JSP, jQuery UI, ExtJS, CSS, HTML, HTML5, XHTML and Java script, AJAX.
Environments:
- Hadoop & Big Data:: Hadoop 2 (YARN), Cloudera CDH 5.8, Hortonworks Apache Distribution, Kerberos and Apache Sentry security, Hadoop components (HBase, Zookeeper, Sqoop and Hive).
- Languages: Java (J2SE, J2EE), SQL, PL/SQL, C, C++, Scala.
- Web Technologies: HTML5, CSS3, JavaScript, JQuery, AJAX, Servlets, JSP, JSON, XML, XHTML, Rest Web Services
- Application Servers: Jboss, Web Logic, Web Sphere
- Software Methodologies: SDLC, Agile, Waterfall
- Databases: Oracle MySQL, DB2, Derby, MSSQL Server, TOAD
- Hadoop clusters: Cloudera CDH 5, HortonWorks HDP 2.3/2.4, MapR.
WORK EXPERIENCE:
Hadoop developer
Confidential, Sanjose, CA
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and pre-processing.
- Performed both major and minor upgrades to the existing Cloudera Hadoop cluster.
- Loaded the data from Hadoop Developer to HDFS using Hadoop Developer Hadoop connectors
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation, administration and support for Hadoop.
- Worked with systems engineering team to plan and deploy new Hadoop environments and expand existing Hadoop clusters and Experience in converting MapReduce applications to Spark.
- Strong Experience in implementing Data warehouse solutions in AWS Redshift; Worked on various projects to migrate data from on premise databases to AWS Redshift, RDS and S3.
- Debug pig scripts. Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Worked on developing applications in Hadoop Big Data Technologies-Pig, Hive, Map-Reduce, Oozie, Flume, and Kafka.
- Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
- Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
- Created data models in Cassandra and Implemented distributed parallel processing and data migration between Cassandra and hawq, greenplum instances.
- Developed ETL Scripts for Data acquisition and Transformation using Talend.
- Automated all the jobs, for pulling data from FTP server to load data into Hive tables, using Oozie workflows.
- Involved in creating Hive tables and working on them using HiveQL and perform data analysis using Hive and Pig.
- Export data from Hive, HDFS to Teradata using Sqoop. Export the aggregated data in Hadoop to test RDBMS databases.
- Developed solutions to pre-process large sets of structured, with different file formats. Implementing Hadoop based data warehouses, integrated Hadoop with Enterprise Data Warehouse systems.
- Used Java UDF's to implement business logic in pig scripts. Load data into HDFS using flume for proof of concept purpose.
- Analyse existing BTEQ scripts and create HQLs to generate reports for member information.
- Designed and developed Pig scripts with Java UDF's to implement business logic to transform the ingested data.
- Design and development of Java MapReduce APIs for loading data. Merge multiple ingested files in Hadoop using pig. Debug pig scripts.
Technical Environment: Hadoop, MapReduce, HDFS, Pig, Hive, HBase, Flume, Zookeeper, Ruby, Spark, Cloudera Manager, Horton works, MongoDB, Oozie, Java (jdk1.6), Python, MySQL, SQL, Windows NT, GitHub, Linux, Spark SQL, ETL, Kafka, Storm.
Hadoop developer
Confidential
Responsibilities:
- Designed and developed Pig scripts with Java UDF's to implement business logic to transform the ingested data. Design and development of Java MapReduce APIs for loading data. Merge multiple ingested files in Hadoop using pig. Debug pig scripts.
- Designed and developed sqoop jobs for data ingestion from Teradata database to HDFS. Merge multiple ingested files in Hadoop using pig. Debug pig scripts.
- Used HBase API's to get and scan events data stored in NoSql (HBase).Developed Simple to complex Map/reduce Jobs using Hive and Pig. Involved in creating Hive tables and HQLs.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Created and maintained Technical documentation for launching HADOOP Clusters and for executing Hive queries and Pig Scripts and Experience in managing and reviewing Hadoop log files.
- Worked with cloud services like Amazon Web Services (AWS) and involved in ETL, Data Integration and Migration.
- Created customized BI tool for manager team that perform Query analytics using HiveQL.
- Developed the technical strategy of using Apache Spark on Apache Mesos as a next generation, Big Data and “Fast Data" (Streaming) platform.
- Planned and created answer for constant information ingestion utilizing Kafka, Storm, Spark spilling and different NoSQL databases.
- Experienced in Apache Spark for implementing advanced procedures like text analytics and processing using the in-memory computing capabilities written in Scala.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Experience in Cloudera Hadoop Upgrades and Patches and Installation of Ecosystem Products through Cloudera manager along with Cloudera Manager Upgrade.
- Involved in loading data from UNIX file system to HDFS.
- Imported and exported data into HDFS and Hive using Sqoop
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Designing HiveQL syntax of comparing source data against target data in HIVE and other databases like "Oracle, Teradata, DB2, Greenplum etc.
Technical Environment: Hadoop, MapReduce, HDFS, Hive, Oozie, Java (jdk1.6), Cloudera, NoSQL, Oracle 11g, 10g, PL SQL, SQL*PLUS, Toad 9.6, Windows NT, UNIX Shell Scripting.
Hadoop developer
Confidential
Responsibilities:
- Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase database and Sqoop.
- Performed Data Ingestion from multiple internal clients using Apache Kafka.
- Performed Real time event processing of data from multiple servers in the organization.
- Hadoop using Spark context, Spark-SQL, Data Frame, Spark YARN.
- Imported and exporting data into HDFS and Hive using SQOOP.
- Experienced in developing Spark scripts for data analysis in both python and Scala.
- Involved in loading data from UNIX file system to HDFS. Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map reduce way.
- Used Flume to Channel data from different sources to HDFS. Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Worked in Spark streaming to get ongoing information from the Kafka and store the stream information to HDFS.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Worked with BI teams in generating the reports and designing ETL workflows on Tableau.
- Implemented different machine learning techniques in Scala using Spark machine learning library. Experience in design and code reviews of C and C++ code.
- Responsible for building scalable distributed data solutions using Hadoop. Worked hands on with ETL process.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Developed Workflows using task developer, work let designer, and workflow designer in Workflow manager and monitored the results using workflow monitor.
- Exported the patterns analyzed back into Teradata using Sqoop.
Technical Environment: Hadoop, MapReduce, Hive, HBase, Flume, Pig, Zookeeper, Java, ETL, SQL, CentOS, Eclipse.
Java Developer
Confidential
Responsibilities:
- Implemented the object oriented programming concepts for validating the columns of the import file.
- Developed JSP pages and client side validation by java script tags.
- Involved in development for simulator which is being used for controllers to simulate real time scenarios using C / C++ programming.
- Designed, developed and validated User Interface using HTML, Java Script, XML and CSS.
- Actively participated in requirements gathering, analysis, design, and testing phases.
- Designed use case diagrams, class diagrams, and sequence diagrams as a part of Design Phase.
- Developed the entire application implementing MVC Architecture integrating JSF with Hibernate and spring frameworks.
- Developed the Enterprise Java Beans (Stateless Session beans) to handle different transactions such as online funds transfer, bill payments to the service providers.
- Implemented Service Oriented Architecture (SOA) using JMS for sending and receiving messages while creating web services.
- Developed XML documents and generated XSL files for Payment Transaction and Reserve Transaction systems.
- Developed Web Services for data transfer from client to server and vice versa using Apache Axis, SOAP and WSDL.
- Used HTML, CSS, JSP, JSTL to develop user interface.Used case diagrams, Sequence diagrams, Activity diagrams and class diagrams are created by using rational rose.
- Store persistent JMS messages or temporarily store messages sent using the store-and-forward feature.
- Worked on database interaction layer for insertions, updating and retrieval operations of data from oracle database by writing stored procedures.
Technical Environment: J2EE, JDBC, Java 1.4, Servlets, JSP, Struts, Hibernate, Web services, SOAP, WSDL, Design Patterns, MVC, HTML, JavaScript 1.2, WebLogic 8.0, XML, Junit, Oracle 10g, Web Sphere, My Eclipse.
Java developer
Confidential
Responsibilities:
- Database design using data modeling techniques and Server side coding using Java.
- Developed JSPs for displaying shopping cart contents and to add, modify, save and delete cart items
- Developed UI using HTML, JavaScript, and JSP, and developed Business Logic and Interfacing components using BusinessObjects,XML,andJDBC.
- Designeduser-interface and checking validations using JavaScript.
- Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
- Developed Java multi threaded batch offline bulk upload tool, web applications using Spring, Servlets and UI layer using JSPs, JavaScript, HTML, CSS, Angular JS.
- Worked on implementation of new & complex implementations, critical/quick deliveries. Developed and build Ant scripts, Maven for packaging the application code.
- Developed database scripts and procedures using PL/SQL. Deployed code on Tomcat web application server.
- Implemented client side data validations using JavaScript.
- Implemented server side data validations using Java Beans.
- Implemented views using JSP & JSTL1.0.
- Validated requirement deliverables, unit testing using SOAP UI, set up & executed system endurance, performance tests using JMeter.
- Deployed & maintained the JSP, Servlets components on Web logic 8.0. Developed Application Servers persistence layer using, JDBC, SQL, Hibernate.
- Used JDBC to connect the web applications to Data Bases. Implemented Test First unit testing framework driven using Junit.
Technical Environment: Java/J2EE, SQL, Oracle 10g, JSP 2.0, EJB, AJAX, Java Script, Web Logic 8.0, HTML, JDBC 3.0, XML, JMS, Junit, Servlets, MVC, My Eclipse.