Senior Hadoop Developer Resume
Irving, TX
SUMMARY
- Around 8+ years of strong experience in software development using Big Data, Hadoop, Apache Spark Java/J2EE, Scala, Python technologies.
- Solid Mathematics, Probability and Statistics foundation and broad practical statistical and data mining techniques cultivated through various industry work and academic programs
- Involved in the Software Development Life Cycle (SDLC)phases which include Analysis, Design, Implementation, Testing and Maintenance.
- Strong technical, administration, and mentoring knowledge in Linux and Big Data/Hadoop technologies.
- Hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Pentaho, Hbase, Zookeeper, Sqoop, Oozie, Cassandra, Flume and Avro.
- Work experience with cloud infrastructure like Amazon Web Services (AWS).
- Experience in importing and exporting the data using Sqoop from HDFS to Relational Database systems/mainframe and vice - versa
- Installing, configuring and managing of Hadoop Clusters and Data Science tools.
- Managing the Hadoop distribution with Cloudera Manager, Cloudera Navigator, Hue.
- Setting up the High-Availability for Hadoop Clusters components and Edge nodes.
- Experience in developing Shell scripts and Python Scripts for system management.
- Experience in profiling huge sets of data using Informatica BDM 10
- Well versed in using Software development methodologies like Rapid Application Development (RAD), Agile Methodology and Scrum software development processes.
- Experience with Object Oriented Analysis and Design (OOAD)methodologies.
- Experience in installations of software, writing test cases, debugging, and testing of batch and online systems.
- Experience in Production, quality assurance (QA), SIT (System Integration testing) and user acceptance (UA) testing.
- Expertise in J2EEtechnologies like JSP, Servlets, EJBs 2.0, JDBC, JNDI and AJAX.
- Extensively worked on implementing SOA (Service Oriented Architecture) using XMLWeb services (SOAP, WSDL, UDDI and XML Parsers).
- Worked with XML parsers like JAXP (SAX and DOM) and JAXB.
- Expertise in applying Java Messaging Service (JMS)for reliable information exchange across Java applications.
- Proficient with Core Java, AWT and also with the markup languages like HTML 5.0, XHTML, DHTML, CSS, XML 1.1, XSL, XSLT, XPath, XQuery, Angular.js, Node.js
- Worked with version control systems like Subversion, Perforce, and GIT for providing common platform for all the developers.
- Highly motivated team player with the ability to work independently and adapt quickly to new and emerging technologies.
- Creatively communicate and present models to business customers and executives, utilizing a variety of formats and visualization methodologies.
PROFESSIONAL EXPERIENCE
Confidential, Irving TX
Senior Hadoop Developer
Environment: Cloudera 5.x Hadoop, Linux, IBM DB2, HDFS, Yarn, Impala, Pig, Hive, Sqoop, Spark, Scala, Hbase, MapReduce, Hadoop Datalake, Informatica BDM 10
Responsibilities:
- Installed and configured HDFS, Hadoop Map Reduce, developed various Map Reduce jobs in Java for data cleaning and preprocessing.
- Analyzed various RDDS using Scala, Python with Spark.
- Performed complex mathematical, statistical and machine learning analysis using SparkMlib, Spark Streaming and GraphX.
- Performed data ingestion from various data sources.
- Worked with various types of databases like SQl, NOSQl and Relational for transferring data to and from HDFS.
- Worked on Amazon Web Services EC2 console.
- Designed and developed workflows to manage Hadoop jobs.
- Used Impala for data processing on top of Hive.
- Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores, NOSQL, Hadoop, PIG, MySQL and Oracle databases
- Experience in using Avro, Parquet, RCFile and JSON file formats and developed UDFs using Hive and Pig.
- Optimized MapReduce jobs to use HDFS efficiently by using Gzip, LZO, Snappy compression techniques.
- Imported and exported data into HDFS and Hive using Sqoop.
- Experience in loading and transforming huge sets of structured, semi structured and unstructured data.
- Worked on different file formats like XML files, Sequence files, JSON, CSV and Map files using Map Reduce Programs.
- Continuously monitored and managed Hadoop cluster using Cloudera Manager.
- Performed POC’s using latest technologies like spark, Kafka, scala.
- Created Hive tables, loaded them with data and wrote hive queries.
- Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
- Experience in managing and reviewing Hadoop log files.
- Executed test scripts to support test driven development and continuous integration.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Worked on tuning the Pig queries performance.
- Installed Oozie workflow to run several MapReduce jobs.
- Extensive Working knowledge of partitioned table, UDFs, performance tuning, compression-related properties, thrift server in Hive.
Confidential, Palm Beach Gardens FL.
Senior Hadoop Developer
Environment: Hortornworks 2.2 Hadoop, Linux, Apache Cassandra, HDFS, Yarn, Pig, Hive, Sqoop, Spark, Scala, Flume, MapReduce, Oracle DB, Java
Responsibilities:
- Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
- Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
- Stored the data in an Apache Cassandra Cluster.
- Reviewing peer table creation in Hive, data loading and queries.
- Involved in analyzing system failures, identifying root causes and recommended course of actions.
- Worked on Hive for exposing data for further analysis and for generating transforming files from different analytical formats to text files.
- Worked with Avro Data Serialization system to work with JSON data formats.
- Exported the result set from Hive to Oracle Db using Sqoop after processing the data.
- Assisted in designing, building, and maintaining database to analyze life cycle of claim processing and transactions.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
- Monitored System health and logs and responded accordingly to any warning or failure conditions through the Cloudera Manager.
- Job Scheduling using Oozie and tracking progress.
- Worked extensively in creating map Reduce to power data for search and aggregation.
- Wrote MapReduce Programs for different types of input formats like JSON, XML and CSV formats.
- Analysis of Web logs using Hadoop tools for operational and security related activities.
- Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
- Wrote custom MapReduce codes, generated JAR files for user defined functions and integrated with Hive to help the analysis team with the statistical analysis.
Confidential, Houston TX.
Hadoop Developer
Environment: Cloudera 2.x Hadoop - PIG, Hive, Sqoop, MapReduce, Cloudera manager, 30 Node cluster with Linux-Ubuntu
Responsibilities:
- Worked with the business users to gather, define business requirements and analyze the possible technical solutions.
- Hadoop installation, configuration of multiple nodes in Cloudera platform.
- Setup and optimize Standalone-System/Pseudo-Distributed/Distributed Clusters.
- Developed Simple to complex MapReduce streaming jobs
- Analyzing data with Hive, Pig and Hadoop Streaming.
- Build/Tune/Maintain Hive QL and Pig Scripts for reporting purpose.
- Handled importing of data from various data sources, performed transformations using Hive, Map/Reduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Analyzed the data by performing Hive queries (HiveQL) and running Pig scripts (Pig Latin) to study customer behavior.
- Used Impala to query the Hadoop data stored in HDFS.
- Manage and review Hadoop log files.
- Support/Troubleshoot MapReduce programs running on the cluster
- Load data from Linux file system into HDFS.
- Install and configure Hive and write Hive UDFs.
- Create tables, load data, and write queries in Hive.
- Develop scripts to automate routine DBA tasks using Linux Shell Scripts, Python
Confidential, Chicago IL.
Hadoop Developer
Environment: Hortornworks Hadoop 2.0, EMP, Cloud Infrastructure (Amazon AWS), JAVA, Python, HBase, Hadoop Ecosystem, Linux
Responsibilities:
- Used Sqoop to dump data from relational database into HDFS for processing and exporting data to RDMS Writing and reviewing technical design documents.
- Loaded and transformed large sets of structured, semi structured and unstructured data.
- Wrote MR jobs and pig scripts using various Input and Output formats. Also used custom formats whenever necessary.
- Used Pig and Hive in the analysis of data.
- Used all complex data types in Pig for handling data.
- Developed Pig UDFs for preprocessing the data for analysis.
- Created/modified UDF and UDAFs for Hive whenever necessary.
- Supported Map Reduce Programs those are running on the cluster.
- Managed and reviewed Hadoop log files to identify issues when job fails.
- Planed, designed, and implemented processing massive amount of market information, information enrichment and processing.
- Assisted admin team in setting up additional nodes in the cluster.
Confidential, New York NY
Java Developer
Environment: Core Java, XML, EJB 3.0, DB2, JavaScript, AJAX, multithreading, Spring, Maven, JDBC, Struts, Hibernate, GUI, Servlets, JSP, Restful, WebLogic App Server, Oracle
Responsibilities:
- Responsible for designing logical and Physical data model in DB2.
- Designed and Developed Mortgage Module using JSF, Servlet's, EJB's and JDBC.
- Implemented Tiles in Struts Framework in order to avoid code redundancy for developing user screens that contain same Headers and Footers
- Worked in the continuous integration of spring and hibernate.
- Designed and developed JSF pages along with AJAX (using DOJO) to provide better user-experience.
- Used JSF layout for View of MVC. JavaScript, HTML5 also used for front end interactivity.
- Deployed the entire application on Web logic 8.1 Application Server.
- Extensively created complex DB2 queries using CAST expression, CASE statements, SQL predicates, etc.
- Prepared ANT scripts to handle code Rolls to various environments.
Confidential
Software Engineer
Environment: Core Java, XML, JavaScript, AJAX, Maven, JDBC, Struts, Spring, Hibernate, GUI, Servlets, JSP, Restful, Oracle 10g, SQL, PL/SQL, DNS, UML, JBoss, Windows
Responsibilities:
- Responsible for Immediate Error Resolving
- Designing Sample Screens with UI and taking Requirements sign off from Users.
- Creation of Tables, Views, Stored Procedures for all modules of the project.
- Developed Crystal Reports
- Created Re-Usable Master Search Control.
- Implemented Spring MVC framework which includes writing Controller classes for handling requests, processing form submissions and also performed validations using Commons validator.
- Implemented Work Flow Management.
- Complete development of my module from backend to front end.
- Designing test plan, test cases and checking the validation.
- Testing the Application.
- Implementation of the system at client Location.
- Giving Training to Application users, interacting with the client, understanding the change requests if any from client.
Confidential
Java Developer
Environment: Core Java, JavaScript, J2EE, Servlets, JSP, Design Patterns, JDBC, HTML, CSS, AJAX, Hibernate, WebLogic, Oracle 8i, ANT, LINUX, SVN, Windows XP
Responsibilities:
- Communicate with Clients for Requirements Gathering, Explaining the requirements to Team Members
- Analyzing the Requirements and Designing Screen Proto types.
- Involved in Project Documentation.
- Involved in creation of Basic DB Architecture for the application.
- Involved in adding solution to VSS.
- Designing & Development of Screens.
- Coded JS functions for client validations.
- Created user Controls for reusability.
- Creation of Tables, Views, Packages, Sequences, Functions for all the modules of the project.
- Developed Crystal Reports.
- Integrating the functionality of all modules.
- Involved in deploying the application.
- Unit testing & integration testing.
- Designing test plan, test cases and checking the validation.
- Test whether the application meets the business requirements.
- Implementation of the system at client Location.
- Giving Training to Application users, interacting with the client, understanding the change requests if any from client.
- Responsible for Immediate Error Resolving.