Hadoop Developer Resume
Rosemead, CA
SUMMARY:
- Over 9 years of professional work experience as Big Data/Hadoop Developer& Java Developerin Banking, Insurance,and HealthCaredomains for software application development projects.
- Experience with software development process models like Agile, Scrum, Waterfall methodologies. Strong experience in setting up a full software Development Life Cycle (SDLC) and ability to handle Complete SDLC.
- Expertise in working AWS, Microsoft Azure cloud environments.
- Hands on experience on Cloudera and Hortonworks distributed environments.
- Expertise in developing jobs using Spark framework modules like Spark - Core, Spark-SQL and Spark Streaming using Scala, Python.
- Experienced in migrating map reduce programs into Spark RDD transformations, to improve performance.
- Extensive and progressive experience in Data Ingestion, Transformation, Analytics using Spark framework, Hadoop eco systems componentsand third-party tools like Trifacta.
- Hands on experience in Hadoop components like HDFS, MapReduce, YARN, Hive, HQL,HBase, Pig, Sqoop, Flume, Nifi, Kafka, Spark,Oozie, Zookeeper, Elastic Search, Kibanaand in-depth knowledge on Hadoop architecture.
- Expert in working with Hive data warehouse tool-creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the HiveQL queries.
- Analyzing Data through HiveQL, Pig Latin & MapReduce programs in Java.Extending HIVE and PIG core functionalities by implementing custom UDF’s.
- Extensively involved in design, development, tuning and maintenance of HBase, CasandraNoSQL databases.
- Having hands on experiencein cluster setup.
- Good knowledge on Integrating the BI tools like Tableau, Kibana with the Hadoop stack and extracting the required Data.
- Experienced in Event Management, Incident Management and Problem Management using Ganglia,Nagios,log4j, JobHistoryServerimplementing Hadoop security using Kerberos.
- Extensive expertise in creating and Automation(Maven) of workflows using Oozie workflow Engine.
- Developing various cross platform products while working with different Hadoop file formats like Sequence File, RC File, ORC, AVRO & Parquet.
- Experience in using Maven to compile, package and deploy to the application servers.
- Experience in using IDEs like Eclipse, IntelliJ, Maven and SBT.
- Hands on experience on writing Queries, Stored procedures, Functions and Triggers by using SQL.
- Good knowledge in DBMS like Oracle, MS SQL Server, Teradata and MYSQL.
- Worked on setting up SSH, SCP, SFTP connectivity between UNIX.
- Expertise in data modelling & deployment strategies in production environment meeting Agile requirements.
TECHNICAL SKILLS:
Hadoop Technologies and Distributions: Apache Hadoop, Cloudera Hadoop Distribution and Horton works Data Platform (HDP)
Hadoop Ecosystem: HDFS, Map-Reduce, Yarn, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, Nifi, kafka
Spark Ecosystem: Spark core, Spark SQL, Spark Streaming
NoSql Databases: Hbase, MongoDB
Programming: C, C++, Java, PL/SQL, Scala, Python
RDBMS: ORACLE, MySQL, SQL Server
Web Development: HTML, JSP, Servlets, JavaScript, CSS, XML
IDE: Eclipse, NetBeans
Operating Systems: Linux (RedHat, CentOS), Windows XP/7/8/10
Web Servers: Apache Tomcat
Build Tools: Apache Maven,Apache Ant
Scripting: Linux/Unix Shell Scripting
Methodologies Used: Agile(Scrum), Waterfall methodologies, SDLC.
PROFESSIONAL EXPERIENCE:
Confidential, Rosemead, CA
Hadoop Developer
Responsibilities:
- Developing distributed computing Big Data applications using Open Source frameworks like Apache Spark, NIFI and Kafka
- Worked on Spark SQL and Spark Streaming.
- Implemented Spark SQL to access hive tables into Spark for faster processing of data. Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Used Spark SQL with Scala for creating data frames and performed transformations on data frames.
- Implemented Spark SQL to access hive tables into Spark for faster processing of data.
- Worked on Spark streaming using Apache Kafka for real time data processing.
- Experience in creating Kafka producer and Kafka consumer for Spark streaming.
- Used Hive to do transformations, joins, filter and some pre-aggregations after storing the data to HDFS.
- Using Azure cloud to store data in DATA LAKE STORE.
- Analysis the logs data and filter required columns by logstash configuration and send it to elastic search.
- Design the Elasticsearch configuration files based on number of hosts available, naming the cluster and node accordingly.
- Using Kibana illustrate the data with various display dashboard such as matric, graphs, pia-chart, aggregation table.
- Experienced in defining job flows managing and reviewing Hadoop log files
Environment: HADOOP, Cloudera, HDFS, Hive, Impala, Spark, Kafka, Sqoop, Elastic Search, Kibana, Java, Scala, Eclipse, UNIX, and Maven.
Confidential, Lafayette, LA
Hadoop Developer
Responsibilities:
- Implemented Kafka Custom encoders for custom input format to load data into Kafka Partitions.
- Implemented Spark programming using Scala API for faster testing and processing of data.
- Worked with diligence team to explore whether NIFI was a feasible option to our solution.
- Parse JSon data through Spark core to extract schema for the production data using SparkSQL and Scala.
- Used Spark API over Hadoop YARN to perform analytics on data in Hive.
- Experienced in managing and reviewing the Hadoop log files.
- Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
- Involved in creating Hive tables, loading with data and writing Hive queries that will run internally in map reduce way.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Import data from a relational database management system (SQL Server) into the Hadoop distributed file system using sqoop.
- Create scalable and high-performance Azure web services for data tracking.
- Played a key-role in setting up a 15 node Hadoop cluster on Aws.
- Used Oozie as workflow engine and Falcon for Job scheduling.
- Implemented authentication using Kerberos and authentication using Apache Sentry.
Environment: Hadoop, Cloudera, SQOOP,Hive, Hue, Pig, NIFI, Kafka, Spark, Scala,Talend, Oozie, Zookeeper, Tableau, AWS cloud, Cloudera Manager Kerberos.
Confidential, Pittsburg, PA
Hadoop Developer
Responsibilities:
- Worked with the advanced analytics team to design fraud detection algorithms and then developed MapReduce programs to efficiently run the algorithm on the huge datasets.
- Run data formatting scripts in python and created terabyte csv files to be consumed by HadoopMapReduce jobs.
- Implemented various hive optimization techniques like Dynamic Partitions, Buckets, Map Joins, Parallel executions in Hive.
- Implemented Hive custom UDF’s to achieve comprehensive data analysis.
- Imported and processed terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
- Involved in Cluster coordination services through Zookeeper.
- Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
- Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data.
Environment: Hadoop, HDFS, MapReduce, Hive, Pig, Oozie, HBase, Flume,Zookeeper, AWS cloud,Cloudera, SQL, Eclipse.
Confidential, Schaumburg, IL
Hadoop Developer
Responsibilities:
- Migrated the existing data to Hadoop from RDBMS (SQL Server and Oracle) using Sqoop for processing the data.
- Developed MapReduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side using distributed cache.
- Used Hive data warehouse tool to analyze the data in HDFS and developed Hive queries.
- Created internal and external tables with properly defined static and dynamic partitions for efficiency.
- Used Pig to develop ad-hoc queries.
- Implemented daily workflow for extraction, processing and analysis of data with Oozie.
Environment: Hadoop, MapReduce, Hive, Pig, Oozie, Sqoop.
Confidential
JAVA Developer
Responsibilities:
- Involved in various stages of the project life cycle mainly design, implementation testing, deployment and enhancement of the application.
- Involved in designing the system based on UML concepts which include Data Flow Diagram, class diagram, sequence diagram, state diagram.
- Involved in designing and developing web pages using JSP, HTML, CSS, and JavaScript and Implementation of MVC Architecture using Spring framework.
- Worked with custom tag libraries - Logic tags, Bean tags, HTML tags.
- Used Spring IOC- Dependency injection to build service and data access layers.
- Developed J2EE Web Service and SOAP Message components using Sun JAX-RPC and JAXB.
- Developed session beans to implement business logic and message-driven beans for processing messages from JMS.
- Used JDBC for database connectivity with Oracle.
- Devised logging mechanism using Log4J.
- Built AngularJS framework including MVC architectures, different modules, specific controllers, templates, custom directives and custom filters.
- Participated in development of a well responsive single page application using AngularJS framework, JavaScript, jQuery and JAVA in conjunction with HTML5 and CSS3.
- Extensively used WebSphere Studio Application Developer for building, testing and deploying applications.
- Solid experience of web service using AJAX call for processing JSON/XML format files between front-end and back-end.
Environment: JDK, Servlets, JSP, JDBC, Spring MVC, Spring Core, Oracle, SQL, Eclipse, HTML, DHTML, XML, UML, IBM WebSphere Application Server, Spring JMS, Log4J, JavaScript, JSON, JQuery, CSS, SOAP.
Confidential
JAVA Developer
Responsibilities:
- Involved in design, development and analysis documents in sharing with Clients.
- Analysis and Design of the Object models using JAVA/J2EE Design Patterns in various tiers of the application.
- Involved in Coding JavaScript code for UI validation and worked on Struts validation frameworks.
- Analyzing the Client Requirements and designing the specification document based on the requirements.
- Used application server WebLogic based on the client requirements and project specifications.
- Involved in mapping of all configuration files according to the JSF Framework.
- Testing and production support of core java based multithreading ETL tool for distributed loading XML data into Oracle11g database using JPA/Hibernate.
- Developed Presentation Layer using HTML, CSS, and JSP and validated the data using AJAX and Ext JS and JavaScript.
- Involved in the development of Database Connections and Database Operations using JDBC.
- Wrote Action Form and Action classes and used various HTML tags, Bean, and Logic etc., also configured Struts-Config.xml for global forwards, error forwards & action forwards.
- Developed UI using JSP, JSON and Servlet and server-side code with Java.
- Used Java Mail (JMS) API to send Email Notifications for the users.
- Developed Maven build scripts and involved in deploying the application on WebSphere.
- Designed and developed various stored procedures, functions and triggers in PL/SQL to implement business rules.
Environment: Java/J2EE, JSP, XML, XSLT, Struts, Web Services, SQL, Web Sphere, JUnit, Log4j, MVC, Maven.
Confidential
Associate Java Developer
Responsibilities:
- Involved in requirement gathering, analysis, design and development of the application.
- Model View Control (MVC) design pattern is implemented with Spring MVC.
- Servlets, JSP (including JSP custom tags), HTML, JavaScript, CSS for the Presentation/Web tier, Application or Business layer (Hibernate and Data layer (Oracle 9i).
- Developed Dynamic and static web pages using JSP, Custom Tags and HTML.
- Extensively used the Spring MVC, Spring Core for Inversion of Control (IOC)/Dependency Injection, Application Context.
- Implemented persistence layer using Hibernate (Spring + Hibernate integration).
- Exposed the WebServices to the client applications by sharing the WSDL's.
- Transaction management using some Spring frameworks.
- Wrote HQL queries, stored procedures and enhanced performance by running explain plans.
- Used Maven for automated building of projects.
- Involved in preparation of Unit test cases and System test plans for various functionalities using JUnit.
- Used IBMRAD to develop and debug application code.
Environment: JDK, J2EE, Spring, JSP, Servlets, XML, Spring, Hibernate, SQL, Oracle 9i, HQL, Tomcat, HTML, JavaScript, and CSS, IBMRAD