- Over 8 years of experience in the field of Information Technology including 3 years of experience in Big Data/Hadoop
- Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
- Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
- Very Strong Object-oriented concepts with complete software development life cycle experience - Requirements gathering, Conceptual Design, Analysis, Detail design, Development, Mentoring, System and User Acceptance Testing.
- Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce.
- Used different Hive Serde's like Regex Serde .
- Extensive knowledge on data serialization techniques like AVRO, sequence files.
- Excellent understanding and knowledge of NoSQL databases like HBase
- Experience in providing support to data analyst in running Pig and Hive queries.
- Developed Map Reduce programs to perform analysis.
- Performed Importing and exporting data into HDFS and Hive using Sqoop.
- Experience in writing shell scripts to dump the Shared data from MySQL servers to HDFS.
- Highly knowledgeable in Writer Comparable, Writer interfaces, Mapper and Reducer abstract classes, Hadoop Data Objects such as IntWritable, ByteWritable, Text objects.
- Experience in using Oozie 0.1 for managing Hadoop jobs.
- Experience in cluster coordination using Zookeeper.
- Extensively development experience in different IDE's like Eclipse, NetBeans, Forte and STS.
- Expertise in relational databases like Oracle, My SQL.
- Experience in designing both time driven and data driven automated workflows using Oozie 3.0 order to run jobs of Hadoop MapReduce 2.0
- Experience in installation, configuration, supporting and managing - Cloudera's Hadoop platform along with CDH3 4 clusters.
- Experienced in setting up SSH, SCP, SFTP connectivity between UNIX hosts.
- Extensive experience in working with the Customers to gather required information to analyze, debug and provide data fix or code fix for technical problems, build service patch for each version release and unit testing, integration testing, User Acceptance testing and system testing and providing Technical Solution documents for the Users.
- Programming Languages : Java 1.4, C , C, SQL, PIG, PL/SQL.
- Java Technologies : JDBC.
- Frame Works : Jakarta Struts 1.1, JUnit and JTest.
- Databases : Oracle8i/9i, NO SQL MYSQL, MSSQL server.
- IDE's Utilities : Eclipse and JCreator, NetBeans.Web Dev.
- Technologies : HTML, XML.
- Protocols : TCP/IP, HTTP and HTTPS.
- Operating Systems : Linux, MacOS, WINDOWS 98/00/NT/XP.
- Hadoop ecosystem : Hadoop and MapReduce, Sqoop, Hive, PIG,HBASE, HDFS, Oozie.
Description: This project is intended to replace existing mainframe legacy applications by storing and processing the data of Billing, Payments, and Disbursements Application Databases entirely in HDFS. The entire processing in HDFS would be done through Pig, Hive, Map reduce programs. Enhance performance using various sub-project of Hadoop like PIG, HIVE, FLUME perform data migration from legacy using SQOOP, handle performance tuning and conduct regular backups. Ensure technical and functional designs meet business requirements. We work with various data sources like RDBMS, flat files, fixed length files and delimited files and legacy file formats.
- Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from LINUX file system to Hadoop Distributed File System.
- Created Hbase tables to store various data formats of PII data coming from different portfolios.
- Experience in managing and reviewing Hadoop log files.
- Exporting the analyzed and processed data to the relational databases using Sqoop for visualization and for generation of reports for the BI team.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzing large amounts of data sets to determine optimal way to aggregate and report on these data sets
- Worked with the Data Science team to gather requirements for various data mining projects.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Created dash boards using Tableau to analyze data for reporting.
- Support for setting up QA environment and updating of configurations for implementation scripts with Pig and Sqoop.
Description: This project is use case for Confidential as enterprise data warehouse. Golden Record is the Enterprise Customer Record ECR that is created as a part of the project. Data is exported from DB2 to HADOOP. The pre-processing of the data is completed in Hadoop. During pre-processing the data is de-normalized, Business rules are applied to create flattened view. Once pre-processed, the data is moved to Cassandra for near-real time access through Restful web service.
- Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
- Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
- Developed Map Reduce programs for applying business rules on the data.
- Developed and executed hive queries for denormalizing the data.
- Installed and configured Hadoop Cluster for development and testing environment.
- Implemented Fair scheduler on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
- Implemented Partitioning, Dynamic Partitions, buckets in Hive and wrote map reduce programs to analyze and process the data
- Developed data pipeline into DB2 containing the user purchasing data from Hadoop
- Developed job flows in Oozie to automate the workflow for extraction of data from Teradata and Netezza
- Developed data pipeline using Pig and Hive from Teradata and Netezza data sources. These pipelines had customized UDF'S to extend the ETL functionality.
- Dumped the data using Sqoop into HDFS for analyzing.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Streamlined Hadoop jobs and workflow operations using Oozie workflow engine.
- Involved in product life cycle developed using Scrum methodology.
- Involved in mentoring team in technical discussions and Technical reviews.
- Involved in code reviews and verifying bug analysis reports.
- Automated work flows using shell scripts.
- Performance tuning of the hive queries, written by other developers.
Environment: Hadoop, HDFS, Hive, MapReduce 2.0, Sqoop 2.0.0, Oozie 3.0, Shell Scripting, Ubuntu, Linux Red Hat.
This application is developed to capture and store patient information. Client wanted an application which maintains patient record. In this application various details such as patient's personal information, Billing details, Clinical details, treatment details, specimen details are recorded using J2EE and SQL server.
- Responsible for analyzing business requirements and detail design of the software.
- Design and developed Front End User interface
- Connectivity of JDBC was established using Oracle10g.
- Involved with project manager in creating detailed project plans.
- Designed technical documents using UML.
- Created Junit Test cases by following Test Driven development.
- Responsible for implementing DAO, POJO using Hibernate Reverse Engineering, AOP and service Layer.
- Used Spring, MVC pattern, struts frame work and followed Test Driven.
Rational Application Developer RAD 7.5, Web Sphere Portal Server 6.1, Java 1.6, J2EE, JSP 2.1, Servlets 3, JSF 1.2, Spring 2.5, Hibernate 2.0, Web Sphere 6.1, AXIS, Oracle 10g, JUnit, XML, HTML, Java Script, AJAX, CSS, Rational Clear Case.
Was part of the development team that was developing an application for Underwriting and Administration of Disability related products under the Income Maintenance and Enhancement.
- Extensively used Core Java, Servlets, JSP and XML
- Used Struts 1.2 in presentation tier
- Generated the Hibernate XML and Java Mappings for the schemas
- Used DB2 Database to store the system data
- Actively involved in the system testing
- Involved in fixing bugs and unit testing with test cases using JUnit
- Wrote complex SQL queries and stored procedures
- Used IBM Web-Sphere as the Application Server
Environment: Java 1.2/1.3, Swing, Applet, Servlet, JSP, XML, HTML, Java Script, Oracle, DB2, PL/SQL
Programmer Analyst/Java Developer Idea Labs,
Automated Solution for a Retail Chain: Developed a Prototype for a Retail Chain, which is a complete ERP solution getting all the modules integrated to a single system that directly connects all the modules, like Retail outlet, Central Repository, Procurement Department to a single System. Adding a new technology to the software that is RFID Radio Frequency Identification , this replaces the barcodes.
- Involved in complete software development life cycle Requirement Analysis, Conceptual Design, and Detail design, Development, System and User Acceptance Testing.
- Involved in Design and Development of the System using Rational Rose and UML.
- Involved in Business Analysis and developed Use Cases, Program Specifications to capture the business functionality.
- Improving the coding standards, code reuse, and performance of the Extend application by making effective use of various design patterns Business Delegate, View Helper, DAO, Value Object etc. and other Basic patterns .
- Design of system using JSPs, Servlets
- Designed application using Process Object, DAO, Data Object, Value Object, Factory, Delegation patterns.
- Involved in integrating the concept of RFID in the software and developing the code for its API.
- Coordinating between teams as a Project Co-coordinator, organizing design and architectural meetings.
- Design and developed Class diagram, Identifying Objects and its interaction to specify Sequence diagrams for the System using Rational Rose.
Environment: JDK 1.3, J2EE, JSP, Servlets, HTML, XML, UML, RATIONAL ROSE, AWT, Web logic 5.1 and Oracle 8i, SQL, PL/SQL. References: Available upon Request.