Hadoop Developer Resume
SUMMARY
- 8 years of work experience in Developing, Installing, Configuring, Analyzing, Designing, Integrating and re - engineering highly sophisticated Software Systems, which consist of 1 Yr in Bigdata Space, 6+ Yrs in Data warehousing and Business Intelligence. Implementation of business applications for Logistics, Transportation, Finance, Insurance and Telecom verticals.
- Overall understanding of BigData Technologies, Analytics, Data warehousing Concepts, Business Intelligence, Cloud platforms, and Analytics. Demonstrable knowledge of Hadoop, Map Reduce (MR), HDFS, HBase, Hive, Sqoop, Flume, Ambari and Oozie.
- Installed and configured Hadoop in Cloud thru Horton Works Distribution created Multi Node - Clusters in Cloud. Good Knowledge and Experience in Mint, Ubuntu - Linux, Scripting - Pig.
- Good Understanding in Hadoop cluster environment administration that includes adding and removing nodes from Cluster. Very good understanding on NameNode, DataNode, Secondary NameNode, YARN (ResourceManager, NodeManager, WebAppProxy), and Map Reduce Job History Server
- Good technical understanding on other few Bigdata distributions like Cloudera, MapR, IBM BigInsights and HD-Insights.
- Experienced in Data Warehousing to develop ETL mappings and scripts in Informatica Power Center 9.6 and Power Mart 9.6 using Designer, Repository Manager, Workflow Manager & Workflow Monitor.
- Good Knowledge and Experience in Mint, Ubuntu - Linux, Scripting - Pig. Used Object oriented analysis/design and development, Model View Controller, JAVA and J2EE (Servlets, JSPs, JNDI, Java Beans, EJB, RMI, and JDBC).
- Developed few MR Progams using JAVA, created Hive and Hbase Tables, created few scripts using Apache Pig in the Horton works Environment. Hands on experience in building GUIs using Java Script, AJAX, HTML, DHTML, CSS2, JSP, Taglibs, JSON, XML and XSL.
- Experience in developing web applications using SOAP, Restful Web services - SOAP and WSDL. Implementing web applications using frameworks like Struts 1.x/2.x, spring 3.2(Spring MVC, Spring Test module), JSF 2.1 and integration with ORM tools like Hibernate 3.5.
- Good Knowledge in Talend Open Studio for Data Integration which efficiently and effectively manage all facets ofdata extraction, data transformation, anddata loading .
- Hands on Oracle 11g/10g/9i/8i, MS SQL Server 2002/7.0/6.5 , SQL Server 12.0/11.x, MS Access 7.0/2000, SQL, PL/SQL, SQL*Plus, Sun Solaris 2.x.,MQ explorer.
- Proficient in Business Intelligence using SAP (Business Objects) BO/BI - Designing Universes, preparing Dash Board Reports.
- Extensively worked in JAVA/J2EE Technologies. Knowledge in using IDE’s such as My Eclipse, Eclipse (EE) for development.
- Experience in GUI development with HTML, DHTML and JavaScript. Comprehensive knowledge in frameworks like Struts 1.1/1.2, Hibernate 3.0 and Spring 2.
- Experience in full SDLC (Design, Development, Testing, Deployment and Support) & understanding of ITIL Process.
- Strong Analytical & Communication skills and ability to Work independently with Excellent Interpersonal Skills.
TECHNICAL SKILLS
BigData Frame work: Horton Works (HDP 2.1), MapR
ETL /BI Tools: Informatica 9.6 / 9.1, Cognos7.x/8.x, Business-Objects 5.x/6.xNO SQL: Apache HBase
Frame Work: Struts, Hibernate, spring
BigData Tools &Technologies: Hadoop (HDFS, MapReduce) 2.4.6, Pig, Hive, Oozie, Ambari, Flume, SQOOP and Zookeeper,Spark,Kafka
OS: Windows 2000/98/95/NT/7, Linux / Unix Kerberos Security
RDBMS: Teradata13.0, Oracle/11g/10g/9i, MS-SQL Server, and MySQL,SAP HANA 9
Tools: SQL Plus, Toad, Power Designer 9.6, Rational Rose 2000. IBM Rational Clear Case,HP-QC
Data Model: Dimensional Data Modeling, Star Join Schema Modeling, Snow-Flake Modeling, FACT and Dimensions Tables, Physical and Logical Data Modeling
Language / Tools: Jave2.0/J2EE, SQL, PL/SQL, XML, HTML, JavaScript and UML, C, C++
PROFESSIONAL EXPERIENCE
Hadoop Developer
Confidential
Responsibilities:
- Involved in implementation of lifecycle of a Hadoop solution, including requirements analysis, platform selection, technical design, application design & development, testing, and deployment.
- Responsible for data Analysis (cleansing, trimming, seeing the correctness, data integration, data quality and etc.,) coming from different sources like RDBMS, CSV files and etc., to Hadoop Ecosystems. Worked in Multi node Hadoop cluster.
- Participated in Design and build scalable infrastructure and platform to collect and process very large amounts of data (structured and semistructured), including near real-time data.
- Got involved on Hortonworks implementation the bigdata distribution for High availability and fault tolerance on clients requirements.
- Created internal and external tables on HIVE, created executed complex queries in Hive for ad hoc access, discussed with business/application teams for the accuracy on results.
- Hands on with the Hadoop Eco Systems Apache PIG, Oozie (Job Scheduling) HDFS and MapReduce, experience in installing and configuring the tools in Hadoop cluster, Ability to troubleshoot and tune logics/queries.
- Created HBase tables to load large sets of structured and semi-structured data coming from UNIX, NoSQL and a variety of portfolios.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
- Used Flume to handle streaming data and loaded the data into Hadoop cluster. Created shell script to ingestion the files from Edge Node to HDFS.
- Worked on creating Map Reduce scripts for processing the data. Used SQOOP to move the structured data from MySQL to HDFS, HIVE, PIG and HBase.
- Worked on different BigData file formats like txt, sequence, avro, parquet and snappy compression. Used Java to read the AVRO file.
- Used core Java concepts like Collections, Generics, Exception handling, IO, Concurrency to develop business logic. Validated query execution plans & tuned queries using Indexes, Views and Batch processing.
- Retrieved agent properties throughout the application using XPATH Integrated Spring & Hibernate frameworks to develop end to end application.
- Used Hibernate to connect from web service and perform CRUD operations in to DB. Used spring framework to inject services, entity services, transaction management.
- Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
- Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
- Wrote UNIX shell Scripts & PMCMD commands for FTP of files from remote server and backup of repository and folder.
- Used Spark API over ClouderaHadoopYARN to perform analytics on data in Hive.
- Explored with Spark improving the performance and optimization of the existing algorithms inHadoopusing Spark context, Spark-SQL, Data Frame, pair RDD's,.
- Migration of code between the Environments and maintaining the code backups.
- Creating Checklists for Coding, Testing and Release for a smooth, better & error free project flow.
- Creating the Release Documents for better readability of code/reports to end users.
- Handling User Acceptance Test & System Integration Test apart from Unit Testing with the help of Quality Center as bug logging tool. Created & Documented Unit Test Plan (UTP) for the code.
Environment: Horton Works (HDP 2.1), Hadoop 2.7 (HDFS, Mapreduce), PIG v 0.8, HBase, HIVE V0.13, Ambari, Flume, Sqoop, Zookeeper, R (Revolutionary Analytics), Oracle 10G, DB2, JSON, CSV and Oozie,, Oracle 11g/10g, Unix, MQ Series,Erwin 3.5, PL/SQL, Windows 7 / 2000. JAVA 1.8/J2EE, HTML, Java Script, JQUERY, Servlets, JSP, JSON, Apache Tomcat Server 9, XML, XSD, XPATH, WSDL, Spring 3.2.. Hibernate 3.5, JAX-RS, JUNIT, Maven,SVN, POSTMAN, Ant, JConsole, SQL Developer, SQL Server Management Studio
Informatica-Teradata Developer / System Analyst
Confidential
Responsibilities:
- Parsing high-level design spec to simple ETL coding and mapping standards.
- Worked on Informatica Power Center tool - Source Analyzer, Data warehousing designer, Mapping & Mapplet Designer and Transformation Designer.
- Worked with the data Architecture team to some extent on the Entity-Relational Ship Model Diagram, and worked towards implementing the same in SEDW.
- Responsibilities included designing and developing Informatica mappings to load data from Source systems to SA and tan to SEDW and Finally to SSL. Also involved in Type-II slowly changing dimensions.
- Created complex mappings using Unconnected Lookup, Sorter, Aggregator, newly changed dynamic Lookup and Router transformations for populating target table in efficient manner.
- Extensively used Power Center to design multiple mappings with embedded business logic.
- Creation of Transformations like Joiner, Rank and Source Qualifier Transformations in the Informatica Designer.
- Good in Designing, Developing and Maintaining Oracle Database Schemas, Tables, Standard, /Materialized Views, Synonyms, Unique/Non-Unique Indexes, Constraints, Triggers, Sequences, Implicit/Explicit Cursors, Cursor for Loops, Reference Cursors, and other Database Objects
- Worked in Teradata 13.x and used utilities like MultiLoad, FastLoad, FastExport, BTEQ, TPump, Teradata SQL
- Good Experience in developing and designing Informatica transformation objects, session tasks, worklet, workflows, workflow configurations and sessions (reusable and non-reusable).
- Knowledge in Informatica Advanced Techniques - Dynamic Caching, Memory Management, Parallel Processing to increase Performance throughput.
- Excellent knowledge and experience in creating source to target mapping, edit rules and validation, transformations, and business rules.
- Created Mapplet and used them in different Mappings.
- Using Informatica Repository Manager maintained all the repositories of various applications, created users, user groups, and security access control.
- Good knowledge in the Physical data Model and Logical data Model
- Provided Knowledge Transfer to the end users and created extensive documentation on the design, development, implementation, daily loads and process flow of the mappings.
- Maintain Development, Test and Production mapping migration Using Repository Manager, also used Repository Manager to maintain the metadata, Security and Reporting.
- Understand the business needs and implement the same into a functional database design.
- Data Quality Analysis to determine cleansing requirements. Designed and developed Informatica mappings for data loads.
- Tuning Informatica Mappings and Sessions for optimum performance. Created and maintained several custom reports for the client using Business Objects.
- Has good knowledge in Informatica 10 version such as Big data management platform along with Developer client, Analyst tool, Admin Console, PWX and IDQ, Metadata manager, Business Glossary.
Environment: Informatica PowerCenter 9.6 (Source Analyzer, Data warehouse designer, Mapping Designer, Mapplet, Transformations, Workflow Manager, Workflow Monitor), Teradata 13.0,oracle 10g MQ Series,Erwin 3.5, PL/SQL, Windows 7 / 2000.
Informatica Developer
Confidential
Responsibilities:
- End-to-end ETL development of the Premium Module Data Mart. Maintained warehouse metadata, naming standards and warehouse standards for future application development.
- Developed design spec to simple ETL coding and mapping standards. Worked on Informatica Power Center tool - Source Analyzer, Data warehousing designer, Mapping & Mapplet Designer.
- Designed and developed Informatica Mappings to load data from Source systems to ODS and tan to Data Mart.
- Extensively used Power Center/Mart to design multiple mappings with embedded business logic.
- Creation of Transformations like Lookup, Joiner, Rank and Source Qualifier Transformations in the Informatica Designer.
- Created mappings using Sorter, Aggregator, newly changed dynamic Lookup and Router transformations.
- Created Mapplet and used them in different Mappings.
- Worked with mapping variable, Mapping parameters and variable functions like Setvariable, Countvariable, Setminvariable and Setmaxvariable.
- Provided Knowledge Transfer to the end users and created extensive documentation on the design, development, implementation, daily loads and process flow of the mappings.
- Maintain Development, Test and Production mapping migration Using Repository Manager, also used Repository Manager to maintain the metadata, Security and Reporting.
- Understand the business needs and implement the same into a functional database design.
- ETL Migration process from legacy scripts to Informatica Power Center 5.2.
- Data Quality Analysis to determine cleansing requirements. Designed and developed Informatica mappings for data loads.
Environment: Informatica PowerCenter 9.0.1 (Source Analyzer, Data warehouse designer, Mapping Designer, Mapplet, Transformations, Workflow Manager, Workflow Monitor), Erwin3.5, PL/SQL,Oracle 9i, DB2, Windows2000.
ETL Developer
Confidential
Responsibilities:
- Interacted with business analysts and translate business requirements into technical specifications.
- Using Informatica Designer, developed mappings, which populated the data into the target.
- Used Source Analyzer and Warehouse Designer to import the source and target database schemas and the mapping designer to map the sources to the targets.
- Worked extensively on Workflow Manager, Workflow Monitor and Worklet Designer to create, edit and run workflows, tasks.
- Enhanced performance for Informatica session using large data files by using partitions, increasing block size, data cache size and target based commit interval.
- Extensively used aggregators, lookup, update strategy, router and joiner transformations.
- Developed the control files to load various sales data into the system via SQL*Loader.
- Extensively used TOAD to analyze data and fix errors and develop.
- Involved in the design, development and testing of the PL/SQL stored procedures, packages for the ETL processes.
- Created different types of reports like List, Cross-tab, Chart, Drill-Thru and Master-Detail Reports. Created multiple Dashboard reports for multiple packages.
- Migrated Reports from Cognos 8.3 to 8.4, initiated training for the users to do ad-hoc reporting using Query Studio.
Environment: Informatica PowerCenter 8.x (Source Analyzer, Data warehouse designer, Mapping Designer, Mapplet, Transformations, Workflow Manager, Workflow Monitor), Erwin 3.5, PL/SQL, Oracle 10g/9i, SQL Server 2005, ROLAP, Cognos 8.2/3/4 (Framework Manager, Report Studio, Query Studio, Analysis Studio, Cognos Connection), Windows 2000.
Oracle Developer
Confidential
Responsibilities:
- Wrote few Unix Shell Scripts to process the files on daily basis like renaming the file, extracting date from the file, unzipping the file and remove the junk characters from the file before loading them into the base tables.
- Created PL/SQL stored procedures, functions and packages for moving the data from staging area to data mart.
- Handled errors using Exception Handling Daily Operations: Job monitoring, notifying/fixing data load failures.
- Production Support of the existing system. Fixing the database problems and processing error out records. Resolving bugs in the code of the EIB system. And Resolving the Call and supporting Oncall.
- Extensively used the advanced features of PL/SQL like Records, Tables, Object types and Dynamic SQL.
- Root cause analysis, Enhancements, Requirement collection and Estimation.
Environment: Oracle 10g, SQL * Plus, TOAD, SQL*Loader, PL/SQL Developer, Shell Scripts, UNIX, Windows XP