- 10 years of Professional IT experience in Designing, Developing and Architecting Data Warehouses and Business Intelligence Solutions for Insurance, Financial.
- 3 Plus years of extensive hands - on experience in Big Data/Hadoop as Hadoop Developer and very good knowledge and experience in Apache Hadoop and ecosystem components like MapReduce, Hive, Pig, Sqoop, Flume, Oozie.
- Solid experience and expert knowledge in designing big data applications with hands on experience in Hadoop eco-system components. Led the effort from creating POC, demonstrating it to stakeholders, designing, developing and implementing in production environment.
- Good knowledge and solid Experience in Spark and its ecosystem projects like Spark SQL, DataFrames, and Streaming.
- Experience in importing and exporting the data using Sqoop from Relational Database to HDFS
- Setting up the High-Availability for Hadoop Clusters components and Edge nodes.
- Experience in developing Shell scripts and Python Scripts for system management
- Experience in Performance tuning of Hadoop clusters and its ecosystem components.
- Good knowledge and solid experience in Hadoop distributions CDH, MapR, Hortonworks
- Experience in analyzing high volumes of data and generating graphs, reports, alerts, dashboards using Splunk/Tableu.
- Expert level programming experience using Python on UNIX (Linux, HP-UX) environment, UNIX Shell scripting, SQL and Exposure in Scala.
- Developed Talend Jobs to compare two XML files contacting support requests to determine which is more recent and take conditional action based on that determination.
- Developed Talend Jobs to store data from a relational database on HDFS/MapR-FS.
- Expertise in Data Warehouse applications, directly responsible for the Extraction, Transformation and Loading of data directly from different heterogeneous source systems like Flat files, Excel, Oracle, SQL Server, Teradata
- Strong Exposure in writing Simple and Complex SQLs, PL/SQL Functions and Procedures, Packages and creation of Oracle Objects - Tables, Materialized views, Triggers, Synonyms, User Defined Data Types, Nested Tables and Collections
- Extensive experience in working with different Databases such as Teradata, Oracle, SQL Server, MySQL and writing efficient and complex SQLs on huge volumes of data.
- Experience in Object Oriented programming, OOAD methodologies, design patterns and code refactoring.
- Excellent skills in Data Analysis and maintaining Data Quality / Data Integrity.
- Passion towards new technologies and ability to adapt quickly to new environments.
- Experience in resolving in production issues quickly. Very good problem solving skills.
- Experience in customization process i.e. Requirements gathering, Preparing Detailed Design Specifications, Gap Analysis, Development, Integration and Deployment.
- Excellent planning, scoping, scheduling, and delivery skills. Expert in leading enterprise software solution projects.
- Experience in DevOps, Agile, Waterfall methodology projects.
- Good at Peer reviews and Defect analysis.
- Established self as an effective leader and communicator with strong interpersonal, technical, leadership, analytical skills, Quality Process, Project Delivery, and Risk Management.
- Strong Global Resources Experience—Onshore/Offshore models.
- Strong analytical, problem-solving and organizational skills. Ability to handle multiple tasks and assignments concurrently in cross-functional teams and a flexible team player able to communicate with all levels of personnel
Operating Systems: Windows XP, 2000, UNIX/Linux
Hadoop Ecosystem Components: Hadoop, HDFS, MapRFS, HttpFS, MapReduce, Pig, Hive, Drill, HCatalog, Sqoop, Flume, Spark, Storm, Kafka, Hue
Data Protection tool: Dataguise
Big Data Analytics tool: Datameer
Visualization& Reporting: Tableau, Splunk
Programming Languages: Python, Exposure on C, C++, Core Java, Scala
Web Technologies: HTML, XML
ETL Tools: Talend open Studio with Big Data
Scripting: Unix Shell Scripting
RDBMS: Teradata 14.0,Oracle 8i/9i/10g, Sybase, MySql
Methodologies: Star Schema and Snow Flaking
Teradata Utilities: BTEQ, FASTLOAD,MULTILOAD,TPUMP and FAST EXPORT
Scheduler / Tools: Putty, Windows SFTP
Versioning Tools: Tortoise SVN
Microsoft Tools: Microsoft Visio 2010, Word 2010,Power Point 2013
Confidential - Phoenix, AZBig Data/Hadoop Developer
ETL Tools: Talend
Programming Languages: Python/Shell Scripting/MySql/Spark
Tools: and Technologies: It involves multiple technologies like Unix, Hive, Pig, Python, Shell Scripting, MySQL, Java (MapReduce).
- Involved in creating complex solution architecture in Big Data space based on Hadoop platform with high performance.
- Involved in Technical interaction with the customers, understanding project requirements, prepared analysis and conceptuality of solution, and manage project delivery with complete project deliverables.
- Provide review and feedback for existing physical architecture, data architecture and individual code.
- Provide proof-of-concepts to reduce engineering churn.
- Analyze, Ingest, Store, and Manage large volumes of data in HDFS.
- Analyzed the requirements for completeness, consistency, comprehensibility, feasibility to convert into solutions.
- Interacted with Developers, Architects and other Operations teams to resolve cluster and job Performance issues.
- Responsible for the analysis of data and maintain the Data integrity and Data Quality.
- Based on the System of records format of the files like Mainframe/Flat Files/Json files we used various techniques to flatten the data and placed in HDFS.
- In this process we used various kind of Tools like Talend or used a high level of python code for transformation of data to the business requirements
- Experience in working with big data developers, designers and scientists in troubleshooting map reduce job failures and issues with Hive, Pig and other eco-system projects.
- Worked with Unix to run the scripts in batch mode.
Confidential - Phoenix, AZBig Data/Hadoop Developer
Programming Languages: Abinitio, Event Engine, Sybase, Teradata
- Involved in Requirement gathering, business Analysis, Design and Development, testing and implementation of business rules.
- Involved in Data Modeling to identify the gaps with respect to business requirements and transforming the business rules.
- Preparing Data Flow Diagram with respect to the source system by referring TCLDM.
- Extracted operational data into the data store from the legacy systems using COBOL and JCL scripts.
- Extensively used ETL to load data from Sybase database, XML files, and Flat files data also used import data from IBM Mainframes.
- Loaded and transferred large data from different databases into Teradata.
- Writing scripts for data cleansing, data validation, data transformation for the data coming from different source systems.
- Developing and reviewing Detail Design Document and Technical specification docs for end to end ETL process flow for each source systems.
- Worked on UTL FILE with extensive parallel processing.
- Used UTL FILE in packages to read/write the data from/to a flat file
- Implemented the component level, pipeline and Data parallelism in Ab Initio for ETL process for Data warehouse.
- Regular interactions with DBA’s.
- Developed UNIX shell scripts to run batch jobs and loads into production.
- Involved in Unit Testing and Preparing test cases and also involved in Peer Reviews.
ConfidentialBig Data/Hadoop as Hadoop Developer
Programming Languages: Abinito, Terdata, Sybase
- Providing L3 & L2 -- Support- Incident/Issue Analysis and Resolutions.
- ETL/SQL Components Development activities.
- Migration of code/nodes onto QA and production environment.
- Production support and maintenance of the existing Data Mart processes. Involved in enhancements to the existing Data Marts.
- Supported several end-user groups with Data, SQL and Report generation support.
- Worked daily on issue tickets involving production issues, data issues and maintenance re-quests.
- Performance Tuning for the existing Oracle SQL scripts
- Reviewed the SQL for missing joins & join constraints, data format issues, mismatched aliases, casting errors.
- Collected Multi-Column Statistics on all the non-indexed columns used during the join operations.
- Used extensively Derived Tables, Volatile Table and GTT tables in many of the ETL scripts.
- Very good understanding of the several relational Databases such as Teradata, Oracle and DB2.
- Wrote several complex SQLs using sub queries, join types, temporary tables, OLAP functions etc.
- Very good understanding of Relational Modeling concepts such as Entities, relationships, Normalization etc.
- Performed System Integration Testing for the key phases in the project
- Provide 24*7 production support for the Teradata ETL jobs for daily, Monthly and Weekly Schedule
- Supporting jobs running in production for failure resolution with tracking failure reasons and providing best resolution in timely manner
- Developed UNIX Shell scripts for file manipulations.
- Handling 75TB Data Warehouse
ConfidentialBig Data/Hadoop Developer
- Delivering high quality Code.
- Interacting with business analysts to analyze the user requirements, functional specifications and system specifications.
- Developed complex queries, functions, store procedures and triggers using PL/SQL.
- Designed and developed backend PL/SQL packages, tuned stored procedures
- Coding client specific Procedures for Import & Export of data.
- Improved the performance of the SQL queries by utilizing best optimization techniques.
- Client and Product development tasks as assigned
- Used UTL FILE in packages to read/write the data from/to a flat file.
- Study and understanding of functional specifications.
- Involved in testing the application manually before automating the desired test cases
- Prepared user requirements and developed business and technical specifications accordance to business rules.
- Generation of unit test cases from Functional Specifications.