I am interested to work in challenging projects, to be associated with a progressive organization that would stimulate my professional and personal growth as software professional and be part of a team that has striven to be the best. I am looking for a position of Senior Developer or Team Lead in Big Data Hadoop Platform.
- Total around 7.6 years of experience in software development and programming using Informatica,Hadoop Ecosystem & Apache Spark .
- Capable to delve into the new leading Technologies.
- Ability to work well in both a team environment and as individual.
Hbase, Hive, Pig, Oozie, Sqoop, HDFS, NIFI
Spark - Core, Spark-SQL
SQL Server 2005/2008, Teradata, Oracle 9i, 10g
Unix (Shell Scripting)
Scala, R (Programming language)
Environment: Apache Hadoop, Spark-SQL, Python, Scala, NIFI, Hive, SQL Server, Oracle
- Build the data ingestion scripts to ingest the data from various Data Sources into Data Lake.
- Fed the required and transformed data into Hive tables for KPI and Sales Dashboards via Spark using Scala.
- Created the one-time scripts to automate all the Hive table and database creation and for smoother release in production.
- Ingested all the data sources into Data lake via the ingestion scripts.
- Automated the Black-Box Testing scripts which helped in decreasing the Man- days efforts.
- Helped in generating reports for mismatched records from Black-Box Testing.
Environment: Spark-SQL,HDFS,Hive,SQL-Server,Oracle,Apache Drill,QlikView, Autosys
- Responsible for loading data to various source systems like Iremarket, Infolease Asia, Infolease Middle East, CEFDW, etc to Confidential Data Lake.
- Responsible for the reconciliation and auditing of the ingested data for each source.
- Preparation of validation report of the ingested data for each source.
- Migrated the legacy system dashboard from Teradata queries to Apache Drill queries.
- Undergoing unit testing and preparing the test cases for each data load.
Environment: Apache Hadoop, HDFS,Hive
- Responsible for data loading and mapping from the EDM data to the Confidential Internal Tables.
- Responsible for creating and reviewing the code by subordinates for data transfer to the core internal tables by Hive DDLs and DMLs.
- Comparing the data ad data types coming from the manual files and EDM landing area tables.
- Responsible for doing unit testing before passing it to the QA.
Environment: Apache Hadoop, Pig, Hive, Oozie, Sqoop, HDFS, HBase
- Responsible for data loading and mapping from the Confidential Data Lake to the Confidential Internal Core Tables by using Pig Scripts, Hive, HBase, Sqoop.
- Responsible for interacting with the On-Site Team on the various issues related Confidential data via Query log sheet.
- Identifying issues related to data loading process and maintaining the run book and the process change related documents.
Environment: Informatica,SQL SERVER
- Responsible for data loading and mapping; system setup and configuration; and functional processes within the Crimson Solution packages by ensuring that the requirements presented for customer data and systems are accurate and follow standard operational processes.
- Responsible for Setting up customer configurations and developing SQL queries.
- Handle Slowly Changing Dimension to maintain the complete history of the data.
- Develop all mappings according to the design document.
- Involve in unit testing of the mapping before passing it to QA.
Environment: SQL Server 2008
- End to End application development (backend).SQL server was used for bulk of operations from data storage, manipulation, validations to calculation and migrations.
- In Depth knowledge of T-SQL (DDl,DML)
- Designed DDL and DML for MS SQL Server 2008.
- Used tools like SQL Profiler extensively for debugging.