- Paramita banik is currently working as a Data Analyst in Confidential .
- Over 10.5 years total experience in IT, worked as a Data Specialist for Confidential.
- Technical Hands On experience with Hadoop ecosystem.
- Strong Data Warehousing ETL experience of using Informatica.
- Excellent Technical knowledge on Confidential DataStage.
- Core vertical experience on Banking, Insurance, Power Energy, Retail.
- Experienced troubleshooter, root - cause analyst and solution developer for Informatica &DataStage.
- Expert in all phases of software development life cycle (SDLC) (Waterfall and Agile).
Big Data tools: Hive 2.2.1, Hadoop 2.8.0, Spark 2.1.1, Sqoop,Python,Pig 0.16.0,R Programming
Reporting Tools: Microstrategy, SSRS,Tableau.
Agile tool: Atlassian JIRA, QTest Tool
Data Warehousing Tools: Confidential WebSphere DataStage v 9.1/8.5/8.1/8.0, Information Analyzer, Quality Stage, Ascential DataStage 7.5 (Manager, Designer, Director, Administrator). Informatica PowerCenter 9.1/8.6.1, Informatica PowerExchange 9.1/8.6.1, Informatica Big Data Edition (BDE) 9.xOLAP, OLTP, SSIS
Dimensional Data Modeling: Different Data Modelling like Dimensional Data Modeling, UML, Star Join Schema, Snow-Flake, FACT and Dimensions Tables Data.
Databases: M Oracle 8i/9i/10g/11i, DB2 UDB 7.2, SQL Server 2012
Operating Systems: AIX,MS 98/2000/Server 2008 R2.
Microsoft tools: MS Access 97/2000, OFFICE Suite, Visio, Project
Languages: SQL, PL/SQL,Shell Scripting
Database Tools: Toad,SQL Developer
Data Modeling: Erwin 7.2/4.1.4, Power Designer 12.5.
Scheduling Tool: Control M.
Other Tools: Rational Clear Case Version Controlling .
Confidential, Fremont, CA
- Module Lead for Data ingestion. Involved in the overall architecture design for the system.
- Prepared System Architecture & Components Design documentation & performed code-review
- Used Agile development process and practices
- Compile and validate data; reinforce and maintain compliance with corporate standards.
- Developed a SQOOP Incremental Import Job, Shell Script & CRONJOB for importing data into HDFS
- Imported data from HDFS into Hive using Hive commands
- Created Hive partition on Dates and Stocks for imported data
- Developed a PySpark Script which dynamically downloads the Amazon S3 Data files into the HDFS system.
- Developing a notification system to push the notification into Flume Agent when data is added into MySQL. Writing MySQL triggers for the notification system.
- Creating & configuring the Flume-Agent with proper source, channel & sinks to add the data into HDFS
- Created PySpark RDDs for data transformation
- Proficient in SQL Queries, triggers
- Implemented incremental import for S3 CSV files
- Worked with Structured & Unstructured, RDBMS & CSV data.
Confidential, San Ramon, CA
Data Quality Analyst
- Extensive Experience running cross platform testing using SDLC methods in agile environment, organized and performed integration and regression test cases for different groups with /without SQL s also used SQL for data validation and backend testing. performed efficient reporting for Micro strategy and SSRS for daily/monthly reporting.
- Create, execute, validate test cases and document the process to perform functional testing of the application.
- Work with systems engineering team to deploy and test new Hadoop environments and expand existing Hadoop clusters and prepare test data for testing flows to validate and prove positive and negative cases.
- Transfer data to the relational databases using Sqoop for visualization and generate reports by the BI team.
- Extensive Experience in Hive DDL, DML statements and analysis of structured and non-structured data for helping analysts spot emerging trends by comparing fresh data with historical claim metrics.
- Create reports for the BI team using Sqoop to export data into HDFS and Hive.
Confidential, San Jose, CA
Productmanagement internship /product Analyst
- Work extensively in Agile environment with leadership teams to develop and execute a cohesive online advertising strategy, generate revenue growth, and ensure operational excellence.
- Worked closely with the CTO in helping them scale to 100s millions of impressions per day. Built a scalable, cost effective, and faulttolerant data ware house system on Amazon EC2 Cloud. Developed MapReduce/EMR jobs to analyze the data and provideheuristics and reports. The heuristics were used for improving campaign targeting and efficiency.
- The objective of this BI project was to consolidate the data collection and monitoring processes for the GBS Learning & Knowledge organization, integrating data from 50 different data sources dispersed worldwide.
- The system currently collects over 100,000 records per month from disparate systems providing decision support and strategic data to the GBS leadership.
- ETL Jobs (Extract, Transformation, and Load) were designed, developed, and implemented using the Confidential InfoSphere server &Informaticawhich supports a variety of storage systems and data formats ranging from XML to text files and oracle database.
- Applied knowledge on Informatica BDE.Because is the disparity in data sources, jobs were developed using a range of technologies such as Transformer, FTP, Join, Merge, Lookup, Sort, Change Capturer, Switch, Data Set, File Set, External Source, Sequential File, Peek, Tail, Row and Column Generator. Extensive use of SQL queries was needed.
Technical Lead-Data stage
- Single point of contact for transitioning project on DataStage 8.5 and handled core team in transition for an onsite engagement ( Confidential, Paris).
- Full life-cycle project management of an Oracle/SQL Server database data integration using Confidential Change Data Capture (CDC) 6.5.2 and ETL Confidential Infosphere Datastage 8.7. Both integrated with Confidential MQ.
- Customer Data Integration project using ETL Confidential Infosphere Datastage, Quality Stage, Information Analyzer 8.7 and Confidential Change Data Capture 6.5.2 ( ). Proof of concepts, Information Server 8.5 installation. Best practices guidance.
- Finalizing the Architecture of the project along with the senior Architects.
- ETL leader, solution architect of a data warehouse project using DB2 and Oracle databases
- Best practices guidance for development team. Information Server 8.5 cluster´s topology installation. Confidential Datastage development. Datastage and Service Director proves of concepts, Performance Tuning, Enhancement, troubleshooting.
- Extensively worked on design and development phase and testing phase of BDI project.
- Data warehouse development using ETL tool Confidential Information Server 7.5 integrated with Oracle and DB2, Performance analyst and developer.