We provide IT Staff Augmentation Services!

Sr. Big Data/hadoop Developer Resume

3.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY

  • Above 10+ working experience as a Big Data/Hadoop Developer/Engineer in designed and developed various applications like Big Data, Hadoop, Java/J2EE open - source technologies.
  • Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Yarn and MapReduce programming paradigm.
  • Experienced on major components in Hadoop Ecosystem including Hive, Sqoop, Flume & knowledge of MapReduce/HDFS Framework.
  • Hands-on programming experience in various technologies like Java, J2EE, HTML, XML, JSON, CSS and angular.js.
  • Expertise in loading teh data from teh different Data sources like (Oracle, Teradata and DB2) into HDFS using Sqoop and load into partitioned Hive tables.
  • Experienced in Amazon AWS cloud which includes services like: EC2, S3, EBS, ELB, AMI Route53, Auto scaling, Cloud Front, Cloud Watch, and Security Groups.
  • Experience on Machine Learning and data analytics on Big Data set and hands on experience in developing SPARK applications using Spark API's like Spark core, Spark Streaming, Spark MLlib and Spark SQL and worked with different file formats such as Text, Sequence files, Avro, ORC, JSON and Parquette.
  • Very good experience and knowledge in Amazon Web Service (AWS) concepts like EMR and EC2 web services which provides fast and efficient processing of Teradata Big Data Analytics.
  • Experienced in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
  • Experience using various Hadoop Distributions (Cloudera, Hortonworks, MapR, etc) to fully implement and leverage new Hadoop features
  • Expertise in Data Development in Hortonworks HDP platform & Hadoop ecosystem tools like Hadoop, HDFS, Spark, Zeppelin, Hive, HBase, SQOOP, flume, Atlas, SOLR, Pig, Falcon, Oozie, Hue, Tez, Apache NiFi, Kafka.
  • Extensive experience in use of Talend ELT, database, data set, HBase, Hive, PIG, HDFS and SCOOP components and generating metadata, create Talend etl jobs, mappings to load data warehouse, Data Lake.
  • Hands on experience in coding Map Reduce/Yarn Programs using Java,Scalaand Python for analyzing Big data.
  • Expertise in DevOps, Release Engineering, Configuration Management, Cloud Infrastructure Automation, it includes Amazon Web Services (AWS), Ant, Maven, Jenkins, Chef, and GitHub.
  • Strong experienced in working with Unix/Linux environments, writing shell scripts and excellent knowledge and working experience in Agile & Waterfall methodologies.
  • Excellent knowledge and experience on Hadoop architecture; as in HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradig
  • Having Good Experience in Object Oriented Concepts withPython and integrated different data sources, data wrangling: cleaning, transforming, merging and reshaping data sets by writingPythonscripts.
  • Good knowledge on Amazon web services: EC2, Redshift, S3, Elastic Load balancer, Cloud watch, Auto scaling etc.
  • Expertise in writingHadoopJobs for analyzing data using Hive QL (Queries), Pig Latin (Data flow language), and custom MapReduce programs in Java.
  • Expertise in Web pages development using JSP, Html, Java Script, JQuery and Ajax and strong working knowledge of front end technologies including Java script framework and Angular.js
  • Hands on experience with NoSQL Databases like HBase, MongoDB and Cassandra and relational databases like Oracle, Teradata, DB2 and MySQL.
  • Proficiency in developing MVC patterns based web applications using Struts by creating forms using Struts tiles and validates using Struts validation framework
  • Experience in deploying applications in various Application servers like Apache Tomcat, and Web Sphere and responsible for deploying teh scripts into GIT hub version control repository hosting service and deployed teh code using Jenkins and experience with web-based UI development using JQuery, Ext JS, CSS, Html, Html5, XHTML and Java script
  • Experience in working with Developer Toolkits like Force.com IDE, Force.com Ant Migration Tool, Eclipse IDE, Mavens
  • Experience in Front-end Technologies like Html, CSS, Html5, CSS3, and Ajax and experience in Data Migration process using Azure by integrating with GIT hub repository and Jenkins.
  • Experienced in installing Kafka on Hadoop cluster and configure producer and consumer coding part in java to establish connection from twitter source to HDFS with popular hash tags.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop 3.0, HDFS, MapReduce, Hive 2.3, Impala 2.10, Apache Pig 0.17, Sqoop 1.4, Oozie 4.3, Zookeeper 3.4, Flume 1.8, Kafka 1.0.1, Spark, Sql, Spark streaming, AWS, Azure Data lake, NoSQL.

Application Server: Web sphere, Weblogic, JBoss, Apache Tomcat

Databases: HBase 1.2, Cassandra 3.11, MongoDB 3.6, MySQL 8.0, Sql Server2016, Oracle 12c

IDE: Eclipse, NetBeans, MySQL Workbench.

Agile Tools: Jira, Jenkins, Scrum

Build Management Tools: Maven, Apache Ant

Java & J2EE Technologies: Core Java, Servlets, JSP, JDBC, JNDI, Java Beans

Languages: C, C++, JAVA, SQL, PL/SQL, PIG Latin, HiveQL, UNIX shell scripting

Frameworks: MVC, Spring, Hibernate, Struts 1/2, EJB, JMS, JUnit, MR-Unit

Version control: Github, Jenkins

Methodology: RAD, RUP, UML, System Development Life Cycle (SDLC), Waterfall Model.

PROFESSIONAL EXPERIENCE

Sr. Big Data/Hadoop Developer

Confidential, Chicago IL

Responsibilities:

  • Involved in gathering requirements from client and estimating time line for developing complex queries using Hive for logistics application and identifying data sources, create source-to-target mapping, storage estimation, provide support for Hadoop cluster setup, data partitioning.
  • Defined teh application architecture and design for Big Data Hadoop initiative to maintain structured and unstructured data; create reference architecture for teh enterprise.
  • Worked with cloud provisioning team on a capacity planning and sizing of teh nodes (Master and Slave) for an AWS EMR Cluster.
  • Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
  • Worked with Amazon EMR to process data directly in S3 when we want to copy data from S3 to teh Hadoop Distributed File System (HDFS) on your Amazon EMR cluster by setting up teh Spark Core for analysis work.
  • Involved in teh complete SDLC, Daily Scrum (Agile) including design of System Architecture, development of System Use Cases based on teh functional requirements.
  • Analyzed teh existing data flow to teh warehouses and taking teh similar approach to migrate teh data into HDFS and created Partitioning, Bucketing, and Map Side Join, Parallel execution for optimizing teh Hive queries decreased teh time of execution from hours to minutes.
  • Responsible for creating an instance on Amazon EC2 (AWS) and deployed teh application on it.
  • Working on Spark Architecture and how RDD's work internally by involving and processing teh data from Local files, HDFS and RDBMS sources by creating RDD and optimizing for performance.
  • Exploring with teh Spark for improving teh performance and optimization of teh existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Extensively used TalendBigdatato build teh data lake on Hadoop and design an efficient disaster recovery mechanism. Ensure efficient recovery and low latency environment by migrating to Hadoop servers.
  • Developed Spark code to usingScalaand Spark-SQL for faster processing and testing and worked towards creating real time data streaming solutions using Apache Spark/Spark Streaming, Kafka.
  • Involved in data pipeline using Pig, Sqoop to ingest cargo data and customer histories into HDFS for analysis.
  • Imported teh data from different sources like AWSS3, Local file system into Spark RDD and worked on cloud Amazon Web Services (EMR, S3, EC2, Lambda)
  • Involved in ingesting data into HDFS using Apache Nifi and developed and deployed Apache Nifi flows across various environments, optimized Nifi data flows and written QA scripts inpythonfor tracking missing files.
  • Importing and exporting tera bytes of data using Sqoop and real time data using Flume and Kafka and written Programs in Spark using Scala andPythonfor Data quality check.
  • Worked on importing data from MySQL DB to HDFS and vice-versa using Sqoop to configure Hive metastore with MySQL, which stores teh metadata for Hive tables and worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs,Pythonand Scala and used Spark API over ClouderaHadoopYARN to perform analytics on data in Hive and involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run Map Reduce jobs in teh backend.
  • Responsible for loading teh customer's data and event logs from Kafka into HBase using REST API and created custom UDF's for Spark and Kafka procedure for some of non-working functionalities in custom UDF into Scala in production environment.
  • Developed workflows in Oozie and scheduling jobs in Mainframes by preparing data refresh strategy document & Capacity planning documents required for project development and support and worked with different actions in Oozie to design workflow like Sqoop action, pig action, hive action, shell action.
  • Implemented Kafka consumers to move data from Kafka partitions into Cassandra for near real time analysis.
  • Ingested all formats of structured and unstructured data including Logs/Transactions, Relation databases using Sqoop & Flume into HDFS and involved in collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis.
  • Have used Enterprise Data Warehouse (EDW) architecture and various data modeling concepts like star schema, snowflake schema in teh project.

Environment: AWS S3, EMR, Python 3.6, PySpark, Scala, Hadoop 3.0, MapReduce, Hive 2.3, impala, Sqoop 1.4, Spark 2.2 SQL, Spark Stream, Airflow, Jenkins, GIT, Bitbucket, R Language 3.4 and Tableau, Oozie, Flume, AWS EC2, Lambda, MongoDB, HDFS, Pig, Unix Shell Scripting, Kafka, HBase.

Sr. Big Data/Hadoop Developer/Engineer

Confidential Dallas TX

Responsibilities:

  • Involved in Agile methodologies, daily scrum meetings, spring planning and scripts were written for distribution of query for performance test jobs in Amazon Data Lake.
  • Created Hive Tables, loaded transactional data from Teradata using Sqoop and worked with highly unstructured and semi structured data of 2 Petabytes in size.
  • Developed MapReduce (YARN) jobs for cleaning, accessing and validating teh data and created and worked Sqoop jobs with incremental load to populate Hive External tables.
  • Developed optimal strategies for distributing teh web log data over teh cluster importing and exporting teh stored web log data into HDFS and Hive using Sqoop.
  • Apache Hadoop installation & configuration of multiple nodes on AWS EC2 system and developed Pig Latin scripts for replacing teh existing legacy process to teh Hadoop and teh data is fed to AWS S3.
  • Responsible for building scalable distributed data solutions using Hadoop Cloudera and designed and developed automation test scripts using Python
  • Integrated Apache Storm with Kafka to perform web analytics and to perform click stream data from Kafka to HDFS.
  • Analyzed teh SQL scripts and designed teh solution to implement using Pyspark and implemented Hive Generic UDF's to incorporate business logic into Hive Queries.
  • Responsible for developing data pipeline with Amazon AWS to extract teh data from weblogs and store in HDFS.
  • Uploaded streaming data from Kafka to HDFS, HBase and Hive by integrating with storm and writing Pig-scripts to transform raw data from several data sources into forming baseline data.
  • Analyzed teh web log data using teh HiveQL to extract number of unique visitors per day, page views, visit duration, most visited page on website.
  • Supporting data analysis projects by using Elastic MapReduce on teh Amazon Web Services (AWS) cloud performed Export and import of data into S3.
  • Worked on MongoDB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
  • Involved in designing teh row key in HBase to store Text and JSON as key values in HBase table and designed row key in such a way to get/scan it in a sorted order.
  • Integrated Oozie with teh rest of teh Hadoop stack supporting several types of Hadoop jobs out of teh box (such as Map-Reduce, Pig, Hive, and Sqoop) as well as system specific jobs (such as Java programs and shell scripts)
  • Worked on custom Talend jobs to ingest, enrich and distribute data in Cloudera Hadoop ecosystem.
  • Creating Hive tables and working on them using Hive QL and designed and Implemented Partitioning (Static, Dynamic) Buckets in HIVE.
  • Developed multiple POCs using PySpark and deployed on teh YARN cluster, compared teh performance of Spark, with Hive and SQL and Involved in End-to-End implementation of ETL logic
  • Used Spark-Streaming APIs to perform necessary transformations and actions on teh fly for building teh common learner data model which gets teh data from Kafka in near real time and Persists into Cassandra.
  • Developed syllabus/Curriculum data pipelines from Syllabus/Curriculum Web Services to HBASE and Hive tables.
  • Worked on Cluster co-ordination services through Zookeeper and monitored workload, job performance and capacity planning using Cloudera Manager
  • Involved in build applications using Maven and integrated with CI servers like Jenkins to build jobs.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters and implemented data ingestion and handling clusters in real time processing using Kafka.
  • Exported teh analyzed data to teh RDBMS using Sqoop for to generate reports for teh BI team and worked collaboratively with all levels of business stakeholders to architect, implement and test Big Data based analytical solution from disparate sources and involved in exporting teh analyzed data to teh databases such as Teradata, MySQL and Oracle use Sqoop for visualization and to generate reports for teh BI team.
  • Creating teh cube in Talend to create different types of aggregation in teh data and also to visualize them.
  • MonitorHadoopName Node Health status, number of Task trackers running, number of Data Nodes running and automated all teh jobs starting from pulling teh Data from different Data Sources like MySQL to pushing teh result set Data toHadoopDistributed File System.

Environment: Hive 2.3, Teradata r15, MapReduce, HDFS, Sqoop 1.4, AWS, Hadoop 3.0, Pig 0.17, Python 3.4, Kafka 1.1, Apache Storm, SQL scripts, data pipeline, HBase, JSON, Oozie, ETL, Zookeeper, Maven, Jenkins, RDBMS

Sr. Hadoop Developer

Confidential - Oakland, CA

Responsibilities:

  • Worked wide range of tasks related to a massive modernization effort (including teh incorporation of Hadoop Big Data Platform namely,HortonworksData Platform) for teh Health Informatics program.
  • Written Hive queries for data analysis to meet teh business requirements and load and transform large sets of structured, semi structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Performed advanced procedures like text analytics and processing, using teh in-memory computing capabilities of Spark usingScala.
  • Migrated data between RDBMS and HDFS/Hive with Sqoop and used Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance and used Sqoop to import and export data among HDFS, MySQL database and Hive
  • Implemented discretization and binning, data wrangling: cleaning, transforming, merging and reshaping data frames usingPython and developed and maintainedPythonETL scripts to scrape data from external sources and load cleansed data into a SQL Server.
  • Worked with Spark eco system usingSCALAand HIVE Queries on different data formats like Text file and parquet and usedScalato convert Hive/SQL queries into RDD transformations in Apache Spark.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
  • Involved in loading data from UNIX/LINUX file system to HDFS and analyzed teh data by performing Hive queries and running Pig scripts.
  • Worked on implementing Spark Framework a Java based Web Frame work and designed and implemented Spark jobs to support distributed data processing.
  • Involved in managing and reviewing theHadooplog files, used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing teh data onto HDFS.
  • Worked on Spark Code using Scala and Spark SQL for faster data sets processing and testing.
  • Implemented Spark Scripts usingScala, Spark SQL to access hive tables into spark for faster processing of data.
  • Processed teh Web server logs by developing Multi-hop flume agents by using Avro Sink and loaded into MongoDB for further analysis and extracted and restructured teh data into MongoDB using import and export command line utility tool.
  • Worked onpythonfiles to load teh data from csv, json, mysql, hive files to Neo4j Graphical database.
  • Managed and reviewed Hadoop and HBase log files. Worked on HBase in creating HBase tables to load large sets of semi structured data coming from various sources.
  • Performed data analysis with HBase using Hive External tables and exported teh analyzed data to HBase using Sqoop and to generate reports for teh BI team.
  • Imported teh data from relational database to Hadoop cluster by using Sqoop and developed Hive queries to process teh data and generate teh data cubes for visualizing.
  • Responsible for building scalable distributed data solutions using Hadoop. Create tables, dropping and altered at run time without blocking updates and queries using HBase and Hive.
  • Wrote Flume configuration files for importing streaming log data into HBase with Flume.
  • Imported logs from web servers with Flume to ingest teh data into HDFS. Using Flume and Spool directory loading teh data from local system to HDFS and developed UNIX shell scripts to load large number of files into HDFS from Linux File System.
  • Installed and configured pig, written Pig Latin scripts to convert teh data from Text file to Avro format and developed MapReduce programs in Java for parsing teh raw data and populating staging Tables and created Partitioned Hive tables and worked on them using HiveQL and loaded Data into HBase using Bulk Load and Non-bulk load.
  • Worked with Tableau and Integrated Hive, Tableau Desktop reports and published to Tableau Server.
  • Used Zookeeper to coordinate teh servers in clusters and to maintain teh data consistency and developed interactive shell scripts for scheduling various data cleansing and data loading process.
  • Worked in Agile development environment having KANBAN methodology. Actively involved in daily scrum and other design related meetings.

Environment: MapReduce, PIG Latin, Hive 1.9, Apache Crunch, Spark, Scala, HDFS, HBase, Core Java, J2EE, Eclipse, Sqoop, Impala, Flume, Oozie, MongoDB 3.0, Jenkins, Agile Scrum methodology

Hadoop Developer

Confidential, GA

Responsibilities:

  • Developed teh business solution to make data-driven decisions on teh best ways to acquire customers and provide them business solutions.
  • Actively participated in every stage of Software Development Lifecycle (SDLC) of teh project
  • Designed and developed user interface using JSP, Html and JavaScript for better user experience.
  • Exported analyzed data to downstream systems using Sqoop-RDBMS for generating end-user reports, Business Analysis reports and payment reports
  • Participated in developing different UML diagrams such as Class diagrams, Use case diagrams and Sequence diagrams
  • Involved in developing UI (User Interface) using Html, CSS, JSP, JQuery, Ajax, and Java Script.
  • Designed dynamic client-side JavaScript, codes to build web forms and simulate process for web application, page navigation and form validation.
  • Imported and exported data jobs to perform operations like copying data from HDFS and to HDFS using Sqoop.
  • Data integration into destination which is received from various providers using Sqoop onto HDFS for analysis and data processing.
  • Managed clustering environment using Hadoop platform and worked with Pig, HBase, NoSQL database HBase and Sqoop, for analyzing teh Hadoop cluster as well as big data.
  • Managed data using teh ingestion tool Kafka and wrote and implemented Apache PIG scripts to load data from and to store data into Hive.
  • Assisted admin for extending and setting up teh nodes on to teh cluster.
  • Implemented teh NoSQL database HBase and teh management of teh other tools and process observed running on YARN.
  • Wrote Hive UDFS to extract data from staging tables and analyzed teh web log data using teh Hive QL.
  • Used multi-threading concepts and clustering concepts for data processing and managed teh clustering and designing of debug teh issue if exits any.
  • Involved in creating Hive tables, load data and writing hive queries, which runs map reduce in backend and further Partitioning and Bucketing was done when required.
  • Used Zookeeper for various types of centralized configurations and tested teh data coming from teh source before processing and resolved problem faced.
  • Developed programs to parse teh raw data, populate staging tables and store teh refined data in partitioned tables.
  • Created Hive queries for teh market analysts to analyze teh emerging data and comparing it with fresh data in reference tables.
  • Involved in teh regular Hadoop Cluster maintenance such as patching security holes and updating system packages.
  • Tested raw data and executed performance scripts and shared responsibility with administration of Hive and Pig.
  • Worked in Apache Tomcat for deploying and testing teh application and worked with different file formats like Text files, Sequence Files, Avro.
  • Written Java program to retrieve data from HDFS and providing REST services and used Automation tools like Maven.
  • Used spring framework to provide teh RESTFUL services and provided design recommendations and thought leadership to stakeholders dat improved review processes and resolved technical problems.

Environment: Eclipse, Hadoop 2.8, Hive 1.5, HBase, Linux, Map Reduce, Pig 0.15, HDFS, Oozie, Shell Scripting, MySQL. 2012, ETL, Data Warehousing, SQL.

Sr. ETL Developer

Confidential - Eden Prairie, MN

Responsibilities:

  • Participated in requirement analysis with teh help of business model and functional model and wrote documentation to describe program development, logic, coding, testing, changes and corrections.
  • Created complex mappings using various transformations like Transaction control, SQL Transformations, etc.
  • Wrote PL/SQL stored procedures and triggers, cursors for implementing business rules and transformations. Created complex T-SQL queries and functions.
  • Provided support to develop teh entire warehouse architecture and planned teh ETL process.
  • Extracted data from flat files, XML files and Oracle, applied business logic to load them in teh central Oracle database.
  • Involved in migration of maps fromIDQto Power Center and Applied teh rules and profiled teh source and target table's data usingIDQ
  • Developed and maintained ETL (Extract, Transformation and Loading) mappings to extract teh data from multiple source systems like Oracle, SQLserver and Flat files and loaded into Oracle.
  • Performance tuned various mappings, Sources, Targets and transformations by optimizing caches for lookup, joiner, rank, aggregator, sorter transformation and tuned performance of Informatica session for data files by increasing buffer block size, data cache size, sequence buffer length and used optimized target based commit interval and Pipeline partitioning to speed up mapping execution time
  • Populate or refreshTeradatatables using Fast load, Multi load & Fast export utilities for user Acceptance testing and wrote SQL queries and PL/SQL procedures to perform database operations according to business requirements.
  • Created some exclusive mappings in Informatica to load teh data from external sources to landing tables ofMDMhub.
  • Developed mappings/reusable objects/transformations/mapplets by using mapping designer, transformation developer and mapplet designer in Informatica Power Center
  • Monitored and tuned ETL repository and system for performance improvements and Created folders, users, repositories, deployment group using Repository Manager.
  • Extensively usedNetezzautilities like NZLOAD and NZSQL and loaded data directly from Oracle toNetezzawithout any intermediate files.
  • Defined teh content, structures and quality of high complex data structures using Informatica Data Explore (IDE).
  • Implemented slowly changing dimension to maintain current information and history information in dimension tables.
  • Generated teh SAPBusinessObjectsreports involving complex queries, sub queries, Unions and Intersection.
  • Primary activities include data analysis identifying and implementing data quality rules inIDQand finally linking rules to power center ETL process and delivery to other data consumers.
  • Designed and DevelopedETLstrategy to populate teh Data Warehouse from various source systems such as Oracle, Teradata, Netezza, Flat files, XML, SQL Server
  • Responsibilities included designing and developing complex mappings usingInformaticapower center andInformaticadeveloper(IDQ) and extensively worked on Address validator transformation inInformaticadeveloper(IDQ).
  • Generated queries using SQL to check for consistency of teh data in teh tables and to update teh tables as per teh Business requirements.
  • Created Jobs and Job streams in Autosys scheduling tool to schedule Informatica, SQL script and shell script jobs
  • Implemented Real-Time Change Data Capture (CDC) for SalesForce.com (SFDC) sources using InformaticaPower Center and implemented Slowly Changing Dimensions for applying INSERT else UPDATE to Target tables.
  • Designed complex mappings in Power Center Designer using Aggregate, Expression, Filter and Sequence Generator, Update Strategy, Union, Lookup, Joiner, XML Source Qualifier and Stored procedure transformations.
  • Proposed PL/SQL and UNIX Shell Scripts for scheduling teh sessions in Informatica.
  • Created teh mapping to load teh data from different Base Objects inMDMinto single flat structure in Informatica Developer.
  • Worked with reporting team using teh BI interface Business object on improving teh business.

Environment: InformaticaPower Center 9.3/5(Power Center Designer, Teradata, workflow manager, workflow monitor), Oracle 11g, IDQ, SQL Server 2010, MDM, TERADATA, PL/SQL, TOAD,InformaticaScheduler, Netezza, TeradataSQL Assistnace, SQL, SSRS, UNIX, Shell Scripting, Autosys, Informatica IDQ, SAP, T-SQLEnvironment: Informatica Power Center 8.6.1/9.1.0 , Oracle 11g, SQLServer2008, IBM (DB2), MS Access, Windows XP, Toad, Tidal, SQL developer

Java Developer

Confidential

Responsibilities:

  • Major part of teh project involves migration from Informatica 7.1 to 8.1. dis includes training, migration of code, migration of data, documenting teh migration process, extensive testing and validation.
  • Interacted with business community and gatheird requirements based on changing needs. Incorporated identified factors into Informatica mappings to build Data Warehouses.
  • Developed a standard ETL framework to enable teh reusability of similar logic across teh board. Involved in System Documentation of Dataflow and methodology.
  • Identified all teh dimensions to be included in teh target warehouse design and confirmed teh granularity of teh facts in teh fact tables.
  • Analyzed teh logical model of teh databases and normalizing it when necessary and involved in identification of teh fact and dimension tables.
  • Extensively used Informatica Power Center 7.1 for extracting, transforming and loading into different databases.
  • Wrote PL/SQL stored procedures and triggers for implementing business rules and transformations.
  • Developed transformation logic as per teh requirement, created mappings and loaded data into respective targets.
  • Created Source and Target Definitions in teh repository using Informatica Designer - Source Analyzer and Warehouse Designer.
  • Worked extensively on different types of transformations like Source qualifier, Expression, Filter, Aggregator, Rank, Lookup, Stored procedure, Sequence generator and used Mapping Designer to create mappings.
  • Implemented complex mapping such as SLOWLY CHANGING DIMENSIONS (Type II).
  • Worked on Dimensional modeling to design and develop STAR schemas using Erwin, Identifying Fact and Dimension Tables.
  • Stored reformatted data from relational, flat file, XML files using Informatica (ETL) and developed mapping to load teh data in slowly changing dimension.
  • Generated reports for end client using various Query tools like Cognos.
  • Replicated operational tables into staging tables, to transform and load data into teh enterprise data warehouse using Informatica.
  • Created and scheduled Worklets, configured email notifications. Set up Workflow to schedule teh loads at required frequency using Power Center Workflow Manager, Generated completion messages and status reports using Workflow Manager.
  • Involved in Performance Tuning at various levels including Target, Source, Mapping, and Session for large data files.
  • Used SQL tools like TOAD to run SQL queries to view and validate teh data loaded into teh warehouse.
  • Documented Data Mappings/ Transformations as per teh business requirement.
  • Performed testing, knowledge transfer and mentored other team members.

Environment: Informatica Power Center 8.1/7.1, Oracle 10g, SQL, PL/SQL, TOAD, XML and Flat files, Cognos, Windows.

We'd love your feedback!