ETL/Hadoop Developer Resume Philadelphia. PA. - Hire IT People

SUMMARY:

Over 15 years of experience in Data Warehousing, Analytics and ETL processes in various business domains like retail, manufacturing, insurance and banking domain.
Profound in ApacheHadoopecosystems: Yarn, Spark, Pig, Hive, Flume, Sqoop, HBase, Zookeeper, Impala, strong understanding of HDFS and MapReduce architecture with Cloudera and Hortonworks.
Strong Data WarehousingETLexperience of using Informatica 9.x/8.x/7.x Power Center tools.
Experience in using cloud components and connectors to push/pull data from different cloud storages.
Strong Knowledge on ER modeling and Dimensional Data Modeling Methodologies like Star Schema and Snowflake Schema.

TECHNICAL SKILLS:

Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Pig, Hive, Sqoop, Spark, YARNStorm, Kafka, Zookeeper, Flume, Hue, Ooziee, MRUnit, Impala.

Programming Languages: Java, C, SQL, Scala, Pig Latin, HiveQL, Shell Scripting, Python.

Database and Tools: MySQL, SQLite, Oracle, Teradata, MS SQL, MongoDB, Cassandra,NoSQL, DB Visualizer, SQL developer, MySQL Workbench.

ETL Tools: Informatica Power Center 7.x,8.x 9.x, BigData Edition, SSIS, DTS,.

Scheduling Tools: Control - M, AutoSys, IBM TWS

Visualization/Reporting: Tableau, Kibana, Zeppelin, Pentaho, Talend.

Web Technologies: Spring, Hibernate, JSP, JavaScript, HTML, XML, JSON, Web-Services.

Dev and Build Tools: Maven, Ant, Eclipse, Scala IDE, Jira, BitBucket, SVN, GIT, Telnet, Jenkins.

Methodologies and Tools: Waterfall, Agile (Scrum and Kanban), MS Project.

PROFESSIONAL EXPERIENCE:

Confidential, Philadelphia. PA.

ETL/Hadoop Developer

Responsibilities:

Hadoop and Informatica based ETL and analytical system to have insights about customer’s usage of Lutron products across different product line leading to future enhancements, improvement in business and services.
Developed data pipeline usingSpark, Kafka, Hive, Pig and HBase to ingest customer system usage data and financial histories intoHadoopcluster for analysis.
Developed Scala scripts, UDF's using both Data frames/SQL and RDD/MapReduce in Spark for data aggregation and writing data back into S3 through Sqoop.
Extensively usedInformatica to create data ingestion jobs into HDFS using complex data file objects such as AVRO and Parquet and to evaluate dynamic mapping capabilities.
Implement Data Quality Rules using Informatica Data Quality (IDQ) to check correctness of the source files and perform the data cleansing/enrichment.
Analyze log records data a day and its aggregated hourly, daily reporting using Tableau.
Environment: Hadoop2.7, Informatica9.x, Hive1.2.1, Spark1.6, Teradata, Oracle, EC2, S3.

Confidential

Hadoop Developer

Responsibilities:

Worked with highly unstructured and semi structured data of 100TB+ in size
Developed Pig and Hive scripts to be used by end user / analyst / product manager’s requirements for adhoc analysis.
UsedInformatica to validate and test the business logic implemented in the mappings and fix the bugs. Developed reusable Mapplets and Transformations.
Managed External tables in Hive for optimized performanceusing Sqoop jobs.
Solved performance issues in Hive and Pig scripts with understanding of joins, group and aggregation and how it translates to MapReduce jobs.
Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark on YARN.
Worked with Hadoop-Kerberos security environment is supported by the Cloudera team.
Environment:32 Node Hadoop2.6 cluster, Informatica9.x, HDFS, Flume 1.5, Sqoop 1.4.3, Hive 1.0.1, Spark 1.4, HBase, XML, JSON, Teradata, Oracle, MongoDB, Cassandra.

Confidential

Hadoop Developer

Responsibilities:

Migration of 100+ TBs of data from different databases (i.e. Oracle, SQL Server) to Hadoop.
Wiring code in different applications of Hadoop and Informatica Ecosystem
Extensively involved in performance tuning of theInformaticaETL mappings by using the caches and overriding the SQL queries and also by using Parameter files.
Worked on various file formats Avro, SerDe, Parquet, and Text by using snappy compression.
Used Pig Custom Loaders to load different forms of data files such as XML, JSON and CSV.
Designed dynamic partition mechanism for optimal query performance of system using HIVE to reduce report time generation under SLA requirements.
Environment: Hadoop 2.2, Informatica Power Center 9.x, HDFS, HBase, Flume 1.4, Sqoop 1.4.3, Hive 0.13.1, Avro 1.7.4, Parquet 1.4, XML, JSON, Oracle 11g, Amazon EC2, S3.

Confidential

ETL Developer

Responsibilities:

Developed mappings/sessions to import, transform and load data into respective target tables and flat files using Informatica Power Center for data loading.
Automation of theInformaticaETL jobs for different ETL design pattern.
Extensively used Transformations like Router, Aggregator, Source Qualifier, Joiner, Expression, Aggregator and Sequence generator by using Source Analyzer, Warehouse Designer, Mapping Designer & Mapplet, and Transformation Developer.
Environment:InformaticaPower Center 9.x (Repository Manager, Designer, Workflow Manager, and Workflow Monitor), Oracle 11g, SeaQuest, HPDM, SQL Server, Teradata, Toad, Control-M.

Confidential

ETL Developer

Responsibilities:

Extensively used Slowly Changing Dimensions technique for updating dimensional schema.
Processed data using various transformations like Aggregator, Router, Expression, Source Qualifier, Filter, Lookup, Joiner, Sorter, XML Source qualifier and web-consumer for WSDL.
Used Informatica user defined functions to reduce the code dependency.
Environment:InformaticaPower Center 8.x,InformaticaPower Connect, Power Exchange, Power Analyzer,Toad, Erwin, Oracle 11g/10g, Teradata V2R5, PL/SQL, ODI, Trillium 11.

Confidential

ETL Developer

Responsibilities:

Used SSIS as an Extract Transform Loading (ETL) tool ofSQLServerto populate data from various data sources, creating packages for different data loading operations for application.
Extensive use of Transact-SQL, stored procedures, trigger scripts for creating databaseobjects.
Generated various reports using features such as group by, drilldowns, drill through, sub-reports, Parameterized Reports.
Deploying new strategies for checksum calculations, and exception population using mapplets and normalizer transformations.
Environment: SQL Server 2005, T-SQL,SSIS/DTSDesigner and Reporting tools, Control-M.

Confidential

Java Developer

Responsibilities:

Developed the web applications using Spring MVC Framework including writing actions/ classes/ forms/ custom tag libraries and JSP pages.
Worked on Integration ofSpringandHibernateFrameworks usingSpringORM Module.
Implemented caching techniques, wrote POJO classes for storing data and DAO's to retrieve the data and did database configurations.

Confidential

Java Developer

Responsibilities:

Implementation of routing and shortest path algorithms along with parsing logic for device discovery using Heart-Beat
Implementation of Java Native Interface(JNI) API’s for Indus Mote to access devices dynamically through C code.

We provide IT Staff Augmentation Services!

Etl/hadoop Developer Resume

Philadelphia, Pa

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship