We provide IT Staff Augmentation Services!

Bigdata/ Hadoop Consultant Resume

4.00/5 (Submit Your Rating)

Alpharetta, GA

SUMMARY:

  • Around 8 years of IT experience in full System Development Life Cycle (Analysis, Design, Development, Testing, Deployment & Support) using Waterfall and Agile methodologies.
  • Expert in Big Data/Hadoop with strong skills in providing solutions to business problems using Big Analytics.
  • Experience in Hadoop eco system components like HDFS, MapReduce, Sqoop, Flume, Kafka, Hive, Impala, Pig, Oozie, Zookeeper & HBase.
  • Good knowledge of Hadoop Architecture MR1 & MR2 (YARN)
  • Experience in importing and exporting data between RDBMS and HDFS using Sqoop
  • Experience in working with Flume to load the log data from multiple sources directly into HDFS
  • Experience working with Hive and Impala in creating Internal and External tables.
  • Experience in partitioning, bucketing and joining Hive tables for Hive query optimization
  • Experience in writing Pig scripts
  • Experience in writing Oozie workflow engine to run multiple jobs
  • Experience working with different file format like Sequence file, Avro, Parquet and Regular Text file format to process the data.
  • Worked on real - time in-memory processing engine in Spark using Python and Scala.
  • Technical and Functional experience in Data Warehousing applications using Informatica Power Center
  • Very good knowledge of OLTP/OLAP, dimensional modeling using start schema and snow- flake schema
  • Extensively worked on Data Extraction, Transformation, loading with Oracle and SQL Server using Informatica
  • Extensively worked with Informatica Designer, Repository Manager, Workflow Manager and Workflow Monitor
  • Experience in creating Transformations, Mappings, Mappletes, Processing Task and Scheduling Sessions
  • Committed to excellence, self-motivator, team-player and a far-sighted developer with strong problem solving skills and with zeal to learn new technologies.
  • Strengths include good team player, excellent communication interpersonal and analytical skills and ability to work effectively in a fast-paced, high volume, deadline-driven environment
  • Adept at gathering and documenting requirements, assessing business objectives and managing all stages of software development and implementation
  • Excellent interpersonal skills and communication skills

PROFESSIONAL EXPERIENCE:

BigData/ Hadoop Consultant

Confidential, Alpharetta, GA

Responsibilities:

  • Ingested logs of an ecommerce website using Sqoop from MySQL into HDFS
  • Ingested logs from Apache log files in XML format using Flume into HDFS
  • Processed Advertisement data for website hits and web campaign using Pig
  • Simplified the logs into key value pairs to convert into structured data using Spark
  • Processed the XML file using Spark-Scala (XML Library) as spark does not have direct method to read XML data
  • Converted XML data into structured data so that we can further process the data.
  • Exported data from HDFS to RDBMS table using Sqoop
  • Created external tables on the dataset available in HDFS using Hive
  • Data Analysis with the use of Hive and Impala query
  • Partitioned the Hive table to improve performance, also had created tables in hive and accessed using Impala to speed up the process

Environment: Hadoop Cluster, HDFS, YARN, Sqoop, Flume, Spark, Scala (XML Library), Hive, Impala, Pig, Hue

Data Integration Analyst

Confidential, Connecticut

Responsibilities:

  • Created standards, guidelines and best practices documents for Informatica, SQL Server.
  • Designed/Developed reusable mappings to match patient/provider data based on demographic information and fuzzy matching.
  • Developed mappletes to build logic for Phone, ZIP code, email address.
  • Used Salesforce lookup to lookup the Salesforce data as a target to match with source data.
  • Created mappings sourced from flat files and loaded into Sql server to stage the data. Once loaded then used those tables as a source and cleanse data and processed as per the requirement and loaded into Salesforce.
  • Involved in configuration of Informatica, Sql Server, Salesforce.com connectivity and Apex Data loader environment from the scratch.
  • Loaded Metadata from different sources into Metadata Manager warehouse.
  • Performed Impact Analysis using Metadata Manager before making any DDL changes.
  • Identifying performance bottlenecks using Informatica log files, verbose option and then doing performance improvement by using sorted input, pushdown optimization, loaders etc. and as well by tuning the SQL queries by using Analytical Functions.
  • Created workflows and schedule them to run on weekly and monthly basis.
  • Involved in testing of data in Sandbox environment and moved tested workflows to production environment.
  • Created Informatica Cloud On Demand tasks for data replication.
  • Worked on Production Support on a rotational basis for running jobs.

Environment: Informatica Power Center 9.1, SQL Server 2008, Salesforce, Flat files.

Informatica Developer

Confidential, Washington, DC

Responsibilities:

  • Involved in the design phase of project.
  • Analyzed and draft documents on the basis of business requirements which include preparing Project Requirement Document, Capacity Planning and Network Survey Form.
  • Involved in Environment setup for the Informatica Power Center, Power Exchange, Source Database and Target Database.
  • Extensively used Informatica Power Center to load data from Oracle 9i Source to Oracle 10g Target.
  • Designed, Developed and Tested ETL processes to load Portico Data into Data Acquisition Layer.
  • Used various transformations such as stored procedure, expression, filter, update strategy and mapping parameters.
  • Created parameter files on Unix Server.
  • Configured the session to send email when the session completed successfully or fails with the use of re-usable email task.
  • Have experienced of working with version control for check in, check out objects.
  • Configured and run the Debugger from within the Mapping Designer to troubleshoot the mapping before the normal run of the workflow.
  • Used industrial best practices (velocity) for mapping design and performance tuning.
  • Involved in resolving errors during Unit and System Integration Testing.
  • Contributed in migration of Informatica objects from Dev to QA and QA to Production.

Environment: Informatica Power Center 9.1.0, Oracle 9i/10g, Toad, UNIX, Windows XP.

Software Developer

Confidential

Responsibilities:

  • Designed modules for OPD and IPD of the Hospital to keep track of Outdoor and Indoor patients respectively.
  • Experience in coding required for designed modules.
  • Responsible for connectivity of database using ADO.NET and used Stored Procedure, Parameterized queries and Datasets.
  • Used Rich server controls of ASP.NET like, Data Grid and Data List controls.
  • Developed Web User Controls, and Custom Controls.
  • Created Web Methods to calculate patient total bill amount.
  • Testing of individual modules like unit testing, integration testing.
  • Assisted in migration of modules from testing phase to production phase.

Environment: ASP.NET, C#, SQL Server 2000, MS Windows 2000

We'd love your feedback!