We provide IT Staff Augmentation Services!

Big Data Developer Resume

Birmingham, AL

SUMMARY

  • Having 10 years of IT experience wif over 5 years of experience as a Big Data Developer wif extensive knowledge on Banking and Insurance Domains.
  • Experience wif SDLC implementation in a large organization.
  • Experience on Cloudera Hadoop distribution model CDH4 and CDH5.
  • Experience in Hadoop ecosystem components like Map Reduce, HDFS, Sqoop, Flume, Pig, Hive, Oozie, Zookeeper, HBase.
  • Experience in Managing data extraction jobs, and build new data pipelines from various structured and unstructured sources into Hadoop.
  • Experience in Amazon Cloud services like Amazon EMR File System and Simple Storage Services (S3).
  • Experience using HDFS along wif Amazon S3 to store input and output data.
  • Experience in Spark Core, Spark SQL wif Scala API.
  • Experience in using Hcatalog for Hive, Pig of different frameworks.
  • Experience in Load and transform large data sets of structured and semi structured.
  • Strong knowledge on Spark Streaming data wif complex Input DSTREAMS.
  • Extensive experience in Data warehousing, Data Architecture & Extraction, Transformation and ETL data load from various sources into Data Warehouse and Data Marts using Informatica Power Center Client tool.
  • Expertise in Data Modeling using Star Schema/Snowflake Schema, Fact and Dimensions tables, Physical and logical data modeling using ERWIN 4.x/3.x
  • In - depth knowledge working on Oracle Database, DB2, MySQL, HBASE(No Sql Database).
  • Extensive knowledge and experience inCOBOL, CICS, JCL,VSAM, FILE-AID, TSO/ISPF, CA7 (Scheduler), ICETOOL, CHANGEMAN, SPUFI.
  • Excellent interpersonal skills comfortable presenting to large Groups, preparing written communications and presentation material.
  • Flexible and Quick learner, who can adapt and execute in any fast paced environment.

TECHNICAL SKILLS

Operating Systems: Windows XP/NT/2000, Unix/Linux, CentOS Linux

Programming Languages: SQL, PL/SQL, Scala API, Python, JCL, COBOL

Frameworks: Hadoop(Sqoop, HDFS, Hive, Pig, Map Reduce, HBase(NoSQL), Oozie, Flume, Zookeeper, HCatalog), Spark(Spark Core, Spark SQL, Spark Streaming)

RDBMS: Oracle 9i/10g/11g/12c, MySQL 5.5, DB2, SQL Server

Tools: SQL Developer, Toad, Tableau, Jira, Informatica Power Center 9.5/9.0.1/9/8.6.1, Erwin 4.x, MS Visio, ServiceNow, HP Quality Center, File Aid, File-manager, DB2 Interactive, Icetool.

Version Controller: Git, SVN

Storage: EC2, S3, EMR

IDE: Scala Eclipse

PROFESSIONAL EXPERIENCE

Confidential, Birmingham, AL

Big Data Developer

Environment: Hadoop 2.x, Spark 1.6, Scala API,Scala IDE-Eclipse, HDFS, AWS EMR,AWS S3,Hive, MapReduce, Sqoop, CentOS Linux and Oracle DB Server.

Responsibilities:

  • Worked wif Data Science team to gather requirements and provide support for data analysis by manipulating and exploring the data.
  • Responsible for translating complex functional and technical requirements into detailed design.
  • Loading disparate datasets into Hadoop Data Lake, which would be available to the data science team to predict the future.
  • Created Hive tables wif Avro file Format.
  • Used HDFS along wif Amazon S3 to store input and output data.
  • Writing oozie scripts and setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.
  • Created Spark RDD’s from source files for better performance using Scala API.
  • Migrated Hive queries to Spark SQL for improving performance.
  • Creating different Dataframes from Hive tables using SparkSQL.
  • Performing data analysis by exploring the datasets and bringing some recommendations.
  • Implemented Data Ingestion in real time using FLUME.
  • Performed sentimental analysis of customer reviews using Spark Streaming.

Confidential, Dublin, OH

Big Data Developer

Environment: Hadoop2.x, HDFS, Pig, Hive, Map Reduce, Sqoop, Oozie, CentOS Linux and Oracle DB.

Responsibilities:

  • Responsible for gathering requirements and translating the functional and technical requirements in to detailed design.
  • Loading and transforming large datasets of structured data from relational databases into HDFS using Sqoop imports.
  • Developed Sqoop scripts to import data from relational sources and handled incremental loading.
  • Data Pre-processing by analyzing and cleansing raw data using Hive queries and Pig scripts.
  • Optimizing the hive queries using partitioning and bucketing techniques by controlling the data.
  • Created Hive tables wif RCFile format which is very useful for performing analysis.
  • Performing SerDe operations using Hive.
  • Performing data analysis by exploring the datasets and bringing some insights.
  • Writing oozie scripts and setting up workflow using Apache Oozie workflow engine for managing and scheduling Hadoop jobs.

Confidential, Dublin, OH

Big Data Developer

Environment: Hadoop1.x, HDFS, Hive, Sqoop, Oozie, CentOS Linux and Oracle DB.

Responsibilities:

  • Worked wif Business Analyst to gather requirements.
  • Created sqoop scripts to Import data from oracle tables to hive tables.
  • Validated the data loaded in Hive using HiveQL.

Confidential, Bloomington, IL

Mainframe developer

Environment: COBOL, JCL, Ezytrieve, ICETOOL, DB2, File-Aid, Ca7, Control M, Changeman.

Responsibilities:

  • Worked wif Business Analyst to gather requirements.
  • Preparation of High level and Low level designs from the requirements.
  • Analysis & Programming in COBOL, DB2, JCL using design specification.

Confidential, Charlotte, NC

ETL Developer

Environment: Informatica Power Center 8.6.1, DB2,SQL Developer, Unix, HP Quality Center.

Responsibilities:

  • Developed ETL programs using Informatica to implement the business requirements.
  • Implemented Relational and Dimensional Data Modeling Techniques to design ERWIN data models.
  • Developed and maintained complex Informatica mappings. Supported, monitored and tuned Informatica ETL processes.
  • Designed and created table structures and modified existing tables to fit into the existing Data Model.
  • Implemented SQL queries for database operations and to maintain DW systems.
  • Used all major transformations to load data into target systems.
  • TEMPEffectively used Informatica parameter files for defining mapping variables, workflow variables, FTP connections and relational connections.
  • Implemented fine-tuning in ETL jobs and performed unit testing for workflow monitoring.

Confidential, Charlotte, NC

ETL Developer

Environment: Informatica Power Center 8.6.1, DB2, SQL Developer, Unix, HP Quality Center.

Responsibilities:

  • Developed ETL programs using Informatica to implement the business requirements.
  • Implemented Relational and Dimensional Data Modeling Techniques to design ERWIN data models.
  • Developed and maintained complex Informatica mappings. Supported, monitored and tuned Informatica ETL processes.
  • Implemented SQL queries for database operations and to maintain DW systems.
  • Used all major transformations to load data into target systems.
  • TEMPEffectively used Informatica parameter files for defining mapping variables, workflow variables, FTP connections and relational connections.
  • Implemented fine-tuning in ETL jobs and performed unit testing for workflow monitoring.

Confidential, Charlotte, NC

Mainframe Developer

Environment: COBOL, JCL, Ezytrieve, ICETOOL, DB2, File-Aid, Ca7, Control M, Changeman.

Responsibilities:

  • Analyzed the requirement and prepared the High/Low level Design documents.
  • JCL job creation and maintenance, Scheduling of Jobs via CA-7 scheduler.
  • Preparation of Test region, and Execution of region shake out.
  • Execution of SQL queries for conditioning/Mining activities for the identified data from all the data stores.
  • Preparation of Component/System Integration Test Plan/ Scripts and execution.

Hire Now