We provide IT Staff Augmentation Services!

Big Data Developer Resume

3.00/5 (Submit Your Rating)

DelawarE

SUMMARY:

  • More than 8 years in Information Technology with skills in analysis, design, development, testing and deploying various software applications, which include Web related and windows applications with emphasis on Object Oriented Programming.
  • More than 4 years of work experience on Big Data development with excellent experience in writing Hive UDF’s on Hadoop Ecosystem.
  • Around 3+ years of experience in Data Warehousing, ETL, and Reporting involved with Business Requirements Analysis, Application Design, Development, testing and documentation. Implementation of full lifecycle in Data warehouses and Data marts in various industries.
  • Hands on experience in using ecosystem components like Hadoop MapReduce, HDFS, Pig, Hive, Sqoop, Cassandra, Spark, Impala, Yarn.
  • Excellent understanding / knowledge of Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode and MapReduce programming paradigm.
  • Handling and further processing schema oriented and non - schema oriented data using Pig, Hive and Cassandra.
  • Experience in analyzing data and performing ETL tasks using HiveQL, Pig Latin, Spark and custom MapReduce programs in Java.
  • Extending Hive and Pig core functionality by writing custom UDFs.
  • Hands on experience on NoSQL databases like Cassandra.
  • Good knowledge of database connectivity (JDBC) for databases like Oracle, Teradata, SQL Server, MySQL, MS Access, Netezza, PostgreSQL.
  • Experience in all phases of the Software Development Life Cycle (SDLC).
  • Good understanding in Dimensional Data modeling, Normalization/De normalization.
  • Experience in Shell-scripting, designing work flows, real time streaming.
  • Involved in all aspects of ETL-requirement procurement, establishing standard interfaces to be used by operational sources, data cleaning, developing data load strategies, designing various mappings, executing mappings, unit testing, integration testing, regression testing
  • Hands on experience in analyzing the needs of a business use case and building structures to fulfill them.
  • Good knowledge on ETL processing and data science solutions.
  • Worked on Agile methodology for many of the applications.
  • Excellent analytical, problem solving, communication and interpersonal skills with ability to interact with individuals at all levels and can work as a part of a team as well as independently.
  • Strong Communication skills of written, oral, interpersonal and presentation.
  • Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities.

TECHNICAL SKILLS:

Hadoop: HDFS, MapReduce, Pig, Hive, HBase, Flume, Sqoop, Oozie, Control-M, Cassandra, Spark, YarnTechnologies: JDBC, XML, Java, SQL, PL/SQL.

Scripting Languages: HTML, Shell

Platforms: Windows XP, UNIX\LINUX\SOLARIS.

Databases: Oracle 9i/10g/11g, SQL Server 2005/2008 R2, DB28.1.

Tools: Informatica, Teradata, SQL developer, DB visualize, Eclipse, Tableau, Datameer, Teradata.

PROFESSIONAL EXPERIENCE:

Confidential, Delaware

Big Data Developer

Responsibilities:

  • Performed data graduation from traditional data warehouse (Teradata) to Hadoop.
  • Merged all legacy ING-Direct bank data with Confidential and pushed to Hadoop.
  • Extracted data from different sources (Teradata, Oracle, Google double click etc.), transformed it according to the business use case and loaded into Hadoop.
  • Implemented data validations including along with CCC compliance checks (Compression, Conversion, Compaction) to meet Confidential data storage standards
  • Developed Hive Scripts, Pig scripts, Unix Shell scripts, Spark programming for all ETL loading processes and converting the files into parquet in the Hadoop File System.
  • Developed various ETL transformation scripts using Hive to create refined datasets analytics use cases.
  • Imported data from Oracle database to HDFS using UNIX based File Watcher tool.
  • Developed Hive scripts to denormalize and aggregate the disparate data.
  • Automated workflows using shell scripts and Control-M jobs to pull data from various databases into Hadoop.
  • Implemented external tables and dynamic partitions using Hive.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
  • Used Control-M scheduler system to automate the pipeline workflow.
  • Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews.
  • Developed scripts in Spark to import data from Cassandra to Hadoop and vice-versa.
  • Created RDD’s and applied data filters in Spark and created Cassandra tables and Hive tables for user access.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Cloudera 5.x, Hadoop, MapReduce, HDFS, Hive, Sqoop, Spark Control-M, Python, Cassandra, Informatica, Oracle, Teradata.

Confidential, New York, NY

Hadoop Developer

Responsibilities:

  • Moved flat files generated from various retailers to HDFS for further processing.
  • Developed the PIG code for loading, filtering and storing the data.
  • Written the Apache PIG scripts to process the HDFS data.
  • Created Hive tables to store the processed results in a tabular format.
  • Developed Hive scripts to denormalize and aggregate the disparate data.
  • Implemented external tables and partitions using HIVE.
  • Developed the Sqoop scripts in order to make the interaction between Pig (ETL) and MySQL Database.
  • Experience on Cloudera distributions.
  • Involved in gathering the requirements, designing, development and testing.
  • Writing the script files for processing data and loading to HDFS.
  • Setting Password less Hadoop and working on MR1 and MR2.
  • Automated workflows using shell scripts and Control-M jobs.
  • Implemented HBASE for creating tabular data.
  • Integrated and worked with the reporting teams like SAS and Tableau.
  • Moved all log/text files generated by various products into HDFS location
  • Created External Hive Tables on top of parsed data.

Environment: : Cloudera 4.x, Hive, Linux, HDFS, Map-Reduce, Hive, Pig, Sqoop, Control-M, Oracle, HBase.

Confidential, Livingston, NJ

Hadoop Developer

Responsibilities:

  • Developed Pig program for loading and filtering the streaming data into HDFS.
  • Imported data from Oracle database to HDFS using Sqoop.
  • Worked on Data cleansing using apache Avro schema and implementing it in Pig.
  • Developed Hive scripts to denormalize and aggregate the disparate data.
  • Automated workflows using shell scripts and Oozie jobs to pull data from various databases into Hadoop.
  • Implemented external tables and dynamic partitions using Hive.
  • Worked on custom Pig Loaders and Storage classes to work with a variety of data formats such as JSON, Compressed CSV, etc.
  • Loaded the created HFiles into HBase for faster access of large customer base without taking performance hit.
  • Used Oozie scheduler system to automate the pipeline workflow.
  • Actively participated in software development lifecycle (scope, design, implement, deploy, test), including design and code reviews.
  • Implemented data serialization using apache Avro.
  • Involved in story-driven agile development methodology and actively participated in daily scrum meetings.

Environment: Cloudera Hadoop, MapReduce, HDFS, Hive, Sqoop, Avro, Oozie, Java (jdk1.6), Informatica.

Confidential, Jackson, Mississippi

Hadoop Developer

Responsibilities:

  • Analyzed source data and documented requirements from the business users.
  • Worked on understanding and creating ETL Designs, Technical specifications, Mappings/Transformations and Informatica sessions documents confining to the business rules.
  • Worked with business analysts for requirement gathering, business analysis, project coordination and testing.
  • Extensively used ETL to load data using Power Center from source systems like Flat Files and Excel Files into staging tables and load the data into the target database Oracle.
  • Created mappings in Power Center Designer using Aggregate, Expression, Filter, Sequence -Generator, look-up, Update Strategy, Rank, Joiner and Stored procedure transformations, Slowly Changing Dimensions(Type 1 and 2).
  • Worked with Connected and Unconnected look-ups and Stored Procedure for pre & post load sessions.
  • Developed Automation logics for the existing Manual processes.
  • Designed and Developed pre-session, post-session routines and batch execution routines using Informatica Server to run sessions.
  • Worked extensively on Mappings, Mapplets, Sessions and Workflows.
  • Created UNIX Shell scripts to automate the process of generating and consuming the flat files.
  • Developed test cases, prepared SQL scripts to test data and tested the sessions and workflows to meet the Unit Test Requirements and used debugger to fix any invalid mappings.
  • Provided support during QA/UAT testing by working with multiple groups.
  • Involved in coordinating end-to-end Test Strategy and Test Plans for Unit, UAT and Performance testing effort - includes setting up the environment, code migration, running the test cycles and resolving the issues.

Confidential, Boston, MA

Java Developer

Responsibilities:

  • Coded the business methods according to the IBM Rational Rose UML model.
  • Extensively used Core Java, Servlets, JSP and XML .
  • Used Struts 1.2 in presentation tier.
  • Generated the Hibernate XML and Java Mappings for the schemas
  • Used DB2 Database to store the system data
  • Used Rational Application Developer (RAD) as Integrated Development Environment (IDE).
  • Used unit testing for all the components using JUnit .
  • Used Apache log 4j Logging framework for logging of trace and Auditing.
  • Used Asynchronous JavaScript and XML (AJAX) for better and faster interactive Front-End.
  • Used IBM Web-Sphere as the Application Server.
  • Used IBM Rational Clearcase as the version controller.
  • Involved in developing UML Diagrams like Use Case, Class, Sequence diagrams.

Environments: Java 1.6, Servlets, JSP, Struts1.2, IBM Rational Application Developer (RAD) 6, Websphere 6.0, iText, AJAX, Rational Clearcase, Rational Rose, Oracle 9i, log4j.

Confidential

SQL Server/Oracle Developer

Responsibilities:

  • Analyzed the functionality, developed data model for financial Modules, using Erwin.
  • Worked with the team for project’s rapid development, including timings and resources throughout project’s life cycle.
  • Writing Development Documents for new projects based on requirements and Business Requirement Document (BRD) submitted by the business owners.
  • Created Database Maintenance Planner for the Performance of SQL Server, which covers Database Integrity Checks, Update Database Statistics and Re-indexing.
  • Created SSIS packages to transfer data from Oracle to SQL server using different SSIS components and used configuration files and variables for production deployment
  • Migrated existing DTS Packages to SSIS Packages, Crystal reports to SSRS reports
  • Created and Scheduled reports for daily, weekly, monthly reports for executives, Business analyst and customer representatives for various categories and regions based on business needs using SQL Server Reporting services (SSRS).
  • Used SQL Profiler to trace the slow running queries and the server activity and with SQL performance tuning too.

Environment: MSSQL Server 2000, Oracle 9i, SSRS and SSIS, Crystal Reports, SQL, T-SQL, PL/SQL, SQL Query Analyzer, Profiler, Erwin, Data modeling, HTML, XPath, ETL, DTS packages.

We'd love your feedback!