We provide IT Staff Augmentation Services!

Big Data/hadoop Developer Resume

0/5 (Submit Your Rating)

NJ

SUMMARY

  • Versatile IT professional with more than 19 years experience in developing business software applications and keen interests in Big Data/Hadoop, Data Science and Business Intelligence.
  • Involved in analysis, design, development, testing and Implementation of Big Data, Business Intelligence applications, CRM, client/server, Data warehouse and web - based applications.
  • Proven record in implementing Big Data Applications along with other high tech skills, initiator to propose solutions, understand market requirements/trends to achieve user satisfaction and make effective dialogues with positive "can-do" attitude.
  • Have around 4 years of experience Big Data platform using Apache Hadoop and its ecosystem (HDFS, Map Reduce, Hive, Hbase, Spark, Sqoop, Pig, Oozie, Flume, YARN and Zookeeper).
  • Strong on Java to implement Mapreduce.
  • Worked on BI Reporting tool Cognos to present the business critical data in Reports.
  • Good experience with the Hive Query optimization and Performance tuning.
  • Hands on experience in writing Pig Latin Scripts and custom implementations using UDF'S.
  • Good experience with Sqoop for importing data from different RDBMS to HDFS and export data back to RDBMS systems for ad-hoc Reporting.
  • Experienced in automate the batch job using Oozie and UNIX shell scripts.
  • Extended Hive and Pig core functionality by writing custom UDFs.
  • Strong experience in building Dimension and Fact tables for Star Schema for various databases Oracle 11g/10g/9i/8i, IBM DB2, MS SQL Server.
  • Developed ad-hoc Reports using IBM Cognos Report Author and Siebel Analytics.
  • Strong knowledge in Python, Scala, MongoDB and Node JS.
  • Worked extensively with ETL Tools like Informatica Power Center and Talend to populate the data between multiple applications and Data warehouse.
  • Expertise in technology solutions for Healthcare, Public Sector, Education, Telecommunications, Wireless, Pharmaceutical, Financial and Software sectors.
  • Siebel implementation experience in Multiple Projects with hands on experience in Integrating and Configuring Siebel Enterprise Applications.
  • Has excellent communication skills, Self-Motivated and Organized in delivering high quality software solutions.

TECHNICAL SKILLS

Big Data: HDFS, Hadoop, Map Reduce, Hive, HBase, Pig, Sqoop, Elasticsearch, Flume, Impala, Oozie, Zookeeper, Spark SQL, Spark, Scala, YARN and R

ETL: Informatica Power Center, TALEND

Operating Environment: HP-UNIX, MS Windows, Sun Solaris, Linux

RDBMS Tools: TOAD 7.x/8.5, PL/SQL Developer

Databases: Oracle 11g, DB2, SQL server 2000, MS Access

NO SQL Databases: MongoDB and HBase

Languages: JavaScript, Python, Scala, Java, Node JS, SQL, PL/SQL, UNIX Scripting, eScript, Node JS, C

BI Reporting: IBM Cognos BI Studio, Report Studio, Frame work Manager, OBIEE

Tools: /Utilities: SQL*Plus, TOAD, SQL*Loader,sql navigator

Scheduling Tools: Autosys, UC4, Control-M

Middle Ware: Tibco’s TIB/RV, Kafka, MQSeries.

Methodologies: Agile, UML, Waterfall

Others: Machine Learning, HP-Quality Center, Rally, Visio 2007

PROFESSIONAL EXPERIENCE

Confidential - NJ

Big Data/Hadoop Developer

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, Zookeeper, HBase, Mapr, Elastic search, Spark, Scala, Java, JSON, UNIX, Avro, Teradata Client and Rally

Responsibilities:

  • Built ingestion framework using Sqoop, data transform framework using MapReduce and Pig.
  • Worked on the DDS Trans extract data to create fixed format files using CDB data and SFTP to the DDS systems.
  • Involved with MapReduce jobs in Java for data cleaning and preprocessing. Created the JSON files to capture the changes in tables for required columns to send only those changed records to generate files.
  • Reduced the processing time by using common logic to figure out to filter the data Using Spark queries to build the Temp tables with required data to reduce the full table scans.
  • Created the shell script to process all the files for DDS Extract process and transfer those files using SFTP process.
  • Developed Python scripts to load large data files with unstructured data into HBASE
  • Worked on Specialty Maternity Eligibility Project to access Wildflower app for qualified UHG customers.
  • Created the Hive External table to move Program ID data and Population data from BOSS system to CDB Tenant.
  • Created the HBase table and Inserted the boss data into CDB HBase tables also created the ORC Hive tables.
  • Loaded the Member Demographic data with Population and Program IDs to Elaticsearch using Pig Scripts, Specialty Maternity API uses this data to give the access to Wildflowers App for Maternity process.
  • Attended the daily scrum meetings to co-ordinate with team and used Rally software to track the progress of User Stories and tasks.

Confidential - NY

Big Data Developer

Environment: Hadoop, HDFS, MapReduce, YARN, Pig, Hive, Sqoop, Zookeeper, Oozie, Flume, HBase, Java, Cloudera, Impala, Scala, Elasticsearch, Cognos, AWS, Spark, Python, Kafka, JSON, Avro.

Responsibilities:

  • Responsible for the conceptualization, design and implementation of the firm’s data analytics platform for Data Mining.
  • Demonstrated ability to provide technical oversight for large complex projects and achieved desired customer satisfaction from inception to deployment.
  • Involved in evaluating data science and big data analytical tools through research, testing and conducting proof of concepts.
  • Installed and configured Hadoop ecosystem components like HDFS, Sqoop, Flume, Spark, Pig and Hive.
  • Written Sqoop scripts to transfer Claims, Provider and Member data between databases and HDFS File system also automated the jobs using Linux batch scripts.
  • Developed Spark code and applied all the business rules on claims data to identify potential cases of fraud.
  • Used Cloudera Distribution to utilize pre-built functionalities and Cloudera Manager.
  • Worked with the Search relevancy team to improve relevancy and ranking of search results using Elasticsearch.
  • Designed the queries against data in the HDFS environment using tools such as Hive, Impala and Spark.
  • Built High performance analytics Confidential scale, for truly interactive analysis by Business Analysts bringing down response rates from 12 minutes to 60 seconds.
  • Designed and developed OOZIE workflow and Coordinator to run jobs automatically.
  • Used Pig as ETL tool to do transformations, event joins, filters and some pre-aggregations before storing the data onto HDFS.
  • Worked on partitioning HIVE tables and running the scripts in parallel to reduce run-time of the scripts.
  • Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries and Pig scripts.
  • Medical Expense Analysis report is delivered via dynamic & interactive self-service Cognos dashboards.

Confidential - NY

Technical Analyst

Environment: IBM Cognos BI Studio, Report Studio, Frame work Manager, Oracle, ESAWS, Linux Scripts, Autosys, Informatica.

Responsibilities:

  • Involved in developing IBM Cognos reports to business users and ESAWS managers allowing them to know customer service performance.
  • Used IBM Cognos Report Studio to design reports and apply the business logic for department and customer monthly, weekly and daily reports.
  • Converted the client letters reports from old crystal reports (business objects) to Cognos for printing customer letters for communication.
  • Used Frame work Manager to design the schema and Packages to generate the reports.
  • Created Informatica mapping to transfer the data from Data warehouse to ESAWS, Softheon and multiple other systems.
  • Developed JIL files to automate the scripts using the Autosys.

Confidential - Upper Saddle River, NJ

Responsibilities:

  • Involved in converting the Legacy data (MDR, GM and HE) to Siebel 8.1 data using the TALEND and Siebel EIM Process.
  • Created the mapping documents from legacy system staging tables to Siebel EIM tables and EIM tables to base tables for Accounts, Courses, Contacts, Activities, Opportunities and Attachments.
  • Resolved all the Siebel EIM user keys, foreign keys and LOV’S for all the legacy data.
  • Used TALEND as ETL tool to load the data from legacy staging tables to Siebel EIM Tables, Appling all the business rules.
  • Developed the IFB files to load the data from Siebel EIM Tables to Siebel Base Tables as per loading sequence and with process optimization.
  • Used the context file in TALEND to free the mappings from environment free to run these in multiple environments without any modifications.
  • Used Siebel Tools for schema changes and created UNIX Scripts for ongoing Jobs.

Confidential - Middletown, NJ

Responsibilities:

  • As a Data Lead, involved in analyzing the eCRM (WWI) application business requirements and preparing the High Level Documents (HLD) and Technical Design Documents (TDD).
  • Created the Mapping documents from staging tables to EIM Tables and EIM Tables to Base tables for all Interfaces from SAART to eCRM.
  • Involved in different meetings to give guidance in design changes and production load plan of the data.
  • Created the SQL scripts to generate the fixed file format to send output and feedback files to SAART.
  • Worked with offshore people to understand the technical design of the requirements and help in writing the code and fixing the errors from the code.
  • Worked on writing the complex code and all the frame work for getting the data to loading the data into base tables like UNIX shell script, SQL Loader CTL files and Business logic PL/SQLs, inp files and IFB files.

Confidential - Rochester, NY

Responsibilities:

  • As an Integration Architect, interacted with all the Integration touch point owners to communicate the process for converting the XNAIS application from Siebel Call Center 6.2 to Siebel 8.1 Call Center.
  • Involved in all the review sessions of requirements, high level design and low level design documents to give the proper direction.
  • Created the EIM to base table mapping documents for all interfaces for this conversion project.
  • Arranged meetings with other integration teams to coordinate the development phase and testing phase to share the primer files, data and distribute the meeting minutes.
  • Worked on shell scripts, PL/SQL procedures, SQL Loader and EIM jobs to fix the errors and performance problems in QC and volume testing.

Confidential - Wayne, PA

Responsibilities:

  • As Data Integration lead, interacted with other teams like CM (Customer Master) and MDS.
  • Provided Tier 3 support to the production system and user escalations. Worked on the PL/SQL procedures, SQL Loader and EIM Jobs to fix the errors from the recent release code changes.
  • Ran daily interface jobs CM Outbound, CM Inbound and SFA Merge in production, check for errors and fix the data problems and documented the code fixes.
  • Created the UC4 jobs to automate the interfaces to scheduled daily runs and coordinated with other teams in schedule timings.
  • Led the data migration effort in converting the Legacy Application data from LAP and PEPS to Siebel 7.8 System (IPM). Created data mapping document to match the Legacy Application tables Siebel interface tables and Siebel base tables.
  • Created the Informatica mappings to load the LAP data extract from flat files to IPM staging tables.
  • Created the data profiling to understand the legacy data using the Informatica data profiler.
  • Created Informatica mappings and workflows to load the legacy data into Siebel interface tables (EIM) by applying all the business logic and Siebel schema requirements.

We'd love your feedback!