Hadoop Developer. Resume

PROFESSIONAL SUMMARY:

A Dedicated, Assertive and Qualified Technology Professional working as Hadoop Developer.
9 years of overall IT experience in Application Development in Big Data Hadoop and Data stage.
5 years of exclusive experience and knowledge in Big Data Hadoop and its components like HDFS, Map Reduce, Apache Pig, Hive, Sqoop, HBase, IMPALA, Flume, OOZIE, SPARK, Spark Streaming, Kafka.
2 Years of experience as ETL Developer using Data stage.
Extensive Experience in Setting Hadoop Cluster.
Good Working Experience on Hadoop HDFS, Hive, Pig, MR Jobs, Impala, flume and SQOOP and OOZie.
Having Experience in to Import/Export data from Existing RDBMS.
Good Knowledge on Oracle 10g, MySQL and NOSQL databases .
Good Exposure on Query Programming Model of Hadoop (Pig and Hive).
Good Experience in PySpark .
Used the JSON and XML SerDe's for serlization and de - serlization to load JSON and XML data into Hive.
Used Avro, Parquest and ORC data formats to store in to GCP. good knowledge in writing applications using python using different libraries like Pandas, NumPy, SciPy, Matpotlib etc.
Good Knowledge on Hcatalog, Impala, and NoSQL data bases (MongoDB and Hbase).
In depth understanding/knowledge of hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Data Node, Name Node and Map-reduce concepts.
Having Experience on Single node and multi node cluster configurations, Decommission and commissioning of nodes in the cluster.
Experiencing in designing, developing, documenting and testing of ETL jobs using Data Stage.
Experienced in integration of various data sources (Oracle, Dataset and sequential files, multi-type flat files) into data staging area.
Good exposure in handling the stages like sequential files stage, dataset, lookup, Join, Sort, transformer, aggregator, funnel stages.
Participated in the client calls to gather the requirements and involved in the preparation of design, mapping as well as unit test cases documents.
Performed various activities like import and export of Data Stage projects.
Good exposure on data warehousing concepts, UNIX and oracle.
Excellent communication, interpersonal, analytical skills, and strong ability to perform as part of team.
Exceptional ability to learn new concepts.
Willing to walk an extra mile to achieve excellence.
Extensive experience in high-end technical areas including SIEBEL Configuration, Work flows, script, having good understanding of Siebel architecture.
Trained in Sales force from the Organization and having good knowledge in implementing the POCs in Sales Force and Sales force certified.
Learn new technologies and ready to reach the management expectations..
Having experience to work and involved in few POC implementations.
Worked on Siebel-CRM and Having POC knowledge in Sales force.

TECHNICAL SKILLS:

Languages: Core Java and Pig Latin, eScript.

Operating Systems: Windows and Linux.

Big Data Technologies: Hadoop, Spark, Hive, PySpark, Pig, Sqoop, HUE, Oozie, Impala, Flink

RDBMS: Oracle 10g and My SQL 5.5.35.

Distributed Database (NoSQL): HBase, MongoDB.

IDE's: Eclipse 3.7.2.

Other Technologies: Sales force, Siebel.

Domain knowledge: Siebel Communications, Call Centre, Financial Services, Banking

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Developer

Responsibilities:

Requirement analysis,design,development
Analyse the data from the SOR
Prepare the Xwalk Documents and bridge documents based on the PDM and data lineage documents to create appid to load the data int Hbase and Gremlin Graph DB.
Modifying the existing MDFE spark jar to load the Data from CDF and GLIF data sources using Xwalk and BD documents.
Developed Xwalk and BD for 12 SOR feed files.
Developing maestro jobs for automation
Data load validation in hbase and and gremlin using APPID
Unit testing, defect fixing

Confidential, Los Angeles, CA

Hadoop Developer

Responsibilities:

Involved in the requirement gathering, project documentation, design document, production deployment
Developed spark jobs using Java for generic reports.
Developed Spark jobs using pyspark for phase 2 inventory project.
Active involvement in the daily requirement clarification calls, scrum calls
Analysed the Source data table from Oracle and MySQL and imported the data using Sqoop to Hadoop.
Involved in designing the Data models and processed the data using Impala and writing shell scripts to execute the impala queries based on the conditions.
Exporting the Data tables back to the MySQL data base.
Implemented Hive tables and HQL Queries for the reports.
Converting the MySQL Stored Procedures to Impala ETL jobs for faster performance.
Unit testing and defect fixing.
Designed Hadoop Data Lake and developed Data ingestion framework supporting RDBMS and Flat files.
Migrating back-dated and Ingesting Daily data into the Data Lake from multiple sources.
Generating the Canned reports on the Daily data using HQL queries and Impala queries and reports handover to business users.
Developed utility tools using shell scripting, Spark, and python scripting for reconciliation.
Categorizing the huge volume of data in the data lake based on the business requirement to enhance the generation of canned reports performance and query execution time using Spark.
Supporting UAT and Production Environments.
Handling data types effectively between RDBMS to HIVE without any data loss.

Confidential

Hadoop Developer.

Responsibilities:

Working on the enhancements and change requests for the Confidential phase 1 release using SPARK with Python and defect fixing.
Analysed the Source data table from Oracle and MySQL and imported the data using Sqoop to Hadoop.
Involved in designing the Data models and processed the data using Impala and writing shell scripts to execute the impala queries based on the conditions.
Exporting the Data tables back to the MySQL data base.
Implemented Hive tables and HQL Queries for the reports.
Converting the MySQL Stored Procedures to Impala ETL jobs for faster performance.
Unit testing and defect fixing.
Implemented POC for the Neilson Using SPARK Streaming and Kafka.
Implemented the POC on MongoDB.

Confidential

Hadoop Developer

Responsibilities:

Involved in the Business client calls and requirement analysis and design.
Actively involved in business requirement discussions from Onsite Counterpart and discussions done with Offshore Team for the implementation and coordinated all project related activities.
Developed Sqoop scripts to import/export data between HDFS and MySQL Database.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in mapreduce way.
Supported MapReduce Programs those are running on the cluster.
Analyzed large data sets by running Hive queries and Pig scripts.
Worked on tuning the performance Pig queries.
Involved in Developing the Pig scripts for processing data.
Written Hive queries to transform the data into tabular format and process the results using Hive Query Language.
Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
Analysed the functional specifications.
Implemented PIG scripts According business rules.
Implemented Hive tables and HQL Queries for the reports.
Unit testing and defect fixing.

Confidential

Hadoop Developer

Responsibilities:

Involved in requirement analysis.
Extensively involved in installation and configuration of Cloudera distribution of Hadoop, its Name node, Secondary Name node, Job tracker, Task trackers and Data nodes
Worked on analyzing Hadoop stack and different big data analytic tools including Pig and Hive, HBase database and Sqoop.
Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
Supported Map Reduce Programs those are running on the cluster.
Analyzed large data sets by running Hive queries and Pig scripts.
Worked on tuning the performance Pig queries.
Worked on installing cluster, commissioning & decommissioning of Data nodes, Name node recovery, capacity planning, JVM tuning, map and slots configuration.
Involved in exploring Hadoop Map-reduce Programming and Cluster configuration and installation.
Written pig scripts for pre-processing of customer data.
Developed map reduce programs for batch processing and customer data analysis to suggest the best plan for customer.
Worked on tuning the performance of Hive and Pig queries.
Writing java code for custom partitioner and writables.
Integrated hive with Pentaho for data reports.
Unit testing of web application and Hadoop programs.
Involved in bug fixing and testing of the application.

Confidential

Data Stage Developer

Responsibilities:

Involved in the preparation of Low Level Design Document.
Actively involved in business requirement discussions from Onsite Counterpart and discussions done with Offshore Team for the implementation and coordinated all project related activities.
Involved in documenting all the Initial Level Analysis done before designing the Jobs.
Involved in developing parallel jobs to implement slowly changing dimension2 logic using change data capture from RMW Tables to Data mart Tables.
Involved in Testing all the developed Staging to RMW Jobs.
Prepared Unit Test Documents for Data mart and RMW designed Jobs.
Captured all Supporting Activities and documented the same.

Confidential

Configurator/upgrade and support

Responsibilities:

Worked on pre upgrade and post upgrade tasks
Fixing defects as per 7.8 functionalists
Worked on Data Stage Designer, Manager, Administrator and Director.
Worked with the Business analysts and the DBAs for requirements gathering, analysis, testing, and metrics and project coordination.
Involved in extracting the data from different data sources like Oracle and flat files.
Involved in creating and maintaining Sequencer and Batch jobs.
Creating ETL Job flow design.
Used ETL to load data into the Oracle warehouse.
Created various standard/reusable jobs in Data Stage using various active and passive stages like Sort, Lookup, Filter, Join, Transformer, aggregator, Change Capture Data, Sequential file, Data Sets.
Tested and fixed defects in List Management module, Event Management Performance tuning.
Tested Inbound and Outbound web services using Soap UI

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship