We provide IT Staff Augmentation Services!

Hadoop Developer. Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • A Dedicated, Assertive and Qualified Technology Professional working as Hadoop Developer.
  • 9 years of overall IT experience in Application Development in Big Data Hadoop and Data stage.
  • 5 years of exclusive experience and knowledge in Big Data Hadoop and its components like HDFS, Map Reduce, Apache Pig, Hive, Sqoop, HBase, IMPALA, Flume, OOZIE, SPARK, Spark Streaming, Kafka.
  • 2 Years of experience as ETL Developer using Data stage.
  • Extensive Experience in Setting Hadoop Cluster.
  • Good Working Experience on Hadoop HDFS, Hive, Pig, MR Jobs, Impala, flume and SQOOP and OOZie.
  • Having Experience in to Import/Export data from Existing RDBMS.
  • Good Knowledge on Oracle 10g, MySQL and NOSQL databases .
  • Good Exposure on Query Programming Model of Hadoop (Pig and Hive).
  • Good Experience in PySpark .
  • Used the JSON and XML SerDe's for serlization and de - serlization to load JSON and XML data into Hive.
  • Used Avro, Parquest and ORC data formats to store in to GCP. good knowledge in writing applications using python using different libraries like Pandas, NumPy, SciPy, Matpotlib etc.
  • Good Knowledge on Hcatalog, Impala, and NoSQL data bases (MongoDB and Hbase).
  • In depth understanding/knowledge of hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Data Node, Name Node and Map-reduce concepts.
  • Having Experience on Single node and multi node cluster configurations, Decommission and commissioning of nodes in the cluster.
  • Experiencing in designing, developing, documenting and testing of ETL jobs using Data Stage.
  • Experienced in integration of various data sources (Oracle, Dataset and sequential files, multi-type flat files) into data staging area.
  • Good exposure in handling the stages like sequential files stage, dataset, lookup, Join, Sort, transformer, aggregator, funnel stages.
  • Participated in the client calls to gather the requirements and involved in the preparation of design, mapping as well as unit test cases documents.
  • Performed various activities like import and export of Data Stage projects.
  • Good exposure on data warehousing concepts, UNIX and oracle.
  • Excellent communication, interpersonal, analytical skills, and strong ability to perform as part of team.
  • Exceptional ability to learn new concepts.
  • Willing to walk an extra mile to achieve excellence.
  • Extensive experience in high-end technical areas including SIEBEL Configuration, Work flows, script, having good understanding of Siebel architecture.
  • Trained in Sales force from the Organization and having good knowledge in implementing the POCs in Sales Force and Sales force certified.
  • Learn new technologies and ready to reach the management expectations..
  • Having experience to work and involved in few POC implementations.
  • Worked on Siebel-CRM and Having POC knowledge in Sales force.

TECHNICAL SKILLS:

Languages: Core Java and Pig Latin, eScript.

Operating Systems: Windows and Linux.

Big Data Technologies: Hadoop, Spark, Hive, PySpark, Pig, Sqoop, HUE, Oozie, Impala, Flink

RDBMS: Oracle 10g and My SQL 5.5.35.

Distributed Database (NoSQL): HBase, MongoDB.

IDE's: Eclipse 3.7.2.

Other Technologies: Sales force, Siebel.

Domain knowledge: Siebel Communications, Call Centre, Financial Services, Banking

PROFESSIONAL EXPERIENCE:

Confidential

Hadoop Developer

Responsibilities:

  • Requirement analysis,design,development
  • Analyse the data from the SOR
  • Prepare the Xwalk Documents and bridge documents based on the PDM and data lineage documents to create appid to load the data int Hbase and Gremlin Graph DB.
  • Modifying the existing MDFE spark jar to load the Data from CDF and GLIF data sources using Xwalk and BD documents.
  • Developed Xwalk and BD for 12 SOR feed files.
  • Developing maestro jobs for automation
  • Data load validation in hbase and and gremlin using APPID
  • Unit testing, defect fixing

Confidential, Los Angeles, CA

Hadoop Developer

Responsibilities:

  • Involved in the requirement gathering, project documentation, design document, production deployment
  • Developed spark jobs using Java for generic reports.
  • Developed Spark jobs using pyspark for phase 2 inventory project.
  • Active involvement in the daily requirement clarification calls, scrum calls
  • Analysed the Source data table from Oracle and MySQL and imported the data using Sqoop to Hadoop.
  • Involved in designing the Data models and processed the data using Impala and writing shell scripts to execute the impala queries based on the conditions.
  • Exporting the Data tables back to the MySQL data base.
  • Implemented Hive tables and HQL Queries for the reports.
  • Converting the MySQL Stored Procedures to Impala ETL jobs for faster performance.
  • Unit testing and defect fixing.
  • Designed Hadoop Data Lake and developed Data ingestion framework supporting RDBMS and Flat files.
  • Migrating back-dated and Ingesting Daily data into the Data Lake from multiple sources.
  • Generating the Canned reports on the Daily data using HQL queries and Impala queries and reports handover to business users.
  • Developed utility tools using shell scripting, Spark, and python scripting for reconciliation.
  • Categorizing the huge volume of data in the data lake based on the business requirement to enhance the generation of canned reports performance and query execution time using Spark.
  • Supporting UAT and Production Environments.
  • Handling data types effectively between RDBMS to HIVE without any data loss.

Confidential

Hadoop Developer.

Responsibilities:

  • Working on the enhancements and change requests for the Confidential phase 1 release using SPARK with Python and defect fixing.
  • Analysed the Source data table from Oracle and MySQL and imported the data using Sqoop to Hadoop.
  • Involved in designing the Data models and processed the data using Impala and writing shell scripts to execute the impala queries based on the conditions.
  • Exporting the Data tables back to the MySQL data base.
  • Implemented Hive tables and HQL Queries for the reports.
  • Converting the MySQL Stored Procedures to Impala ETL jobs for faster performance.
  • Unit testing and defect fixing.
  • Implemented POC for the Neilson Using SPARK Streaming and Kafka.
  • Implemented the POC on MongoDB.

Confidential

Hadoop Developer

Responsibilities:

  • Involved in the Business client calls and requirement analysis and design.
  • Actively involved in business requirement discussions from Onsite Counterpart and discussions done with Offshore Team for the implementation and coordinated all project related activities.
  • Developed Sqoop scripts to import/export data between HDFS and MySQL Database.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in mapreduce way.
  • Supported MapReduce Programs those are running on the cluster.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Worked on tuning the performance Pig queries.
  • Involved in Developing the Pig scripts for processing data.
  • Written Hive queries to transform the data into tabular format and process the results using Hive Query Language.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Analysed the functional specifications.
  • Implemented PIG scripts According business rules.
  • Implemented Hive tables and HQL Queries for the reports.
  • Unit testing and defect fixing.

Confidential

Hadoop Developer

Responsibilities:

  • Involved in requirement analysis.
  • Extensively involved in installation and configuration of Cloudera distribution of Hadoop, its Name node, Secondary Name node, Job tracker, Task trackers and Data nodes
  • Worked on analyzing Hadoop stack and different big data analytic tools including Pig and Hive, HBase database and Sqoop.
  • Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Supported Map Reduce Programs those are running on the cluster.
  • Analyzed large data sets by running Hive queries and Pig scripts.
  • Worked on tuning the performance Pig queries.
  • Worked on installing cluster, commissioning & decommissioning of Data nodes, Name node recovery, capacity planning, JVM tuning, map and slots configuration.
  • Involved in exploring Hadoop Map-reduce Programming and Cluster configuration and installation.
  • Written pig scripts for pre-processing of customer data.
  • Developed map reduce programs for batch processing and customer data analysis to suggest the best plan for customer.
  • Worked on tuning the performance of Hive and Pig queries.
  • Writing java code for custom partitioner and writables.
  • Integrated hive with Pentaho for data reports.
  • Unit testing of web application and Hadoop programs.
  • Involved in bug fixing and testing of the application.

Confidential

Data Stage Developer

Responsibilities:

  • Involved in the preparation of Low Level Design Document.
  • Actively involved in business requirement discussions from Onsite Counterpart and discussions done with Offshore Team for the implementation and coordinated all project related activities.
  • Involved in documenting all the Initial Level Analysis done before designing the Jobs.
  • Involved in developing parallel jobs to implement slowly changing dimension2 logic using change data capture from RMW Tables to Data mart Tables.
  • Involved in Testing all the developed Staging to RMW Jobs.
  • Prepared Unit Test Documents for Data mart and RMW designed Jobs.
  • Captured all Supporting Activities and documented the same.

Confidential

Configurator/upgrade and support

Responsibilities:

  • Worked on pre upgrade and post upgrade tasks
  • Fixing defects as per 7.8 functionalists
  • Worked on Data Stage Designer, Manager, Administrator and Director.
  • Worked with the Business analysts and the DBAs for requirements gathering, analysis, testing, and metrics and project coordination.
  • Involved in extracting the data from different data sources like Oracle and flat files.
  • Involved in creating and maintaining Sequencer and Batch jobs.
  • Creating ETL Job flow design.
  • Used ETL to load data into the Oracle warehouse.
  • Created various standard/reusable jobs in Data Stage using various active and passive stages like Sort, Lookup, Filter, Join, Transformer, aggregator, Change Capture Data, Sequential file, Data Sets.
  • Tested and fixed defects in List Management module, Event Management Performance tuning.
  • Tested Inbound and Outbound web services using Soap UI

We'd love your feedback!