We provide IT Staff Augmentation Services!

Hadoop Spark Programmer Resume

2.00/5 (Submit Your Rating)

SUMMARY

  • 9.5 years of Experience in IT of which 3 Years of experience in Big Data analytics
  • Worked with teh tools in Hadoop Ecosystem including SPARK, SQOOP, SCALA, Hive, HDFS, MapReduce, Yarn, Oozie
  • Migrating ETL project to HADOOP projects.
  • Ability to move teh data in and out of Hadoop from various RDBMS and UNIX using SQOOP and other traditional data movement technologies.
  • Experienced in installing, configuring Hadoop cluster of major Hadoop distributions.
  • Have hands on experience in writing Scala programs in Spark frame work.
  • Hands on Experience in working with ecosystems like Hive and Sqoop
  • Developed analytical components using Scala, Spark, Apache Mesos and Spark Stream.
  • Created RDD, used Transformation and Action commands/functions
  • Successfully loaded files to Hive and HDFS from MYSQL.
  • Loaded teh dataset into Hive for ETL Operation.
  • Good knowledge on Hadoop Cluster architecture and monitoring teh cluster.
  • Good understanding of cloud configuration in Amazon web services (AWS)..
  • Worked extensively on (ETL) IBM InfoSphere DataStage 8.7, 8.5, 11.3 using Components like DataStage Designer, DataStage Director, DataStage Administrator.
  • Excellent experience in working with large data feeds and heterogeneous systems like Oracle, Netezza, SQL Server, DB2, Teradata, XML files, COBOL/Mainframe files and Flat files.
  • Pervasive knowledge of IT industry domains such as Banking, Health care etc.
  • Involved in Code Analysis, Coding, Testing, Bug fixing and Production support
  • Developed and Tested Unix scripts with Ksh for teh project requirements
  • Participated in Requirement gathering meetings with Business, Prepared Test Strategy Matrix, Test Plans, Test Result Report, Performed Baseline, Forward and Regression Testing
  • Sharing Test Key Controls with Business to get approval and Project sign off.
  • Logged Defects and participated in Defect Prevention Analysis meetings
  • Excellent knowledge of studying teh ER diagrams, data dependencies using Metadata stored in teh DataStage Repository.
  • Played multiple roles - Programmer Analyst, Application Developer, Software Tester, Quality Analyst, Module Leader & Front Line Manager .
  • Participates in Agile - Scrum standups and Sprint planning meetings

TECHNICAL SKILLS

Frame work: Hadoop, SPARK, IBM Infosphere Datastage, Mainframes

Schedulers: Control +M, Autosys, CA-7, Oozie

Operating Systems: Windows 95/98/NT/2000/XP, Unix, Z /OS

Database: Oracle, SQL Server, DB2, MySQL, Netezza, Teradata

Tools: Toad, SPARK, Hive, SQOOP, Flume, YarnJIRA, Kanban, ISPF, SQUFI, ENDEVOR, FILEAID, XPEDITOR, Aginity

Languages: Cobol, CICS, JCL, Shell Scripting, C, C++, JAVA, PHP, HTML, Java ScriptSCALA, Unix Shell Scripting (Ksh)

PROFESSIONAL EXPERIENCE

Confidential

Hadoop Spark Programmer

Responsibilities

  • Participated in Project Requirement discussions with Business and development teams
  • Use HIVE on SPARK
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • POC to use SPARK and SCALA in teh existing HADOOP ecosystem.
  • Create a Hadoop design which replicates teh Current system design.
  • Developed Scala scripts, UDFFs using both Data frames/SQL and RDD in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Creating SQOOP programs to move teh data from Oracle to Hive Temporary Tables.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Loaded teh data into Spark RDD and do in memory data Computation to generate teh Output response.
  • Create SQOOP programs to move teh Static Lookup Data from Oracle to Hive tables.
  • Developed Hive queries to pre-process teh data required for running teh business process.
  • Create teh Main upload files from teh Hive Temporary Tables.
  • Create Oziee workflows for HIVE scripts and schedule teh OZIEE workflows and DMX-h scripts in Autosys.
  • Create UDF functions for HIVE queries.

Environment: Hadoop, Spark, Scala, Sqoop, Hive, InfoSphere DataStage 11.3, Teadata, Oracle, Autosys, UNIX Shell scripting

Confidential, NJ

Spark Programmer

Responsibilities

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Experience in Job management using Fair scheduler and Developed job processing scripts using Oozie workflow.
  • Developed Spark scripts by using Scala shell commands as per teh requirement.
  • Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
  • Developed Scala scripts, UDFFs using both Data frames/SQL and RDD in Spark 1.6 for Data Aggregation, queries and writing data back into OLTP system through Sqoop.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Loaded teh data into Spark RDD and do in memory data Computation to generate teh Output response.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Performed advanced procedures like text analytics and processing, using teh in-memory computing capabilities of Spark using Scala.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, TEMPEffective & efficient Joins, Transformations and other during ingestion process itself.
  • Worked extensively with Sqoop for importing metadata from Oracle.
  • Involved in creating Hive tables, and loading and analyzing data using hive queries
  • Developed Hive queries to process teh data and generate teh data cubes for visualizing
  • Implemented schema extraction for Parquet and Avro file Formats in Hive.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.

Environment: Hadoop, Spark, Scala, Hive, Pig, UNIX Shell scripting

Confidential

ETL Developer

Responsibilities

  • Understood teh technical specifications and develop data stage jobs for Extraction Transformation, Cleansing and Loading process of DW.
  • Profiling and analyzing data from different sources, addressing teh data quality issues, Transforming and processing of data and loading teh data into Teradata.
  • Implemented Survive stage & Match Stage for data patterns & data definitions
  • Captured data from a variety of sources including DB2, Flat Files, Mainframes and other formats.
  • Extensively worked with Data Stage Shared Containers for Re-using teh Business functionality.
  • Used extensively Reject Link, Job Parameters, and Stage Variables in developing jobs.
  • Drafting technical documents like Overview documents, migration and deployment documents for every code release.
  • Automation is done by using batch logic, scheduling jobs on a daily, on a weekly and yearly basis depending on teh requirement using Autosys.
  • Involved in various reviews and meetings including Internal and external code review, weekly status calls, issue resolution meetings and code acceptance meetings.
  • Assisted SIT testing team, UAT team and Production team during code release with code walk through and presentations and Defect identification, reporting and tracking.

Environment: InfoSphere DataStage 8.5, Oracle 11g, Fixed width files, COBOL files, Sequential files, DB2, XML files

Confidential

Delivery Module Lead

Responsibilities

  • Developed unit test cases, unit-tested teh jobs before migrating teh code to QA and Production boxes.
  • Responsible for managing scope, planning, tracking, change control, aspects of teh project
  • Analysed of Business requirements and Specifications
  • Coded online and batch programs using COBOL, VME COBOL, CICS, DB2, VSAM, JCL PWB, ALTADATA & ITS
  • Prepared of test cases, Unit testing and regression testing
  • Responsible for TEMPeffective communication between teh project team and teh customer
  • Translated customer requirements into formal requirements and design documents
  • Established Quality Procedure for teh team, monitor and audit to ensure team meets quality goals
  • Performed teh role of a team lead managing teh work allocation, mentoring, ensure co-ordination amongst teh team members, gain confidence of teh onsite SMEs/leads/business
  • Participated actively in team/customer meetings and ensure co-ordination between onsite/offshore team
  • Made sure teh commitments made to teh client, quality of deliverables are met
  • Mentored teh team in technical/business areas and halp them resolve issues related to it
  • Reviewed work status and assist teh team in all phases of teh software engineering cycle as and when required

Environment: IBM - Mainframes - COBOL, VME COBOL, CICS, DB2, VSAM, JCL, PWB, ALTADATA, ITS.

Confidential

Trainee programmer

Key Responsibilities

  • Analysed of teh existing modules to be developed or enhanced
  • Performed new modules coding
  • Changed code of existing programs
  • Prepared of Unit test reports for teh Unit testing to be done
  • Involved in Unit testing and Logging teh Unit test reports for various conditions
  • Supported System integration testing and User acceptance testing
  • Prepared Quality related documents through out teh SDLC process.

Environment: PHP, MySql, HTML, Javascript.

We'd love your feedback!