We provide IT Staff Augmentation Services!

Sr. Bigdata Developer Resume

2.00/5 (Submit Your Rating)

NJ

EXPERIENCE SUMMARY:

  • Overall experience of 14 years in software design, development and deployments with technical skills panning across the Big Data Technologies and IBM Mainframe z O/S environment.
  • Excellent skills in Requirement gathering aided with deep understanding of the value to Business.
  • Extensive understanding of end to end Supply Chain modules and their inter - dependencies in the Manufacturing Domain
  • Expertise in Big Data Architecture planning and design, application development, deployment and migration of traditional Data Warehouse solutions to Hadoop based Integrated Data Lakes and Enterprise Data Hub (EDH)
  • Experience of 6 years in Hadoop ecosystem technologies like Sqoop, Hive, Spark, Oozie, Zookeeper and Kafka for incremental and real-time data ingestion from varied sources.
  • Experience with Querying languages like SQL and Hive QL and programming languages like Scala and Python.
  • Experience in working with Spark modules like Spark SQL, Spark Streaming and knowledge of Spark MLLib and GraphX modules.
  • Experience in working with relational databases like DB2, Oracle and NoSQL databases like HBase, MongoDB.
  • Experience in working with UNIX and Linux file systems and in Bash scripting for workflow control, data manipulation tasks.
  • Experience in Performance Tuning, Query Optimization of SQL queries
  • Experienced in using build tools like SBT, Maven etc. for Continuous integration of application code.
  • Experience in development and maintenance of Legacy Mainframe based applications using COBOL, JCL, PL/I, IMS, DB2 and in dealing with VSAM Datasets.
  • Experience in Legacy Migration projects in handling various activities like Data Migration Planning/Execution, Process Documentation, and Solution Gap Analysis.
  • Performed multiple roles including Onsite Lead, Technical SME, and Project Coordinator. Responsibilities include code reviews, delegating offshore daily tasks, lead performance reviews,, mentoring junior developers, worked as strong liaison between Business and IT teams.
  • Good presentation, summarization, structuring and refinement skills. Proven track record of execution capabilities for strategic as well as tactical plans, both within and through functional boundaries.
  • Good team player, strong interpersonal and communication skills combined with self-motivation, initiative and the ability to think outside the box.

TECHNICAL SKILLS:

Big Data Ecosystems: Cloudera Hadoop 4.3, Hortonworks (HDP 2.5), HDFS, Map Reduce, YARN, Hive, SQOOP, Spark (Spark SQL, Spark Streaming), Kafka

Programming Languages: Scala, Python, Bash Scripting, COBOL, PL/I, EASYTRIEVE, JCL

Querying Languages: SQL, HIVEQL

File Systems: HDFS, LINUX, XML, EDI FILES, JSON, PARQUET AND EBCDIC

Databases: Hive, DB2, ORACLE, NoSQL (HBase, MongoDB), IMS DB/DC

Schedulers: Oozie, Autosys, CA-7

Mainframe Env.Tools: XPEDITOR, FILEAID FOR IMS, FILE MANAGER, SPUFI AND QMF, IBM DEBUGGER

Service Delivery Mgmt: ServiceNow, Remedy

ORGANIZATIONAL EXPERIENCE:

Confidential, NJ

Sr. BigData Developer

Responsibilities:

  • Designing and developing real-time data ingestion into the Spark environment on HDP 2.5 cluster using Kafka message broker.
  • Configuring various topics on the Kafka server to handle transactions flowing from multiple ERP systems.
  • Developed SQOOP jobs to collect master data from Oracle tables to be stored onHive tables using Parquet file format.
  • Used Shell Scripting in Linux to configure the Sqoop and Hive tasks required for the data pipeline flow
  • Developed data transformation modules in Python language to convert the JSON format files into Spark DataFrames to handle data from Legacy ERP systems.
  • Developing scripts using Spark Streaming API and Scala for data transformation using Spark Dataframes.
  • Used HiveContext on Spark to store the data in Hive tables on HDFS for adhoc data analysis and reporting.
  • Implemented SQL type joins between Spark dataframes and worked with Spark RDDs to merge the transactional and master files for preparing the final data loads onto MongoDB.
  • Designed and developed various KPI driven data marts on MongoDB and loaded data from HDFS to MongoDB.
  • Monitor user queries on data and running adhoc queries on MongoDB to answer the data related questions.
  • Performed advanced procedures like text analytics using Python language to generate buyer recommendations for Confidential products.
  • Involving in end-to-end development lifecycle from requirement gathering till UAT and deployment.

Environment: HDP 2.5, HDFS, Kafka 0.10.0, Spark 1.6 (Spark Streaming, Spark SQL), Hive 1.2.1, Sqoop 1.4.6, Scala, Python, Shell Scripting, MongoDB 3.4

Confidential, NJ

Sr. BigData Developer

Responsibilities:

  • Understanding and documenting the end-to-end scope and acceptance criteria for the POC.
  • Designing data ingestion of real-time inputs from the systems in connected flight ecosystem using Kafka message broker to be processed by the Spark Streaming API to be loaded into Hive tables on HDFSand datamarts on MongoDB.
  • Configuring the various data streams onto different topics on Kafka server and developing Spark scripts to convert these data streams into Spark Dataframes.
  • Developing the transformation logic in Scala to join the Spark Dataframes per the requirements and using Spark RDDs for aggregation/summarization of data.
  • Performed data profiling and transformation on the raw sensor data feeds using Python modules.
  • Developing Hive functions and queries using joins for massaging the data for adhoc reporting in HDFS.
  • Designing and creating the final data marts on NoSQL MongoDB database and loading the data using Spark API.
  • Taking part in Test plan preparation and working closely with Business in validating the test results during UAT sessions.

Environment: HDP 2.3, HDFS, Kafka 0.10.0, Spark Streaming, Spark SQL, Hive 1.2.1, Scala, Python, MongoDB 3.4

Confidential, Auburn Hills, MI

Sr. BigData Developer

Responsibilities:

  • Designing and implementing end to end data near real-time data pipeline by transferring data from DB2 tables into Hive on HDFS using Sqoop.
  • Accepting and processing the data on border crossing feeds in various formats like CSV, Flat file and XML
  • Experience in handling VSAM files in mainframe and transforming to a different Code Page before moving them to HDFS using SFTP.
  • Designing and DevelopingScala scripts to perform data transformation and aggregation through RDDs on Spark.
  • Designing and implementing the data lake on the NoSQL database HBase with denormalized tables suited to feed the down- stream reporting applications.
  • Developing Hive Functions and Queries for massaging data before loading the Hive Tables.
  • Involved in Hive performance tuning by changing the Join strategies and by implementing Indexing, Partitioning and bucketing on the transactional data.
  • Extensively worked with Avro and Parquet file formats while storing data on HDFS to be accessed through Hive.
  • Extensively used Oozie workflow scheduler to automate the Hadoop jobs by creating Direct Acyclic Graph (DAG) of actions with necessary flow controls while managing dependencies.
  • Continuously monitored and managed the Hadoop Cluster using Cloudera Manager.

Environment: CDH4.3, HDFS, Hive, Sqoop, Scala, Spark Core, DB2, HBase

Confidential, Auburn Hills, MI

BigData Developer

Responsibilities:

  • Importing data from various DB systems like DB2, Oracle into HDFS using Sqoop1.
  • Accepting and processing the material movement feeds in various formats like CSV, XML and flat files with fixed length format.
  • Developed Shell scripts in transforming the file feeds before processing into the data mart.
  • Developed Hive (v0.10.0) scripts and UDFs to transform and load the transportation feeds into Hive staging tables.
  • Developed Hive scripts to validate the data feeds on HDFS and capturing the invalid transactions.
  • Performed data transformation by joining Hive tables related to the master and transaction data in performing incremental loads to the data marts on Hive.
  • Involved in Hive performance tuning by changing the Join strategies and by implementing Indexing, Partitioning and bucketing on the transactional data.
  • Developed Sqoop jobs to export data back to DB2 tables for downstream front-end applications.
  • Experience in handling VSAM files in mainframeand transforming to a different Code Page before moving them to HDFS using SFTP.
  • Used crontab, Oozie workflows to automate the data feed processing from the various sources and for the incremental master data loads from the DB2 tables.

Environment: CDH4, HDFS, Hive, Sqoop, DB2, HBase, Shell scripting, Oozie

Confidential, Auburn Hills, MI

Mainframe Developer

Responsibilities:

  • Understanding the Application functionality and inter-dependencies between various supply chain modules with the Logistics modules.
  • Participated in requirement discussions for upgrades on the existing application components and in new enhancements and involved in documentation of requirements with emphasis on Scope and Assumptions.
  • Designing and developing work flows for interfaces through application components in COBOL and JCL languages.
  • Developing reports using the EASYTRIEVE language.
  • Developing Cursor based processing in COBOL to access the data from DB2 tables and in developing Stored Procedures for accessing data from DB2 tables in executing the business logic on data.
  • Designing DB2 Stored Procedures in Mainframe using COBOL to converse with the front-end Java screens.
  • Performed data optimization tasks like Predicate optimization, defining indexes on the existing DB2 tables in achieving performance improvement in the existing technology stack
  • Designed and developed interactive querying screens using IMS DC to fetch data from IMS DB tables.
  • Interacted with Business users and resolved Data related queries and presented adhoc reports with excellent skills in SQL querying.
  • Designed and developed message queuing interfaces from MQ in bringing in the Vehicle status updates from Production control systems to Transportation module.
  • Performed team lead activities in reviewing the deliverables from the team, work pipeline planning and liaising with the project management with regular updates.
  • Experience with Mainframe related tools like XPEDITOR, FILEAID for IMS, FILE MANAGER, SPUFI and QMF.
  • Experience in working with Job schedulers like CA7, Autosys and debugging tools like IBM DEBUGGER.
  • Experience in Service Delivery management through ServiceNow tool and in configuration management tools like PANVALET.

Environment: IBMzO/S 1.8, COBOL, JCL, DB2, IMS DB/DC, EASYTRIEVE, CA7, PANVALET

Confidential, Columbus, Indiana

Mainframe Developer

Responsibilities:

  • Preparing Business/Functional Requirement documents and Design specifications for the Legacy components.
  • Developed application components using PL/I, COBOL, JCL with the IMS and DB2databases in fulfilling the application requirements.
  • Experience in Database optimization and developing stored procedures, Cursors.
  • Developed Mainframe interactive screens using IMS DC that handles data from the IMS DB.
  • Experienced in supporting the Migration of the Legacy systems to Oracle 11i Solution.
  • Generated Business process documentation of existing Legacy systems and involved in Gap Analysis tasks for the proposed solution.
  • Prepared Data Migration templates for various Functional data streams to support the ERP Migration and executed the Data Migration activity liaising with the various Stakeholders.
  • Re-engineered the data flow interfaces as necessary for the Legacy Migration.
  • Interacted closely with project managers/users for the applications on a periodic basis to assess whether their expectations are being met in a satisfactory manner and address if any concerns.
  • Experience in working with interfaces based on EDI, Manufacturing Execution System (MES), WebSphere MQ.
  • Experience with Mainframe related tools like XPEDITOR, FILEAID, FILE MANAGER, SPUFI and QMF.
  • Experience in working with Job schedulers like CA7, Autosys and debugging tools like IBM DEBUGGER.
  • Experience in Service Delivery management through Remedy tool and in configuration management tools like NDVR, PANVALET.

Environment: IBMzO/S, COBOL, PL/I, JCL, DB2, IMS DB/DC, CA7, NDVR, PANVALET

We'd love your feedback!