We provide IT Staff Augmentation Services!

Big Data Analyst Resume

3.00/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY:

  • Overall 7 years of experience in IT, with 4 years of experience in Mainframe CICS, PL/1, COBOL - DB2, IMS DB and batch process design and development and 3 years of experience in analysis, design, development and implementation of large-scale applications within Big Data Hadoop environment using technologies such as Spark, Map Reduce, Scala, Cassandra, Hive, Pig, Sqoop, Oozie, HBase, Zookeeper and HDFS
  • Proficient in Hadoop architecture and Spark Architecture with various components of Hadoop 1.X and 2.X such as HDFS, Job Tracker, Task Tracker, Data Node, Name Node and YARN concepts such as Resource Manager, Node Manager
  • Strong experience in writing Spark, Map Reduce programs, HiveQL and Pig Latin scripts leading to good understanding in Map Reduce design patterns, data analysis using Hive and Pig
  • Strong knowledge of Apache Spark, Hive and Pig's analytical functions, extending Spark, Hive and Pig functionality by writing custom UDFs
  • Great knowledge of working with Apache Spark Streaming API on Big Data Distributions in an active cluster environment
  • Proficient in importing and exporting data from Relational Database Systems to HDFS and vice versa, using Sqoop
  • Experience in using Apache Flume for collecting, aggregating and moving large amounts of data from application servers
  • Experience with installing, backup, recovery, configuration and development on multiple Hadoop distribution platforms like Hortonworks Distribution Platform(HDP), Cloudera Distribution for Hadoop (CDH)
  • Ability to quickly master new concepts and capable of working independently as well as in teams
  • Strong in client communication working with both technology and business users throughout the delivery life cycle
  • Proficiency in managing and co-ordinating with the interfacing systems to implement the business solutions
  • Good knowledge on managing the Change request and Incident management with experience in Service Now and Infoman tools.
  • Sound knowledge in File manager, File Aid, Changeman, ServiceNow, Debugger, Abend-aid, OPCA scheduler, SAVERS and SART
  • Strong hands on technical knowledge on VSAM (Access method), JCL, Syncsort Utility, Sort Utility, IDCAMS utility and other IBM utilities.
  • Good hands on experience in Online Screen programs using Assembler (MAPS)
  • Strong Experience in working on CICS, DB2,QMF, Stored procedures and Spufi
  • Good experience in NDM(Connect:Direct), MQ and REXX
  • Strong expertise in Rally Agile management tool
  • Hands on experience in communicating with all stake holders (Business team, database management team, software development team, architects) and users to provide best solutions by maintaining the best standards of practice
  • Good knowledge in managing the funding needs of the project with experience in Clarity tool and Risk management
  • In-depth experience in the entire solution delivery life cycle on Agile and Waterfall model

TECHNICAL SKILLS:

Database: MySQL, Oracle 11g, DB2, IMS-DB MS-SQL Server, HBase, Cassandra, MongoDB

Languages: SQL, HIVEQL, Pig Latin, SparkSQL,Scala, Java, Javascript, Python, COBOL, PL1, JCL, Rexx

Big Data Technologies: Hadoop (Horton works, Cloudera), HDFS, YARN, Map Reduce, Apache Spark, Apache Pig, Apache Hive, Apache HBase, Impala, Sqoop, Cassandra, Mongo DB, Spark Streaming, Spark SQL, Spark ML, Flume, Oozie, Hue, Zookeeper, Active MQ and Apache Kafka.

Operating Systems: Microsoft Windows, Linux, IBM Z OS.

Other Tools: HP ALM, JIRA,RALLY, File Manager, File-Aid, Changeman, OPCA, Control-M, Savers, MQ series, IMS DB, Intertest, Debugger, Abend Aid

Framework: Waterfall model and Agile model

PM Tool: Clarity

PROFESSIONAL EXPERIENCE

Confidential, Phoenix, AZ

Big Data Analyst

Responsibilities:

  • Developed Spark code using Scala and Spark - SQL/Streaming for faster processing of data .
  • Prepared Spark builds from source code and ran the Pig Scripts using Spark rather using MR jobs for better performance.
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Used Spark API over Hortonworks, Hadoop YARN to perform analytics on data in Hive.
  • Wrote Hive queries for data analysis to meet the business requirements.
  • Developed Kafka producer and consumers for message handling.
  • Exploring with Spark improving performance and optimization of the existing algorithms in Hadoop MapReduce using Spark Context, Spark-SQL, Data Frames, Pair RDD's and Spark YARN.
  • Deployed MapReduce and Spark jobs on Amazon Elastic MapReduce using datasets stored on S3.
  • Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers.
  • Knowledge of designing and deployment of Hadoop cluster and different Big Data analytic tools including Hive, HBase, Oozie, Sqoop, Flume, Spark, Impala, Cassandra.
  • Real time streaming of data using Spark with Kafka.
  • Responsible for developing data pipeline using Flume, Sqoop and Pig to extract the data from weblogs and store in HDFS.
  • Experienced with batch processing of data sources using Apache Spark and Elastic search.
  • Experienced in implementing Spark RDD transformations, actions to implement business analysis.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data .
  • Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
  • Integrating user data from Cassandra to data in HDFS.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala.
  • Involved in importing the real-time data to Hadoop using Kafka and implemented Oozie jobs for daily imports.
  • Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in Oozie.
  • Created Hive tables and involved in data loading and writing Hive UDFs

Environment: CDH4, CDH5, Scala, Spark, Spark Streaming, Spark SQL, HDFS, AWS, Hive, Pig, Linux, Eclipse, Oozie, Hue, Flume, MapReduce, Apache Kafka, Sqoop, Oracle, Shell Scripting and Cassandra, Hortonworks

Confidential, Phoenix, AZ

Big Data/Hadoop Engineer

Responsibilities:

  • Supported MapReduce Programs, those are running on the cluster.
  • Provisioning, installing, configuring, monitoring, and maintaining HDFS, Yarn, HBase, Flume, Sqoop, Oozie, Hive.
  • Involved in writing the shell scripts for exporting log files to Hadoop cluster through automated process.
  • Creating Hive tables, dynamic partitioning, buckets for sampling, and working on them using HiveQL.
  • Used Pig to parse the data and store it in Avro format.
  • Stored data in tabular formats using Hive tables and Hive SerDes.
  • Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Involved in creating UNIX shell scripts for database connectivity and executing queries in parallel job execution.
  • Developed and written Apache Pig scripts and Hive scripts to process the HDFS data .
  • Designed and implemented incremental imports into Hive tables.
  • Involved in Unit testing and delivered Unit test plans and results documents using Junit and MRUnit.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data .
  • Exported the analyzed data to the relational databases using Sqoop for visualization.
  • Analyzed large and critical datasets using HDFS, HBase, MapReduce, Hive, Hive UDF, Pig, Sqoop, Zookeeper.
  • Developed custom aggregate UDF's in Hive to parse log files.
  • Identified the required data to be pooled to HDFS, and created Sqoop scripts which were scheduled periodically to migrate data to the Hadoop environment.
  • Involved with File Processing using Pig Latin.
  • Created MapReduce jobs involving combiners and practitioners to deliver better results and worked on application performance optimization for an HDFS cluster.
  • Worked on debugging, performance tuning of Hive & Pig Jobs

Environment: Cloudera, MapReduce, HDFS, Pig Scripts, Hive Scripts, HBase, Sqoop, Zookeeper, Oozie, Oracle, Shell Scripting

Confidential, Chicago

Mainframe Lead

Responsibilities:

  • Involved in Impact analysis on high level design and changed the design accordingly.
  • Working with the clients to get the queries sorted out
  • Involved in Peer Review and queries clarification
  • Worked on setting up of CICS region ready for SIT and UAT testing
  • Worked on SIT and UAT queries and solving defects.

Environment: DB2, PL/1 and Cobol programming, CICS Maps, JCL,VSAM

Confidential

Technical Lead

Responsibilities:

  • Extensively involved in the replacing IBM utilities which were used to check for file existence, empty file check, Sort functionalities to other alternatives which could be used in Citi mainframes.
  • Involved in Peer review for Job changes for Citi Standards.
  • Worked on Production validation and error fixing.
  • Worked on Production monitoring.

Environment: DB2, PL/1 and Cobol programming, CICS Maps, JCL,VSAM

Confidential

Technical Lead

Responsibilities:

  • Played key role in end to end coding and testing of the new connections and modules.
  • Worked on MQ coding to connect with new vendor
  • Responsible for developing, Support and maintenance for this new enhancement.
  • Performed business and system process flows, requirements and developed Requirement Traceability Matrix to ensure current project requirements are met and their timely update.
  • Handled requirement gathering for all the modules developed and provided the solution for seamless integration.
  • Worked on post implementation.

Environment: PL/1 and Cobol programming, CICS, MQ, Addition of new files, Transaction definition, TDQ and TSQ definition.

Confidential

Technical Lead

Responsibilities:

  • Involved in Impact analysis for the High level design
  • Responsible for coding and testing of new Confidential modules and worked on creating Socket connections to contact Confidential and get the results.
  • Involved in defining the programs, transactions in CICS regions
  • Coded DB2 Cobol modules and worked in socket communication modules to connect to open system

Environment: DB2, Mainframe, PL/1, COBOL, Socket connections.

Confidential

Systems Engineer

Responsibilities:

  • Involved in complete code change for this project.
  • Wrote a complex COBOL program to handle New account booking and Nonmonetary data.
  • Developed code to contact FDR using Message Queuing(MQ)
  • Involved in contacting FDR for checking on the Syntax for sending the nonmonetary data in RPC format(XML).
  • Created Screen programs(CICS screen) to connect FDR for resending failed accounts.
  • Separated the logic to send main details immediately and rest of the details the next day to avoid heavy traffic.
  • Created batch programs to collect the error out applications.
  • Identified, organized, and documented the changing requirements of the project.
  • Involved in design and development of user management module.
  • Extensively performed manual Test process.
  • Handled requirement gathering for all the modules developed and provided the solution for seamless integration.

Environment: Mainframe, PL/1, COBOL, CICS, MQ, TDQ, TSQ definitions, JCL and VSAM file definition

Confidential

Systems Engineer

Responsibilities:

  • Involved in code enhancement.
  • Developed CICS mappings for online Suffix field and involved in coding and unit testing.
  • Coded the main logic to find the Suffix values
  • Peer review.
  • Maintenance & support.
  • Troubleshooting and Bug fixing.

Environment: Mainframe, PL/1, COBOL, CICS.

Confidential

Systems Engineer

Responsibilities:

  • Verified the detailed design documents for Development releases and done the Impact Analysis.
  • Prepared delivery plan and estimates.
  • Coded important modules to calculate Capacity to Pay and involved in DB2 Cobol and stored procedures.
  • It involved various CICS screen changes
  • Was involved in Coding ad Unit Testing.
  • Was involved in PEER REVIEW.
  • Taken ownership of the project and coordinated with Onshore and managed the project.

Environment: Mainframe- CICS, DB2, OPCA, File-manager and Abend aid. Used PL/1 and Cobol language.

We'd love your feedback!