We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Dearborn, MI

OBJECTIVE:

8 years of IT experience in all phases of Software Development Life Cycle including Analysis, Design, Development,Maintenance and Testing of software applications using Big Data, Search and Mainframe Technologies.

PROFESSIONAL SUMMARY:

  • Strong experience/understanding of Hadoop Distributed File System (HDFS) and Hadoop Eco System (Map Reduce, PIG, HIVE, Oozie, Sqoop, HBase and Zoo Keeper).
  • Proficient in design, configuration, indexing, querying and maintenance of Apache Solr.
  • Experience with advanced data computing frameworks like Spark Core, Spark SQL on Scala.
  • Expertise in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Experience in writing User Defined Functions (UDF’s) for Hive and Pig.
  • Worked with different file formats like Parquet, RC Files, ORC Files, Sequence Files, XML, JSON etc.
  • Experience in analyzing data with Hadoop frameworks like Map Reduce, Hive QL, Pig and Spark.
  • Experience in analyzing large log files, indexing and visualizing data in SPLUNK.
  • Experience in writing shell scripts using different OS like UNIX/Linux as per the requirements.
  • Experience in using Kerberos for authenticating the end users with Hadoop in secure mode.
  • Strong Database background with DB2, VSAM Databases.
  • Strong experience/knowledge on ITIL Service management framework.
  • Worked in all aspects of Software Development Life Cycle (Analysis, System Design, Development, Testing and Maintenance) using Waterfall and Agile (SCRUM) methodologies.
  • Good knowledge in Insurance and Manufacturing domains.
  • Strong communication and analytical skills with very good experience in programming& problem solving.
  • Ability to learn and adapt quickly and to correctly apply new tools and technology.

TECHNICAL SKILLS:

Programming Languages: JAVA, Scala, COBOL.

Hadoop / Big Data Stack: Hadoop, HDFS, YARN, MapReduce, Pig, Hive, Spark, Spark - SQL, Oozie, Zookeeper, HBase, Sqoop.

Search Engine: Apache Solr

Databases: DB2, VSAM

Data Analytics Tool: SPLUNK

Operating Systems: Windows, Linux, UNIX and CentOS.

Version Control Systems: AccuRev, GIT, CMAN.

PROFESSIONAL EXPERIENCE:

Confidential, Dearborn, MI

Sr. Hadoop Developer

Responsibilities:

  • Implemented Data Lake in Hive DB providing centralized platform for users to rapidly query, join, analyze from multiple sources at one point.
  • Have implemented free text search engines using Apache Solr collections.
  • Experience in design, configuration, indexing, querying and maintenance of Solr collections.
  • Have developed complex Spark SQL (Scala) modules to implement data analytics involving heavy volume of data.
  • Piloted migration of existing Hive/ Map Reduce data transformation process to Spark Core/SQL to study performance and feasibility.
  • Have developed Hive UDF libraries to be used in both data transform and business analysts for Hive querying.
  • Experience in working with Kerberos authentication system.
  • Experience in ingesting, deduping and processing machine generated raw data with Spark frameworks on Scala.
  • Have handled varying data formats like JSON, XML, ORC, parquet, sequence files etc.
  • Have implemented end to end data transformation process in Oozie including data collection, decoding, transform, de-dup and storage using Hadoop frameworks like Map reduce, Hive, Pig.
  • Have created Shell Scripts to automate routine maintenance tasks like data copy, transfer and cleansing.
  • Involved in automating data transfers using Oozie workflow/ coordinator jobs. And have scheduled both time and file dependent Oozie jobs.

Environment: Hadoop, HDFS, Solr, Spark, Pig, Hive, Oozie, Scala, Java, UNIX, AccuRev, Git, Eclipse.

Confidential, Scranton, PA

Hadoop Developer

Responsibilities:

  • Implemented centralized data lake in Hive by copying data from external environments like Mainframes for BI team to create dashboards for business use.
  • Have developed SPLUNK dashboards by ingesting and indexing application logs and visually presenting the data.
  • Implemented JCLs to copy data from Mainframe databases like VSAM, DB2 to HDFS.
  • Experience in handling semi structured/ structured in HDFS.
  • Experience in working on various data compression techniques like snappy, ORC, sequence and formats like XML, JSON etc.
  • Piloted process of moving data from external sources into HDFS and visually presenting the required data to users in Qlikview dashboards.
  • Have created custom Pig loaders to ingest semi structured application logs and transform them into meaningful data.
  • Implemented data decode, transformation and filtration process using Map Reduce and Hive.
  • Have developed complete Oozie workflows to ingest data from external sources, transform, filter, aggregate and store in Hive DB. Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Worked on various performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
  • Have worked on Map reduce performance optimization by using partitions, counters, sorting and custom comparators
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.

Environment: Hadoop, Map Reduce, HDFS, SPLUNK, Hive, Pig, Java, Oozie, DB2, VSAM, UNIX, Eclipse.

Confidential

Mainframe Developer& Incident Lead

Responsibilities:

  • Involved in incident management, problem management and maintenance activities in Insurance domain remittance, client acquisition, payroll processing and policy management systems.
  • Developed and maintained JCL jobs and COBOL, Java programs.
  • Experience in resolving critical and high priority application outages, job failures and incident resolution.
  • Have worked on Maestro and OPC schedulers.
  • Experience working with file based database like VSAM and SQL based DB2.
  • Have worked in service management lead roles in incident and problem management.
  • Involved in software system analysis and application maintenance.
  • Designed charts/metrics depicting current work progress used by senior management for decision making.

Environment: Mainframe, JCL, COBOL, Java, TSO/ISPF, File-Aid, Xpeditor, CMAN.

We'd love your feedback!