Hadoop/Spark Developer Resume Irving,TX - Hire IT People

SUMMARY

Over 9+ years of strong experience with Hadoop and Mainframe Operations and worked on enterprise applications using Hadoop components like HDFS, Map Reduce(YARN), SQOOP, PIG, HIVE, HBase, Oozie, Apache Kafka, Spark with Scala, Cassandra and Tableau.
Strong working experience in Mainframe Production support environment.
4+ years of exclusive experience in Hadoop and its components like HDFS, Map Reduce(yarn), Apache Pig, Hive, Sqoop, HBase and Oozie, Cassandra
Expertise in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Good working knowledge with Apache Hive and PIG
Expertise in creating Hivetables and Hivequeries using HiveQL
Strong exposure towards Performance tuning and optimization techniques in Hive
Good working knowledge with Oozie for scheduling jobs
Good hands-on experience in Apache Spark with Scala
Involved in writing the Pig scripts to reduce the job execution time
Used Hbase in accordance with PIG/Hive as and when required for real time low latency queries.
Good understanding/knowledge of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Good understanding of HDFS Designs, Daemons and HDFS high availability (HA).
Experience in installation, configuration, supporting and managing - Cloudera Hadoop platform along with CDH4&5 clusters.
Good knowledge on NOSQL Data bases such as HBase, Cassandra.
Experience in preparing reports and dashboards using data visualizations tools like Tableau
Expertise in trouble shooting and bug reporting using defect tracking tools.

TECHNICAL SKILLS:

Programming languages: Core Java, ScalaHADOOP/BIG DATAHDFS, MapReduce, Yarn, Hive, Pig, HBase, Sqoop, Flume, Oozie, Zoo keeper, Apache Spark, Kafka

Mainframes Operations Tools: CA7, JOB TRAC,TWS-OPC, Control-M, Net Cool.

Presentation Tools: WebEx, Skype, Team Viewer

Databases: Oracle 10g, DB2, NoSQL,Cassandra

IDE Tools: Eclipse 3.x, NetBeans 5.0/5.5

Documentation Tools: MS Word, MS Visio.

Visualization tools: Tableau

Operating Systems: Windows XP/2000/NT/98/95, UNIX, LINUX

PROFESSIONAL EXPERIENCE

Hadoop/Spark Developer

Confidential, Irving,TX

Responsibilities:

Developed and executed shell scripts to automate the jobs
Wrote complex Hive queries.
Worked on reading multiple data formats on HDFS using Scala
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL
Analyzed the SQL scripts and designed the solution to implement using Scala
Involved in loading data from UNIX file system to HDFS
Extracted the data from Databases into HDFS using Sqoop
Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
Manage and review Hadoop log files.
Involved in analysis, design, testing phases and responsible for documenting technical specifications
Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance
Worked on the core and Spark SQL modules of Spark extensively..
Involved in im porting the real time data to hadoop using Kafka and implemented the Oozie job for daily imports.

Environment: Hadoop, HDFS, Hive, Scala, Spark, SQL, Pig,Tableau

Hadoop Developer

Confidential - Plano, TX

Responsibilities:

Installed and configured Hadoop on a cluster.
Developed multiple MapReduce Jobs in java for data cleaning and pre-processing
Developed Simple to complex Map Reduce Jobs using Hive and Pig
Extending Hive and Pig core functionality by writing custom UDFs
Analyzed large data sets by running Hive queries and Pig scripts
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in mapreduce way.
Experienced in defining job flows using Oozie
Experienced in managing and reviewing Hadoop log files
Experienced in collecting, aggregating, and moving large amounts of streaming data into HDFS using Flume.
Load and transform large sets of structured, semi structured and unstructured data
Responsible to manage data coming from different sources and application
Working Knowledge in NoSQL Databases like HBase and Cassandra.
Good Knowledge of analyzing data in HBase using Hive and Pig.
Involved in Unit level and Integration level testing.
Prepared design documents and functional documents.
Based on the requirements, addition of extra nodes to the cluster to make it scalable.
Involved in running Hadoop jobs for processing millions of records of text data
Involved in loading data from local file system (LINUX) to HDFS
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Responsible to manage data coming from different sources.
Assisted in exporting analysed data to relational databases using Sqoop
Created and maintained Technical documentation for launching Hadoop Clusters and for executing
Submit a detailed report about the daily activities on a weekly basis.

Environment: Hadoop-HDFS, Pig, Sqoop, HBase, Hive, Flume MapReduce, Cassandra, Oozie and MySql

System Analyst

Confidential, Richardson, TX

Responsibilities:

Worked on importing data from various sources and performed transformations using MapReduce, Hive to load data into HDFS.
Configured Sqoop jobs to import data from RDBMS into HDFS using Oozie workflows.
Worked on setting up Pig, Hive and HBase on multiple nodes and developed using Pig, Hive, HBase and MapReduce.
Solved small file problem using Sequence files processing in Map Reduce.
Written various Hive and Pig scripts.
Created HBase tables to store variable data formats coming from different portfolios.
Performed real time analytics on HBase using Java API and Rest API.
Implemented HBase Co - processors to notify Support team when inserting data into HBase Tables.
Worked on compression mechanisms to optimize MapReduce Jobs.
Analyzed the customer behavior by performing click stream analysis and to ingest the data used flume.
Experienced with working on Avro Data files using Avro Serialization system.
Implemented business logic by writing UDF's in Java and used various UDF's from Piggybanks and other sources.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Horton works, Map Reduce, HBase, HDFS, Hive, Pig, SQL, Cloudera Manager, Sqoop, Flume, Oozie.Senior Operations Professional (

Confidential

Responsibilities:

Monitoring consoles of 17 LPAR’s which includes Production, finance, networking and testing
performing health checks for jobs in regular intervals
Suppress and re-instate the jobs
Hold and release the jobs.
Ad-hoc request to Schedule Particular applications
Moving Elements from Development to Production.
Manage JES spool and analyze hardware and software problem informing to the support team.
Display / Start/ Stop CICS.
Perform IPL on all the LPARs during maintenance slots and troubleshooting during the IPL.
Perform pre and post health checkups before and after the IPL.

Environment: CA7,JCl, mainframes hardware and Z O/SEngineer (Mainframes Operations)

Confidential

Responsibilities:

Console monitoring of all LPARs (Production, Development and Testing).
Monitoring batch jobs, Subsystems, Online regions & Databases.
Taking care of each production batch job abends by following operator instruction and escalating to appropriate support group if required.
ABEND fixing by rerunning, purging, deleting, place jobs on hold/confirm status also rescheduling job as per requirement.
Bringing down and bringing up Subsystems & Online regions like-CICS as per requirement.
Monitoring and responding to Outstanding Replies (WTORs) and also checking Contention regularly.
Raising tickets for Critical Servers and assigning to appropriate resolver group.
Bringing down and bringing up Subsystems & Online regions like-CICS & IMS as per requirement.
Preparing weekly and monthly Report, Dashboard for client.
Saturday morning ‘tie in’ on Old World AXA Life TWS.
Technical window on weekends

Environment: Jobtrac, TWS OPS, JCl, mainframes hardware and Z O/S.Education

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Irving, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship