Hadoop/Spark Developer Resume

SUMMARY

I have about 7+ years of professional IT experience which includes experience in Big d Confidential ecosystem experience in complete project life cycle (design, development, testing and implementation) of which over 3+ years of work experience in ingestion, storage, querying, processing and analysis of Big D Confidential with hands on experience inHadoop Ecosystem (YARN, HDFS) and its components Hive, Pig, HBase, Sqoop, Hue, Kafka, Flume, Oozie, Zookeeper, Spark, SparkSQL andSparkStreaming.
Worked Hands on in Hadoop clusters like Hortonworks, AWS Elastic Map Reduce and Cloudera.
I have hands on experience in improving the performance and optimization of the existing algorithms in Hadoop usingSparkcontext,Spark - SQL, D Confidential Frame, pair RDD's &SparkYARN.
I have working experience on building spark applications using build tools like SBT, Maven and Gradle.
I have good experience in dealing with different file formats like text, Sequence, RCFILE, ORC, Parquet, Avro and JSON and different compression formats like GZip, LZO, BZip2 and snappy.
I have good knowledge on relational d Confidential bases like MySQL, Oracle and NoSQL d Confidential bases like HBase, MongoDB. working knowledge on UNIX /Linux systems including Experience on shell scripting working experience in handling semi/un-structured d Confidential from different d Confidential sources.
Working experience in developing Map Reduce programs using Combiners, Map side join, Reducer side join, Distributed Cache, Compression techniques, Multiple Input & output.
I have working experience in performing ad-hoc analysis on structured d Confidential using HiveQL, joins and Hive UDF's good exposure to Counters, Shuffle & Sort parameters, Dynamic Partitions, Bucketing for performance improvement.
I have worked in using IDE like Eclipse and Intellij IDEA
I have working knowledge in Java and SQL in application development and deployment.

TECHNICAL SKILLS

Big D Confidential Associated: HDFS, MapReduce, Pig, Hive, Sqoop, Flume, HBase, Oozie, Apache Spark, Spark SQL, Spark Streaming.

Process/D Confidential Modeling: MS Visio, UML Diagrams and ER Studio

Cluster Manager Tools: HDP Ambari, Cloudera Manager, Hue

ETL/ELT/D Confidential bases: HBase, MongoDB, Spark SQL, MS Access, Oracle, DB-II, My SQL, SQL Developer, SQL Server and Toad

Languages: C, C++, Java, PL/SQL, Python, Scala

Web-Technologies: HTML, DHTML, XML, CSS

Microsoft Technologies: ASP.NET, C#.Net, VB.Net, ADO.NET, SharePoint, Word, Excel and PowerPoint.

Operating Systems: Linux, Ubuntu, RHEL, Windows XP/7/8/10.

IDE: Eclipse and Intellij IDEA

PROFESSIONAL EXPERIENCE

Confidential

Hadoop/Spark Developer

Responsibilities:

Worked with lambda architecture in handling and processing batch and real-time d Confidential .
Using Sqoop, ingested the D Confidential from d Confidential warehouse to HDFS.
Using Kafka, collected real-time streaming and log d Confidential from web applications and click stream d Confidential, analyzing a part of d Confidential using spark streaming and rest stored into HDFS for future use.
Worked in writing Hive Queries for analyzing d Confidential in Hive warehouse using Hive Query Language (HiveQL) and Worked with Hive Tables, Hive queries, Partitioning, Bucketing.
PerformedD Confidential Profiling, identifyd Confidential quality and validating rules regarding d Confidential integrity andd Confidential quality as it relates to the impact on business requirements.
Build spark applications using SBT builds.
Used Spark SQL to process the huge amount of structured d Confidential .
Connected Tableau server to publish dashboard to a central location for portal integration.
Creation of metrics, attributes, filters, reports, and dashboards created advanced chart types, visualizations and complex calculations to manipulate the d Confidential .

Environment: Cloudera Manager, Sqoop, Java (jdk1.8 Version), Hive, Spark, Spark-SQL, Scala, Tableau.

Confidential

Hadoop/Spark Developer

Responsibilities:

Worked in Ingesting flat files from local Unix file systems to HDFS and using Sqoop ingested structured d Confidential from legacy RDBMS systems to
Developed the code for Importing and exporting d Confidential into HDFS and Hive using Sqoop
Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, D Confidential Frame, Pair RDD's, Spark YARN.
Used D Confidential Frame API in Scala for converting the distributed collection of d Confidential organized into named columns, developing predictive analytic using Apache Spark Scala APIs.
Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop and Worked with Spark and Scala.
Writing Hive join query to fetch info from multiple tables, writing multiple Map Reduce jobs to collect output from Hive Used Hive to analyze the partitioned and bucketed d Confidential and compute various metrics for reporting on the dashboard.
Utilized Oozie workflow to run Hive Jobs Extracted files through Sqoop and placed in HDFS and processed.

Environment: Hadoop, Spark, HDFS, Scala, Hive, Java, Spring, Map Reduce, Sqoop, Spring MVC, Big D Confidential, Spark SQL, JDBC, Oozie, Pig, Flume

Confidential

Hadoop Developer

Responsibilities:

Worked on analyzing d Confidential using different big d Confidential analytic tools including Pig, Hive and MapReduce.
Created Pig Latin scripts to sort, group, join and filter the enterprise wise d Confidential .
Implemented Partitioning, Dynamic Partitions, and Buckets in Hive on Avro files to meet the business requirements.
Implemented D Confidential Integrity and D Confidential Quality checks using Linux scripts.
Used flume to tail the application log files into HDFS.
Involved in scheduling of Hive and pig jobs using Oozie workflow.
Involved in performance tuning and memory optimization of map-reduce and Hive applications.
Worked on end to end automation of application.
Responsible for continuous Build/Integration with Jenkins and deployment using XL Deploy.
Actively involved in code review and bug fixes and enhancements.

Environment: Hadoop, HDFS, MySQL, Apache Hive, Pig, MapReduce, MySQL, Core Java, Shell Scripting, Eclipse, Git, Jenkins.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship