Hadoop/Spark Developer Resume TX - Hire IT People

SUMMARY

6+ years of Professional experience in IT Industry, involved in Developing, Implementing and maintenance of various web based applications using Java, J2EE and Big Data Ecosystems experience on Windows and Linux environments.
Over 3+ years of work experience on Big Data Analytics with hands on experience on writing Sparkand Map Reduce jobs on Hadoop Ecosystem including Hive, Pig, Sqoop and Flume.
Excellent knowledge on Hadoop Architecture and ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Knowledge on installing, configuring and using Hadoop ecosystem and components like Hadoop Map Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, Zookeeper and Flume.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
Extending Hive and Pig core functionality by writing custom UDFs.
Proficiency in Spark using Scala for loading data from the local file systems like HDFS, Amazon S3, Relational and NoSQL databases using Spark SQL, Cassandra and Import data into RDD and Ingesting data from a range of sources using Spark Streaming.
Developed Apache Spark jobs using Scala in test environment for faster data processing and used Spark SQL for querying.
Analyzed large amounts of data sets using Pig scripts and Hive scripts.
Exploring withvarious modules of Spark and working with Data Frames, RDD and Spark.
Performed map-side joins on RDD.
Experience in ETL operations on Hive to Spark
Performed visualizations according to business requirements using visualization tools like Tableau.
Designed and developed Tableau dashboards, installed and configured Tableau Server on enterprise wide deployments.
Installed, tested and deployed monitoring solutions with Splunk services.
Worked with Core Java and J2EE technologies such as Servlets, JSP, EJB, JMS, JDBC, Threads, Multi-Threading, Collections and Exception handling
Hands on experience with Spring modules such as Spring Core, Spring MVC, Spring AOP, Spring Auto Wiring, Security and Transaction, Struts along with Hibernate as the back-end ORM tool.
Experienced in developing applications using Model-View-Controller (MVC) Architecture andSpring framework.
Experience in developing and consuming Web Services using REST, SOAP, XSD, XML, UDDI, JSON and WSDL.
Experience in Deploying web application using application servers WebLogic, Apache Tomcat, WebSphere and JBOSS.
Used Version Control tools like GIT, CVS, SVN and Clear Case.
Good Experience on SDLC (Software Development Life cycle).
Experienced in coding SQL, PL/SQL, Procedures/Functions, Triggers and Packages on database (RDBMS) packages like Oracle.
Experienced inwebdevelopmentusingHTML/HTML5, DHTML, XHTML, CSS/CSS3, JavaScript, Angular JS, Node JS technologies.

TECHNICAL SKILLS

Hadoop Ecosystem: Hadoop, CDH5.3.2, MapReduce, YARN, Spark 1.6/2.0, Sqoop, Hive, Oozie, PIG, HDFS, Flume, ImpalaProgramming Languages C, Java, Scala 2.11 SQL, PL/SQL, PIG Latin, HiveQL, Unix shell scripting

Java & J2EE Technologies: Core Java, Servlets, JSP

No SQL Databases: HBase, MongoDB

Version control/Tools: Git, Git Hub,SAS, Tableau..

Databases: Oracle 11g/10g/9i, My SQL

Frameworks: Spring 3.0.5, Hibernate 3.5.1, Struts 1.3.10, EJB, JUnit, MRUnit

Web and Application Server: Apache Tomcat 7.0, Apache Tomcat 6.0, Web Logic 8.0Methodologies Agile Scrum, Waterfall

PROFESSIONAL EXPERIENCE

Confidential, TX

Hadoop/Spark Developer

Responsibilities:

Experienced with batch processing of data sources using Apache Spark and Elastic search
Experienced in implementing Spark RDD transformations, actions to implement business analysis
Migrated Hive QL queries on structured into Spark QL to improve performance
Implemented POC to migrate map reduce jobs into Spark RDD transformations using Scala
Configured, deployed and maintained a single node storm cluster in DEV environment
Developing predictive analytic using Apache Spark Scala APIs used Spark Streaming with Kafka&HDFS/HBase to build a continuous ETL pipeline. This is used for real time analytics performed on the data
Preparing Design Documents (Request-Response Mapping Documents, Hive Mapping Documents)
Data ingestion is done using Flume with source as Kafka Source & sink as HDFS.
Used Scala collection framework to store and process the complex consumer information. Based on the offers setup for each client, the requests were post processed and given offers.
Used slick to query and storing in database in a Scala fashion using the powerful Scala collection framework.
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL
Created various Parser programs to extract data from Autosys, Tibco Business Objects, XML, Java, and database views using Scala
Ran weekly sales enablement requirements including hands-on Git and GitHub workshops for reps
Developed solutions to pre-process large sets of structured, semi-structured data, with different file formats (Text file, Avro data files, Sequence files, Xml and JSon files, ORC and Parquet)
Handled importing of data from RDBMS into HDFS using Sqoop
Experienced in data cleansing processing using Pig latin operations and UDFs
Experienced in writing Hive Scripts for analyzing data in Hive warehouse using Hive Query Language (HQL)
Implemented Partitioning, Dynamic partitioning and Bucketing in Hive using internal and external table for more efficient data.
Involved in creating Hive tables, loading with data and writing hive queries to process the data
Created scripts to automate the process of Data Ingestion
Developed PIG scripts for source data validation and transformation
Installed Oozie workflow engine to run multiple Hive and Pig jobs which run independently with time and data availability for analyzing HDFS audit data
Experience in using Testing Frameworks of BigData world, MRUnit, PIGUnit for testing raw data and executed performance script

Environment: HDFS, CDH5.3.2, Apache Spark 4.1, Hive, Pig, Scala, Java, Sqoop, SQL, Shell scripting.

Confidential, Fort Worth, TX

Hadoop/Spark Developer

Responsibilities:

Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster
Installed and configured Hive, Pig, Sqoop, and Oozie on the Hadoop cluster
Installed Oozie Workflow engine to run multiple Hive and Pig Jobs
Developed multiple MapReduce jobs in Java for data cleansing and preprocessing
Developed Simple to complex Map/Reduce Jobs using Hive and Pig
Involved in loading data from UNIX file system to HDFS
Evaluated business requirements and prepared detailed specifications that follow project guidelines required to develop written programs
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it
Provided quick response to ad hoc internal and external client requests for data and experienced in creating ad hoc reports
Responsible for building scalable distributed data solutions using Hadoop
Migration of ETL processes from Oracle to Hive to test the easy data manipulation
Performed optimization on Pig scripts and Hive queries increase efficiency and add new features to existing code
Stored and retrieved data from data-warehouses using Amazon Redshift
Developed PIG Latin scripts for the analysis of semi structured data
Used Hive and created Hive tables and involved in data loading and writing Hive UDFs
Used Sqoop to import data into HDFS and Hive from other data systems
Installed Oozie workflow engine to run multiple Hive
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager
Conducted some unit testing for the development team within the sandbox environment
Developed Hive queries to process the data

Environment: Apache Hadoop, Cloudera Manager, CDH2, CDH3 CentOS, Java, MapReduce, ApacheHama, Eclipse Indigo, Pig, Hive, Sqoop, Oozie and SQL, Struts, JUnit.

Confidential

Hadoop Developer

Responsibilities:

Cloudera Hadoop installation and configuration of multiple nodes using Cloudera Manager and CDH 4.X/5.X.
Designed documents and estimated efforts for the project.
Developed Map Reduce Programs using MRv1 and MRv2 (YARN).
Responsible for processing unstructured data using Pig and Hive.
Developed Pig Latin scripts for extracting data.
Used Pig for data loading, filtering and storing the data.
Developed HIVE queries for the analysts.
Developed Java code to stream the Packet tracer data into Hive using rest full services.
Worked on migrating data from Mongo DB to Hadoop.
Worked on integrating SFDC with Hadoop.
Extracted the data from MySQL into HDFS using Sqoop.
Involved in running Hadoop jobs for processing millions of records of text data for batch and online processes by using Tuned/Modified SQL.
Designed and published workbooks and dashboards using Tableau Dashboard/Server 6.X/7.X

Environment: Cloudera, Hadoop (HDFS), Map Reduce, Spark, Hive, Java, Scala, JDK, UNIX Shell Scripting, MySQL, Eclipse, Tableau 8.X/9.X.

Confidential

Java Developer

Responsibilities:

Involved in the complete development, testing and maintenance process of the application
Responsible for gathering the requirements doing the analysis and formulating the requirements specifications with the consistent inputs/requirements
Developed JSP as an application controller
Designed and developed HTML front end screens and validated forms using JavaScript
Used Frames and Cascading Style Sheets (CSS) to give a better view to the Web Pages
Deployed the web application on Web Logic server
Used JDBC for database connectivity
Developed necessary SQL queries for database transactions
Involved in testing, implementation and documentation
Written Java script code for Input Validation
Front End was built using JSPs, JavaScript and HTML
Built Custom Tags for JSPs
Built the report module on reports based from Crystal reports
Integrating data from multiple data sources
Generating schema difference reports for database using toad

Environment: Java, JSP, Web Logic 5.1, HTML, JavaScript, JDBC and SQL, PL/SQL, UNIX.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship