Hadoop/Spark Developer Resume Phoenix, AZ - Hire IT People

SUMMARY

Having 8+ years of professional IT experience in Analysis, Development, Integration and Maintenance of Web based and Client/Server applications using Java and Big Data technologies.
4 years of relevant experience in Hadoop Ecosystem and architecture (HDFS, Spark, MapReduce, YARN, Pig, Hive, HBase, Sqoop, Flume, Oozie).
Experience in all phases of software development life cycle (SDLC), which includes User Interaction, Business Analysis/Modelling, Design/Architecture, Development, Implementation, Integration, Documentation, Testing, and Deployment
Hands on experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop Distributed File System (HDFS), MapReduce, PIG, HIVE, HBASE, ZOOKEEPER, SQOOP, HUE, JSON.
Reading data from File system into a Spark RDD
Good understanding in processing of real - time data using Spark.
Inject data using Sqoop from various RDBMS like Oracle, MYSQL, and Microsoft SQL Server into Hadoop HDFS.
Integration of OBIEE,ODI, Tableau with Hive.
Experienced in WAMP (Windows, Apache, MYSQL, andPython /PHP) and LAMP (Linux, Apache, MySQL, andPython /PHP) Architecture.
Good experience in developing web applications implementing Model View Control architecture using Django, Flask, Pyramid and Zope Python web application frameworks.
Experience in implementation of Open-Source frameworks like Spring, Hibernate, Web Services etc.
Experience in Continuous Integration and Continuous Deployment by the tools like Jenkins
Experience in manipulating the streaming data to clusters through Kafka and Spark-Streaming.
Experience with databases such as Oracle 9i, PostgreSQL, MySQL Server with cluster setup and writing the SQL queries Triggers & Stored Procedures
Very Good understanding and Working Knowledge of Object Oriented Programming(OOPS), Python and Scala.
Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Proficient in working with NoSQL database like MongoDB, Cassandra and HBase.
Good Knowledge in NoSQL databases HBASE (Column family DB).
Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
Communicated to diverse communities of clients at offshore and onshore, dedicated to client satisfaction and quality outcomes. Extensive experience in coordinating the Offshore Development activities
Highly organized and dedicated with positive Attitude, possess good time management and organizational skills with the ability to handle multiple tasks with positive attitude.
Experience working across multiple industries with Fortune 500 customers and government agencies.

TECHNICAL SKILLS

BigData components: Hadoop/Big Data HDFS, MapReduce,HBase,Pig,Cassandra,Hive, Scala, Sqoop,Oozie, Kettle,Kafka,Zookeeper,MongoD

Programming Languages: Java (J2SE, J2EE), C, C#, PL/SQL, Swift, SQL+, ASP.NET, JDBC, Python

Mobile Development: Android, IOS application development with Swift, Objective C

Web Development: JavaScript, JQuery, HTML 5.0, CSS 3.0, AJAX, JSON

Development Tools: NetBeans 8.0.2, Visual Studio 2013, Eclipse Neon, Android Studio, SQL developer, AWS(Import/Export)

Testing Tools: J-Unit Testing, HP- Unified functional testing, HP- Performance Center, Selenium, win runner, Load Runner, QTP

UNIX Tools: Apache, Yum, RPM

Operating Systems: Windows, Linux, Ubuntu, Mac OS, Red Hat Linux

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, HortonWorks, Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra,Oracle 10g, MySQL, Couch, MS SQL server

Encryption Tools: VeraCrypt, AxCrypt, BitLocker, GNU Privacy Guard

PROFESSIONAL EXPERIENCE

Hadoop/Spark Developer

Confidential, Phoenix, AZ

Responsibilities:

Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like ApacheSpark written in Scala
Creating end to end Spark-Solr applications using Scala to perform various data cleansing, validation, transformation and summarization activities according to the requirement
Used flume, sqoop, hadoop, spark and oozie for building data pipeline.
Good knowledge on Spark Ecosystem and Spark Architecture.
Cluster coordination services through Zookeeper.
Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
Automated all the jobs, for pulling data from FTP server to load data into Hive tables,using Oozieworkflows.
Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
Developed Oozieworkflow for scheduling and orchestrating the ETL process. Designed & Implemented Java MapReduce programs to support distributed data processing.
Worked with highly unstructured and semi-structured data of 30 TB in size (90 TB with replication factor of 3).
Contributed towards developing a Data Pipeline to load data from different sources like Web, RDBMS, NoSQL to Apache Kafka or Spark cluster.
Migrating data fromSpark-RDD into HDFS and NoSQL like Cassandra/Hbase.
Worked on reading multiple data formats on HDFS using PySpark
Hands on experience in installation, configuration, supporting and managingHadoop ClustersusingApache, Cloudera (CDH3, CDH4), Yarn distributions.
Developed Kafka producer and consumers, HBase clients,Sparkand Hadoop MapReduce jobs along with components on HDFS, Hive.
Worked on the core andSpark SQL modules ofSpark extensively.
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance

Environment: Hadoop, HDFS, Hive, Scala, Spark, SQL, Teradata, UNIX Shell Scripting, Big Data, Map Reduce, Sqoop, Oozie, Pig, Zookeeper, Flume, LINUX, Java, Eclipse, Python 2.7, Cloudera

Hadoop/Scala Developer

Confidential - Malvern, PA

Responsibilities:

Create, validate and maintain scripts to load data using Sqoop manually.
Create Oozie workflows and coordinators to automate Sqoop jobs weekly and monthly.
Worked on reading multiple data formats on HDFS using Scala.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
Analyzed the SQL scripts and designed the solution to implement using Scala
Develop, validate and maintain HiveQL queries.
Fetch data to/from HBase using Map Reduce jobs.
Analyzing data with Hive, Pig.
Designed Hive tables to load data to and from external tables.
Writing DistCp shell scripts to load data across servers.
Run executive reports using Hive and Qlik View.
Load and transform large sets of unstructured data from UNIX system to HDFS
Use Apache Scoop to dump the data user data into the HDFS on a weekly basis.
Created production jobs using Ooziework flows that integrated different actions like Map Reduce, Sqoop, Hive.
Used Scala collection framework to store and process the complex employer information. Based on the offers setup for each client, the requests were post processed and given offers.
Used Akka as a framework to create reactive, distributed, parallel and resilient concurrent applications inScala.
Successfully migrated Django database from SQLite to MySQL with complete data integrity.
Involved in developing a linear regression model to predict a continuous measurement for improving the observation on wind turbine data developed using spark withScalaAPI.
Good knowledge in writing Spark application using Python andScala.
Developed Spark scripts by usingScalashell commands as per the requirement.

Environment: Hadoop Horton Works, Hadoop Stack (Hive, PIG, HCatlog, Sqoop, Oozie), Qlik view, Windows 8, SQL Server 2010, Bit Bucket, Scala, Python Django, Unix

Hadoop Developer

Confidential - Washington, DC

Responsibilities:

Create, validate and maintain scripts to load data from and into tables in Oracle PL/SQL and in SQL Server 2008 R2.
Wrote Store Procedures and Triggers.
Converting, testing and validating Oracle scripts to SQL Server.
Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS.
Used SOLR for database integration IBM MAXIMO to SQL SERVER.
Upgraded IBM Maximo database from 5.2 to 7.5.
CreatedAWSS3 buckets, performed folder management in each bucket, Managed cloud trail logs and objects within each bucket.
Analyze, validate and document the changed records for IBM Maximo web application.
Importing data from MySQL database to HiveQL using Scoop.
Implemented OASISBI.
Writing Map Reduce jobs.
Develop, validate and maintain HiveQL queries.
Running reports in Pig and Hive Queries.
Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
Install and configure Hue.
Managing Amazon Web ServicesAWSinfrastructure with automation and configuration management tools such as IBM Udeploy, Puppet or custom-built designing cloud-hosted solutions, specific AWSproduct suite experience.
Junit for unit testing.
Conduct datamining,datamodelling, statistical analysis, business intelligence gathering, trending and benchmarking by using Datameer.
Used Tableau for visualization and generate reports for financial data consolidation, reconciliation andsegmentation.
Designed and developed script for transfer of files using FTP/SFTP between servers according to business requirements
Implemented machine learning techniques like clustering and regression on Tableau and created interactive dashboards
Managed and reviewedHadoop log files.
Support full testing cycle forETLprocesses, including bug fixes.
Performed upgrades, package administration and support for over 200Linuxservers.
Performed automated installation of CentOS operating system using kickstart.

Environment: HDFS, Hive, Pig, Sqoop,ZooKeeper, Oozie, ETL, AWS, Tableau, Hive Query, CentOS

Java Developer

Confidential

Responsibilities:

Participated in re-design of the application using Java, JSP, Servlets,Java Beans, XML, AdvantNet SNMP and MySQL technologies.
Wrote PL/SQL queries, stored procedures, and triggers to perform back-end database operations.
Experience in using multiple Action Controllers to control the page flow.
Worked in UI team to develop new customer facing portal for Long Term Care Partners.
Implementing Java API using core java
Write new features in Golang
Used JDBC to establish connection between the database and the application.
Used AJAX for client-to-server communication
Created the user interface using HTML, CSS and JavaScript.
Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
Applied design patterns and OO design conceptsto improve the existing Java/J2EE based code base.
Developed JAX-WS web services
Expertise in script programming, including BASH shell, JavaScript and Python
Written Implementation proposals with design alternatives for ENUM+ and IPWorks 5.0 upgrade work packages and configured MySQL Cluster with 4 Solaris Systems and Integrated with IPWorks.
Designed and developed ENUM+ objects storage in MySQL cluster synchronizing with DNS Server using java multi-threading concepts
Built SPA with loading multiple views using route services usingAngular2and NodeJs
Created Angular2 components, implemented Interpolation, Input variables, Bootstrapping, NgFor, NgIf, Router Outlet, binding the events, decorators
Migrate the legacy system implemented in Perl to Golang
Used JavaScript, AJAX, HTML for front end.
Used SQL to write complex queries.

Environment: J2EE 5, Struts 2.0, Hibernate 3.0, MVC, WebLogic Application Server 10.3, UML, JSP, Servlets, Java Script, HTML5, CSS, Ajax, Angular2, Web Services, JBOSS,Eclipse 3.5 IDE, PL/SQL, ANT, Junit, XML/XSL, log 4j 1.2.15.

PL/SQL Developer

Confidential

Responsibilities:

Wrote Stored Procedures in PL/SQL.
Defragmentation of tables, partitioning, compressing and indexes for improved performance and efficiency.
Involved in table redesigning with implementation of Partition Table and Partition Indexes to makeDatabaseFaster and easier to maintain.
UsedSQL Server SSIS toolto build high performance data integration solutions includingextraction, transformationandload packagesfordata warehousing.
Extracted data from theXMLfile and loaded it into thedatabase.
Created and modifiedSQL*Plus, PL/SQLandSQL*Loader scriptsfor data conversions.
Worked onXMLalong with PL/SQL to develop and modify web forms.
Designed Data Modeling, Design Specifications and to analyzeDependencies.
Creatingindexeson tables to improve the performance by eliminating the full table scans and views for hiding the actual tables and to eliminate the complexity of the large queries.
Involved in creatingUNIX Shell Scripting.
Maintaining Logical and Physical structure of the database.
Creating tablespaces, tables, views,scripts for automatic operationsof the database activities.
Coded variousstored procedures, packagesandtriggersto incorporate business logic into the application.

Environment: Oracle 9i, 10g, PL/SQL, Erwin 4.1, C, C++, Oracle Designer 2000,Windows 2000, Toad, SQL*Plus.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Phoenix, AZ

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship