We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

2.00/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY

  • Having 8+ years of professional IT experience in Analysis, Development, Integration and Maintenance of Web based and Client/Server applications using Java and Big Data technologies.
  • 4 years of relevant experience in Hadoop Ecosystem and architecture (HDFS, Spark, MapReduce, YARN, Pig, Hive, HBase, Sqoop, Flume, Oozie).
  • Experience in all phases of software development life cycle (SDLC), which includes User Interaction, Business Analysis/Modelling, Design/Architecture, Development, Implementation, Integration, Documentation, Testing, and Deployment
  • Hands on experience in installing, configuring and using Apache Hadoop ecosystem components like Hadoop Distributed File System (HDFS), MapReduce, PIG, HIVE, HBASE, ZOOKEEPER, SQOOP, HUE, JSON.
  • Reading data from File system into a Spark RDD
  • Good understanding in processing of real - time data using Spark.
  • Inject data using Sqoop from various RDBMS like Oracle, MYSQL, and Microsoft SQL Server into Hadoop HDFS.
  • Integration of OBIEE,ODI, Tableau with Hive.
  • Experienced in WAMP (Windows, Apache, MYSQL, andPython /PHP) and LAMP (Linux, Apache, MySQL, andPython /PHP) Architecture.
  • Good experience in developing web applications implementing Model View Control architecture using Django, Flask, Pyramid and Zope Python web application frameworks.
  • Experience in implementation of Open-Source frameworks like Spring, Hibernate, Web Services etc.
  • Experience in Continuous Integration and Continuous Deployment by the tools like Jenkins
  • Experience in manipulating the streaming data to clusters through Kafka and Spark-Streaming.
  • Experience with databases such as Oracle 9i, PostgreSQL, MySQL Server with cluster setup and writing the SQL queries Triggers & Stored Procedures
  • Very Good understanding and Working Knowledge of Object Oriented Programming(OOPS), Python and Scala.
  • Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Proficient in working with NoSQL database like MongoDB, Cassandra and HBase.
  • Good Knowledge in NoSQL databases HBASE (Column family DB).
  • Good knowledge on Hadoop MRV1 and Hadoop MRV2 (or) YARN Architecture.
  • Communicated to diverse communities of clients at offshore and onshore, dedicated to client satisfaction and quality outcomes. Extensive experience in coordinating the Offshore Development activities
  • Highly organized and dedicated with positive Attitude, possess good time management and organizational skills with the ability to handle multiple tasks with positive attitude.
  • Experience working across multiple industries with Fortune 500 customers and government agencies.

TECHNICAL SKILLS

BigData components: Hadoop/Big Data HDFS, MapReduce,HBase,Pig,Cassandra,Hive, Scala, Sqoop,Oozie, Kettle,Kafka,Zookeeper,MongoD

Programming Languages: Java (J2SE, J2EE), C, C#, PL/SQL, Swift, SQL+, ASP.NET, JDBC, Python

Mobile Development: Android, IOS application development with Swift, Objective C

Web Development: JavaScript, JQuery, HTML 5.0, CSS 3.0, AJAX, JSON

Development Tools: NetBeans 8.0.2, Visual Studio 2013, Eclipse Neon, Android Studio, SQL developer, AWS(Import/Export)

Testing Tools: J-Unit Testing, HP- Unified functional testing, HP- Performance Center, Selenium, win runner, Load Runner, QTP

UNIX Tools: Apache, Yum, RPM

Operating Systems: Windows, Linux, Ubuntu, Mac OS, Red Hat Linux

Protocols: TCP/IP, HTTP and HTTPS

Web Servers: Apache Tomcat

Cluster Management Tools: Cloudera Manager, HortonWorks, Ambari

Methodologies: Agile, V-model, Waterfall model

Databases: HBase, MongoDB, Cassandra,Oracle 10g, MySQL, Couch, MS SQL server

Encryption Tools: VeraCrypt, AxCrypt, BitLocker, GNU Privacy Guard

PROFESSIONAL EXPERIENCE

Hadoop/Spark Developer

Confidential, Phoenix, AZ

Responsibilities:

  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like ApacheSpark written in Scala
  • Creating end to end Spark-Solr applications using Scala to perform various data cleansing, validation, transformation and summarization activities according to the requirement
  • Used flume, sqoop, hadoop, spark and oozie for building data pipeline.
  • Good knowledge on Spark Ecosystem and Spark Architecture.
  • Cluster coordination services through Zookeeper.
  • Experienced in running Hadoop streaming jobs to process terabytes of xml format data.
  • Automated all the jobs, for pulling data from FTP server to load data into Hive tables,using Oozieworkflows.
  • Implemented Spark using Scala and SparkSQL for faster testing and processing of data.
  • Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
  • Developed Oozieworkflow for scheduling and orchestrating the ETL process. Designed & Implemented Java MapReduce programs to support distributed data processing.
  • Worked with highly unstructured and semi-structured data of 30 TB in size (90 TB with replication factor of 3).
  • Contributed towards developing a Data Pipeline to load data from different sources like Web, RDBMS, NoSQL to Apache Kafka or Spark cluster.
  • Migrating data fromSpark-RDD into HDFS and NoSQL like Cassandra/Hbase.
  • Worked on reading multiple data formats on HDFS using PySpark
  • Hands on experience in installation, configuration, supporting and managingHadoop ClustersusingApache, Cloudera (CDH3, CDH4), Yarn distributions.
  • Developed Kafka producer and consumers, HBase clients,Sparkand Hadoop MapReduce jobs along with components on HDFS, Hive.
  • Worked on the core andSpark SQL modules ofSpark extensively.
  • Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance

Environment: Hadoop, HDFS, Hive, Scala, Spark, SQL, Teradata, UNIX Shell Scripting, Big Data, Map Reduce, Sqoop, Oozie, Pig, Zookeeper, Flume, LINUX, Java, Eclipse, Python 2.7, Cloudera

Hadoop/Scala Developer

Confidential - Malvern, PA

Responsibilities:

  • Create, validate and maintain scripts to load data using Sqoop manually.
  • Create Oozie workflows and coordinators to automate Sqoop jobs weekly and monthly.
  • Worked on reading multiple data formats on HDFS using Scala.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs and Scala.
  • Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Teradata.
  • Analyzed the SQL scripts and designed the solution to implement using Scala
  • Develop, validate and maintain HiveQL queries.
  • Fetch data to/from HBase using Map Reduce jobs.
  • Analyzing data with Hive, Pig.
  • Designed Hive tables to load data to and from external tables.
  • Writing DistCp shell scripts to load data across servers.
  • Run executive reports using Hive and Qlik View.
  • Load and transform large sets of unstructured data from UNIX system to HDFS
  • Use Apache Scoop to dump the data user data into the HDFS on a weekly basis.
  • Created production jobs using Ooziework flows that integrated different actions like Map Reduce, Sqoop, Hive.
  • Used Scala collection framework to store and process the complex employer information. Based on the offers setup for each client, the requests were post processed and given offers.
  • Used Akka as a framework to create reactive, distributed, parallel and resilient concurrent applications inScala.
  • Successfully migrated Django database from SQLite to MySQL with complete data integrity.
  • Involved in developing a linear regression model to predict a continuous measurement for improving the observation on wind turbine data developed using spark withScalaAPI.
  • Good knowledge in writing Spark application using Python andScala.
  • Developed Spark scripts by usingScalashell commands as per the requirement.

Environment: Hadoop Horton Works, Hadoop Stack (Hive, PIG, HCatlog, Sqoop, Oozie), Qlik view, Windows 8, SQL Server 2010, Bit Bucket, Scala, Python Django, Unix

Hadoop Developer

Confidential - Washington, DC

Responsibilities:

  • Create, validate and maintain scripts to load data from and into tables in Oracle PL/SQL and in SQL Server 2008 R2.
  • Wrote Store Procedures and Triggers.
  • Converting, testing and validating Oracle scripts to SQL Server.
  • Developed Kafka producer and consumers, HBase clients, Spark and Hadoop MapReduce jobs along with components on HDFS, Hive.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS.
  • Used SOLR for database integration IBM MAXIMO to SQL SERVER.
  • Upgraded IBM Maximo database from 5.2 to 7.5.
  • CreatedAWSS3 buckets, performed folder management in each bucket, Managed cloud trail logs and objects within each bucket.
  • Analyze, validate and document the changed records for IBM Maximo web application.
  • Importing data from MySQL database to HiveQL using Scoop.
  • Implemented OASISBI.
  • Writing Map Reduce jobs.
  • Develop, validate and maintain HiveQL queries.
  • Running reports in Pig and Hive Queries.
  • Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
  • Install and configure Hue.
  • Managing Amazon Web ServicesAWSinfrastructure with automation and configuration management tools such as IBM Udeploy, Puppet or custom-built designing cloud-hosted solutions, specific AWSproduct suite experience.
  • Junit for unit testing.
  • Conduct datamining,datamodelling, statistical analysis, business intelligence gathering, trending and benchmarking by using Datameer.
  • Used Tableau for visualization and generate reports for financial data consolidation, reconciliation andsegmentation.
  • Designed and developed script for transfer of files using FTP/SFTP between servers according to business requirements
  • Implemented machine learning techniques like clustering and regression on Tableau and created interactive dashboards
  • Managed and reviewedHadoop log files.
  • Support full testing cycle forETLprocesses, including bug fixes.
  • Performed upgrades, package administration and support for over 200Linuxservers.
  • Performed automated installation of CentOS operating system using kickstart.

Environment: HDFS, Hive, Pig, Sqoop,ZooKeeper, Oozie, ETL, AWS, Tableau, Hive Query, CentOS

Java Developer

Confidential

Responsibilities:

  • Participated in re-design of the application using Java, JSP, Servlets,Java Beans, XML, AdvantNet SNMP and MySQL technologies.
  • Wrote PL/SQL queries, stored procedures, and triggers to perform back-end database operations.
  • Experience in using multiple Action Controllers to control the page flow.
  • Worked in UI team to develop new customer facing portal for Long Term Care Partners.
  • Implementing Java API using core java
  • Write new features in Golang
  • Used JDBC to establish connection between the database and the application.
  • Used AJAX for client-to-server communication
  • Created the user interface using HTML, CSS and JavaScript.
  • Developed the code which will create XML files and Flat files with the data retrieved from Databases and XML files.
  • Applied design patterns and OO design conceptsto improve the existing Java/J2EE based code base.
  • Developed JAX-WS web services
  • Expertise in script programming, including BASH shell, JavaScript and Python
  • Written Implementation proposals with design alternatives for ENUM+ and IPWorks 5.0 upgrade work packages and configured MySQL Cluster with 4 Solaris Systems and Integrated with IPWorks.
  • Designed and developed ENUM+ objects storage in MySQL cluster synchronizing with DNS Server using java multi-threading concepts
  • Built SPA with loading multiple views using route services usingAngular2and NodeJs
  • Created Angular2 components, implemented Interpolation, Input variables, Bootstrapping, NgFor, NgIf, Router Outlet, binding the events, decorators
  • Migrate the legacy system implemented in Perl to Golang
  • Used JavaScript, AJAX, HTML for front end.
  • Used SQL to write complex queries.

Environment: J2EE 5, Struts 2.0, Hibernate 3.0, MVC, WebLogic Application Server 10.3, UML, JSP, Servlets, Java Script, HTML5, CSS, Ajax, Angular2, Web Services, JBOSS,Eclipse 3.5 IDE, PL/SQL, ANT, Junit, XML/XSL, log 4j 1.2.15.

PL/SQL Developer

Confidential

Responsibilities:

  • Wrote Stored Procedures in PL/SQL.
  • Defragmentation of tables, partitioning, compressing and indexes for improved performance and efficiency.
  • Involved in table redesigning with implementation of Partition Table and Partition Indexes to makeDatabaseFaster and easier to maintain.
  • UsedSQL Server SSIS toolto build high performance data integration solutions includingextraction, transformationandload packagesfordata warehousing.
  • Extracted data from theXMLfile and loaded it into thedatabase.
  • Created and modifiedSQL*Plus, PL/SQLandSQL*Loader scriptsfor data conversions.
  • Worked onXMLalong with PL/SQL to develop and modify web forms.
  • Designed Data Modeling, Design Specifications and to analyzeDependencies.
  • Creatingindexeson tables to improve the performance by eliminating the full table scans and views for hiding the actual tables and to eliminate the complexity of the large queries.
  • Involved in creatingUNIX Shell Scripting.
  • Maintaining Logical and Physical structure of the database.
  • Creating tablespaces, tables, views,scripts for automatic operationsof the database activities.
  • Coded variousstored procedures, packagesandtriggersto incorporate business logic into the application.

Environment: Oracle 9i, 10g, PL/SQL, Erwin 4.1, C, C++, Oracle Designer 2000,Windows 2000, Toad, SQL*Plus.

We'd love your feedback!