Hadoop/Spark Developer Resume Kansas City, Missouri - Hire IT People

SUMMARY

4+ years of IT professional experience with full project lifecycle development in J2EE technologies, Requirements analysis, Design, Development, Testing, Big Data, Deployment and production support of software applications.
Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, HiveQL, Spark, Spark Streaming, SparkSQL, MLLib, Kafka, HBase, and Zookeeper.
Involved in converting Hive/SQL queries into Spark Transformations using RDD's and Scala.
Migrated the traditional MapReduce jobs to Spark jobs to improve the Speed of Data.
Experienced in WAMP (Windows, Apache, MYSQL) and LAMP (Linux, Apache, MySQL) Architecture.
Experience in working with Horton Works Hadoop stack and Amazon Web Services (AWS) suite.
Very good understanding of Hadoop architecture and the daemons ofHadoop - Name Node, Data Node, Resource Manager, Node Manager, Task Tracker, Job Tracker.
Good knowledge and experience in developing SOAP and REST APIs and frameworks like Django and Flask.
Building Data Warehousing and Datamart solution in Teradata and Big data platforms.
Experienced in developing Web Services with java programming language.
Experience in developing web applications and implementing Model View Control (MVC) architecture using server-side applications Django, Flask and Pyramid.
Hands on experience in installing, configuring and using ecosystem components likeHadoopMap Reduce, HDFS, HBase, Oozie, Hive, HCatalog, Pig, Flume.
Performed Data Integration between different Databases and to HDFS, Hive and Hbase using Talend Integration and Talend Big Data tools.
Experience in using database stage like oracle connector, Teradata connector, ODBC connector.
Experience in Designing, Compiling, Testing, and Scheduling and Running Data Stage jobs.
Experienced in developing Map Reduce programs using Apache Hadoop for working with Big Data.
Expertise in back-end procedure development, for RDBMS, Database Applications using SQL and PL/SQL.
Good knowledge with Big Data on Azure - Data lake store, Data Factory.
Good Knowledge on Informatica and worked when connected to Oracle using Informatica and used various transformations to perform the ETL tasks.
Hands on experience on writing Queries, Stored procedures, Functions and Triggers by using SQL.
Experienced in utilizing Java tools in business, Web, and client-server environments including Java Platform, J2EE, EJB, JSP, Java Servlets, Struts, and Java database Connectivity (JDBC) technologies.
Experience in writing Complex SQL Queries involving multiple tables inner and outer joins.
Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.

TECHNICAL SKILLS

Languages: C, Java, Scala

Hadoop Distribution: Hortonworks, Cloudera

Hadoop Eco Systems: HDFS, MapReduce, Yarn, Pig, Hive, HiveQL, HBase, Sqoop, Flume, Oozie, Zookeeper, Cassandra, Kafka, Scala, Spark, Spark Streaming, Spark SQL and Storm.

Technologies: JSP, J2EE, JDBC, Hibernate, Spring, Ajax, RESTful web services

Development Tools(IDEs): Eclipse, NetBeans, Intellij

Web/Application Servers: Tomcat, WebLogic, IBM WebSphere, JBOSS

Database: Oracle 11g, SQL server 2008, MySQL, MS SQL Server, HBase

Platforms: Windows, Unix, Linux

Testing Tools: Junit, JIRA

Version Control Tools: Git, GitHub

Methodologies: Agile (SCRUM), Waterfall

Build Tools: Maven, Gradle

PROFESSIONAL EXPERIENCE

Confidential, Kansas City, Missouri

Hadoop/Spark Developer

Responsibilities:

Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Zoo Keeper, Sqoop, Flume, Spark and Kafka.
Developed Spark code using Scala and Spark -SQL for faster testing and processing of data.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Exploring with theSpark improving the performance and optimization of the existing algorithms in Hadoop usingSpark Context,Spark -SQL, Data Frame, Pair RDD’s,Spark YARN.
Experienced with batch processing of data sources using ApacheSpark.
Developed analytical components using Scala,Spark, YARN andSpark Stream.
Experienced with NoSQL databases like HBase, MongoDB and Cassandra.
Installed Hadoop, Map Reduce, HDFS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
Developed Kafka producer and consumers, Spark and Hadoop MapReduce jobs.
Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Import the data from different sources like HDFS/HBase into Spark RDD.
Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
Involved in converting Map Reduce programs intoSpark transformations usingSpark RDD’s on Scala.
DevelopedSpark scripts by using Scala Shell commands as per the requirement.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
Load the data into Spark RDD and do in memory data Computation to generate the Output response.
Experience in Oozie and workflow scheduler to manage Hadoop jobs with control flows.
Expertise in different Data Modelling and Data Warehouse design and development.

Environment: Hadoop, HDFS, Spark, MapReduce, Pig, Hive, Sqoop, Kafka, HBase, Oozie, Flume, Scala, Java, SQL Scripting and Linux Shell Scripting.

Confidential, Naperville IL

Spark/Java Developer

Responsibilities:

Worked on developing streaming application using Spark Streaming (2.x). The end to end data flow includes NiFi, Kafka, Spark Streaming and HBase.
Developed Spark code using Scala andSpark -SQL for faster testing and processing of data.
Developed Kafka producer and consumers, Spark and Hadoop MapReduce jobs.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HBase.
Load the data into Spark RDD and do in memory data Computation to generate the Output response
I have been involved in streaming the data i.e. Json format from different Kafka topics and loading the data into HBase in real time.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
I have developed a real-time data validation checks on the streaming data before loading to HBase tables.
Worked on Row Key design and table design.
Daily reports are generated on the HBase tables using Spark HBase API. Reports include Audit batch reports and Data Validations reports.
Involved on created Time series data for the daily data, which would help for further time series analysis and conduct machine learning techniques on the time series data.

Environment: Spark, Spark Streaming, Java, Scala, HBase, Hive, Kafka, Intellij, NiFi, Zeppelin

Confidential, Ann Arbor, MI

Software Engineer

Responsibilities:

Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Zoo Keeper, Sqoop, Flume, Spark and Kafka.
Developed Spark code using Scala andSpark -SQL for faster testing and processing of data.
Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop usingSpark Context,Spark -SQL, Data Frame, Pair RDD’s,Spark YARN.
Experienced with batch processing of data sources using Apache Spark.
Developed analytical components using Scala, Spark, YARN andSpark Stream.
Experienced with NoSQL databases like HBase, MongoDB and Cassandra.
Installed Hadoop, Map Reduce, HDFS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
Involved in Data Extraction from Oracle, Flat files and XML files using Talend by using Java as Backend Language.
Wrote UNIX shell scripts in combination with the Informatica sessions to process the source files and load into staging database.
Developed Kafka producer and consumers, Spark and Hadoop MapReduce jobs.
Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
Import the data from different sources like HDFS/HBase into Spark RDD.
Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
Involved in converting Map Reduce programs intoSpark transformations usingSpark RDD’s on Scala.
DevelopedSpark scripts by using Scala Shell commands as per the requirement.
Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
Load the data into Spark RDD and do in memory data Computation to generate the Output response.
Experience in Oozie and workflow scheduler to manage Hadoop jobs with control flows.
Expertise in different Data Modelling and Data Warehouse design and development.

Environment: Hadoop, HDFS, Spark, MapReduce, Pig, Hive, Sqoop, Kafka, HBase, Oozie, Flume, Scala, Java, SQL Scripting, Oracle and Linux Shell Scripting.

Confidential

Software Engineer

Responsibilities:

Implemented server-side programs by using Servlets and JSP.
Designed, developed and validated User Interface using HTML, Java Script, XML and CSS.
Implemented MVC using Struts Framework.
Involved in implementing the DAO pattern for database access and used the JDBC API extensively.
Used XML Web services for transferring data between different applications and retrieving credit information from the credit bureau.
Used XML with DTD and its references with the files. Used JAXB API to bind XML schema to java classes.
Used JMS-MQ Bridge to send messages securely, reliably and asynchronously to WebSphere MQ, which connects to the legacy systems.
Tested the application functionality with JUnit Struts Test Cases.
GUI was developed using JSF and Java Swing.
Developed logging module-using Log4J to create log files to debug as well as trace application.
Used CVS for version control.
Extensively used ANT as a build tool. Deployed the applications on IBM Web Sphere Application Server.
Handled the database access by implementing Controller Servlet.
Implemented PL/SQL stored procedures and triggers.
Used JDBC prepared statements to call from Servlets for database access.
Used Log4J for any errors in the application. Written test cases using Junit.

Environment: Java 1.4, J2EE, JSP, Servlets, HTML, DHTML, XML, JavaScript, Eclipse, WebLogic, Struts, Web Sphere MQ 5.3, Java SDK 1.4, MVC, Core Java, Servlet 2.2, JSP 2.0, JDBC, PL/SQL, XML Web Services, XML DTD, Apache Tomcat, ASP, Spring1.0.2, SOAP, WSDL, JavaScript, Windows 2000, Oracle 9i, JUnit, CVS, ANT 1.5 and Log4J.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

Kansas City, MissourI

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship