Hadoop/Spark Developer Resume New York, NY - Hire IT People

SUMMARY:

7+ years of IT experience, including 3 years work experience as a Hadoop consultant in big - data conversion projects gathering and analyzing customer’s technical requirements and .
Working experience on Cloudera, Horton Works Hadoop distribution.
Good Domain knowledge on Insurance, Banking and E-commerce.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, yarn, hive, SQOOP, HBase, Flume, Oozie, Name Node, Data Node, and Map Reduce concepts.
Experience in working with Map Reduce programs using Apache Hadoop to analyze large data sets efficiently.
Hands on experience in working with Ecosystems like Hive, Pig, SQOOP, Map Reduce, Flume, Oozie. Strong knowledge of Pig and Hive's analytical functions, and writing custom UDFs.
Experience in importing and exporting/importing data using SQOOP from HDFS to Relational Database Systems and vice-versa.
Good knowledge in Spark and spark components like Spark, SparkSQL.
Experienced in developing simple to complex Map/Reduce jobs, Hive and Pig to handle files in multiple formats (JSON, Text, XML, Avro, Sequence File and etc.)
Expertise in J2EE Frameworks, Servlets, JSP, JDBC, XML. Familiar with System Programming by using C, C++.
Extensive in-depth knowledge in OOAD concepts, Multithreading, Activity Diagrams, Sequence Diagrams and Class Diagrams using UML.
Experience using Design Patterns (Singleton, Factory, Builder) including MVC architecture.
Have very good exposure to the entire Software Development Life Cycle.
Excellent organizational and interpersonal skills with a strong technical background.
Quick Learner and ability to work in challenging and versatile environments and Self-motivated, excellent written/verbal communication skills.
Good experience in performing and supporting Unit testing, System Integration testing (SIT), UAT and production support for issues raised by application users.

TECHNICAL SKILLS:

Languages/Scripting: Java, Python, Pig Latin, Scala, HiveQL, SQL LINUX shell scripts, Java Script.

Big Data Framework/Stack: Hadoop HDFS, MapReduce, YARN, Hive, Hue, Impala, SQOOP, Pig, HBase, Spark, Kafka, Flume, Oozie, Zookeeper, KNIME etc

Hadoop Distributions: Apache Cloudera CDH5, Hortonworks HDP2.X

RDBMS: Oracle, DB2, SQL Server, MySQL

No SQL Databases: HBase, MongoDB

Software Methodologies: SDLC- Waterfall / Agile, Scrum

Operating Systems: Windows XP/NT/7/8, REDHAT, Centos, Mac

IDE s: Net beans, Eclipse

File Formats: XML, Text, Sequence, JSON, ORC, AVRO, and Parquet.

PROFESSIONAL EXPERIENCE:

Confidential - New York, NY

Hadoop/Spark Developer

Responsibilities:

Used Cloudera distribution for hadoop ecosystem.
Converted MapReduce jobs into Spark transformations and actions using Spark RDDs in python.
Written Spark jobs in python to analyze the data of the customers and sales history.
Used Kafka to get data from many sources into HDFS.
Involved in designing the row key in HBase to store Text and JSON as key values in HBase tables.
Involved in collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
Written Python applications to interact with the MySQL database using spark SQL Context and also accessed Hive tables using Hive Context.
Created hive external tables to perform ETL on data that is generated on daily basics.
Created HBase tables for random lookups as per requirement of business logic.
Performed transformations using spark and loaded data into HBase tables.
Performed validation on the data ingested to filter and cleanse the data in Hive.
Created SQOOP jobs to handle incremental loads from RDBMS into HDFS.
Imported data as parquet files for some use cases using SQOOP to improve processing speed for later analytics.
Collected log data from web servers and pushed to HDFS using Flume.

Environment: s: Hadoop, Hive, Flume, REDHAT6.x, Shell Scripting, Java, Eclipse, HBase, Kafka, SparkPython, Oozie, Zookeeper, CDH5.x, HQL/SQL, Oracle 11g.

Confidential, Rosemont, IL

Hadoop Developer

Responsibilities:

Work on the POC for Apache Hadoop framework initiation.
Work on Installed and configured Hadoop 0.22.0 MapReduce, HDFS, developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and HIVE using Sqoop.
Involve in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way
Implement Partitioning, Dynamic Partitions, Buckets in HIVE.
Responsible to manage data coming from different sources
Monitor the running MapReduce programs on the cluster.
Responsible for loading data from UNIX file systems to HDFS.
Install and configure Hive and also written Hive UDFs.
Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend.
Implement the workflows using Apache Oozie framework to automate tasks.
Develop scripts and automated data management from end to end and sync up b/w all the clusters.
Manage IT and business stakeholders, conduct assessment interviews, solution review sessions
Review the code developed and suggest any issues w.r.t customer data.
Use SQL queries and other tools to perform data analysis and profiling.
Mentor and train the engineering team in use of Hadoop platform and analytical software, development technologies

Environment: Apache Hadoop, Java (jdk1.6), DataStax, Flat files, Oracle 11g/10g, MySQL, Toad 9.6, Windows NT, Centos, Sqoop, Hive, Oozie.

Hadoop Developer

Confidential - Oakland, California.

Responsibilities:

Involved in complete Big Data flow of the application starting from data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data.
Importing and exporting data into HDFS using SQOOP and Kafka.
Created Hive tables and working on them using Hive QL
Created partitioned tables in Hive for best performance and faster querying.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Worked on Hive UDF’s using data from HDFS.
Performed extensive data analysis using Hive.
Executed different types of joins on Hive tables.
Used Impala for faster querying purposes.
Created indexes and tuned the SQL queries in Hive.
Involved in scheduling Oozie workflow engine to run multiple Hive jobs
Develop HiveQL scripts to perform the incremental loads.
Worked on different Big Data file formats like text, sequence, avro, parquet and snappy compression.
Involved in identifying possible ways to improve the efficiency of the system.
Involved in generating the data cubes for visualizing.

Environment: Hadoop, Hive, Pig, SQOOP, Kafka, Oozie, Impala, Flume, MySQL, Zookeeper, HBase, Cloudera Manager, Map Reduce.

Hadoop Developer

Confidential - Tampa, Florida.

Responsibilities:

Responsible to manage data coming from different sources
Involved in loading and transforming large sets of structured, semi structured and unstructured.
Developed Pig UDFs in Python for preprocessing the data.
Extensively Worked on Flat files.
Performed Joins, Grouping, and Count Operations on the Tables using Impala.
Developed pig Latin scripts for validating different query modes.
Worked on creating the workflow to run multiple Hive and Pig jobs, which run independently with time and data availability.
Created SQOOP jobs to export analyzed data to relational database.
Created Hive tables, loaded data and wrote Hive queries that run within the map .
Implemented bucketing, partitioning and other query performance tuning techniques.
Generated various reports using Tableau with Hadoop as a source for data.

Environment: Hadoop, Map Reduce, Hive, Pig, Tableau, Python, SQOOP, Oozie, Impala, Flume, MySQL, Zookeeper, HBase, Cloudera Manager.

Confidential, NYC, NY

Java Developer

Responsibilities:

Involved in deployment of full Software Development Life Cycle (SDLC) of the tracking system like Requirement gathering, Conceptual Design, Analysis, Detail design, Development, System Testing and User Acceptance
Worked in Agile Scrum methodology
Involved in writing exception and validation classes using core java
Designed and implemented the user interface using JSP, XSL, DHTML, Servlets, JavaScript, HTML, CSS and AJAX
Developed framework using Java, MySQL and web server technologies
Developed and performed unit testing using JUnit framework in a Test-Driven environment (TDD).
Validated the XML documents with XSD validation and transformed to XHTML using XSLT
Implemented cross cutting concerns as aspects at Service layer using Spring AOP and of DAO objects using Spring-ORM
Spring beans were used for controlling the flow between UI and Hibernate
Services using SOAP, WSDL, UDDI and XML using CXF framework tool/Apache Commons
Worked on database interaction layer for insertions, updating and retrieval operations of data from data base by using queries and writing stored procedures
Wrote Stored Procedures and complicated queries for IBM DB2. Implemented SOA architecture with Web
Used Eclipse IDE for development and JBoss Application Server for deploying the web application
Used Apache Camel for creating routes using Web Service
Used JReport for the generation of reports of the application
Used Web Logic as application server and Log4j for application logging and debugging
Used CVS version controlling tool and project build tool using ANT

Environment: Java, HTML, CSS, JSTL, JavaScript, Servlets, JSP, Hibernate, Struts, Web Services,, Eclipse, JBoss, JSP, JMS, JReport, Scrum, MySQL, IBM DB2, SOAP, WSDL, UDDI, AJAX, XML, XSD, XSLT, Oracle, Linux, JBoss, Log4J, JUnit, ANT, CVS

Confidential

Java Developer

Responsibilities:

Involved in designing and developing enhancements per business requirements with respect to front end JSP development using Struts.
Implemented the project using JSP and Servlets based tag libraries.
Conducted client side validations using JavaScript.
Coded JDBC calls in the Servlets to access the Oracle database tables.
Generate SQL Scripts to update the parsed message into Database.
Worked on parsing the RSS Feeds (XML) files using SAX parsers.
Designed and coded the java class that will handle errors and will log the errors in a file.
Developed Graphical User Interfaces using struts, tiles and JavaScript. Used JSP, JavaScript and JDBC to create Web Servlets.
Utilized the mail merge techniques in MS Word for time reduction in sending certificates.
Involved in documentation, review, analysis and fixed postproduction issues.
Worked on bug fixing and enhancements on change requests.
Designed the various animations with different graphics using with Macromedia Flash MX with Action Script 1.0, Photo Impact and GIF Animator.
Understanding the customer requirements, mapping them to functional requirements and creating Requirement Specifications.
Developed web pages to display the account transactions and Application UI creation using GWT, Java, JSP, CSS and web standards improving application usability always meeting tight deadlines
Responsible for the configuration of Struts web based application using struts-config.xml and web.xml
Modified Struts configuration files as per application requirements and developed Web services for non-java clients to obtain user information details pertaining to that account using JSP, DHTML, Spring Web Flow and CSS.

Environment: HTML/CSS/JavaScript/JSON, JDK 1.3, J2EE, Servlets, Java Beans, MDB, JDBC, MS SQL Server, JBoss, I frameworks & libraries Struts, Spring MVC, JQuery, MVC concepts, XML, SVN.

We provide IT Staff Augmentation Services!

Hadoop/spark Developer Resume

New York, NY

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship