We provide IT Staff Augmentation Services!

Hadoop/big Data Developer Resume

4.00/5 (Submit Your Rating)

Carrolton, TX

PROFESSIONAL SUMMARY:

  • Over 9 +years of experience in software development includes Analysis, Design and Development of quality software for Standalone Applications and Web - based applications using JAVA/J2EE Technologies using Software Development Methodologies / Frameworks like SDLC, OOAD and AGILE.
  • Developed web applications based on different Design Patterns such as Model-View-Controller (MVC), Data Access Object (DAO), Singleton Pattern, Front Controller, Business Delegate, Service Locator, Transfer Objects etc.
  • Experienced in using Java tools like Intelli J, Eclipse.
  • Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MapReduce concepts responsible for writing MapReduce programsand setting up standards and processes for Hadoop-based application design and implementation.
  • Performance benchmarking & optimization of H-scale implemented Big data Components.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
  • Expertise with different tools in Hadoop Environment including Pig, Hive, HDFS, MapReduce, Spark, Kafka, Yarn, and Zookeeper.
  • Extensively used Scalafor functional application programming for creating GUI and charts and data analytics.
  • Used Different Spark Modules like Spark core, Spark RDD's, Spark Data frame, Spark SQL.
  • Developed various web applications using Scala PLAY Framework using REST APIs and MVC pattern
  • Expertise in developing data driven applications usingPython2.7,Python3.5 on Pyspark.
  • Experience on Machine learning tools using Python using toolkits such as NumPy, Regression, Natural Language Processing, Pyspark, and SciPy.
  • Experience in installation, configuration, and deployment of Big Data solutions.
  • Experience in using Cloudera Manager for installation and management of single-node and multi-node Hadoop cluster.
  • Worked with the Spark for improving performance and optimization of the existing algorithms inHadoopusing Spark Context, Spark-SQL, Spark MLlib, Data Frame, Pair RDD's, Spark YARN.
  • Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS), Python and Scala.
  • Expertise in MapReduce programs in HIVE and PIG to validate and cleanse the data in HDFS, obtained from heterogeneous data sources, to make it suitable for analysis.
  • Analyzed or transformed stored data by writing MapReduce jobs based on business requirements.
  • Experience in developing Pig scripts and Hive Query Language.
  • Hands on experience working with NoSQL databaseCassandra.
  • Experience in developing Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.

TECHNICAL SKILLS:

Hadoop Technologies: HBase, HIVE, Sqoop, Flume, HDFS, Oozie, Zookeeper, YARN, Spark, Kafka, Sentry, Falcon, Pig

J2EE Technologies: Servlets, JSP, EJB, JDBC, Web Services (WSDL, SOAP), Spring and

Web Services/ Application Servers: Apache tomcat Server, IBM WebSphere server, JBoss

Web Tools and Languages: HTML, XML, CSS, DHTML, Java Script

Databases (SQL): IBM DB2, Oracle8i/9i/10g, MS SQL Server 2005/2008, MySQL

DataBases (NO-SQL): PIG, HIVE, Cassandra, MongoDB, HBASE

Languages: Scala, Python, Java / J2EE, HTML, SQL

OS: Windows 2003/2008/XP/Vista, Unix, Linux (Various Versions)

Tools: MS-Office 2003/2007/2010, Eclipse3.3/3.4, Eclipse, Net Beans

Version Control: IBM RTC

Bug Reporting Tools: Bugzilla, IBM Rational Clearcase

Others: ASP.NET, VB.NET and C#

IDEs: Eclipse, NetBeans, JDeveloper, MyEclipse

PROFESSIONAL EXPERIENCE:

Confidential, Carrolton, TX

Hadoop/Big Data Developer

Responsibilities:

  • Worked with the advanced analytics team to design fraud detection algorithms, and retrieving real-time streaming datasets and then developed MapReduce programs to efficiently run the algorithm on the huge datasets.
  • Ran data formatting scripts inpythonand created terabyte csv files to be consumed byHadoop MapReduce jobs.
  • Performed data analysis, feature selection, feature extraction using Apache Spark Machine Learning streaming libraries inPython.
  • Developed functional programs in SCALA for connecting the streaming data application and gathering webdatausing JSON and XML and passing it to FLUME.
  • Configured Kafka to read and write messages from external programs.
  • Configured Kafka to handle real time data.
  • Extensively used SCALA for connecting and retrieving data from NO-SQL databases such as MongoDB, PIG, HIVE, Cassandra, and HBASE
  • Involved in administration, installing, upgrading and managing CDH3, Pig, Hive&HBase.
  • Played a key-role is setting up a 50 node Hadoop cluster utilizing Apache Spark by working closely with the Hadoop Administration team.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs in Python and Scala.
  • Used the Spark -Cassandra Connector to load data to and from Cassandra.
  • Created Hive tables to store data into HDFS, loading data and writing hive queries that will run internally in map-reduce way.
  • Modeling and data mining and advanced data processing. Contributed in healthcare projects based on Big Data.
  • Real time streaming the data using Spark (version 1.4.0) with Kafka (version 0.8.2.2).Configured Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scale.
  • Uploaded and processed terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
  • Involved in Cluster coordination services through Zookeeper.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for BI team.
  • Automated the installation and maintenance of Kafka, storm, zookeeper and elastic search using salt stack technology.
  • Created UDF's to store specialized data structures in HBase and Cassandra.
  • Played a key role in installation and configuration of the various Hadoop ecosystem tools such as Solr, Kafka, Pig, HBase and Cassandra.
  • Implemented various hive optimization techniques like Dynamic Partitions, Buckets, Map Joins, Parallel executions in Hive.
  • Developed Web Applications in SCALAPLAY Framework (2.4 and 2.5)using REST API and MVC pattern, for interfacing with HDFS.
  • Involved in scheduling Airflow workflow engine to run multiple Hive and pig jobs using python.
  • Used Flume to collect the logs data with error messages across the cluster.
  • Extracted meaningful data from dealer csv files and text files and generatedPythonpanda's reports for data analysis.
  • UtilizedPythonto run scripts, generate tables, and reports.
  • Designed and Maintained Oozie workflows to manage the flow of jobs in the cluster.
  • Parse Json files through Spark core to extract schema for the production data using SparkSQL and Scala.
  • Actively updated the upper management with daily updates on the progress of project that include the classification levels that were achieved on the data.

Confidential, Atlanta,GA

Hadoop/Big Data Developer

Responsibilities:

  • Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for statistical data analysis.
  • Worked on statistical regression and modelling, and Language processingand analysis using Python and Scala in HDFS
  • Involved in writing Map Reduce jobs.
  • Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
  • Involved in Sqoop, HDFS Put or Copy from Local to ingest data.
  • Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing the data onto HDFS.
  • Developed functional programs in SCALA for connecting the streaming data application in FLUME
  • Extensively used SCALA for connecting and retrieving data from NO-SQL databases such as MongoDB, PIG, andHIVE.
  • Involved in developing Pig UDFs for the needed functionality that is not out of the box available from Apache Pig.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Involved in developing Hive DDLs to create, alter and drop Hive tables.
  • Developed Web Applications in SCALAPLAY Framework (2.3 and 2.4) for interface with HDFS
  • Managed works including indexing data, tuning relevance, developing custom tokenizers and filters, adding functionality includes playlist, custom sorting and regionalization with Solr Search Engine.
  • Develop and maintain operational best practices for smooth operation of large Hadoop clusters
  • Involved in loading data from UNIX file system to HDFS. Installed and configured Hive and also written Hive UDFsand Cluster coordination services through Zoo Keeper.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Knowledge in performance troubleshooting and tuning Hadoop clusters
  • Involved in developing Hive UDFs for the needed functionality that is not out of the box available from Apache Hive.
  • Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
  • Computed various metrics using Java Map Reduce to calculate metrics that define user experience, revenue etc.
  • Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS.
  • Extracted and updated the data into Monod using Mongo import and export command line utility interface.
  • Extracted and updated the data into Monod using Mongo import and export command line utility interface. Involved in using SQOOP for importing and exporting data into HDFS.
  • Used Eclipse and ant to build the application. Proficient work experience with NOSQL, Monod databases. Also the HDFS data from Rows to Columns and Columns to Rows.
  • Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move the data files within and outside of HDFS.

Environment: MapReduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Perl/Shell scripts, Eclipse, Hbase, Flume, Spark, Kafka, Cloudera Manager, Cassandra, REST API, Python, Greenplum DB, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, Pentaho, Talend, Bigdata, YARN.

Confidential, OH

Java Developer

Responsibilities:

  • Worked with business analyst in understanding business requirements, design and development of the project.
  • Implemented the JSP frame work with MVC architecture.
  • Created new JSP's for the front end using HTML, Java Script, Jquery, and Ajax.
  • Developed the presentation layer using JSP, HTML, CSS and client side validations using JavaScript.
  • Involved in creating Restful web services using JAX RS and JERSEY tool.
  • Involved in designing, creating, reviewing Technical Design Documents.
  • Developed DAOs (Data Access Object) using Hibernate as ORM to interact with DBMS - Oracle.
  • Applied J2EE design patterns like Business Delegate, DAO and Singleton.
  • Involved in developing DAO's using JDBC.
  • Worked with QA team in preparation and review of test cases.
  • JUnit was used for unit testing for the integration testing tool.
  • Writing SQL queries to fetch the business data using Oracle as database.
  • Developed UI for Customer Service Modules and Reports using JSF, JSP's and My Faces Components
  • Log4j used for logging the application log of the running system to trace the errors and certain automated routine functions.

Environment: Java, JSP, JavaScript, Servlets, Hibernate, REST, EJB, JSF, JSP, Ant, Tomcat, Eclipse, SQL, Oracle.

Confidential - Chicago, IL

System Lead

Responsibilities:

  • Wireless and Cable network management design, database management, and web application development
  • Cable systems performance analytics design, database management and web application development for huge customer base of cable industry.
  • Cable industry financial analytics development based on current performances based on logs of each cable consumer. Projects Experience:
  • Developed five web design related projects and three database related application development.
  • Developed web applications using MVC 3/4, with front end using CSHTML and CSS
  • Implemented DATA TABLES in web applications using JQUERY DATATABLES and MVC WEBGRID methods.
  • Developed regression and analytics based applications and graphics displays using Google APIs.
  • Developed applications for XML data parsing and loading relevant data in WEB Applications and databases.
  • Used LINQ for database transactions.
  • Performed CRUD operations in databases using Web Applications.
  • Developed the Web Interface using Servlets, Java Server Pages, HTML, and CSS.
  • Extensively used the JDBC Prepared Statement to embed the SQL queries into the java code.
  • Developed DAO (Data Access Objects) using Spring Framework 3.
  • Developed Web applications with Rich Internet applications using Java applets, Silverlight, Java.
  • Created PC based applications in C, Visual C++, Visual C# for database access, Analytics, and Statistics graphs.
  • Used ADO.Net classes components sqlConnection, sqlCommand, sqlDataadapter,
  • Created Data layer using Entity Framework

Environment: Visual C++, Visual C#, MVC 3/4/5, Google APIs, Servlets, Java Server Pages, HTML, and CSS

Confidential - Marysville, OH

Software Developer

Responsibilities:

  • Video surveillance system design, database management, and web application development
  • Node access Control design, database management and web application development
  • Responsible for project management and scheduling, rolling out the project plans for new projects.
  • Successfully managed cross-functional teams to keep the design process within target date.
  • Developed and managed projects for video surveillance system design, database management, and web application development
  • Developed applications for video surveillance system designs, such as user interface, onscreen displays, motion detection, and video storage in C++.
  • Developed HTML and CSS based web application for flashing embedded system codes.
  • Responsible for designing Rich user Interface Web Applications using JavaScript, CSS, HTML.
  • Designed and implemented Web applications for Node control interfaces for surveillance based software.
  • Successfully implemented Node access control system management.
  • Implemented ASP.Net based web application for managing Security and Node control triggers and reporting mechanisms and alert systems in MS-SQL.

Environment: VISUAL BASIC, SQL, MS-SQL, C++, RTOS (NUCLEUS PLUS)

Confidential - Palo Alto, CA

Java / J2ee Developer

Responsibilities:

  • Participated in the business requirements meetings and provided inputs.
  • Involved in complete Agile/SDLC - Requirement Analysis, Development, System and Integration Testing.
  • Used Spring MVC as framework and JavaScript for major data entry, which involved extreme level of data validation at client side using Ajax.
  • Used Native Queries and Criteria Queries (annotations) in hibernate for access and updating data.
  • Used Spring 2.5 Framework for DI/IOC and ORM components to support the Hibernate tool.
  • Implemented business logic according to the requirements.
  • Worked extensively on Collections Framework.
  • Developed REST Web Services.
  • Create and maintain web pages using HTML, CSS, JavaScript, JQuery, Java, J2EE and also responsible for Designing of Web pages including Ajax controls and XML.
  • Involved to create different cross browser compatible user interactive web pages, using web technologies like HTML, XHTML, and CSS.
  • Worked with HTML, CSS background, CSS Layouts, CSS positioning, CSS text, CSS border, CSS margin, CSS padding, Pseudo elements and CSS behaviors.
  • Followed MVC Structure to develop Application.
  • Worked with Bootstrap for compiling CSS, JavaScript and building the System with the convenient methods.
  • Extensively worked in defect maintenance of Front End issues
  • Organized the internal site for managing environments, and project details using HTML, CSS, JavaScript and jQuery easing the scrolling pages to tabbed template structure.
  • Created and maintained the framework and layout of each portal with Cascading Style Sheets (CSS).

Environment: Java, J2EE, Spring 2.5, Spring Transactions, Spring JDBC, Spring MVC, Hibernate 3.5, XML, RESTful, WSDL, AJAX, jQuery, HTML, JavaScript, CSS, Log4J, JAXB, JUnit, Web sphere Application Server 6.0, Eclipse 3.5, Oracle 10g, JSP, Bootstrap.

We'd love your feedback!