Hadoop/big Data Developer Resume
Carrolton, TX
PROFESSIONAL EXPERIENCE
- Over 9 +years of experience in software development includes Analysis, Design and Development of quality software for Standalone Applications and Web - based applications using JAVA/J2EE Technologies using Software Development Methodologies / Frameworks like SDLC, OOAD and AGILE.
- Developed web applications based on different Design Patterns such as Model-View-Controller (MVC), Data Access Object (DAO), Singleton Pattern, Front Controller, Business Delegate, Service Locator, Transfer Objects etc.
- Experienced in using Java tools like Intelli J, Eclipse.
- Good knowledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, MapReduce concepts responsible for writing MapReduce programsand setting up standards and processes for Hadoop-based application design and implementation.
- Performance benchmarking & optimization of H-scale implemented Big data Components.
- Involved in teh process of data acquisition, data pre-processing and data exploration of telecommunication project in Scala.
- Expertise with different tools in Hadoop Environment including Pig, Hive, HDFS, MapReduce, Spark, Kafka, Yarn, and Zookeeper.
- Extensively used Scalafor functional application programming for creating GUI and charts and data analytics.
- Used Different Spark Modules like Spark core, Spark RDD's, Spark Data frame, Spark SQL.
- Developed various web applications using Scala PLAY Framework using REST APIs and MVC pattern
- Expertise in developing data driven applications usingPython2.7,Python3.5 on Pyspark.
- Experience on Machine learning tools using Python using toolkits such as NumPy, Regression, Natural Language Processing, Pyspark, and SciPy.
- Experience in installation, configuration, and deployment of Big Data solutions.
- Experience in using Cloudera Manager for installation and management of single-node and multi-node Hadoop cluster.
- Worked with teh Spark for improving performance and optimization of teh existing algorithms inHadoopusing Spark Context, Spark-SQL, Spark MLlib, Data Frame, Pair RDD's, Spark YARN.
- Very Good understanding and Working Knowledge of Object Oriented Programming (OOPS), Python and Scala.
- Expertise in MapReduce programs in HIVE and PIG to validate and cleanse teh data in HDFS, obtained from heterogeneous data sources, to make it suitable for analysis.
- Analyzed or transformed stored data by writing MapReduce jobs based on business requirements.
- Experience in developing Pig scripts and Hive Query Language.
- Hands on experience working with NoSQL databaseCassandra.
- Experience in developing Pig Latin scripts to extract teh data from teh web server output files to load into HDFS.
- Detailed understanding of Software Development Life Cycle (SDLC) and sound knowledge of project implementation methodologies including Waterfall and Agile.
TECHNICAL SKILLS
Hadoop Technologies: HBase, HIVE, Sqoop, Flume, HDFS, Oozie, Zookeeper, YARN, Spark, Kafka, Sentry, Falcon, Pig
J2EE Technologies: Servlets, JSP, EJB, JDBC, Web Services (WSDL, SOAP), Spring and
Web Services/ Application Servers: Apache tomcat Server, IBM WebSphere server, JBoss
Web Tools and Languages: HTML, XML, CSS, DHTML, Java Script
Databases (SQL): IBM DB2, Oracle8i/9i/10g, MS SQL Server 2005/2008, MySQL
DataBases (NO-SQL): PIG, HIVE, Cassandra, MongoDB, HBASE
Languages: Scala, Python, Java / J2EE, HTML, SQL
OS: Windows 2003/2008/XP/Vista, Unix, Linux (Various Versions)
Tools: MS-Office 2003/2007/2010 , Eclipse3.3/3.4, Eclipse, Net Beans
Version Control: IBM RTC
Bug Reporting Tools: Bugzilla, IBM Rational Clearcase
Others: ASP.NET, VB.NET and C#
IDEs: Eclipse, NetBeans, JDeveloper, MyEclipse
PROFESSIONAL EXPERIENCE
Confidential, Carrolton, TX
Hadoop/Big Data Developer
Responsibilities:
- Worked with teh advanced analytics team to design fraud detection algorithms, and retrieving real-time streaming datasets and then developed MapReduce programs to efficiently run teh algorithm on teh huge datasets.
- Ran data formatting scripts inpythonand created terabyte csv files to be consumed byHadoop MapReduce jobs.
- Performed data analysis, feature selection, feature extraction using Apache Spark Machine Learning streaming libraries inPython.
- Developed functional programs in SCALA for connecting teh streaming data application and gathering webdatausing JSON and XML and passing it to FLUME.
- Configured Kafka to read and write messages from external programs.
- Configured Kafka to handle real time data.
- Extensively used SCALA for connecting and retrieving data from NO-SQL databases such as MongoDB, PIG, HIVE, Cassandra, and HBASE
- Involved in administration, installing, upgrading and managing CDH3, Pig, Hive&HBase.
- Played a key-role is setting up a 50 node Hadoop cluster utilizing Apache Spark by working closely with teh Hadoop Administration team.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs in Python and Scala.
- Used teh Spark -Cassandra Connector to load data to and from Cassandra.
- Created Hive tables to store data into HDFS, loading data and writing hive queries dat will run internally in map-reduce way.
- Modeling and data mining and advanced data processing. Contributed in healthcare projects based on Big Data.
- Real time streaming teh data using Spark (version 1.4.0) with Kafka (version 0.8.2.2).Configured Spark streaming to receive real time data from teh Kafka and store teh stream data to HDFS using Scale.
- Uploaded and processed terabytes of data from various structured and unstructured sources into HDFS (AWS cloud) using Sqoop and Flume.
- Involved in Cluster coordination services through Zookeeper.
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for BI team.
- Automated teh installation and maintenance of Kafka, storm, zookeeper and elastic search using salt stack technology.
- Created UDF's to store specialized data structures in HBase and Cassandra.
- Played a key role in installation and configuration of teh various Hadoop ecosystem tools such as Solr, Kafka, Pig, HBase and Cassandra.
- Implemented various hive optimization techniques like Dynamic Partitions, Buckets, Map Joins, Parallel executions in Hive.
- Developed Web Applications in SCALAPLAY Framework (2.4 and 2.5)using REST API and MVC pattern, for interfacing with HDFS.
- Involved in scheduling Airflow workflow engine to run multiple Hive and pig jobs using python.
- Used Flume to collect teh logs data with error messages across teh cluster.
- Extracted meaningful data from dealer csv files and text files and generatedPythonpanda's reports for data analysis.
- UtilizedPythonto run scripts, generate tables, and reports.
- Designed and Maintained Oozie workflows to manage teh flow of jobs in teh cluster.
- Parse Json files through Spark core to extract schema for teh production data using SparkSQL and Scala.
- Actively updated teh upper management with daily updates on teh progress of project dat include teh classification levels dat were achieved on teh data.
Confidential - Memphis, TN
Hadoop/Big Data Developer
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioral data and financial histories into HDFS for statistical data analysis.
- Worked on statistical regression and modelling, and Language processingand analysis using Python and Scala in HDFS
- Involved in writing Map Reduce jobs.
- Developed Spark code using Scala and Spark-SQL for faster testing and data processing.
- Involved in Sqoop, HDFS Put or Copy from Local to ingest data.
- Used Pig to do transformations, event joins, filter boot traffic and some pre-aggregations before storing teh data onto HDFS.
- Developed functional programs in SCALA for connecting teh streaming data application in FLUME
- Extensively used SCALA for connecting and retrieving data from NO-SQL databases such as MongoDB, PIG, andHIVE.
- Involved in developing Pig UDFs for teh needed functionality dat is not out of teh box available from Apache Pig.
- Used Hive to analyze teh partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive tables.
- Developed Web Applications in SCALAPLAY Framework (2.3 and 2.4) for interface with HDFS
- Managed works including indexing data, tuning relevance, developing custom tokenizers and filters, adding functionality includes playlist, custom sorting and regionalization with Solr Search Engine.
- Develop and maintain operational best practices for smooth operation of large Hadoop clusters
- Involved in loading data from UNIX file system to HDFS. Installed and configured Hive and also written Hive UDFsand Cluster coordination services through Zoo Keeper.
- Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
- Knowledge in performance troubleshooting and tuning Hadoop clusters
- Involved in developing Hive UDFs for teh needed functionality dat is not out of teh box available from Apache Hive.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Computed various metrics using Java Map Reduce to calculate metrics dat define user experience, revenue etc.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract teh data from weblogs and store in HDFS.
- Extracted and updated teh data into Monod using Mongo import and export command line utility interface.
- Extracted and updated teh data into Monod using Mongo import and export command line utility interface. Involved in using SQOOP for importing and exporting data into HDFS.
- Used Eclipse and ant to build teh application. Proficient work experience with NOSQL, Monod databases. Also teh HDFS data from Rows to Columns and Columns to Rows.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and Map Reduce) and move teh data files within and outside of HDFS.
Environment: MapReduce, HDFS, Hive, Pig, Hue, Oozie, Core Java, Perl/Shell scripts, Eclipse, Hbase, Flume, Spark, Kafka, Cloudera Manager, Cassandra, REST API, Python, Greenplum DB, IDMS, VSAM, SQL*PLUS, Toad, Putty, Windows NT, UNIX Shell Scripting, Pentaho, Talend, Bigdata, YARN.
Confidential, Marysville, OH
Java Developer
Responsibilities:
- Worked with business analyst in understanding business requirements, design and development of teh project.
- Implemented teh JSP frame work with MVC architecture.
- Created new JSP's for teh front end using HTML, Java Script, Jquery, and Ajax.
- Developed teh presentation layer using JSP, HTML, CSS and client side validations using JavaScript.
- Involved in creating Restful web services using JAX RS and JERSEY tool.
- Involved in designing, creating, reviewing Technical Design Documents.
- Developed DAOs (Data Access Object) using Hibernate as ORM to interact with DBMS - Oracle.
- Applied J2EE design patterns like Business Delegate, DAO and Singleton.
- Involved in developing DAO's using JDBC.
- Worked with QA team in preparation and review of test cases.
- JUnit was used for unit testing for teh integration testing tool.
- Writing SQL queries to fetch teh business data using Oracle as database.
- Developed UI for Customer Service Modules and Reports using JSF, JSP's and My Faces Components
- Log4j used for logging teh application log of teh running system to trace teh errors and certain automated routine functions.
Environment: Java, JSP, JavaScript, Servlets, Hibernate, REST, EJB, JSF, JSP, Ant, Tomcat, Eclipse, SQL, Oracle.
Confidential - Chicago, IL
System Lead
Responsibilities:
- Wireless and Cable network management design, database management, and web application development
- Cable systems performance analytics design, database management and web application development for huge customer base of cable industry.
- Cable industry financial analytics development based on current performances based on logs of each cable consumer.
- Developed five web design related projects and three database related application development.
- Developed web applications using MVC 3/4, with front end using CSHTML and CSS
- Implemented DATA TABLES in web applications using JQUERY DATATABLES and MVC WEBGRID methods.
- Developed regression and analytics based applications and graphics displays using Google APIs.
- Developed applications for XML data parsing and loading relevant data in WEB Applications and databases.
- Used LINQ for database transactions.
- Performed CRUD operations in databases using Web Applications.
- Developed teh Web Interface using Servlets, Java Server Pages, HTML, and CSS.
- Extensively used teh JDBC Prepared Statement to embed teh SQL queries into teh java code.
- Developed DAO (Data Access Objects) using Spring Framework 3.
- Developed Web applications with Rich Internet applications using Java applets, Silverlight, Java.
- Created PC based applications in C, Visual C++, Visual C# for database access, Analytics, and Statistics graphs.
- Used ADO.Net classes components sqlConnection, sqlCommand, sqlDataadapter,
- Created Data layer using Entity Framework
Environment: Visual C++, Visual C#, MVC 3/4/5, Google APIs, Servlets, Java Server Pages, HTML, and CSS
Confidential - Marysville, OH
Software Developer
Responsibilities:
- Video surveillance system design, database management, and web application development
- Node access Control design, database management and web application development
- Responsible for project management and scheduling, rolling out teh project plans for new projects.
- Successfully managed cross-functional teams to keep teh design process within target date.
- Developed and managed projects for video surveillance system design, database management, and web application development
- Developed applications for video surveillance system designs, such as user interface, onscreen displays, motion detection, and video storage in C++.
- Developed HTML and CSS based web application for flashing embedded system codes.
- Responsible for designing Rich user Interface Web Applications using JavaScript, CSS, HTML.
- Designed and implemented Web applications for Node control interfaces for surveillance based software.
- Successfully implemented Node access control system management.
- Implemented ASP.Net based web application for managing Security and Node control triggers and reporting mechanisms and alert systems in MS-SQL.
Environment: VISUAL BASIC, SQL, MS-SQL, C++, RTOS (NUCLEUS PLUS)
Confidential - Palo Alto, CAJava / J2ee Developer
Responsibilities:
- Participated in teh business requirements meetings and provided inputs.
- Involved in complete Agile/SDLC - Requirement Analysis, Development, System and Integration Testing.
- Used Spring MVC as framework and JavaScript for major data entry, which involved extreme level of data validation at client side using Ajax.
- Used Native Queries and Criteria Queries (annotations) in hibernate for access and updating data.
- Used Spring 2.5 Framework for DI/IOC and ORM components to support teh Hibernate tool.
- Implemented business logic according to teh requirements.
- Worked extensively on Collections Framework.
- Developed REST Web Services.
- Create and maintain web pages using HTML, CSS, JavaScript, JQuery, Java, J2EE and also responsible for Designing of Web pages including Ajax controls and XML.
- Involved to create different cross browser compatible user interactive web pages, using web technologies like HTML, XHTML, and CSS.
- Worked with HTML, CSS background, CSS Layouts, CSS positioning, CSS text, CSS border, CSS margin, CSS padding, Pseudo elements and CSS behaviors.
- Followed MVC Structure to develop Application.
- Worked with Bootstrap for compiling CSS, JavaScript and building teh System with teh convenient methods.
- Extensively worked in defect maintenance of Front End issues
- Organized teh internal site for managing environments, and project details using HTML, CSS, JavaScript and jQuery easing teh scrolling pages to tabbed template structure.
- Created and maintained teh framework and layout of each portal with Cascading Style Sheets (CSS).
Environment: Java, J2EE, Spring 2.5, Spring Transactions, Spring JDBC, Spring MVC, Hibernate 3.5, XML, RESTful, WSDL, AJAX, jQuery, HTML, JavaScript, CSS, Log4J, JAXB, JUnit, Web sphere Application Server 6.0, Eclipse 3.5, Oracle 10g, JSP, Bootstrap.
