Sr. Big Data/hadoop Developer Resume
Atlanta, GA
SUMMARY
- Over 8+ years of experience in Information Technology which includes experience inBigdata, HADOOPEcosystem,Core Java/J2EEand strong in Design, Software processes, Requirement gathering, Analysis anddevelopment of software applications
- Excellent Hands on Experience in developing HadoopArchitecture in Windows and Linux platforms.
- Experience in building bigdata solutions using Lambda Architecture using Cloudera distribution of Hadoop, TwitterStorm, Trident, MapReduce, Cascading, HIVE, PIG and Sqoop.
- Expertise in various components ofHadoopEcosystem - Map Reduce, Hive, Pig, Sqoop, Impala, Flume, Oozie, HBase, Spark, Kafka, YARN.
- Excellent working experience onBigDataIntegration and Analytics based on Hadoop, SOLR, Spark, Kafka, Storm and web Methods technologies.
- Hands on experience working on NoSQL databases including Hbase, MongoDB, Cassandra and its integration with Hadoopcluster.
- Strong Knowledge on implementing Big Data in Amazon Elastic MapReduce (Amazon EMR) for processing, managing Hadoop framework dynamically scalable Amazon EC2 instances.
- Hands on experience in writing Ad-hoc Queries for moving data from HDFS to HIVE and analyzing the data using HIVEQL.
- Experienced in development and utilization of ApacheSOLR withDataComputations and Transformation for use by Down Stream Online Applications.
- Excellent knowledge of database such as Oracle 8i/9i/10g/11g, 12c, MicrosoftSQLServer, DB2, Netezza.
- Good understanding and experience with Software Development methodologies like Agile and Waterfall.
- Experienced in importing and exportingdatausing Sqoop from HDFS (Hive & HBase) to Relational Database Systems (Oracle &Teradata) and vice-versa.
- Expertise in using IDE like WebSphere (WSAD),Eclipse, NetBeans, MyEclipse, WebLogicWorkshop.
- Experienced in developing and designing Web Services (SOAP and Restful Web services).
- Experienced in developing Web Interface using Servlets, JSP and Custom Tag Libraries
- Absolute knowledge of software development life cycle (SDLC), database design, RDBMS,data warehouse.
- Experience in writing ComplexSQLQueries involving multiple tables inner and outer joins.
- Expertise in various Java/J2EE technologies like JSP, Servlets, Hibernate, Struts, spring.
- Good knowledge with web-based UI development using jQuery UI, jQuery, ExtJS, CSS3, HTML, HTML5, XHTML and JavaScript.
TECHNICAL SKILLS
BigData: Hadoop, Storm, Trident, Hbase, Hive, Flume, Cassandra, Kafka, Storm, Sqoop, Oozie, PIG, Spark, MapReduce, ZooKeeper, Yarn.
Operating Systems: UNIX, Mac, Linux, Windows 2000 / NT / XP / Vista, Android
Programming Languages: Java (JDK 5/JDK 6&7), Mat lab, R, HTML, SQL, PL/SQL
Frameworks: Hibernate 2.x/3.x, Spring 2.x/3.x,Struts 1.x/2.x and JPA
Web Services: WSDL, SOAP, Apache CXF/XFire, Apache Axis, REST, Jersey
Databases: Oracle 8i/9i/10g, Microsoft SQL Server, DB2 & MySQL 4.x/5.x
Middleware Technologies: Web sphere Message Queue, Web sphere Message Broker, XML gateway, JMS
Web Technologies: J2EE, Soap & REST Web Services, JSP, Servlets, EJB, JavaScript, Struts, spring, web works, HTML, XML, JMS, JSF and Ajax.
Testing Frameworks: Mockito, PowerMock, EasyMock
Web/Application Servers: IBM Web sphere Application server, JBoss, Apache Tomcat
Others: Software Borland Star team, Clear case, Junit, ANT, Maven, Android Platform, Microsoft Office, SQLDeveloper, DB2 control center, MicrosoftVisio, Hudson, Subversion, GIT, Nexus, Artifactory
Development Strategies: Agile, Lean Agile, Pair Programming, Water-Fall and Test Driven Development
PROFESSIONAL EXPERIENCE
Confidential, Atlanta GA
Sr. Big Data/Hadoop Developer
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts.
- Supported MapReducePrograms those are running on the cluster and also Wrote MapReduce jobs using JavaAPI.
- Involved in HDFS maintenance and loading of structured and unstructureddata.
- Importeddatafrom mainframe dataset to HDFS using Sqoop.
- Handled importing ofdatafrom variousdatasources (i.e. Oracle, DB2,HBase, Cassandra, and MongoDB) to Hadoop, performed transformations using Hive, MapReduce.
- Wrote Hivequeries fordataanalysis to meet the business requirements. Implemented Kafka Custom encoders for custom input format to loaddatainto Kafka Partitions. Real time streaming thedatausing Spark with Kafka for faster processing.
- Configured Sparkstreaming to receive real timedatafrom the Kafka and store the streamdatato HDFS using Scala.
- Written pythonscripts for internal testing which pushes thedatareading form a file into Kafka queue which in turn is consumed by the Storm application.
- Installed and configured Flume, Hive, Pig, Sqoop and Oozie on theHadoopcluster.
- Participated in building CDH4 test cluster for implementing Kerberos authentication. Upgraded the HadoopCluster from CDH4 to CDH5 and setup High availability Cluster to Integrate the HIVE with existing applications
- Analyzed thedataby performing Hive queries and running Pigscripts to know user behavior.
- Worked on Kafka, Kafka-Mirroring to ensure that thedatais replicated without any loss.
- Load and transform large sets of structured, semi structured and unstructured data usingHadoop/Big Data concepts.
- Responsible for migrating tables from traditional RDBMS into Hive tables using Sqoop and later generate required visualizations and dashboards using Tableau.
- Generate final reportingdatausing Tableau for testing by connecting to the corresponding Hivetables using HiveODBC connector.
- Implemented Storm builder topologies to perform cleansing operations before movingdatainto Cassandra.
- Prototype done with HDPKafka and Storm for click stream application.
- Updated maps, sessions and workflows as a part of ETLchange and also modified existing ETLCode and document the changes.
Environment: Hadoop, Java, MapReduce, HDFS, Hbase, Hive, Pig, Linux, XML, Eclipse, Kafka, Storm, Spark, Cloudera, CDH4/5 Distribution, DB2, Scala, SQL Server, Oracle 12c, MySQL, InformaticaPowerCenter 8.x, Informatica Power Connect.
Confidential, Wayne, PA
Sr. Big Data/Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster using differentbigdataanalytic tools including Flume, Pig, Hive, HBase, Oozie, ZooKeeper, Sqoop, Spark and Kafka.
- Developed Sparkcode using Scala and Spark-SQL/Streaming for faster testing and processing ofdata.
- Used SparkAPI over ClouderaHadoopYARN to perform analytics ondatain Hive.
- As aBigDataDeveloperimplemented solutions for ingestingdatafrom various sources and processing theData-at-Rest utilizingBigDatatechnologies such as Hadoop, MapReduce Frameworks, MongoDB, Hive, Oozie, Flume, Sqoop and Talend etc.
- Explored with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL,DataFrame, PairRDD's, SparkYARN.
- Developed analytical components using Scala, Spark, Apache Mesos and Spark Stream.
- Installed Hadoop, Map Reduce, HDFS and developed multiple MapReduce jobs in PIG and Hive fordatacleaning and pre-processing.
- Developed Kafka producer and consumers, Spark and HadoopMapReduce jobs.
- Imported thedatafrom different sources like HDFS/Hbase into SparkRDD.
- Configured deployed and maintained multi-node Dev and Test KafkaClusters.
- Involved in converting MapReduce programs into Sparktransformations using Spark RDD's on Scala.
- Developed Sparkscripts by using ScalaShell commands as per the requirement.
- Performed transformations, cleaning and filtering on importeddatausing Hive, Map Reduce, and loaded finaldatainto HDFS.
- Load thedatainto SparkRDD and do in memorydataComputation to generate the Output response.
- Developed and written ApachePIGscripts and HIVEscripts to process the HDFSdata.
- Used Hive to find correlations between customer's browser logs in different sites and analyzed them to build risk profile for such sites.
- Created and maintained Technical documentation for launching HadoopClusters and for executing Hivequeries and PigScripts.
Environment: Hadoop, HDFS, Spark, MapReduce, Pig, Hive, Sqoop, Kafka, HBase, Oozie, Flume, Scala, Python, Java, SQL Scripting and Linux Shell Scripting, HBase, MongoDB, Cloudera, Cloudera Manager, EC2, EMR, S3.
Confidential, Pittsburgh, PA
Sr. Big Data/Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster using differentbigdataanalytic tools including Kafka, Pig, Hive and MapReduce.
- Configured Spark streaming to receive real timedatafrom the Kafka and store the streamdatato HDFS using Scale.
- Worked on implementing Spark using Scala and SparkSQL for faster analyzing and processing ofdata.
- Handled in Importing and exportingdatainto HDFS and Hive using SQOOP and Kafka
- Involved in creating Hive tables, loading thedataand writing hivequeries, which will run internally in map reduce.
- Involved in developing PigScripts for changedatacapture and delta record processing between newly arriveddataand already existingdatain HDFS.
- Involved in scheduling Oozieworkflow engine to run multiple Hive and pig jobs.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the logdata.
- Worked on Designing and Developing ETLWorkflows using Java for processingdatain HDFS/Hbase using Oozie.
- Worked on importing the unstructureddatainto the HDFS using Flume.
- Wrote complex Hivequeries and UDFs.
- Involved in developing Shellscripts to easy execution of all other scripts (Pig, Hive, and MapReduce) and move thedatafiles within and outside of HDFS.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
- Worked with NoSQL databases like Hbase in creating tables to load large sets of semi structureddata.
- Generated JavaAPIs for retrieval and analysis on No-SQL database such as HBase.
- Created ETL jobs to generate and distribute reports from MySQL database using PentahoDataIntegration.
- Worked on loadingdatafrom UNIX file system to HDFS
- Analyzed large amounts ofdatasets to determine optimal way to aggregate and report on it.
Environment: Hadoop, HDFS, MapReduce, Hive Sqoop, Pig, Hbase, Apache Spark, Oozie Scheduler, Java, UNIX Shell Scripts, Kafka, Git, Maven, PLSQL, MongoDB, HBase, Cassandra, Python, Scala, Pentaho.
Confidential, OH
Sr. Java/Hadoop Developer
Responsibilities:
- Developed applications inHadoopBigData technologies- Pig, Hive, Map-Reduce, Hbase and Oozie.
- Developed Scala programs with Spark for data inHadoopecosystem.
- Managed and reviewedHadoopLogfiles as a part of administration for troubleshooting purposes. Communicate and escalate issues appropriately.
- Developed MapReduce jobs using apache commons components.
- Collected and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Created HBasetables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
- Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and translate to MapReduce jobs.
- Developed UDFs inJavaas and when necessary to use in PIG and HIVE queries.
- Coordinated with various stakeholders such as the End Client, DBA Teams, Testing Team and Business Analysts.
- Developed Java Web Applications using JSP and Servlets, Struts, Hibernate, spring, RestWebServices, SOAP.
- Involved in gathering requirements and developing a project plan.
- Involved in understanding requirements, functional specifications, designing documentations and testing strategies.
- Involved in UI designing, Coding, Database Handling.
- Involved in UnitTesting and BugFixing.
- Worked over the entire Software Development Life Cycle (SDLC) as a part of a team as well as independently.
- WrittenSQLqueries to query the database and providing data extracts to users as per request.
Environment: Java1.5, JSP, Servlet, Spring, Hibernate 3.0, Struts framework,Hadoop, Map Reduce, HDFS, HBase, Hive, Pig, Sqoop, Flume, Kafka, Spark, Scala, ETL, Cloudera CDH ApacheHadoop, HTML, XML, Log 4j, Eclipse, Unix, Windows XP
Confidential
Java/J2EE DeveloperResponsibilities:
- Developed web pages using Struts, JSP, Servlets, HTML and JavaScript.
- Designed and implemented the strategic modules like Underwriting, Requirements, Create Case, User Management, Team Management and Material Data Changes.
- Involved in Installation and Configuration of Tomcat, SpringSource Tool Suit, Eclipse, unittesting.
- Involved in Migrating existing distributed JSPframework to StrutsFramework, designed and involved in research of StrutsMVCframework
- Developed Ajaxframework on service layer for module as benchmark
- Implemented Service and DAO layers in between Struts and Hibernate.
- Designed GraphicalUserInterface(GUI) applications using HTML, JSP, JavaScript (JQuery), CSSand AJAX.
- Applied MVC pattern of Ajaxframework which involves creating Controllers for implementing Classes.
- Designed and developed the UI using Strutsview component, JSP, HTML, CSS and JavaScript.
- Implemented business process, database retrievals, access of information and User Interface usingJava, Struts, and Planet Interact Framework.
- Implemented the Application using many of the Design Patterns and Object Oriented Process in the view of future requirements of Insurance domain.
- Used Log4j for logging the application.
- Used JAXB for convertingJavaObject into a XML file and for converting XML content into aJavaObject.
- Used JIRA for bug/task tracking and time tracking.
- Used agilemethodology for development of the application.
Environment: JavaJ2EE, JSP, JavaScript, Ajax, Swing, Spring 3.2, Eclipse 4.2, Hibernate 4.1, XML, Tomcat, Oracle 10g, JUnit, JMS, Log4j, Maven, Agile, Git, JDBC, Web service, XML, SOAP, JAX-WS, Unix MongoDB, AngularJS and Soap UI.