Hadoop/spark Developer Resume
Charlotte, NC
SUMMARY:
- IT Professional with over 8+ years of extensive experience in all phases of SDLC, including 3+ years of strong experience working on Apache Hadoop ecosystem and Apache Spark.
- 4 years of Experience in design, development, maintenance and support of Big Data Analytics using Hadoop Ecosystem tools like HDFS, Hive, Sqoop and Pig.
- Experienced in processing Big data on the Apache Hadoop framework using MapReduce programs.
- Excellent understanding and knowledge of NOSQL databases like HBase and Mongo DB.
- Good knowledge of Hadoop ecosystem, HDFS, Big Data, RDBMS.
- Experienced in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions and AWS.
- Good knowledge on Hadoop, Hbase, Hive, Pig Latin Scripts, MR, Sqoop, Flume, Hive QL.
- Experience in analyzing data using Pig Latin, HiveQL and HBase.
- Capturing data from existing databases that provide SQL interfaces using Sqoop.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Implemented Proofs of Concept on Hadoop stack and different big data analytic tools, migration from different databases (i.e Teradata, Oracle, MYSQL ) to Hadoop.
- Worked on NoSQL databases including HBase, Cassandra and MongoDB
- Successfully loaded files to Hive and HDFS from MongoDB, HBase
- Experience in configuring Hadoop Clusters and HDFS.
- Worked extensively in Java, J2EE, XML, XSL, EJB, JSP, JSF, JDBC, MVC, Jakarta struts, JSTL, Spring2.0, Design Patterns and UML.
- Extensive experience in Object Oriented Programming, using Java & J2EE (Servlets, JSP, Java Beans, EJB, JDBC, RMI, XML, JMS, Web Services, AJAX).
- Excellent analytical and problem solving skills and ability to quickly learn new technologies.
- Good communication and interpersonal skills. A very good team player with the ability to work independently.
TECHNICAL SKILLS:
Hadoop/Big Data: HDFS, MapReduce, Hive, Pig, Sqoop, Apache Ignite, GitHub, Agile, Flume, Oozie, and ZooKeeper, Kafka, Kerberos
No SQL Databases: Hbase,Cassandra, MongoDB, MarkLogic
Languages: C, Python, Java, J2EE, PL/SQL, Pig Latin, HiveQL, Unix shell scripts, R Programming, Storm Linux
Java/J2EE Technologies: Applets, Swing, JDBC, JNDI, JSON, JSTL, RMI, JMS, Java Script, JSP, Servlets, EJB, JSF, JQuery
Frameworks: MVC, Struts, Spring, Hibernate
Operating Systems: Sun Solaris, HP - UNIX, RedHat Linux, Ubuntu Linux and Windows XP/Vista/7/8
Web Technologies: HTML, DHTML, XML, AJAX, WSDL, SOAP
Web/Application servers: Apache Tomcat, WebLogic, JBoss
Databases: Oracle 9i/10g/11g, DB2, SQL Server, MySQL, Teradata
Tools: and IDE Eclipse, NetBeans, Toad, Maven, ANT, Hudson, Sonar, JDeveloper, Assent PMD, DB Visualizer
Network Protocols: TCP/IP, UDP, HTTP, DNS, DHCP
PROFESSIONAL EXPERIENCE:
Confidential, Charlotte, NC
Hadoop/Spark Developer
Responsibilities:
- Involved in complete Big Data flow of the application starting from data ingestion from upstream to HDFS, processing and analyzing the data in HDFS.
- Developed Spark API to import data into HDFS from Teradata and created Hive tables.
- Developed Sqoop jobs to import data in Avro file format from Oracle database and created hive tables on top of it.
- Created Partitioned and Bucketed Hive tables in Parquet File Formats with Snappy compression and then loaded data into Parquet hive tables from Avro hive tables.
- Involved in running all the hive scripts through hive, Impala, Hive on Spark and some through Spark SQL
- Involved in performance tuning of Hive from design, storage and query perspectives.
- Developed Flume ETL job for handling data from HTTP Source and Sink as HDFS.
- Collected the Json data from HTTP Source and developed Spark APIs that helps to do inserts and updates in Hive tables.
- Developed Spark scripts to import large files from Amazon S3 buckets.
- Responsible for writing Hive Queries for analyzing data in Hive warehouse using Hive Query Language (HQL).
- Provide support data analysts in running Pig and Hive queries.
- Developed Spark core and Spark SQL scripts using Scala for faster data processing.
- Involved in HiveQL and Involved in Pig Latin.
- Importing and exporting Data from Mysql/Oracle to HiveQL Using SQOOP.
- Configured HA cluster for both Manual failover and Automatic failover.
- Designed and built many applications to deal with vast amounts of data flowing through multiple Hadoop clusters, using Pig Latin and Java-based map-reduce.
- Specifying the cluster size, allocating Resource pool, Distribution of Hadoop by writing the specification texts in JSON File format.
- Experience in writing SOLR queries for various search documents
- Responsible for defining the data flow within Hadoop eco system and direct the team in implement them.
Environment: HDFS, Yarn, MapReduce, Hive, Sqoop, Flume, Oozie, HBase, Kafka, Impala, Spark SQL, Spark Streaming, Eclipse, Oracle, Teradata, PL/SQL UNIX Shell Scripting, Cloudera.
Confidential, Reston, VA
Hadoop Developer
Responsibilities:
- Processed Big Data using a Hadoop cluster consisting of 40 nodes.
- Designed and configured Flume servers to collect data from the network proxy servers and store to HDFS.
- Loaded the customer profiles data, customer spending data, credit from legacy warehouses onto HDFS using Sqoop.
- Built data pipeline using Pig and Java Map Reduce to store onto HDFS.
- Applied transformations and filtered both traffic using Pig.
- Used Pattern matching algorithms to recognize the customer across different sources and built risk profiles for each customer using Hive and stored the results in HBase.
- Performed unit testing using MRUnit.
- Responsible for building scalable distributed data solutions using Hadoop
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster
- Setup and benchmarked Hadoop/HBase clusters for internal use
- Developed Simple to complex Map/reduce Jobs using Hive and Pig
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Analyzed the data by performing Hive queries and running Pig scripts to study employee behavior
- Installed Oozie workflow engine to run multiple Hive and Pig jobs
Environment: Hadoop, Hive, Zookeeper, Map Reduce, Sqoop, Pig 0.10 and 0.11, JDK1.6, HDFS, Flume, Oozie, DB2, HBase, Mahout
Confidential, Hartford, CT
Sr Java Developer
Responsibilities:
- Involved in defining the business rule according to client specific and convert them into High level Technical Design.
- Designed entire system according to OOPS & UML by using Rational Tools.
- Elaborated use cases, interface definition specifications in collaboration with Business.
- Used Backend as the Oracle database & used JDBC technologies for integration.
- Extensively used TOAD for all DB related activities & integration testing.
- Used build and deploy scripts in ANT and UNIX shell scripting.
- Developed User interface screens using Servlets, JSP, JavaScript, CSS, AJAX, HTML.
- Involved in unit testing of developed business units & used the JUnit for specifics.
- Worked along with the Development team & QA team to resolve the issues in SIT/UAT/Production environments.
- Closely Co-ordinated with Architect, Business Analyst, business team for requirement analysis and doing development and implementation.
- Spring Framework caching mechanism which was used to pre-load some of the Master Information.
- Implementation of this project included scalable coding using JAVA, JDBC, JMS with Spring.
- Developed Controller Classes, Command Objects, Action Classes, Form beans, Transfer Objects Singleton at server side for handling requests and responses from presentation Layer.
Environment: Core Java, J2EE1.5/1.6, Struts, Ajax, Rational Rose, Rational Requisite Pro, Hibernate3.0, CVS, RAD7.0 IDE, Oracle10g, JDBC, log4j, WebSphere6.0, Servlets, JSP, Junit.
Confidential, Columbus, OH
J2EE Developer
Responsibilities:
- Implementation of the Business logic layer.
- Implementation of the Business logic layer for MongoDB Services.
- Development of advanced client-side web applications
- JavaScript and HTTP knowledge
- Implementation of the presentation layer using CSS, Tiles 3 and Jquery
- Responsible for coding of DAO classes using Spring Model.
- Implementing Business logic for Service Layer.
- Responsible for Internationalization of Application
- Responsible for coding POJO Classes and Hibernate Reverse Engineering.
- Developing Hibernate Configuration files for MySql 5.1, Oracle 10g & 11g and MongoDB.
- Front End Designing along with Javascript, JQuery Plugins and Ajax.
Environment: . J2E7, Spring MVC 4.0.3, JSP 2.0, Spring Security 3.1, Ajax, Jquery, Hibernate 4.2.1,JBOSS AS 7, Maven, Tiles, Facebook Api, MongoDB, Facebook App Developer, DOJO ToolKit, Javascript. XML.MongoDB