Senior Hadoop Developer Resume
Nyc, NY
SUMMARY
- 7+ years of total experience in Designing and Developing client/server and web based applications using J2EE technologies, which includes 3 + years of experience in Big Data with good knowledge on HDFS and Ecosystem.
- Excellent understanding / knowledge ofHadooparchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
- Hands on experience in installing, configuring, and usingHadoopecosystem components likeMap Reduce, HDFS, HBase, Oozie, Hive, Sqoop, Pig, and Flume.
- Experience in working with large scaleHadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
- Experience in importing and exporting terabytes of data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Experience in architecting Hadoop clusters using major Hadoop Distributions - CDH3&CDH4&CDH5.
- Experience in managing and troubleshooting Hadoop related issues
- Experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster.
- Knowledge in job/workflow scheduling and monitoring tools like Oozie & Zookeeper.
- Experience in analyzing data using HIVEQL, PIG Latin and custom Map Reduce programs in JAVA. Extending Hive and PIG core Functionality by using custom User Defined Functions.
- Worked with application teams to install operating system, Hadoop updates, patches and version upgrades as required.
- Hands on experience in virtualization and worked on VMware Virtual Center
- Experience in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and theecosystem in Hadoop.
- Designed and modeled projects using techniques in UML - Use Cases, Class Diagrams, Sequence Diagrams, etc.
- Extensive experience inRequirements gathering, Analysis, Design, Reviews, Codingand Code Reviews,Unit and Integration Testing.
- Experience in using different applications development frameworks like Hibernate, Struts, and spring for developing integrated applications and different light weight business components.
- Experience in developing service components using JDBC.
- Experienceindeveloping and designing Web Services (SOAP and Restful Web services).
- Experience in developing Web Interface using Servlets, JSP and Custom Tag Libraries.
- Good knowledge and working experience in XML related technologies.
- Experience in using Java, JEE, J2EE design Patterns like Singleton, Factory, MVC, Front Controller, for reusing most effective and efficient strategies.
- Expertise in using IDE like WebSphere (WSAD), Eclipse, NetBeans, My Eclipse, WebLogic Workshop.
- Extensive experience in writing SQL queries for Oracle, Hadoop and DB2 databases using SQLPLUS.
- Hands on experience in working with oracle (9i/10g/11g), DB2, NoSQL, MySQL and knowledge on SQL Server.
- Extensive experience in using SQL and PL/SQL to write Stored Procedures, Functions and Triggers.
- Excellent technical, logical, code debugging and problem solving capabilities and ability to watch the future environment, the competitor and customers probable activities carefully.
- Proven ability to work effectively in both independent and team situations with positive results.
- Inclined towards building a strong team/work environment, and have the ability to accustom to the latest technologies and situations with ease.
TECHNICAL SKILLS
Hadoop/Big Data: Hadoop 1.x/2.x(Yarn), HDFS, Map Reduce, Spark, Hive, Zookeeper, Oozie, Tez, Impala, Mahout, Pig, Sqoop, Flume, Kafka, Storm, Ganglia, Nagios.
Development Tools: Eclipse, IBM DB2 Command Editor, QTOAD, SQL Developer, Microsoft Suite (Word, Excel, PowerPoint, Access), VM Ware
Programming/Scripting Languages: Java, SQL, Unix Shell Scripting, Python.
Databases: Oracle 11g,10g,9i, MySQL, SQL Server 2005,2008, PostgreSQL & DB2
NoSQL Databases: HBase, Cassandra, Mongo DB
Visualization: Tableau, Plotly, Raw and MS Excel.
Modeling languages: UML Design, Use case, Class, Sequence, Deployment and Component diagrams.
Version Control Tools: Sub Version (SVN), Concurrent Versions System (CVS) and IBM Rational ClearCase.
Methodologies: Agile/ Scrum, Waterfall
Operating Systems: Windows 98/2000/XP/Vista/7/8, 10, Macintosh, Unix, Linux and Solaris.
PROFESSIONAL EXPERIENCE
Confidential, NYC, NY
Senior Hadoop Developer
Responsibilities:
- Worked with the business analyst team for gathering requirements and client needs.
- Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, Sqoop, Flume.
- Using Sqoop, imported and exported the data from RDBMS into HDFS.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark-Context, Spark-SQL, Data Frame and PairRDD's.
- Worked on reading multiple data formats on HDFS using Scala.
- Designed and configured Kafka cluster to accommodate heavy throughput of 1 million messages per second. Used Kafka producer 0.8.3 API's to produce messages.
- Developed a data pipeline using Kafka and Storm to store data into HDFS and performed the real time analytics on the incoming data.
- Used HIVE to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in writing optimized Pig Script along with involved in developing and testing Pig Latin Scripts.
- Written Hive queries to process the data for visualization.
- Worked on different file formats (ORCFILE, TEXTFILE) and different Compression Codecs (GZIP, SNAPPY, LZO).
- Worked with multiple Input Formats such as Text File, Key Value, Sequence File input format.
- Experience in collecting metrics for Hadoop clusters using Ambari.
- Used Zookeeper to provide coordination services to the cluster.
- Involved in defining job flows using Oozie for scheduling jobs to manage apache Hadoop jobs by directed Acyclic graph (DAG) of actions with control flows
- Managing and scheduling Spark Jobs on a Hadoop Cluster using Oozie.
- Collaborated with the infrastructure, network, database, application and BI teams to ensure data quality and availability.
Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Sqoop, Kafka, Solr, HBase, Oozie, Flume, HDP, java, SQL Scripting, Linux Shell Scripting.
Confidential, Franklin, TN
Hadoop Developer
Responsibilities:
- Involved in Requirements gathering, Analysis, and Design, Development, Testing and Deployment phases of the project life cycle.
- Designed and developed complex hive queries to build Workers Compensation Fraud Claims Detection Predictive Model to predict the fraudulent medical providers.
- Proposed solutions to process the Ingested data in Hadoop Data Lake in varying file formats like Text, Json, XML, ORC and Sequence Files.
- Developed Sqoop scripts to import the data from various RDBMS sources to Hadoop Lake in an optimized way by selecting the right number of Mappers, Split-By column, Applying compression, etc.
- Designed and developed Pig scripts and pig UDF’s to process and transform the Ingested data by applying level-1 Curation and transformation rules like Trim of String fields, Formatting the Signed Integers, Convert EBCDIC to ASCII, Format Data/Timestamp fields, Apply Masking on Protected and Confidential data, Flattening of Flat files, Removal of Non-printable characters.
- Developed MapReduce program to derive CDC (Change Data Capture) records from Incremental Complete extract data from Sql-Server databases.
- Developed Java program to compare the Base schema of a data file with Incremental schema and dynamically generate the Hive DDL’s based on multiple business use cases.
- Developed Json-Lookup Hive Architecture to process and store small Lookup tables/files.
- Developed Curations using hive queries for Pricing Calcs.
- Developed MapReduce program to process varying schema data files by converting the Delimited Text Files and XML data files to Json and Avro data files.
- Applied Partitioning, Bucketing, MapJoin, Vectorization and CBO (Cost based optimization) techniques in hive to improve the performance of queries involving joins, aggregations, filters, etc.
- Developed Samba Client scripts to transfer the files from Windows Network drives to Hadoop Edge Node in a secured way.
- Developed Shell commands like Curl with Hadoop streaming enabled download of Web data files to Hadoop Lake.
- Enabled the execution engine of Hive and Pig processing to TEZ to enable performance improvement.
- Driving POC initiatives for finding the feasibilities of different traditional and Big data tools for reporting purposes.
- Developed Work Flow Management Tier3 scripts to schedule Big Data jobs using Autosys Scheduler.
- Driving initiative to automate the recurring manual activities for monitoring and operations using Unix Scripting.
- Worked on multiple Spark Sql vs Hive on Tez POC business use-cases involving iterative processing of data.
Environment: Hortonworks 2.3, Hadoop 2.0, MapReduce, Hive 1.2, Pig 0.3, Hbase 1.1, Tez, Spark 1.4, Unix Shell Scripting, Sqoop 1.4, Samba 1.4, Core Java, Scala, Autosys Scheduler(WFM)
Confidential, Houston, TX
Hadoop Developer
Responsibilities:
- Involved in Installing, Configuring Hadoopecosystem, and Cloudera Manager using CDH3 distribution.
- Experienced in managing and reviewing Hadoop log files.
- Experienced in running Hadoop streaming jobs to process terabytes of xml format data
- Load and transform large sets of structured, semi structured and unstructured data.
- Supported Map Reduce programs those are running on the cluster.
- Importing and exporting of data from RDBMS to HDFS and vice versa using Sqoop.
- Installed and configured Hive and also written Hive UDFs.
- Involved in creating Hive tables, loading the data and writing hive queries that will run internally in Map Reduce.
- Written Hive queries for data to meet the business requirements.
- Analyzed the data using Pig and written Pig scripts by grouping, joining and sorting the data.
- Hands on experience with NoSQL Database.
- Worked on Mongo DB by using CRUD (Create, Read, Update and Delete), Indexing, Replication and Sharding features.
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
- Designed and Developed Dashboards using Tableau.
- Actively participated in weekly meetings with the technical teams to review the code.
Environment: Hadoop, Map Reduce, Mongo DB, Hive, Pig, Sqoop, Core Java, Cloudera, HDFS, Oracle SQL, Eclipse, Tableau, Windows XP, UNIX.
Confidential, New York
Java Developer
Responsibilities:
- Participated in requirement discussions with all the stake holders.
- Responsible for distributing, tracking, communicating issues to developers and reporting status to manager on daily basis.
- Involved in High Level Design and Low Level Design document preparation.
- Development according to the specified design.
- Published SOAP based web services using JAX-WS, JAXB, XSD, XML Bean and XML.
- Front end is developed based on struts MVC architecture
- SOAPUI has been used to test the web services.
- Struts and spring frameworks has been used for the newly designed UI Infrastructure services to interact with the legacy application systems.
- Developed Action classes, Action forms, Validate methods, struts-config.xml file using struts and also used various struts tag libraries.
- Used Enterprise Java Beans (EJB session beans) in developing business layer APIs.
- Hibernate is used as ORM
- HQL and Criteria API have been used extensively.
- Developed complex SQL queries, stored procedures, functions, triggers and created indexes wherever applicable in Oracle database.
- Co-ordination with Onshore development team
- Involved in debugging and testing the application for the change requests
- Preparing weekly status reports /Monthly status reports
- Coordinating with complete offshore team on filling weekly time sheets on Clarity and Field glass.
- Given the code walk through to the newly joined team members on the deliverables
- Planning the forecast for the individuals on their task sheets.
- Prepared the test case documents for enhancements
- JUNIT is used for unit testing and prepared JUNIT Test cases document.
- Participated in code review and involved in integration, unit, functional testing, peer testing and integration testing.
Environment: JDK 1.5/1.4, J2EE, Servlets, Strut, Spring, Hibernate 3/3.5/4.0, HQL, Maven 3.0, JAX-WX, JAXB, XML, XSD, SOAPUI, JQuery, CSS, JUNIT, Oracle 9i/10g, SQL, PL/SQL, Quality Center, SSH shell, SSH Client, Putty, VSS, WAS, Web Sphere, Visual Studio, Microsoft Visio, Microsoft Project, UML, Share point, Windows XP and UNIX.
Confidential
Java Developer
Responsibilities:
- Involved in Requirements analysis, design, and development and testing
- Involved in development of platform related applications on Mediation Servers.
- Involved in Configuration management of the server using Core Java, Oracle DB.
- Involved in development of upgrade framework for upgrading the servers.
- Integrated upgrade framework like Rolling Upgrade, Quick Reboot Upgrade for NSN LTE mediation server/CSL Server using Unix Shell Scripting, C++.
- Involved in bug fixing of Configuration management and upgrade framework.
- Involved in testing of applications and upgrade functionality.
- Developed LSNAP tool which is a log snap tool for collecting system logs and status information on MED/CSL servers.
- Interacted with the client directly during the integration of upgrade framework.
Environment: Core Java, Oracle, UNIX Shell Scripting, C++