Senior Bigdata/ Hadoop Developer Resume
Madison, WI
SUMMARY:
- Over 8 years of IT experience in software development and support with experience in developing strategic methods for deploying Big Data technologies to efficiently solve Big Data processing requirement.
- Expertise in Hadoop eco system components HDFS, Map Reduce, Yarn, HBase, Pig, Sqoop, Spark, Spark SQL, Spark Streaming, and Hive for scalability, distributed computing, and high - performance computing.
- Experience in using Hive Query Language for data Analytics.
- Experienced in Installing, Maintaining and Configuring Hadoop Cluster.
- Strong knowledge on creating and monitoring Hadoop clusters on Amazon EC2, VM, Hortonworks Data Platform 2.1 & 2.2, CDH3, CDH4Cloudera Manager on Linux, Ubuntu OS etc.
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
- Having Good knowledge on Single node and Multi node Cluster Configurations.
- Strong knowledge in NOSQL column-oriented databases like HBase, Cassandra, MongoDB, andMarkLogicand its integration with Hadoop cluster.
- Expertise on Scala Programming language and Spark Core.
- Worked with AWS based data ingestion and transformations.
- Worked with CloudBreak and BluePrint to configure AWS plotform.
- Worked with data warehouse tools like Informatica, Talend .
- Experienced in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
- Good knowledge on Amazon EMR, Amazon RDS S3 Buckets, Dynamo DB, RedShift.
- Analyze data, interpret results, and convey findings in a concise and professional manner
- Partner with Data Infrastructure team and business owners to implement new data sources and ensure consistent definitions are used in reporting and analytics
- Promote full cycle approach including request analysis, creating/pulling dataset, report creation and implementation and providing final analysis to the requestor
- Pleasant experience on Kafka and Storm.
- Worked with Docker to establish connection between Spark and NEO4J database.
- Knowledge of java virtual machines (JVM) and multithreaded processing.
- Hands on experience working with ANSI SQL.
- Strong programming skills in designing and implementation of applications using Core Java, J2EE, JDBC, JSP, HTML, Spring Framework, Spring batch framework, Spring AOP, Struts, JavaScript, Servlets.
- Experience in build scripts using Maven and do continuous integrations systems like Jenkins.
- Java Developer with extensive experience on various Java Libraries, API’s, and frameworks.
- Hands on development experience with RDBMS, including writing complex SQL queries, Stored procedure, and triggers.
- Very Good understanding of SQL, ETL and Data Warehousing Technologies
- Knowledge of MS SQL Server 2012/ 2008/2005 and Oracle 11g/10g/9i and E-Business Suite.
- Expert in TSQL, creating and using Stored Procedures, Views, User Defined Functions, implementing Business Intelligence solutions using SQL Server 2000/2005/2008 .
- Developed Web-Services module for integration using SOAP and REST.
- NoSQL database experience on HBase, Cassandra, DynamoDB .
- Flexible with Unix/Linux and Windows Environments working with Operating Systems like Centos 5/6, Ubuntu 13/14.
- Have sound knowledge on designing data warehousing applications with using Tools like Teradata, Oracle, and SQL Server.
- Experience working with Solr for text search.
- Experience on using Talend ETL tool.
- Experience in working with job scheduler like Autosys and Maestro.
- Strong in databases like Sybase, DB2, Oracle, MS SQL, Clickstream.
- Strong understanding of Agile Scrum and Waterfall SDLC methodologies.
- Strong Working experience in snowflake.
- Hands on experience with automation tools such as Puppet, Jenkins, chef, Ganglia, Nagios .
- Effective communication, collaboration & team building skills with proficiency at grasping new Technical concepts quickly and utilizing them in a productive manner.
- Adept in analyzing information system needs, evaluating end-user requirements, custom designing solutions and troubleshooting information systems.
- Strong analytical and Problem-solving skills.
TECHNICAL SKILLS:
Hadoop/Big Data Technologies: HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, impala, Spark, Zookeeper and Cloudera Manager, Splunk.
NO SQL Database: HBase, Cassandra
Monitoring and Reporting: Tableau, Custom shell scriptsHadoop Distribution: Horton Works, Cloudera, MapR
Build Tools: Maven, SQL Developer
Programming & Scripting: JAVA, C, SQL, Shell Scripting, Python, Scala
Java Technologies: Servlets, JavaBeans, JDBC, Spring, Hibernate, SOAP/Rest services
Databases: Oracle, MY SQL, MS SQL server, Teradata
Web Dev. Technologies: HTML, XML, JSON, CSS, JQUERY, JavaScript, angular JS
Version Control: SVN, CVS, GIT
Operating Systems: Linux, Unix, Mac OS-X, Cen OS, Windows10, Windows 8, Windows 7, Windows Server 2008/2003
PROFESSIONAL EXPERIENCE:
Confidential, Madison, WI
Senior Bigdata/ Hadoop Developer
Responsibilities:
- Developed efficient MapReduce programs for filtering out the unstructured data and developed multiple MapReduce jobs to perform data cleaning and preprocessing on Hortonworks.
- Implemented Data Interface to get information of customers using RestAPI and Pre-Process data using MapReduce2.0 and store into HDFS (Hortonworks)
- Extracted files from MySQL, Oracle, and Teradata 2 through Sqoop1.4.6and placed in HDFS Cloudera Distribution and processed.
- Worked with various HDFS file formats like Avro1.7.6, Sequence File, Json and various compression formats like Snappy, bzip2.
- Successfully written Spark Streaming application to read streaming twitter data and analyze twitter records in real time using kafka and flume to measure performance of Apache spark streaming.
- Proficient in designing Row keys and Schema Design for NoSQLDatabase Hbase and knowledge of other NOSQL database Cassandra.
- Used Hive to perform data validation on the data ingested using scoop and flume and the cleansed data set is pushed into Hbase.
- Good understanding of Cassandra Data Modeling based on applications.
- Wrote ETL jobs to read from web APIs using REST and HTTP calls and loaded into HDFS using java and Talend.
- Developed the Pig 0.15.0UDF's to pre-process the data for analysis and Migrated ETL operations into Hadoop system using Pig Latin scripts and Python Scripts3.5.1.
- Used Pig as ETL tool to do transformations, event joins, filtering and some pre-aggregations before storing the data into HDFS.
- Troubleshooting, debugging & altering Talend issues, while maintaining the health and performance of the ETL environment.
- Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop.
- Used spark to parse XML files and extract values from tags and load it into multiple hive tables.
- Experienced in running Hadoop streaming jobs to process terabytes of formatted data using Python scripts.
- Developed small distributed applications in our projects using Zookeeper3.4.7and scheduled the workflows using Oozie 4.2.0.
- Proficiency in writing the Unix/Linux shell commands.
- Developed a SCP Stimulator which emulates the behavior of intelligent networking and Interacts with SSF
Environment: Hadoop Cluster, HDFS, Hive, Pig, Sqoop, OLAP, data modelling, Linux, Hadoop Map Reduce, HBase, Shell Scripting, MongoDB, and Cassandra, Apache Spark, Neo4J.
Confidential, Harrisburg, PA
Senior Bigdata/ Hadoop Developer
Responsibilities:
- Worked on Distributed/Cloud Computing (Map Reduce/Hadoop, Hive, Pig, HBase, Sqoop, Spark AVRO, Zookeeper etc.), Cloudera distributed Hadoop (CDH4).
- Installed and configured Hadoop MapReduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and processing.
- Involved in installing Hadoop Ecosystem components.
- Importing and exporting data into HDFS, Pig, Hive and HBase using SQOOP.
- Responsible to manage data coming from various sources.
- Flume and from relational database management systems using SQOOP.
- Responsible to manage data coming from different data sources.
- Developed Pig scripts for data analysis and extended its functionality by developing custom UDF's.
- Extensive knowledge on PIG scripts using bags and tuples.
- Experience in managing and reviewing Hadoop log files.
- Involved in gathering the requirements, designing, development and testing.
- Worked on loading and transformation of large sets of structured, semi structured data into Hadoop system.
- Developed simple and complex MapReduce programs in Java for Data Analysis.
- Load data from various data sources into HDFS using Flume.
- Developed the Pig UDF'S to pre-process the data for analysis.
- Worked on Hue interface for querying the data.
- Created Hive tables to store the processed results in a tabular format.
- Developed Hive Scripts for implementing dynamic Partitions.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
- Exported analyzed data to relational databases using SQOOP for visualization to generate reports for the BI team.
Environment: Hadoop (CDH4), UNIX, Eclipse, HDFS, Java, MapReduce, Apache Pig, Hive, HBase, Oozie, SQOOP and MySQL.
Confidential, New York, NY
Big Data SoftwareDeveloper
Responsibilities:
- Worked on SQOOP to import data from various relational data sources.
- Working with Flume in bringing click stream data from front facing application logs
- Worked on strategizing SQOOP jobs to parallelize data loads from source systems
- Participated in providing inputs for design of the ingestion patterns.
- Participated in strategizing loads without impacting front facing applications.
- Worked on performance tuning of HIVE queries with partitioning and bucketing process.
- Worked on the core and Spark SQL modules of Spark extensively.
- Developed Kafka producer and consumers, HBase clients, Spark, and Hadoop MapReduce jobs along with components on HDFS, Hive.
- Worked with Solr to do the full text search and the NoSQL search of the structured and unstructured data.
- Worked with big data tools like Apache Phoenix, Apache Kylin, Atscale, Apache Hue .
- Worked with securities like Knox, Apache Ranger, Atlas, sentry, Kerberose.
- Worked with BI Concepts-Dataguru, Talend.
- Worked in agile environment using Jira, Git.
- Worked on design on Hive, ANSI data store to store the data from various data sources.
- Involved in brainstorming sessions for sizing the Hadoop cluster.
- Involved in providing inputs to analyst team for functional testing.
- Worked with source system load testing teams to perform loads while ingestion jobs are in progress.
- Worked with Continuous Integration and related tools (i.e. Nagios, Jenkins, Maven, Puppet, Chef, Ganglia ).
- Worked on performing data standardization using PIG scripts.
- Worked with query engines Tez, Apache Phoenix.
- Worked with Business Intelligent(BI) Concepts and Data Ware housing Technologies using Power BI and R Statistics .
- Worked on installation and configuration Horton works cluster ground up.
- Managed various groups for users with different queue configurations.
- Worked on building analytical data stores for data science team’s model development.
- Worked on design and development of Oozie works flows to perform orchestration of PIG and HIVE jobs.
- Worked with Source Code Management Tools GitHUB, Clearcase SVN, CVS,
- Working experience with Testing tools JUNIT / SOAPUI.
- Experienced in analyzing the SQL scripts and designed the solution to implement using PySpark.
- Worked with Code Quality Governance related tools ( Sonar, PMD, FindBugs, Emma, Cobertura , etc)
- Analyzed the SQL scripts and designed the solution to implement using PySpark.
Environment: Hadoop, HDFS, Map Reduce, Flume, Pig, Sqoop, Hive, Pig, Sqoop, Oozie, Solr, Ganglia, HBase, Shell Scripting, Apache Spark.
Confidential, Chicago, IL
JAVA/J2EE Developer
Responsibilities:
- Involved in the project from requirements gathering and involved in various stages like Design, testing till production following agile methodology.
- Implemented Spring MVC framework, which includes writing Controller classes for handling requests, processing form submissions and performed validations using Commons Validator.
- Implemented the business layer by using Hibernate with Spring DAO and developed mapping files and POJO java classes using ORM tool.
- Designed and developed Business Services using Spring Framework (Dependency Injection) and DAO Design Patterns.
- Have Knowledge on Spring Batch, which provides Functions like processing large volumes of records, including job processing statistics, job restart, skip, and resource management.
- Implemented various design patterns in the project such as Business Delegate, Data Transfer Object, Service Locator, Data Access Object, and Singleton.
- Used Maven Deployment Descriptor Setting up build environment by writing Maven build XML, taking build, configuring, and deploying of the application in all the servers
- Implementing all the Business logic in the middle-tier using Java classes, Java beans, used JUnit framework for Unit testing of application.
- Developed web service for web store components using JAXB and involved in generating stub and JAXB data model class based on annotation.
- Worked on the platforms REST APIs, NodeJS.
- Developed XML configuration and data description using Hibernate. Hibernate Transaction Manager is used to maintain the transaction persistence.
- Designed and develop web-based application using HTML5, CSS, JavaScript, AJAX, JSP framework.
- Involved in doing various testing efforts as per the specifications and test cases using Test Driven.
- Applied MVC pattern of Ajax framework, which involves creating Controllers for implementing Classes.
Environment: JDK5.0, J2EE, Servlets, JSP, Spring, HTML, Java Script Prototypes, XML, JSTL, XPath, JQuery, Oracle 10, RAD, TTD, Web Sphere Application, SVN, MAVEN, JDBC, Windows XP, Hibernate.
Confidential
JAVA/J2EE Developer
Responsibilities:
- Involved in Java, J2EE, struts, web services and Hibernate in a fast-paced development environment.
- Followed agile methodology, interacted directly with the client on the features, implemented optimal solutions, and tailor application to customer needs.
- Involved in design and implementation of web tier using Servlets and JSP.
- Used Apache POI for Excel files reading.
- Developed the user interface using JSP and Java Script to view all online trading transactions.
- Used JSP and JSTL Tag Libraries for developing User Interface components.
- Performing Code Reviews.
- Performed unit testing, system testing and integration testing.
- Designed and developed Data Access Objects (DAO) to access the database.
- Used DAO Factory and value object design patterns to organize and integrate the JAVA Objects
- Coded Java Server Pages for the Dynamic front end content that use Servlets and EJBs.
- Coded HTML pages using CSS for static content generation with JavaScript for validations.
- Used JDBC API to connect to the database and carry out database operations.
- Involved in building and deployment of application in Linux environment.
Environment: Java, J2EE, JDBC, Struts, SQL. Hibernate, Eclipse, Apache POI, CSS.
Confidential
Software Engineer
Responsibilities:
- Used Web Sphere for developing use cases, sequence diagrams and preliminary class diagrams for the system in UML.
- Extensively used Web Sphere Studio Application Developer for building, testing, and deploying applications.
- Used Spring Framework based on (MVC) Model View Controller, designed GUI screens by using HTML, JSP.
- Developed the user interface using the JSP pages and DHTML to design the dynamic HTML pages.
- Developed Session Beans on Web Sphere for the transactions in the application.
- Developed the presentation layer and GUI framework in HTML, JSP and Client-Side validations were done.
- Involved in Java code, which generated XML document, which in turn used XSLT to translate the content into HTML to present to GUI.
- Implemented XQuery and XPath for querying and node selection based on the client input XML files to create Java Objects.
- Used Web Sphere to develop the Entity Beans where transaction persistence is required and JDBC was used to connect to the MySQL database.
- Utilized WSAD to create JSP, Servlets, and EJB that pulled information from a DB2 database and sent to a front-end GUI for end users.
- In the database end, responsibilities included creation of tables, triggers, stored procedures, sub-queries, joins, integrity constraints and views.
- Worked on MQ Series with J2EE technologies (EJB, Java Mail, JMS, etc.) on Web Sphere server.
Environment: : Java, EJB, IBM Web Sphere Application server, Spring, JSP, Servlets, JUnit, JDBC, XML, XSLT, CSS, DOM, HTML, MySQL, JavaScript, Oracle, UML, Clear Case, ANT.