Big Data/hadoop Developer Resume
Indianapolis, IN
PROFESSIONAL SUMMARY:
- Seasoned Database Developer with extensive experience in implementing Big Data solutions, Big Data Analytics and design & Implementation of database applications
- Over 8 Years of experience in the IT industry including 5 years implementing Big Data solutions administrative and Big data analytics in Ecommerce, Education Financials and Healthcare domains
- Proficient skills in Hadoop Ecosystem with h ands - on experience implementing Hadoop Technologies such as HDFS, OOZIE, SQOOP, Impala, Flume, MapReduce, Pig & Zookeeper, Hive . Implemented solutions include:
- Developed Map Reduce jobs in Hive & Pig
- Used Hadoop ecosystem components to Load and transform large sets of structured, semi-structured and unstructured data along with exporting data into Tableau using Live connection.
- Created databases, tables and views in HIVEQL, IMPALA and PIG LATIN.
- Developed Proof of Concepts on Hadoop stack and different Big Data analytic tools along with migration from different databases (Oracle, MySQL etc. ) to Hadoop.
- Used OOZIE to define and schedule the jobs.
- Experience in developing workflows using Flume Agents with multiple sources like Web Server logs, REST API and multiple sinks like HDFS sink and Kafka sink.
- Good knowledge in Amazon Web Services like Amazon EC2, IAM, EMR, S3 Storage, RedShift, DynamoDB, Aurora, and AWS Security Compliance Programs.
- Experience in implementing unified data platform to get data from different data sources using Apache Kafka brokers and various producers and consumers.
- Proficient in Java, Scala, and Python.
- Hands on experience in application development using Java, RDBMS, and Linux Shell Scripting.
- Extensive experience in using Tableau & QlikView Reporting Tools.
- Working knowledge of techniques like Sentiment Analysis.
- Good experience in implementing NoSQL databases like MongoDB, Cassandra & HBase and working with different data sources including Flat files, XML files and Databases
- Proficient in all flavors of Hadoop including Cloudera, Hortonworks & MapR.
- Extensive experience in design and development of database applications in Oracle 12c/11g/ 10g using SQL and PL/SQL
- Extensively used PL/SQL to build Oracle Packages, Stored Procedures, Functions, Triggers, Views and Cursors for processing data.
- Worked on advanced PL/SQL constructs like Oracle supplied packages, Nested Tables, Arrays, Records and Types.
- A team player with good Analytical, Communications and Interpersonal skills and a proven ability to work on multiple projects under tight deadlines
TECHNICAL SKILLSET
Hadoop Framework: HDFS, MapReduce, Java, Hive, Pig, HBase, Sqoop, Flume, Solr, spark
Databases: HiveQL, Impala, Oracle 12c, 11g & 10g
Languages: SQL, PL/SQL, Core Java
Tools: TOAD, Oracle SQL Developer
Operating Systems: Windows, Linux
IDE Tools: Eclipse
Web Technologies: HTML, JavaScript, CSS
Servers: Apache, Tomcat
Reporting Tools: Tableau, QlikView, Crystal Reports, SSRS, and HTML Reports
WORK EXPERIENCE:
Big Data/Hadoop Developer
Confidential, Indianapolis, IN
Responsibilities:
- Experienced in development using Cloudera distribution system.
- As a Hadoop Developer, my responsibility is managing the data pipelines and data lake.
- Performing Hadoop ETL using hive on data at different stages of pipeline.
- Sqooped data from different source systems and automating them with Oozie workflows.
- Generation of business reports from Data Lake using Hadoop SQL (Impala) as per the Business Needs.
- Automation of Business reports using Bash scripts in UNIX on Data Lake by sending them to business owners.
- Developed Spark Scala code to cleanse and perform ETL on the data in data pipeline in different stages.
- Worked in different environments like DEV, QA, Data Lake and Analytics Cluster as part of Hadoop Development.
- Responsible for implementation and ongoing administration of Hadoop infrastructure
- Snapped the cleansed data to the Analytics Cluster for reporting purpose to Business
- Developed pig scripts, python to perform Streaming and created tables on the top of it using hive.
- Developed multiple POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, and SQL
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, and Scala.
- Developed Oozie workflow engine to run multiple Hive, Pig, Sqoop and Spark jobs.
- Handled importing of data from various data sources, performed transformations using Hive, Spark and loaded data into HDFS.
- Experience with UNIX administration.
- Developed pig, hive, Sqoop, Hadoop streaming, spark actions in Oozie in the workflow management.
- Supported Map Reduce Programs those are running on the cluster.
- Experienced in collecting, aggregating, and moving large amounts of streaming data into HDFS using Flume.
- Good Understanding of Workflow management process and in implementation.
- Knowledge on HL7 protocols and parsing the messages from the HL7 messages.
- Involved in the development of frameworks that are used in Data pipelines and coordinated with cloudera consultant
Environment: Hadoop, HDFS, Hive, Pig, Oozie, Spark Core, Spark SQL, Spark Streaming, Scala, Java, Eclipse, Flume, Cloudera Distribution, Oracle 10g, UNIX Shell Scripting, Python, GIT.
Hadoop Developer
Confidential, Sterling, VA
Responsibilities:
- Responsible for building a framework in Java 8 that ingests data onto Hadoop from a variety of data sources providing high storage efficiency and an optimized layout for analytics.
- Implemented POC to showcase the Next Gen ETL Data platform capability through Spark RDD and in memory transformations using SCALA.
- Framework was implemented using Spring Boot for Hadoop in Java 8 with Spark RDD’s and Map Data structures.
- Used Spark Streaming APIs to perform transformations and actions on the fly for building common learner data model which gets the data from Kafka in Near real-time and persist it to HBase.
- Load the data into Spark RDD and performed in-memory data computation to generate the output response.
- Responsible for HBase REST server administration, backup and recovery.
- Performed different types of transformations and actions on the RDD to meet the business requirements.
- Developed a data pipeline using Kafka, Spark, HBase and Hive to ingest, transform and analyze data using Oozie workflows.
- Loaded transformed data into Hive tables which helps in providing SQL like access to the data.
- Created Phoenix views for the HBase tables to have SQL enables in HBase.
- Also loaded the transformed data after performing validations which are based on business requirements.
- Develop Spark programs using Spark-SQL library to perform analytics on data in Hive.
- Writing Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Parsed Error data is streamed into Kafka error topic which serves as an input for Apache Sentry to send out notifications.
- Using Hive to analyze the partitioned data and compute various metrics for reporting.
- Managed and reviewed Hadoop log files.
- Extensively used Apache Sqoop for efficiently transferring bulk data from Hive to MYSQL.
- Used PowerBI for report generation.
Environment: Hadoop, Spark, Scala, Kafka, Sqoop, HDFS, Hive, Pig, Oozie, MySQL, Impala, Sentry.
Hadoop Developer
Confidential, Jersey City, NJ
Responsibilities:
- Responsible for writing various Hive queries, testing and performance tuning of the hive queries.
- Developed Pig Latin scripts in the areas where extensive coding needs to be reduced to analyze large data sets.
- Used Sqoop tool to extract data from a relational database into Hadoop.
- Involved in performance enhancements of the code and optimization by writing custom comparators and combiner logic.
- Worked closely with data warehouse architect and business intelligence analyst to develop solutions.
- Good understanding of job schedulers like Fair Scheduler which assigns resources to jobs such that all jobs get, on average, an equal share of resources over time and an idea about Capacity Scheduler.
- Responsible for performing peer code reviews, troubleshooting issues and maintaining status report.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run Map Reduce jobs in the backend.
- Involved in identifying possible ways to improve the efficiency of the system. Involve in the requirement analysis, design, development and Unit Testing use of MRUnit and Junit.
- Prepare daily and weekly project status report and share it with the client.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig, Hive and Sqoop.
Environment: Apache Hadoop, Java (JDK 1.7), Oracle, My SQL, Hive, Pig, Sqoop, Linux, Cent OS, Junit, MR Unit, Cloud era
Java Developer
Confidential
Responsibilities:
- Involved in analysis, design and development of Expense Processing systems.
- Created used interfaces using JSP.
- Developed the Web Interface using Servlets, Java Server Pages, HTML and CSS.
- Developed the DAO objects using JDBC.
- Business Services using the Servlets and Java.
- Design and development of User Interfaces and menus using HTML 5, JSP, Java Script, Client side and Server-side validations.
- Developed GUI using JSP, Struts frame work.
- Involved in developing the presentation layer using Spring MVC/Angular JS/jQuery.
- Involved in designing the user interfaces using Struts Tiles Framework.
- Used Spring 2.0 Framework for Dependency injection and integrated with the Struts Framework and Hibernate.
- Used Hibernate 3.0 in data access layer to access and update information in the database.
- Experience in SOA (Service Oriented Architecture) by creating the web services with SOAP and WSDL.
- Developed JUnit test cases for all the developed modules.
- Used Log4J to capture the log that includes runtime exceptions, monitored error logs and fixed the problems.
- Used RESTFUL Services to interact with the Client by providing the RESTFUL URL mapping.
- Used CVS for version control across common source code used by developers.
- Used Ant scripts to build the application and deployed on Oracle WebLogic Server 10.0.
Environment: Struts 1.2, Hibernate 3.0, Spring 2.5, JSP, Servlets, XML, SOAP, WSDL, JDBC, JavaScript, HTML, CVS, Log4J, JUNIT, Web logic App server, Eclipse, Oracle, Restful.
Java Developer
Confidential
Responsibilities:
- Wrote SQL queries, stored procedures, and triggers to perform back-end database operations.
- Developed nightly batch jobs which involved interfacing with external third-party state agencies.
- Implemented JMS producer and Consumer using Mule ESB.
- Gathered business requirements and wrote functional specifications and detailed design documents.
- Extensively used Core Java, Servlets, JSP and XML.
- Wrote AngularJS controllers, views, and services.
- Designed the logical and physical data model, generated DDL scripts, and wrote DML scripts for Oracle 9i database.
- Implemented Enterprise Logging service using JMS and Apache CXF.
- Developed Unit Test Cases and used JUNIT for unit testing of the application.
- Involved in designing user screens and validations using HTML, jQuery, Ext JS and JSP as per user requirements.
Environment: Java, Spring core, JMS Web services, JMS, JDK, SVN, Maven, Mule ESB, Junit, WAS7, jQuery, Ajax, SAX.
