Senior Bigdata Developer Resume
Raleigh, NC
SUMMARY
- Above 8 years of professional experience in IT, including 4+ years of work experience in Big Data, Hadoop Development and Ecosystem Analytics in Insurance, Health Care & Retail Industry Project sectors with multiprogramming language expertise like Java, Python.
- Hadoop Developer with 4 years of working experience in designing and implementing complete end - to-end Hadoop Infrastructure using MapReduce, PIG, HIVE, Sqoop, Oozie, Hbase and Flume.
- Java Programmer with 4+ years of Extensive programming experience in developing web based applications and Client-Server technologies.
- Experience in integration of Flume with Spark Streaming and Down Stream Database Hbase.
- Experience in processing the multiple files from the HDFS using Spark streaming and store the data in to the Hbase.
- Good noledge on spark components like Spark Sql,Spark Streaming.
- Good noledge of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce concepts
- Experience in working with MapReduce programs using Hadoop for working with Big Data.
- Experience in analyzing data using Hive QL, Pig Latin and custom MapReduce programs in Java
- Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS and vice-versa.
- Collecting and aggregating a large amount of Log Data using Apache Flume and storing data in HDFS for further analysis.
- Job/workflow scheduling and monitoring tools like Oozie.
- Experience in designing both time driven and data driven automated workflows using Oozie
- Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) using Agile Methodologies.
- Experience on Hadoop clusters using major Hadoop Distributions - Cloudera(CDH4, CDH5), Hortonworks(HDP)
- Experience in different layers of Hadoop Framework - Storage (HDFS), Analysis (Pig and Hive), Engineering (Jobs and Workflows).
- Experienced in using Integrated Development environments like Eclipse, IntelliJ, Kate and gEdit.
- Migration from different databases Oracle, DB2, Teradata to Hadoop.
- Worked and migrated RDMBS databases into different NoSQLdatabase.
- Experience in designing and coding web applications using Core Java&webTechnologies- JSP, Servlets and JDBC.
- Extensive noledge of J2EE architecture, Patterns, Design and development.
- Excellent noledge in Java and SQL in application development and deployment.
- Familiar with data warehousing "fact" and "dim" table and star schema and combined with Google Fusion tables for visualization.
- Hands on experience in creating various database objects like tables, views, functions, and triggers using SQL.
- Excellent technical, communication, analytical and problem-solving skills and ability to get on well with people including cross-cultural backgrounds and troubleshooting capabilities.
TECHNICAL SKILLS
Big Data Ecosystems: Hadoop, MapReduce, HDFS, HBase, Hive,spark, Pig, Sqoop,Oozie,Zookeeper
Languages: C, Core Java, Unix, SQL, Python, C#, Scala
J2EE Technologies: Servlets, JSP, JDBC, Java Beans.
Methodologies: Agile, UML, Design Patterns (Core Java and J2EE).
NoSQLTechnologies: HBase
Frameworks: MVC, Struts, Hibernate, Spring.
Database: Oracle 11g, MySQL, MS-SQL Server, Teradata. PostgreSQL, IBM DB2
Operating Systems: Windows XP/Vista/7, UNIX.
Software Package: MS Office 2010.
Tools: & Utilities: Eclipse,Intelli J, SVN, Git, Maven, SOAP UI, JMX explorer, QC, QTP, Jira
Web Technologies: HTML, XML, JavaScript, jQuery, AJAX, SOAP, and WSDL
PROFESSIONAL EXPERIENCE
Confidential - Raleigh, NC
Senior Bigdata Developer
Responsibilities:
- Developed data pipeline using Flume, Sqoop, Pig and Java map reduce to ingest customer behavioural data and financial histories into HDFS for analysis.
- Involved in writing MapReduce jobs.
- Involved in SQOOP, HDFS Put or CopyFromLocal to ingest data.
- Extensively used UNIX for shell Scripting and pulling the Logs from the Server.
- Used Pig to do transformations, event joins, filter bot traffic and some pre-aggregations before storing the data onto HDFS.
- Implemented the spark streaming job which integration of Flume and Spark streaming and load the data in to Hbase by processing the data with spark streaming.
- Implemented the spark streaming application which will process the data from multiple files in Hdfs and store the data in to Hbase.
- Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
- Involved in developing Hive DDLs to create, alter and drop Hive TABLES.
- Involved in developing HiveUDFs for the needed functionality dat is not out of the box available from Apache Hive.
- Involved in using HCATALOG to access Hive table metadata from Map Reduce or Pig code.
- Computed various metrics using JavaMapReduce to calculate metrics dat define user experience, revenue etc.
- Responsible for developing data pipeline using flume, sqoop and pig to extract the data from weblogs and store in HDFS Designed and
- Implemented various metrics dat can statistically signify the success of the experiment.
- Used Eclipse and ant to build the application.
- Involved in using SQOOP for importing and exporting data into HDFS and Hive.
- Involved in processing ingested raw data using MapReduce, Apache Pig and Hive.
- Involved in developing PigScripts for change data capture and delta record processing between newly arrived data and already existing data in HDFS.
- Involved in pivot the HDFS data from Rows to Columns and Columns to Rows.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFSGET or CopyToLocal.
- Involved in developing Shell scripts to orchestrate execution of all other scripts (Pig, Hive, and MapReduce) and move the data files within and outside of HDFS.
Confidential - Madison, Wisconsin
Hadoop Developer
Responsibilities:
- Loading the data from the different Data sources like (Teradata, DB2, Oracle and flat files) into HDFS using Sqoop and load into Hive tables, which are partitioned.
- Created different pig scripts & converted them as a shell command to provide aliases for common operation for project business flow.
- Used Sqoopto extract data from Oracle SQL server and MySQL databases to HDFS
- Developed workflows in Oozie for business requirements to extract the data using Sqoop
- Developed MapReduce(YARN) jobs for cleaning, accessing and validating the data
- Wrote MapReduce jobs using Pig Latin
- Hive scripts were written in Hive QL to de-normalize and aggregate the data
- Optimized the existing Hive and Pig Scripts
- Automated the work flows using shell scripts (Bash) to export data from databases into Hadoop
- Used JUnit framework to test the Unit testing of the application
- Hive queries for data were written to meet the business requirements
- Developed product profiles using Pig and commodity UDFs.
- Designed workflows by scheduling Hive processes for Log file data, which is streamed into HDFS using Flume
- Developed schemas to handle reporting requirements using Tableau
- Actively participated in weekly meetings with the technical teams to review the code
- Involved in loading data from UNIX file system to HDFS
- Implemented test scripts to support test driven development and continuous integration
- Responsible to manage data coming from different sources
- Have deep and thorough understanding of ETLtools and how they can be applied in a BigData environment
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume
Confidential, San Francisco, CA
Hadoop Developer
Responsibilities:
- Understand the exact requirement of a report from the Business groups and users.
- Frequent interactions with Business partners.
- Designed and developed a Medicare-Medicaid claims system using Model-driven architecture on a customized framework built on Spring.
- Moved data from HDFS to Cassandra using MapReduce and BulkOutputFormat class.
- Imported trading and derivatives data in Hadoop Distributed File System and Eco System (MapReduce, Pig, Hive, Sqoop).
- Was part of an activity to setup Hadoop ecosystem at dev & QA Environment.
- Managed and reviewed Hadoop Log files.
- Responsible writing PIG Script and Hive queries for data processing
- Running Sqoop for importing data from Oracle &Other Database.
- Creation of shell script to collect raw logs from different machines.
- Created Partition in a hive as static and dynamic.
- Implemented Pig Latin scripts using operators such as LOAD, STORE, DUMP, FILTER, DISTINCT, FOREACH, GENERATE, GROUP, COGROUP, ORDER, LIMIT, AND UNION.
- Defined some PIG UDF for some financial functions such as swap, hedging, Speculation and arbitrage
- Coded many MapReduceprogram to process unstructured logs file.
- Worked on Import and export data into HDFS and Hive using Sqoop
- Used parameterize pig script and optimized script using illustrate and explain.
- Involved in the process of configuring HA, Kerberossecurity issues and name node failure restoration activity time to time as a part of zero downtime.
Confidential - Dallas, TX
Java/J2EE Developer
Responsibilities:
- Write design document based on requirements from MMSEA user guide.
- Performed requirement gathering, design, coding, testing, implementation and deployment.
- Worked on modeling of Dialog process, Business Processes and coding Business Objects, QueryMapper and
- JUnit files.
- Created the Business Objects methods using Java and integrating the activity diagrams.
- Worked in web services using SOAP, WSDL.
- Wrote Query Mappers and MQ Experience in JUnit Test Cases.
- Developed the UI using XSL and JavaScript.
- Managed software configuration using ClearCase and SVN.
- Design, develop and test features and enhancements.
- Perform error rate analysis of production issues and technical errors. Provide production support. Fix production defects.
- Analyze user requirement document and develop test plan, which includes test objectives, test strategies, test environment, and test priorities.
- Perform Functional testing, Performance testing, Integration testing, Regression testing, Smoke testing and User Acceptance Testing (UAT).
- Converted Complex SQLqueries running at mainframes into pig and Hive as a part of a migration from mainframes into Hadoopcluster.
Confidential
Java Developer
Responsibilities:
- Involved in various SDLC phases like Design, Development and Testing.
- Developed front end using Struts and JSP.
- Developed web pages using HTML, JavaScript, JQuery and CSS.
- Used various Core Java concepts such as Exception Handling, Collection APIs to implementvarious features and enhancements.
- Developed server side components servlets for the application.
- Involved in coding, maintaining, and administeringServlets and JSPcomponents to be deployed on a WebSphere application server.
- Implemented Hibernate ORM to Map relational data directly to java objects.
- Worked with Complex SQL queries, Functions and Stored Procedures.
- Involved in developing spring web MVC framework for portals application.
- Implemented the logging mechanism using log4j framework.
- Developed RESTAPI, WebServices.
- Wrote testcases in JUnit for unittesting of classes.
- Used Maven to build the J2EE application.
- Used SVN to track and maintain the different version of the application.
- Involved in maintenance of different applications with onshore team.
- Good working experience in Tepestry processing claims.
- Working experience with professional billing claims.