Big Data Developer Resume
Columbus, OH
SUMMARY
- 5+ years of total IT experience which includes 2.5+ years of hands on experience in Hadoop, HDFS and Hadoop Ecosystem.
- Extensive Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
- Extensive Experience in importing streaming data into HDFS using flume sources, flume interceptors and flume sinks.
- Expertise in performing analytics on structured data in hive with Hive queries, Partitioning, Bucketing and UDF’s using HiveQL.
- Performed extensive transformations using Hive.
- Experience in using PIG Latin Scripts performed join operations, custom user defined functions (UDF) to perform ETL operations.
- Experienced in using Apache Oozie to combine multiple jobs Map Reduce, Hive, Pig, and Sqoop into one logical map of unit work.
- Experienced in optimization techniques in sorting and phasing of Map reduce programs, and implemented optimized joins that will join data from different data sources.
- Experience in performance tuning of HIVE queries and Java Map Reduce programs for scalability and faster execution.
- Hands on experience in writing Pig Latin scripts using GRUNT shell on Hadoop Ecosystem to carry out essential data operations and tasks.
- Worked with different file formats like JSON, XML, Avro, Parquet, ORC, RC data files and text files
- Excellent experience in using and understanding of NOSQL databases like HBase, Cassandra.
- Experience in working with Hadoop in Stand - alone, pseudo and distributed modes.
- Experience in reviewing, managing logs also analyze the errors.
- Hands on experience migrating complex map reduce programs into Apache Spark RDD transformations.
- Experienced in Cloud Computing with Amazon Web Services like EC2, S3 which provides fast and efficient processing of Big Data
- Good expertise in exporting, extracting of analyzed data and generating various visualizations using Business Intelligence tool Tableau for better analysis of data.
- Created complex Executive dashboards using Tableau Desktop that was published for the business stake holders for analysis and customization using filters and actions.
- Expert in Data Visualization development using Tableau to create complex, intuitive, and innovative dashboards
- Strong Expertise in using Tools like Tableau Server, Tableau Desktop and Qlikview.
TECHNICAL SKILLS
Big Data/Hadoop: HDFS, Hadoop Map Reduce, Zookeeper, Hive, Pig, Sqoop, Flume, Oozie, SparkApache Kafka, Apache Solr
Cloud Computing: Amazon Web Services
Methodologies: Agile, Waterfall model, SDLC
Java/J2EE Technologies: J2EE, JDBC, JSP, SOA, REST and SOAP Web Services
Frameworks: Hibernate, Spring MVC, CSS3, ANT, Maven, JSON
Database: Oracle (SQL & PL/SQL), My SQL, HBase, Cassandra
Servers: Tomcat
Version Control: SVN
IDE: Eclipse, Net Beans
PROFESSIONAL EXPERIENCE
Confidential, Columbus, OH
Big Data Developer
Responsibilities:
- Involved in complete SDLC life cycle of big data project that includes requirement analysis, design, development, testing and production
- Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop scripts for last saved value or Last modified Date.
- Created Hive Generic UDF to successfully implement business requirements.
- Used Impala for faster retrieval of data in Hive table and also performed updates using Invalidate command to update the tables.
- Involved in creating hive tables, loading data into tables and writing hive queries which run on MapReduce execution engine.
- Experienced with using different kind of compression techniques to save data and optimize data transfer over network using Snappy in Hive tables.
- Used the RegEx, JSON and Avro SerDe’s for serialization and de-serialization packaged with Hive to parse the contents of streamed log data and implemented Hive custom UDF’s.
- Customized the reducers in Pig using Set commands.
- Established custom Map Reduces programs in order to analyze data and used Pig Latin to clean unwanted data.
- Hand on in creating Map Reduce programs for various data analyzing.
- Expertise in creating instances on cloud services use EC2, S3 Storage, Kinesis stream for Big Data analysis.
- Implemented custom interceptors for flume to filter data and defined channel selectors to multiplex the data into different sinks.
- Expertise in using Semi joins for faster retrieval of data.
- Experience in using Apache kafka for streaming data and extensively worked on Producers, Brokers and customers
- Used transformations and actions in Spark for creating RDD’s and doing aggregation operations on the data set.
- Used Pig for filtering, cleansing the data and performed aggregations.
- Experienced in using solr in Hadoop for retrieving records in a large data set using elastic search.
- Expertise in using Pig input/output commands, relational operators and advanced relational operators.
- Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
- Responsible for adding users to access the ecosystem using Cloudera Manger
- Created Dashboards using Joins, calculated fields, parameters, groups, sets, hierarchies in Tableau.
- Implemented Data Blending in case of merging two different datasets in Tableau 9.1.
- Built complex formulas in Tableau for various business calculations using Calculated Fields in Tableau 9.1
- Created Dashboards using Pie, Donut, waterfall, bubble charts and others in Tableau 9.1
- Used actions, filters and parameters in Tableau for Dynamic functionality.
- Migrated dashboards from Tableau 8.2 to 9.1
- Responsible for creating Groups, assigning privileges on Tableau admin.
- Provided 24/7 production support for Tableau users
Environment: Hadoop 2.x, HDFS, Map Reduce, Hive, Flume, Sqoop, PIG, Java (JDK 1.6), Eclipse, MySQL, CDH 4.7, Oracle, Shell Scripting, Spark, Tableau 9.1
Confidential, Beaverton, OR
Hadoop Developer
Responsibilities:
- Developed various Big Data workflows using custom MapReduce, Pig, Hive, Sqoop, and Flume.
- Used flume script to import logs into HDFS from different streaming source.
- Implemented complex map reduce programs to perform joins on the Map side using distributed cache.
- Developed map reduce jobs to preprocess data.
- Created hive tables defined with appropriate static and dynamic partitions intended for efficiency and performed operations using HIVE QL.
- Designed and implemented custom writable, custom input formats, custom partitions and custom comparators.
- Used Sqoop to import data from RDBMS into hive tables.
- Created Hive UDF’s and Pig UDAF’s to implement various business definitions.
- Monitored workload, job performance and capacity planning using Cloudera Manager
- Responsible for managing data coming from different data sources.
- Involved in loading of structured and unstructured data using Hive and Pig.
- Wrote pig scripts for advanced analytics on the data for recommendations.
- Experience in using Oozie workflow engine to run multiple pig and hive jobs.
- Implemented helper classes that access HBase directly from java using Java API to perform CRUD operations.
- Integrated Map Reduce with HBase to import bulk amount of data into HBase using Map Reduce Programs.
- Developed Spark scripts by using Scala for faster processing of data on huge data sets.
- Responsible for creating Dashboards using Tableau 8.2
- Designed and developed many complex reports based on the requirements using complex calculations like Row-level calculations, aggregate calculations, filter Actions, URL actions, trend lines, forecast lines Relationships, data blending, sorting, groupings, live connections.
- Closely collaborated with both the onsite and offshore team
Environment: Hadoop 2.x, MapReduce, HDFS, Hive, Pig, Linux, XML, Eclipse, Cloudera, CDH 4.7, SQL Server, Oracle 11i, MySQL, Flume, Spark, Oozie, Hbase, Tableau 8.2
Confidential
Java Developer
Responsibilities:
- Involved in design and development phase of the application.
- Used Hibernate to develop DAO layer which performs all the DDL and DML operations for services
- Used JDBC to establish connection between the database and the application
- Used XML for mapping the pages and classes and to transfer data universally among different data sources.
- Created SQL tables and indexes and also wrote queries to update/manipulate data stored in the Database.
- Creation of JSP pages including the use of JSP custom tags and other methods of Java Bean presentation and all HTML and graphically oriented aspects of the site's user interface.
- Used Spring IOC for obtaining the policy details into transfer object upon requesting for the policy.
- Developed code to handle web requests involving Request Handlers and Data Access Objects.
- Created the user interface using HTML, CSS and JavaScript.
- Created/modified shell scripts for scheduling and automating tasks
- Involved in unit testing and documentation and used Junit framework for testing.
- Handled requests and worked in an Agile process
Environment: Eclipse, Spring, JSPs, HTML, CSS, JavaScript, JQuery, SQL, JDBC