Big Data Engineer Resume
Bentonville, AR
SUMMARY:
- Around 7 years of experience in IT industry including Big Data Environment, Hadoop Ecosystem, Tableau Desktop, SQL requirement gathering, Analysis, development & Maintenance of applications using Java/J2EE Technologies like Servlets, JSPs, Spring for various software applications.
- Excellent understanding of Hadoop architecture and complete understanding of Hadoop daemons and various components such as HDFS, YARN, Resource Manager, Node Manager, Name Node, Data Node HDP and CDH.
- Experience in developing Map Reduce jobs with Java API in Hadoop
- Implemented Data Ingestion using Sqoop into HDFS from RDBMS and vice - versa.
- Handled structured data using Hive.
- Wrote Ad-hoc queries for moving data from HDFS to Hive and analyzed the data using HIVE QL.
- Experience in writing custom UDFs in Java for HIVE and Pig to extend the functionality.
- Good Knowledge on serialization formats like Sequence File, Avro and Parquet Worked with RDBMS including MySQL, Oracle.
- Scheduled workflow using Oozie workflow Engine.
- Authentication and authorization management for Hadoop cluster users using Kerberos and Sentry.
- Implementing Map Reduce jobs using Spark, Spark SQL with Scala.
- Experienced with Real-time data processing mechanism in Big Data Ecosystem such as Spark Streaming and Flume
- Working experience NoSQL Database.
- Working experience as Tableau developer, created highly interactive Data visualization Reports and Dashboards using feature such as Data Blending, Calculations, Filters, Actions, Parameters, Maps, Extracts, Context Filters, Sets, Aggregate measures, Bars, Lines and Pies, Scatter plots, Gantts, Bubbles, Histograms, Bullets, Heat maps and Highlight tables.
- Extensive experience on building dashboards in Tableau and Involved in Performance tuning of reports and resolving issues within the Tableau Server.
- Created Tableau Data Extracts for making visualizations perform better and improve the dashboard.
- Utilized all Tableau tools including Tableau Desktop, Tableau Public, and Tableau Reader.
- Experienced in creating Oracle BI Publisher Report Templates and reports using BI Publisher Desktop Edition.
- Wrote SQL scripts and queries to extract data from various data sources into the SSRS, Visualization tool, and Excel reports.
TECHNICAL SKILLS:
Hadoop Ecosystem: Hadoop, Map Reduce, Hive,Sqoop,HBase, Oozie
Programming: Scala,SQL,Java
Reporting Tools: Tableau Suite of Tools which includes (Desktop, Server, online and Public), BI Publisher, SSRS Developer
Databases: Oracle, MS SQL Server
Spark Technologies: Spark Core, Spark SQL, Spark Streaming
PROFESSIONAL EXPERIENCE:
Confidential, Bentonville, AR
Big Data Engineer
Responsibilities:
- Analyzing Hadoop cluster and different big data analytical and processing tools including Hive,Sqoop and Spark with Scala&java, Spark Streaming.
- Involving in architecture and design of distributed time-series database platform using NoSQL technologies like Hadoop/HBase.
- Writing HiveQL to analyze the number of unique visitors and their visit information such as views, most visited pages, etc.
- Supporting Map Reduce Programs those are running on the cluster and developed multiple Map Reduce jobs in Java for data cleaning and pre-processing.
- Implementing Data ingestion into HDFS using Sqoop on the huge chunks of data.
- Using pig for transformations, event joins and pre -aggregations performed before loading JSON files format onto HDFS.
- Involving in resolving performance issues in Hive with understanding of Map Reduce physical plan execution and using debugging commands to run code in optimized way.
- Good understanding of Partitions, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
- Configuring Spark Streaming to receive real time data from the Apache Kafka and store the stream data to HDFS using Scala.
- Using Spark to perform analytics on data in Hive and experienced with ETL working with Hive and Map-Reduce.
Environment: HDFS, Map Reduce, Spark Streaming, Spark-Core, Spark SQL, Scala, Hive, Sqoop, JSON, HBase
Confidential, Richmond, VA
Big Data Analyst
Responsibilities:
- Evaluated suitability of Hadoop and its ecosystem to the above project and implemented various proof of concept (POC) applications both on Distributed data centers and cloud-based services to eventually adopt them to benefit from the Big Data Hadoop initiative
- Estimated Software & Hardware requirements for the Name-Node and Data-Node& planning the cluster
- Integrated Hive and Hbase, loaded data into HDFS and Bulk Loaded the cleaned data into HBase
- Written the Map Reduce programs, HiveUDFs in Java where the functionality is too complex
- Involved in loading data from LINUX file system to HDFS
- Developed HIVE queries for the analysis, to categorize different items.
- Experience in developing Talend Jobs in Big Data Platform.
- Create Talend jobs and incremental entity feeds based upon the Unix pattern and transform to Hadoop file system.
- Designing and creating Hive external tables using shared meta-store instead of the derby with partitioning, dynamic partitioning and buckets
- Responsible for improving data quality and for designing or presenting conclusions gained from analyzing data using Microsoft Excel as statistical tool.
- Given POC of FLUME to handle the real time log processing for attribution reports
- Sentiment Analysis on reviews of the products on the client's website
- Tested Spark on real-time data, did frequent item mining on real-time by implementing associative-rule mining
- Exported the resulted sentiment analysis data to Tableau for creating dashboards
- Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, and Hive)
- Reviewing peer table creation in Hive, data loading and queries.
- Responsible to manage the test data coming from different sources
- Involved in scheduling Oozie workflow engine to run multiple Hive jobs
- Weekly meetings with technical collaborators and active participation in code review sessions with senior and junior developers
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
- Involved unit testing, interface testing, system testing and user acceptance testing of the workflow tool
Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Flume, Horton works, Cloud era, Oozie, My SQLUNIX.
Confidential, Albany,NY
Hadoop Developer
Responsibilities:
- Worked o Hadoop Cluster with size of 83 Nodes and 896 terabytes capacity
- Worked on Map reduce jobs, HIVE.
- Involve in Requirement Analysis, Design, and Development.
- Importing and exporting data into Hive and HBase using Sqoop from existing SQL server.
- Experience working on processing unstructured data using Hive.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Developed Hive queries and Spark SQL queries to analyze large datasets.
- Exported the result set from Hive to MySQL using Sqoop.
- Created and maintained technical documentation for launching Hadoop clusters and for executing Hive queries.
- Gained experience in managing and reviewing Hadoop log files.
- Involved in scheduling Oozie workflow engine to run multiple Hive jobs.
- Used NoSQL database with HBase.
- Actively involved in code review and bug fixing for improving the performance.
Environment: Hadoop, HDFS, Hive, MapReduce, Sqoop, Flume, LINUX, Hbase, Java, Oozie.
Confidential, Houston, TX
Tableau Developer/Hadoop Developer
Responsibilities:
- Effectively interacted with Business Analysts and defined Mapping documents and Design process for various Sources and Targets.
- Involved in creating dashboards by extracting data from the sources SQL Server
- Prepared dashboards using calculated fields, parameters, calculations, groups, sets and hierarchies in Tableau.
- Attended meetings with Developers, Site Administrators and Business users as needed to discuss about tableau.
- Created dashboard using parameters, sets, groups and calculations.
- Involved in creating interactive dashboard and applied actions (filter, highlight and URL) to dashboard.
- Involved in creating calculated fields, mapping and hierarchies.
- Effectively utilized data blending in case of merging different sources.
- Created and modify Interactive Dashboards and guided navigation links within Dashboards.
- Combined views and reports into interactive dashboards in Tableau Desktop that were presented to Business Users, Program Managers, and End Users.
- Created drill through reports in dashboard using Tableau Desktop.
- Worked on data extraction, transformation and loading of data directly from different source systems like flat files, Excel etc.
- Extensive Knowledge of Tableau Server Concepts like tabcmds for exporting dashboards to PPT deck, creating users, sites and projects.
- Integrated Tableau with Hadoop data source for building dashboard to provide various insights on sales of the organization.
- Worked on Spark in building BI reports using Tableau and was integrated with Spark using Spark-SQL/Shark.
- Combined views and reports into interactive dashboards in Tableau Desktop that were presented to Business Users, Program Managers.
- Reviewed SQL queries and edited inner, left, and right joins in Tableau Desktop by connecting live/dynamic and static datasets.
- Involved in testing the SQL Scripts for report development, Tableau reports, Dashboards, and handled the performance issues effectively.
- Tested dashboards to ensure data was matching as per the business requirements and if there were any changes in underlying data.
- Had meetings on report progress and issues to Project Leader/Manager on a weekly basis.
Environment: Windows XP, Tableau Desktop (8, 9), Tableau Server, Spark - SQL, Hadoop, SQL Developer, MS (Access, Excel, Word)
Confidential
Java Developer
Responsibilities:
- Developed Online panels and application using EJB, Java Servlets, Session EntityBeans
- Handled the database persistence using JDBC.
- Used Spring Framework and created the Dependency injection for the Action classes using ApplicationContext.xml.
- Using Java Script functions for the custom validations.
- Writing JSP form bean validations by using Struts Validation framework validation.xml, validator-rules.xml and message Resources.
- Designed and developed REST web service for validating address.
- Used Criteria API and HQL for data extraction.
- Performed Validations on UI data using JSF validations and JavaScript
- Involved in implementing the rich user interface using JSP Standard Tag Libraries and worked with custom tags.
- Performed client side validations using java script functions to implement various functionality
- Worked on ancillary technologies/tools portal development, BPM, rules engines, security/SSO, and UML
- Worked on designing/developing of large, transactional, enterprise class systems
- Used JDBC for database connectivity and manipulation.
- Developed the custom components like Check box as per the requirement of the application.
Environment: Core Java, Java EE, Spring, Web Logic 10.X, Web Services, HTML, XML, XSL, JSTL, JSP, AJAX, SQL