Big Data Developer Resume
Naperville, IL
SUMMARY:
- Hadoop Developer with 2½ years of working experience on designing and implementing complete big data solutions using PIG, HIVE, Impala, Sqoop, Flume, Spark, Kafka, zookeeper and Ozzie.
- Microsoft BI developer with over 8 years of experience in designing and implementing BI solutions using SSIS, SSRS and SSAS.
- Expert knowledge in Hadoop ecosystem and its various components such as HDFS, Job Tracker, Name Node, Data Node and MapReduce.
- Extensive experience with Cloudera cluster with limited knowledge of AWS.
- Good knowledge and understanding of Scala and basic knowledge of Java programming.
- Experience in analyzing data using Hive QL, Pig Latin and Impala.
- Good experience in implementation of publisher - subscriber based messaging system using Kafka.
- Experience in importing and exporting data using Sqoop from RDBMS systems to HDFS and vice-versa.
- Collecting and aggregating large amount of Log Data using Apache Flume and storing data in HDFS for further analysis.
- Worked in complete Software Development Life Cycle (analysis, design, development, testing, implementation and support) using Agile Methodologies.
- Very good hands on experience in creating various database objects like tables, stored procedures, functions, and triggers in SQL server.
- Excellent technical, communication, analytical, problem solving and trouble-shooting capabilities.
TECHNICAL SKILLS:
Big Data Ecosystem: HBASE, Pig, Impala, Hive, Spark, Flume, Sqoop, Kafka, HDFS
Microsoft BI: SQL Server 2008/2012, SSRS, SSIS, SSAS
Programming Languages: Scala, Shell Scripts, PL/SQL, Java (basic)
Database: SQL Server, MySQL
Web Technologies: ASP.NET, SharePoint
PROFESSIONAL EXPERIENCE:
Confidential, Naperville IL
Big Data Developer
Responsibilities:
- Designed data collection, data flow and data processing to deliver data for reporting.
- Developed Sqoop jobs to read data from different RDBMS systems to HDFS.
- Created Hive tables to store the processed results in a tabular format.
- Developed Hive scripts to denormalize and aggregate the disparate data.
- Implemented external tables and partitions using HIVE.
- Implemented HBASE for creating tabular data.
- Worked on implementation of Hadoop streaming through Apache Kafka as message broker to process all activity stream data using Spark.
- Moved all log/text files generated by various applications into HDFS location using Flume.
- Developing a Kafka based producer/consumer services for data streaming and persisting user activity logs
- Used dynamic workflows from Oozie workflow Manager.
Environment: Hadoop, Hive, HBase, Sqoop, PIG, Oozie, Flume, Kafka, Yarn and Spark
Confidential, Jacksonville FL
Big Data Analyst
Responsibilities:
- Responsible for Build/develop/testing shared components that will be used across many modules.
- Imported all the Customer specific personal data to Hadoop using SQOOP component from various relational databases like SQL and oracle.
- Created tasks for incremental load into staging tables, and schedule them to run.
- Used Apache Kafka as messaging system to publish log data, data from UI applications under different topics to be consumed by subscriber.
- Written Map Reduce code that will take input as customer related flat file and parse the same data to extract the meaningful (domain specific) information for further processing.
- Created Hive External tables with partitioning to store the processed data from Map Reduce.
- Implemented Hive optimized joins to gather data from different sources and run ad-hoc queries on top of them.
- Wrote Hive Generic UDF's to perform business logic operations at record level.
- Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig and used ZooKeeper to coordinate the clusters.
- Involved in loading data from LINUX file system to HDFS.
- Worked on various file formats and compressions.
- Used Pig to do data transformations, event joins, filter and some pre-aggregations before storing the data onto HDFS.
- Implemented test scripts to support test driven development and continuous integration.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Environment: Cloudera, Hadoop, Map Reduce, HDFS, Hive, Pig, Oozie, Sqoop, Apache Kafka, flume, Linux, Oracle, Netezza.
Confidential, Memphis TN
Hadoop Developer
Responsibilities:
- Worked with highly unstructured and semi-structured data (Replication factor of 3).
- Involved in ETL, Data Integration and Migration using SQL. Imported data using Sqoop to load data from SQL to HDFS on regular basis.
- Wrote Pig Latin scripts to transform raw data from several data sources into forming baseline data.
- Wrote Hive queries for ad hoc data analysis to meet the business requirements.
- Solved performance issues in Hive and Pig scripts with understanding of joins, Groups, and aggregation and how does it translate to MapReduce jobs.
- Created Hive tables and working on them using HiveQL. Importing and exporting data into HDFS from SQL Database and vice versa using Sqoop.
- Worked with Oozie workflow for scheduling and orchestrating the ETL process.
- Experience in managing and reviewing Hadoop log files.
Environment: Hadoop, Linux, MapReduce, HDFS, Hive, Pig, Shell Scripting, Sqoop, SQL Server 2012.
Confidential, Freeport ME
MS BI developer
Responsibilities:
- Contributed in design and development of data warehouse and BI portal.
- Designed and developed ETL packages using SSIS to load data into data warehouse from different transactional systems.
- Designed and developed multiple complex SSRS reports.
- Developed SQL stored procedure to read data for reporting in SSRS reports.
- Scheduled ETL jobs to run on predefined schedules to load data into data warehouse environment based on SLA.
- Created views on top of complex table structures to make data available in simple format for reports.
- Created DDL scripts to create the structure of data warehouse.
Environment: SharePoint 2010, SQL Server 2012, SSRS, SSIS, SSAS, MDX, DB2
Confidential, Richmond VA
SQL Developer
Responsibilities:
- Contributed in designing logical data model.
- Developed DDL scripts to create physical data model.
- Created SSIS packages to import data from excel sheet to SQL tables.
- Developed various SQL stored procedures as part of calculation engines to perform the tax allocation/calculation.
- Developed various stored procedures to give required output on UI.
- Developed views on complex table structures to help UI team to create summary and detail UI views.
- Created wrapper stored procedures to put multiple stored procedures together as part of one process.
- Created test SQL scripts to run stored procedures with required data for testing.
- Designed and developed multiple SSRS reports to provide allocation/calculation status to users.
- Developed SSIS packages to export data into excel sheets.
Environment: ASP.NET, SQL Server 2008, SSRS, SSIS
Confidential
SQL Developer
Responsibilities:
- Involved in Requirements analysis, design, and development and testing of different applications.
- Designed, developed and deployed different data warehouse models using SQL Server.
- Involved in designing and developing SSIS packages using SQL Server for integration jobs.
- Developed SSRS reporting on data warehouse and transactional database.
- Developed SSAS cube for multidimensional SSRS reports.
- Used SQL database to store data and execute stored procedures on the backend.
- Designed, developed and maintained different database objects scripts.
- Prepared and Maintained test environment.
- Tested the application before going live to production.
- Involved in weekly meeting with team leads and manager to discuss the issues and status of the projects.
Environment: ASP.NET, C#, SQL Server 2008, SSRS, SSIS, SSAS