We provide IT Staff Augmentation Services!

Big Data Developer Resume

0/5 (Submit Your Rating)

NJ

SUMMARY

  • Over 6 years of software development experience including 3 years of expertise in Big Data technologies.
  • Experience in Hadoop Ecosystems: Hive, Pig, HBase, Sqoop, Impala and Spark.
  • DevelopedHive scripts for end user / analyst requirements to perform ad hoc analysis
  • Experience in writingPig scripts to transform raw datafrom several data sources into forming baseline data.
  • Excellent understanding ofPartitions, Bucketingconcepts in Hive and designed both Managed and Externaltables in Hive to optimize performance
  • Solved performance issuesin Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
  • DevelopedOozieworkflow for scheduling and orchestrating the ETL process
  • Expertise in NoSQL database - HBase
  • Migrated batch processing system from Mainframes to Hadoop platform
  • Good knowledge in Tableau and Talend
  • Experience with hadoop shell commands.
  • Experience in Microsoft Business Intelligence Tools SSIS, SSAS and SSRS.
  • Expert in design and development of SQL Server Integration Services (SSIS) packages and using various control flow task and data flow transformation.
  • Strong experience in developing Custom Reports and different types of Tabular, Matrix, Ad-hoc and distributed reports in multiple formats using SQL Server Reporting Services (SSRS)
  • Well-versed in Agile and other SDLC methodologies
  • Experience in all the phases of BI life cycle involving Requirement Analysis, Design, Coding, Testing, and Deployment
  • Excellent knowledge in Informatica Power Center9.x/8.x.
  • Adept at working as part of a team or independently with minimal to no supervision.

TECHNICAL SKILLS

Big Data Technologies\Database\: Hadoop, Hive, MapReduce, HBase, Pig, Spark, \Sql Server, MySQL, Oracle\HBase, Oozie, Sqoop, Flume, Yarn, Cloudera, \Impala\

Reporting Tools \ETL Tools \: Tableau, SSRS\SSIS, Informatica, Talend\

Other\: SSAS, Java\

PROFESSIONAL EXPERIENCE

Confidential - NJ

Big Data Developer

Responsibilities:

  • Worked on a live Hadoop production CDH5 cluster with 50 nodes.
  • Developed Hive scripts to load data into Hive tables for processing.
  • Created joins, partitions, applied data filter conditions and lookups on Hive temp tables before loading to target tables.
  • Used window functions to calculate average, mean, min, max, dense rank, rank, lead and lag as per specifications provided by analytics team.
  • Created HBase tables and implemented bulk load process using TSV bulk loader to load data from processed files to HBase tables.
  • Involved in migrating of MapReduce programs into Spark transformations using Spark
  • Designed and created Oozie workflows to schedule and automate Hive and HBase jobs.
  • Used Impala to query tables for ad-hoc data requests.
  • Optimized performance by Partitioning and Bucketing in Hive.
  • Involved in deploying and testing hadoop solutions in different environments.

Environment: CDH 5, HDFS, MapReduce, Hive, Sqoop, HBase, MySQL, Impala, Spark

Confidential

Responsibilities:

  • Loaded data from IE Oracle Data Warehouse to HDFS using data ingestion tool Sqoop.
  • Loaded data from Linux file system to HDFS.
  • Created data models, field layouts and Hive partitions based on user requirements.
  • Developed Hive DDL and data load scripts using Hue Editor
  • Created Hive SerDe to read fixed width or position delimited data files.
  • Performance tuned Hive queries
  • Automated jobs for pulling data from FTP server to load data into Hive tables using Oozie workflow.
  • Worked with business analysts, source technical team and data scientist to understand source and target data requirements.

Environment: CDH 4 and 5, HDFS, MapReduce, Hive, Sqoop, R, Python, Impala, Linux, Oozie

Confidential - GA

Big Data Developer

Responsibilities:

  • Used Sqoop to load data from MDW Oracle data warehouse.
  • Developed shell scripts to load data from 6000+ Hospitals and Clinics into HDFS.
  • Designed and developed Pig Latin scripts and Pig command line transformations for data joins and custom processing of map reduce outputs.
  • Fine-tuned data load process to meet SLA by refactoring and performance tuning Pig scripts.
  • Created shell scripts for Pig workflows which were invoked from Control M.
  • After completing data transformations in Pig created Hive scripts to load data to Hive tables.
  • Worked with business analysts and data scientists to understand data mapping and transformation logics.
  • Assisted Tableau reporting team in developing and tuning queries

Environment: CDH 4.5, HDFS, MapReduce, Pig, Hive, Sqoop, Oracle, R, Python, Impala, Linux, Oracle

Confidential - NY

BI Developer and Production Support

Responsibilities:

  • Successfully handled migration of databases from JDE to SQL Server 2008 along with devising Business Intelligence Metrics/Measurements.
  • Developed SSIS packages to extract transform and load the data from JDEdwards, RealPage and Yardi to Analytical databases.
  • Created SSIS Packages to export and import data from CSV files, Text files and Excel Spreadsheets.
  • Designed, Developed and Deployed reports as per business requirement in SSRS-2008.
  • Write complex Stored Procedures to implement the reporting requirements, and used them as data sources to SSRS reports.
  • Supported and maintained existing SSRS Reports that are stored procedure based, or query based or MDX based.
  • Created subscriptions to direct automated report delivery to recipients.
  • Kept up-to-date documentation for all developed reports and SSIS packages.

Confidential

Responsibilities:

  • Developed SSIS packages for Retail and Residential properties.
  • Designed, Developed and Deployed reports as per business requirement in SSRS-2008.
  • Worked with business analysts to understand business data requirements
  • Involved in deploying and testing the SSIS, SSAS and SSRS reports to production.
  • Created calculated fields with MDX code to meet additional measures needs those were being calculated from existing measures.
  • Designed, developed, created and tested standardized PivotTable/PivotChart reports based on OLAP cubes and offline cubes to satisfy ad-hoc reporting needs from business users.

Confidential

Responsibilities:

  • Developed Budget variance reports using SSRS 2008.
  • Created Stored Procedures, User Defined Functions, Views, TSQL Scripting for complex business logic.
  • Extensively used Joins and sub-queries for complex queries involving multiple tables from different databases.
  • Created subscriptions and alerts for successful or unsuccessful completion of Scheduled Jobs.
  • Worked with SharePoint developers to configure dashboards, scorecards and reports via SharePoint site.
  • Supported SSRS and MDX reports in production.
  • Troubleshot and fixed ETL issues in SSIS packages in production.
  • Wrote script to perform frequent back up. Performed Backup & Restore and point in time recovery of the databases as per requests.
  • Responsible for creating Change Request documents for deploying fixes to production.
  • Provided internal IT project support.
  • Assisted with code deployments and documentation from Development, QA and Production

Environment: MS SQL Server, SQL Server Reporting Services (SSRS), SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), JD Edwards, Yardi, RealPage, WebService, Windows 7, SharePoint

We'd love your feedback!