Big Data Developer Resume
NJ
SUMMARY
- Over 6 years of software development experience including 3 years of expertise in Big Data technologies.
- Experience in Hadoop Ecosystems: Hive, Pig, HBase, Sqoop, Impala and Spark.
- DevelopedHive scripts for end user / analyst requirements to perform ad hoc analysis
- Experience in writingPig scripts to transform raw datafrom several data sources into forming baseline data.
- Excellent understanding ofPartitions, Bucketingconcepts in Hive and designed both Managed and Externaltables in Hive to optimize performance
- Solved performance issuesin Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
- DevelopedOozieworkflow for scheduling and orchestrating the ETL process
- Expertise in NoSQL database - HBase
- Migrated batch processing system from Mainframes to Hadoop platform
- Good knowledge in Tableau and Talend
- Experience with hadoop shell commands.
- Experience in Microsoft Business Intelligence Tools SSIS, SSAS and SSRS.
- Expert in design and development of SQL Server Integration Services (SSIS) packages and using various control flow task and data flow transformation.
- Strong experience in developing Custom Reports and different types of Tabular, Matrix, Ad-hoc and distributed reports in multiple formats using SQL Server Reporting Services (SSRS)
- Well-versed in Agile and other SDLC methodologies
- Experience in all the phases of BI life cycle involving Requirement Analysis, Design, Coding, Testing, and Deployment
- Excellent knowledge in Informatica Power Center9.x/8.x.
- Adept at working as part of a team or independently with minimal to no supervision.
TECHNICAL SKILLS
Big Data Technologies\Database\: Hadoop, Hive, MapReduce, HBase, Pig, Spark, \Sql Server, MySQL, Oracle\HBase, Oozie, Sqoop, Flume, Yarn, Cloudera, \Impala\
Reporting Tools \ETL Tools \: Tableau, SSRS\SSIS, Informatica, Talend\
Other\: SSAS, Java\
PROFESSIONAL EXPERIENCE
Confidential - NJ
Big Data Developer
Responsibilities:
- Worked on a live Hadoop production CDH5 cluster with 50 nodes.
- Developed Hive scripts to load data into Hive tables for processing.
- Created joins, partitions, applied data filter conditions and lookups on Hive temp tables before loading to target tables.
- Used window functions to calculate average, mean, min, max, dense rank, rank, lead and lag as per specifications provided by analytics team.
- Created HBase tables and implemented bulk load process using TSV bulk loader to load data from processed files to HBase tables.
- Involved in migrating of MapReduce programs into Spark transformations using Spark
- Designed and created Oozie workflows to schedule and automate Hive and HBase jobs.
- Used Impala to query tables for ad-hoc data requests.
- Optimized performance by Partitioning and Bucketing in Hive.
- Involved in deploying and testing hadoop solutions in different environments.
Environment: CDH 5, HDFS, MapReduce, Hive, Sqoop, HBase, MySQL, Impala, Spark
Confidential
Responsibilities:
- Loaded data from IE Oracle Data Warehouse to HDFS using data ingestion tool Sqoop.
- Loaded data from Linux file system to HDFS.
- Created data models, field layouts and Hive partitions based on user requirements.
- Developed Hive DDL and data load scripts using Hue Editor
- Created Hive SerDe to read fixed width or position delimited data files.
- Performance tuned Hive queries
- Automated jobs for pulling data from FTP server to load data into Hive tables using Oozie workflow.
- Worked with business analysts, source technical team and data scientist to understand source and target data requirements.
Environment: CDH 4 and 5, HDFS, MapReduce, Hive, Sqoop, R, Python, Impala, Linux, Oozie
Confidential - GA
Big Data Developer
Responsibilities:
- Used Sqoop to load data from MDW Oracle data warehouse.
- Developed shell scripts to load data from 6000+ Hospitals and Clinics into HDFS.
- Designed and developed Pig Latin scripts and Pig command line transformations for data joins and custom processing of map reduce outputs.
- Fine-tuned data load process to meet SLA by refactoring and performance tuning Pig scripts.
- Created shell scripts for Pig workflows which were invoked from Control M.
- After completing data transformations in Pig created Hive scripts to load data to Hive tables.
- Worked with business analysts and data scientists to understand data mapping and transformation logics.
- Assisted Tableau reporting team in developing and tuning queries
Environment: CDH 4.5, HDFS, MapReduce, Pig, Hive, Sqoop, Oracle, R, Python, Impala, Linux, Oracle
Confidential - NY
BI Developer and Production Support
Responsibilities:
- Successfully handled migration of databases from JDE to SQL Server 2008 along with devising Business Intelligence Metrics/Measurements.
- Developed SSIS packages to extract transform and load the data from JDEdwards, RealPage and Yardi to Analytical databases.
- Created SSIS Packages to export and import data from CSV files, Text files and Excel Spreadsheets.
- Designed, Developed and Deployed reports as per business requirement in SSRS-2008.
- Write complex Stored Procedures to implement the reporting requirements, and used them as data sources to SSRS reports.
- Supported and maintained existing SSRS Reports that are stored procedure based, or query based or MDX based.
- Created subscriptions to direct automated report delivery to recipients.
- Kept up-to-date documentation for all developed reports and SSIS packages.
Confidential
Responsibilities:
- Developed SSIS packages for Retail and Residential properties.
- Designed, Developed and Deployed reports as per business requirement in SSRS-2008.
- Worked with business analysts to understand business data requirements
- Involved in deploying and testing the SSIS, SSAS and SSRS reports to production.
- Created calculated fields with MDX code to meet additional measures needs those were being calculated from existing measures.
- Designed, developed, created and tested standardized PivotTable/PivotChart reports based on OLAP cubes and offline cubes to satisfy ad-hoc reporting needs from business users.
Confidential
Responsibilities:
- Developed Budget variance reports using SSRS 2008.
- Created Stored Procedures, User Defined Functions, Views, TSQL Scripting for complex business logic.
- Extensively used Joins and sub-queries for complex queries involving multiple tables from different databases.
- Created subscriptions and alerts for successful or unsuccessful completion of Scheduled Jobs.
- Worked with SharePoint developers to configure dashboards, scorecards and reports via SharePoint site.
- Supported SSRS and MDX reports in production.
- Troubleshot and fixed ETL issues in SSIS packages in production.
- Wrote script to perform frequent back up. Performed Backup & Restore and point in time recovery of the databases as per requests.
- Responsible for creating Change Request documents for deploying fixes to production.
- Provided internal IT project support.
- Assisted with code deployments and documentation from Development, QA and Production
Environment: MS SQL Server, SQL Server Reporting Services (SSRS), SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), JD Edwards, Yardi, RealPage, WebService, Windows 7, SharePoint