Big Data Analyst Resume
SUMMARY
- Seasoned business intelligence professional having experience working in entire software development life cycle. 10+ years of data processing experience. 1+ year of experience coding well - abstracted big data components.
- Technical knowledge and project experience on MapR Hadoop, its ecosystem including pig, hive, hbase, sqoop etc., have used Pentaho data integration to solve many of the big data challenges.
- Knowledge of spark streaming, splunk and elastic search .Exposure working on AWS, VMware, EC2, S3 storage. Attended hands-on training in Python and Pentaho BI Suite.
- Worked on following open source modules: MySQL, Jasper soft iReport, Pentaho Kettle, elastic search. Experienced working with agile modules.
- Worked on creating virtual diagnostic tools using supervised machine learning algorithm.
- Weigh on different big data ready ELT tools and suggest the one for adapting into the platform
- Technical hands on experience on products like e-commerce, retail and educational services, AML risk rating, Mortgage/Loan Conduit, smart analytic networks.
- Solved challenges faced in contract renewals by carving a strategy for data mashing. Given the fact that the legacy data integration took 3 days to crunch data, load distribution using Hadoop took only 8 hours’ time.
- Authored snapshot, profile record facts, fact less facts, slowly changing, and role playing dimensions. Designed DataMart’s, data warehouse from OLTP, ERP and Legacy systems. Worked on Analytical tasks: like cohort reports, trend analysis, and tree visualization for financial systems.
- With access to no reporting tool, created report using excel. Created metadata layer using Perl, excel and shell modules.
- Created real time business intelligence with design of Operational data store. Staging layer for ETL, designed one level or multi-level staging based on the complications with incremental loads of data.
TECHNICAL SKILLS
Systems: Windows, Solaris and AIX UNIX
Languages: Oracle PL/SQL, SQL, Shell Scripting, Perl Scripting
Applications: Banking, Finance, Manufacturing, Utilities
Other Tools: Pentaho Kettle 4.3 5.0.7,Informatica 5.x7.x, Business Objects 5.x,6.5,XIR2,3.0( Supervisor, Infoview, BCA,CMC, Web Intelligence, Desktop Intelligence, Designer, xcelsius, Import Wizard, Query Builder, Report conversion ),Oracle Warehouse Builder 9.x,vi,SQL*Plus,, ERWin,Jaspersoft iReport 2.x,, AWS,S3, COGNOS Impromptu 6.6,JIRA
Databases: Oracle 8i, 9i, 10g,SQL server 2000, Sybase 11.5 ASE, MySQL
PROFESSIONAL EXPERIENCE
Confidential
Big Data Analyst
Responsibilities:
- Conduct feasibility study for big data components and suggest right or alternate usage as needed. Work out case study scenarios before getting accepted in the product. Manage and analyze pentaho BI suite bugs. Find alternate solutions for unreleased bugs.
- Enable and setup audit database using oracle as backend, tuning the pre-defined reports. This enhanced the performance of master and carte pentaho servers
- Create and design platform landing zone for external and internal customers. Ensure security and compliance while bringing in new customers.
- Log management including rotation, retention and consolidation .
- Understand and investigate the requirements from contract renewal customers. Aggregate pig and hive scripts for enhancing the speed of the queries running on MapR Hadoop. Especially optimizing data munging (of 500 million joined with 50 million record size)
- Standardize and automate big data governance across multiple customer support projects. This includes geo-based data and file restrictions, retention, encryption etc.,
- Plan, build and code for password encryption across the master and slave pentaho servers.
- Plan on fault - tolerance for hive servers. Code to alternate and automate the dynamic hive server connection from the edge nodes. Monitor the availability of hive server, and script for failure recovery.
Environment: Pentaho data integrator v 5.x, MapR Hadoop, Apache Pig, Hive, Hbase, Sqoop
Confidential
Consultant
Responsibilities:
- Implemented the reporting structure of endowment process with the help of universe and desktop intelligence reports.
- Managing the testing phases of reports, handling queries from end users from the university.
- Performed root-cause analysis and issue resolution across complex report/data environments.
Environment: Oracle 11i,Business Objects XIR2, SQL developer
Confidential
EDW Developer,analyst
Responsibilities:
- Redesigned ELT for huge record volumes of data(millions per hour), thus reducing ELT load time from 3hours to 20 minutes. This also included downsizing unused metrics. Designed extraction layer using pentaho kettle and series of MySQL staging tables.
- Dimensional design, code and implement forecast model for financial planning
- Automate the monitoring process for accounting reports with the use of shell and Perl modules. Used Perl to create reporting abstraction layer, in HTML .The HTML fed excel dashboard reports.
- Eliminate manual check for month end reports by designing and scheduling checklist dashboard using HTML .Designed ELT system flow for bookings using Pentaho kettle jobs and transformations.
- Tweak Perl modules to extract data from JSON file format .Gained knowledge of NoSQL and key value pair databases like Mongo db. Have exposure to customer engagement analysis as part of web analytics
Environment: MySQL,UNIX,Perl,Shell, Pentaho Kettle Data Integrator, AWS
Confidential
Technical Lead
Responsibilities:
- Liaised with business partners for requirements, got clarifications and understood the business model and documented. This reduced the cycle time of project development as the documentation was clear and accurate.
- Converted the requirements to functional specifications, Component technical design after having consensus with all the work streams. Designed a terabyte sized system for extracting and loading text and document data from heterogeneous systems. Created materialized views to bring replication and this enabled real time reporting for the immediate needs of the financial clients
- Created Universes using @functions like @select, @where and @prompt in the process of building the universes. Used derived tables and views inside universes based on the reporting complications.
- Constructed analytical reports with complex graphical representation such as stacked graph with line in desktop intelligence. Created multi-tab reports with tabs ranging from 1-5. These report documents had 30 visualizations. Some were interactive in nature with drill-up, drill-down and cascading prompts.
- Authored reports which were very complex by using custom SQL in web intelligence and desktop intelligence
- Developed and converted complex desktop intelligence reports from desktop intelligence to web intelligence using report conversion tool. Migrated users and contents from one server to another server. Re-implemented the security features for the users and objects in the new server.
- Co-ordinated and tested/modified the reports and their schedules during Oracle upgrades
- Had exceptionally prioritized multiple responsibilities, while implementing 3 work streams and kept the stakeholders ahead of information.
Environment: Oracle 10g,Business Objects XIR3, ERWin and AIX UNIX
Confidential
Lead Consultant
Responsibilities:
- Helped the business users by visualizing the requirement using mock report in excel. Based on the gathered requirements designed the dimensional model (3 facts, 25 dimensions) using ERwin. Designed aggregate table. Designed a 10 million record size system to pull data from SQL server to Sybase database.
- Implemented the knowledge of table design into universe to create valuable analytical Web intelligence and desktop intelligence BO reports. Had handled fan traps and chasm traps using an efficient table design, without the need of a context.
- Had solved challenges in merging many data providers. Have done Instance Management for the scheduled reports, Content management and user management for setting up the new repository (using CMC and supervisor). Scheduled reports using BO command line commands .Used calendar objects for the location specific scheduling to maintain the schedules only during business days. Improved the performance of the reports by tuning the joins and the query indexes
- Created the metadata mapping from the source system to the target system. Responsible for drawing out the ETL strategy using Informatica. Created Informatica workflows to sends mail to the users .Solved SCD 2 and 1 type of dimension loads using informatica mappings
- Scheduled the mappings through UNIX for the daily and monthly loads using autosys, a scheduler tool.
- Created challenging report metadata for iReport, (iReport is a java reporting application from JasperSoft).Conducted impact Analysis for implementing the XML tags for the reporting metadata. Created metadata and views for the report creation.
Environment: SQL Server 2005,Sybase,Informatica 7.0,Business Objects XIR2, ERWin, SQL Server 2005,Jboss,Jaspersoft I Report
Confidential
Senior Consultant
Responsibilities:
- Pictured the solution for the finance department’s reporting needs by the designing the flow of ETL pipeline and reporting for sales, booking, invoice and revenue Data Marts. Actualized the dimensional design and implemented the universe and reports for 4 datamarts. Worked with universe database connections, SQL joins, cardinalities, loops, aliases, views, aggregate conditions, object parsing and hierarchies.
- Created canned and ad-hoc reports. Created booking, sales and invoice reports, using BO full client. Authored sales and booking forecast reports which fetched 1 million records and improved the performance of the reports which were running for 3 hour to 1 hour
- Indicated the slowness of the system and suggested strategy for re-architecting ETL, which gave tremendous improvement. Tested and upgraded reports from 5.1.4 version to 6.5 of Business Objects.
Environment: Business Object 5.1.4,6.5, HP UNIX, Oracle 9i, 8i, PVCS
Confidential
Project Engineer
Responsibilities:
- Designed ETL strategy using the OWB for pricing datamarts. Trained the team and designed the entire ETL strategy including facts, dimensions (product, location), and error and audit tables.
- Wrote the ETL routines to transfer the data from flat files or Oracle tables to Oracle DB. Coded UNIX scripts to automate the ETL routines. This includes ETL routine migration from OWB to Informatica
- Designed and developed ETL mapping strategy using Informatica Power Center/PowerMart. Created / developed / enhanced more than 30 various ETL mapping and mapplet processes. Designed and developed various transformations (aggregations, route, rank, filter, join, dynamic lookups and many more).
- Analyzed gaps while integrating heterogeneous systems. Minimized manual work by using UNIX scripts, PL/SQL for complex mappings. Reduced code components by using mapplets.
- Created and designed cognos cubes in transformer. Created cubes from Impromptu query. Analyzed reports using power play and created reporting using cognos visualizer. Created ad-hoc reports using business objects reporting tool
Environment: Sun Solaris UNIX, Oracle PL/SQL 8.1, Erwin 3.5.2, VSS 6.0,COGNOS Impromptu Administrator 6.6, COGNOS PowerPlay Transformer 6.6, PowerPlay 6.6, BO 5.x,Informatica Power Mart 5.1,Autosys, Oracle Warehouse Builder 9.x, Informatica 6.x Solaris 2.8 Server, Oracle 9i,8i, PL/SQL
