Sr Lead Big Data Consultant Resume
Coppell, TX
SUMMARY
- Around 12+ years of IT experience in System Analysis, Design, Development, Conversion, Customization, Interfacing, Implementation of Business Applications
- 4+ years of experience working with very large data sets and knowledge of building programs that leverage Hadoop and MPP Platforms.
- Cloudera Certified Developer and Developer for Apache Hadoop.
- Extensive experience with Hadoop stacks (MapReduce, YARN, HDFS, Hive, Pig, Sqoop,Oozie, Spark)
- Good Exposure to NoSQL data bases like Cassandra, HBase and MongoDB.
- Hands - on experience in SQL and PL/SQL - Development of Packages, Functions, Procedures, Triggers and Tables utilizing Oracle PL/SQL Database programming language.
- Experience in transferring data from OLTP to Data warehouse using ETL concepts and created reports using OBIEE, Cognos, Business Objects, Tableau, QlikView Reporting tools.
- Experience in Agile Scrum development methodology and frameworks.
TECHNICAL SKILLS
Languages/Scripts: SQL, PL/SQL, JAVA, JSP, XML, HTML, R, Python
Hadoop Ecosystem: HDFS, Hive, Pig, Sqoop, Oozie, HBase, Zookeeper, Avro and MR unit.
Databases: Oracle 11G/10G., SQL Server 8.0/2k/2012, DB2, Teradata, MySQL, Netezza, Vertica.
ETL Tools: Informatica Powercenter 6.x/7.1/8.6/9/1,SSIS, Oracle Data warehouseTeradata, Ab Intio, SSIS
Reporting Tools: Business Objects(BOXI), Crystal Reports, Jasper Reports, SSRS, Cognos, Microstrategy, Tableau, QlikView
Tools: CA Erwin, ER/Studio, Eclipse, SQL*Plus, TOAD, Teradata SQL Assistant, Hibernate
PROFESSIONAL EXPERIENCE
Confidential, Coppell, TX
Sr Lead Big Data Consultant
Responsibilities:
- Involved in all phases of the Software development life cycle (SDLC) including requirement analysis, design, development, building, testing, and deployment of Hadoop cluster in fully distributed mode.
- Responsible for implementing complete Hadoop based solutions, which includes data acquisition, storage, transformation and analysis.
- Developed scripts in Sqoop to transfer bulk data from OLTP systems to Apache Hadoop HDFS for ETL AND ELT processes.
- Imported and indexed Hadoop data into Splunk to make it available for searching, reporting, analysis and visualization Dashboards . Gain rapid insight and analysis without MapReduce code.
- Created Dashboards, KPI Reports using MS APS(Microsoft Analytics Platform System ), Sql Server Parallel Data Warehouse (PDW) Excel, Report Builder functionality and helped create customer reports using pivot tables and filters using Hive and Pig Data.
- Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
- Worked on Real-Time message queuing system for event logs using Apache Storm, Apache kafka as the broker and outputting the result into Hbase.
- Configuring Spark Streaming to receive real time data from the Kafka and Store the stream data to HDFS.
- Collected and aggregated large amounts of web log data from different sources such as webservers, mobile and network devices using Apache Kafka and stored the data into HDFS for analysis.
- Developed Tableau visualizations and dashboards using Tableau Desktop.
- Developed Tableau workbooks from multiple data sources using Data Blending.
- Developed Dashboards, Master Details Reports in SSRS based on sqlunk Data.
- Developed Pig Latin scripts to analyze large orders, customers and transactions datasets.
- Developed Hive QL to accept structured data and created internal and external tables in Hive Metadata. Provided data summarization, query, and analysis from Hive metadata.
- Created partitions, Buckets and indexes to improve performance on Hive Queries.
- Developed Pig and Hive UDF’s for calculation of complex metrics.
- Created UNIX scripts to automate the MR, pig and Hive jobs and scheduling them in Autosys.
- Built real-time Big Data solutions using Pig, Hive, Sqoop and HBase handling billions of records.
- Optimize and tune the Hadoop environment to meet the performance requirements.
- Developed the workflow jobs using Oozie services to run the MR, Pig and Hive jobs as per the requirement basis.
- Developed Sqoop and Oozie jobs to transfer history data from Teradata to HDFS layer.
- Created static and dynamic partitions, bucketing, and optimized joins to improve performance on Hive Queries.
- Implemented Spark, oozie process to stream line and automate Hive, Pig, MR and Unix jobs through Autosys.
- Created UNIX scripts to automate the Spark, MR, pig and Hive jobs and scheduling them in Autosys.
Environment: Agile process, Java, Netezza, Oracle, Vertica, Centos, Splunk, Hadoop, Sqoop, MapReduce, Hive, Pig, OOZIE, Spark, Hortonworks, XML, Putty, and Eclipse, SAP SD/MM,SSIS,SSRS,SSAS, Tableau, Qlikview, Oracle 11g, Informatica, Cognos, Business Objects.
Confidential, Plano, TX
Big Data Developer
Responsibilities:
- Created business requirements for several SERs and supported End to End project cycles.
- Created Technical Architecture Design, Data modelling for advanced Reports for the Loss Mitigation Data LOB on scheduled basis.
- Created UNIX scripts to automate the MR, pig and Hive jobs and scheduling them in Autosys LOB.
- Built real-time Big Data solutions using Pig, Hive, Sqoop and HBase handling billions of records for Loss Mitigation Underwriting and Collection Application
- Queried and massaged data from multiple databases including LMA underwriting database and iseries for in-depth analysis of issues supporting LOB .
- Implemented data ingestion from multiple sources like IBM Mainframes, IBM DB2 Database, SQL server, Oracle, Netezza and terdata using Sqoop, SFTP and MapReduce jobs.
- Efficiently handled Loss Mitigation reporting needs and enhancements to the existing environment
- Queried the Underwriting database for various custom reporting needs including daily weekly, ad-hoc reports and reports to the management.
- Created Processes and Procedures to handle the Loss Mitigation Data and made it easily accessible for Reporting needs.
- Created Standardization layers in Hive by mapping attributes of individual parties.
- Importing and Exporting RDBMS data into HDFS, Hive and HBase using Splunk, Sqoop.
- Importing Incremental and updated changes from RDBMS using Sqoop.
- Importing and exporting data from Teradata, Exadata using Sqoop and Oracle Connectors.
- Worked in a production support team which included in depth analysis of business critical issues, supporting users and escalations as and when required.
- Involved in Full project life cycles (from requirements gathering through post implementation phase).
- Worked with Functional analysts and developers to translate the business requirements to technical specifications.
- Assisted in CFPB project UAT effort by providing GAP analysis and facilitating testing efforts across various teams to ensure seamless integration in testing effort. Maintained various data conditioning needs through the testing cycle and worked on Reporting needs of the management.
- Manipulated and conditioned data through NOSQL to help QA/UAT with backend testing.
Environment: Java, Netezza, Oracle, SSRS,SSIS, Informatica, Teradata, Business Objects, and Tableau, Centos, Hadoop, Splunk, Sqoop, MapReduce, Hive, Pig, OOZIE, XML, Putty, and Eclipse.
Confidential, Richmond, VA
Data Analyst/BI Report Analyst
Responsibilities:
- Created Processes and Procedures to handle the Loss Mitigation Data and made it easily accessible for Reporting needs.
- Created procedures, scalar and table defined functions to modularize the data retrieval process.
- Experience in extracting, transforming and loading (ETL) data from spreadsheets, database tables and other sources using Microsoft SSIS and Informatica.
- Created, documented and maintained logical and physical database models in compliance with enterprise standards and maintained corporate metadata definitions for enterprise datastores within a metadata repository.
- Established and maintained comprehensive datamodel documentation including detailed descriptions of business entities, attributes, and datarelationships.
- Developed mapping spreadsheets for (ETL) team with source to target datamapping with physical naming standards, datatypes, volumetric, domain definitions, and corporate meta-datadefinitions.
- Worked in building business intelligence and corporate performance management solutions including dashboards, score cards, query, and analysis reports with drill up, drill down.
- Created ad-hoc queries from SQL Server databases for custom reporting needs to meet the business need
- Documented data mapping rules for movement of data between applications or for population of new databases.
- Worked with internal and external clients for Import and normalization of third party data
- Designed & developed the reports BOXIR3, reports Involved in Accessing, Managing, Analyzing and presenting data using SAS.
- Used SSIS to perform the ETL operations and created SSRS Reports using the SQL Server Business Intelligence Studio. Automated several reports.
- Worked with the BI team to create several Reports and Dashboards using SSRS
- Worked on creating the process flow, modeling data and process implementation of Home Affordability and Modification Program (HAMP). Provided Reporting Solutions for tracking the HAMP process by creating Business, Quality Control and Exception Reports.
- Involved in leading the team in creating reports, managing the reports interface, automating the reports
Environment: Agile, UNIX, Linux SQL Server, Ab Initio Oracle 10.g, Teradata, .NET, HP Quality Center 9.x(Test Director), Toad 9.x, BO XIR2, BO XI 3.x, Live Office, Crystal Reports, Xcelsius 4.5
Confidential, Washington DC
Developer/BI Analyst
Responsibilities:
- Created ad-hoc queries from SQL Server databases for custom reporting needs to meet the business needs
- Performed data and systems analysis in order to create and document ETL rules to facilitate adding new data sources into data warehouse housing
- Helped in the conversion of Microsoft SQL Server database to an Oracle database in order to allow the company to increase their customer base
- Converted existed Crystal Reports to Business Objects XI. Worked with Report Designer created parameterized reports, expressions, functions, custom functions, Sub Reports, dynamic sorting and used Chart Controls
- Involved in leading the team in creating reports, managing the reports interface and automating the reports
- Documented data mapping rules for movement of data between applications or for population of new databases
- Acted as expert technical resource to programming staff
- Update plan estimates and dates with information from PM and other project team members