Big Data Analyst Resume
Baltimore, MD
SUMMARY:
- A Cumulative 8+ years of IT experience in full life - cycle implementations with skills in planning, executing various Software Development Life Cycle (SDLC) projects in compliance to quality standards.
- 3+ years of comprehensive experience in dealing with Apache Hadoop and its ecosystems like HDFS, Map Reduce, Hive, Pig, Sqoop, spark SQL, Spark Data Frames and Hbase.
- Involved in all phases of Software Development Life Cycle (SDLC) and Worked on all activities related to the development, implementation, administration and support of ETL processes for large-scale Data Warehouses.
- Experience in installation, configuration, supporting and monitoring Hadoop clusters using Apache, Cloudera distributions.
- Experience in understanding the client’s Big Data business requirements and transform it into Hadoop centric technologies.
- Experience in understanding customers multiple data sets, which include Behavioral data, customer profile data, usage data and product data.
- Experience in designing and implementing complete end-to-end Hadoop Infrastructure.
- Analyzing the clients existing Hadoop infrastructure and understanding the performance bottlenecks and provide the performance tuning accordingly.
- Experience in importing and exporting data using Sqoop from Relational Database Systems to HDFS.
- Experience in writing DB2 PL/SQL static and Dynamic Stored Procedures
- Experienced with fast and general engine for large-scale data processing in Spark sql .
- Strong experience in analyzing large amounts of data sets writing Pig scripts and Hive queries.
- Experience in configuration and maintaining Oracle Exadata and Oracle Golden Gate.
- Syntel BNFS Payment level 1 certified and gained domain knowledge of the Banking and Financial Services.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
- Requirements Definition, Functional Specifications and gap Analysis.
- Experience in Implementing projects, end to end testing and user acceptance testing.
- Ensured Quality assurance as per client guidelines.
TECHNICAL SKILLS:
Big Data Ecosystems: Spark SQL, Oracle GG, Oracle ODI, Control-M, Hadoop, HDFS, Sqoop, Hive, Pig, Spark Data Frames,oozie,Development Tools: Eclipse
Programming Languages: Java, Cobol, JCL
Database : IBM DB2, Oracle, MySQL, Exadata
Environments / Platforms : Unix, Cent OS, Windows (all flavours)
Other tools: Cognos, PL/SQL stored procedures
Methodologies: Agile
PROFESSIONAL EXPERIENCE:
Big Data Analyst
Confidential - Baltimore, MD
Responsibilities:
- Analyzing requirements and updating possible approaches and solutions on requirements.
- Analyze the impact of the requirements to the existing system and document the same.
- Develop detail design as per the requirement received from business and get the same approved.
- Involved in creating Hive Tables, loading with data and writing hive queries to process the data.
- Imported data using Sqoop to load data from oracle to HDFS on regular basis.
- Imported data using Golden Gate to load data from Oracle, DB2 to HDFS on regular basis.
- Load and transform large sets of structured, semi structured and unstructured data.
- Conducted Data analysis and created Data Mappings
- Developed test cases for Unit Testing.
- Interaction with various members of team to resolve defects.
Environment : Hadoop, HDFS, Hive, Sqoop, DB2, Oracle, SQL developer, Golden Gate, Spark, Data Frames, Spark SQl, Exadata
Hadoop Developer
Confidential - Dallas, TX
Responsibilities:
- Responsible for building scalable distributed data solutions using Hadoop.
- Developed job processing scripts using Oozie workflow.
- Installed and configured Hive, Pig, Sqoop, Flume and Oozie on the Hadoop cluster.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig.
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
- Involved in Hadoop cluster task like commissioning & decommissioning Nodes without any effect to running jobs and data.
- Wrote Map Reduce jobs to discover trends in data usage by users.
- Involved in running Hadoop streaming jobs to process terabytes of text data.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Helped the team to increase the Cluster size from 22 to 30 Nodes.
- Job management using Fair scheduler.
- Worked extensively with Sqoop for importing metadata from Oracle.
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems as we as RDBMS and NoSQL data stores for data access and analysis.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible to manage data coming from different sources.
- Assisted in exporting analyzed data to relational databases using Sqoop.
- Wrote Hive Queries and UDF's.
- Developed Hive queries to process the data and generate the data cubes for visualizing.
- Created Pig Latin scripts to sort, group, join and filter the enterprise wise data.
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Gained experience in managing and reviewing Hadoop log files.
Environment: Hadoop, HDFS, Hive, Sqoop, DB2, Oracle, SQL developer, Spark, Data Frames, Spark SQL.
Technical Lead
Confidential
Responsibilities:
- Analyzing the business functionality of the existing application and restructuring that to a new technology for enriching the user experience apart from providing enhanced.
- Installed and configured Hadoop MapReduce, HDFS, Developed multiple map reduce jobs for data preprocessing.
- Defined Migration plans to import & export data from RDBMS to HDFS.
- Extracted data from rational databases using Scoop.
- Developed PIG scripts for data processing according to business rules.
- Involved in creating Hive Tables, loading with data and writing Hive queries to process the data.
- Load and transform large sets of structured, semi structured and unstructured data.
- Conducted Data analysis from Financial Transactions.
- Developed test cases for Unit Testing.
- Interaction with various members of team to resolve defects.
- Maintaining good coordination with the team.
- Project delivery under stringent timelines and high pressure.
Environment: Hadoop , MapReduce, HDFS, Hive, Sqoop, Java, MySQL, Oracle,Spark SQL,Spark Data Frames.
Software Engineer
Confidential
Responsibilities:
- Involved in interacting with business people, Client meeting, analysis, code modification, testing.
- Enhancement of the business logic.
- Impact analysis for proposed enhancements / modifications to a module / program / database.
- Developing test strategies and test cases.
- Providing support to the users and testing teams during Integration, UAT & Regression testing by resolving the issues encountered during testing.
- Preparing the implementation plan for the project roll-out.
- Providing Production support by monitoring daily processing jobs till the end of warranty period.
- Developed test cases for Unit Testing.
- Actively participated in Weekly status meetings with technical collaborators and colleagues.
Environment: COBOL, JCL, DB2, IMS, VSAM, Manage Now (tool used by Amex to monitor the tickets), BMCADM, Info man, Change man, Ezytrieve.
Software Engineer
Confidential
Responsibilities:
- Gathering functional requirements from the onshore coordinator.
- Interacting with onsite coordinator on analyzing requirements and updating possible approaches and solutions on requirements.
- Analyze the impact of the requirements to the existing system and document the same.
- Develop detail design as per the requirement received from business and get the same approved.
- Calculate the effort requirement for the changes, risk analysis, issues/delays in completing the tasks.
- Perform code review, testing, test case review and test results review for the created/modified components.
- Software implementation and post install support.
- Preparation of status report and conducts weekly status meeting with onshore coordinator.
- Knowledge transition to the new resources joining in the team.
- Have been part of SDLC-Coding, code reviews, testing and ensuring coding standards for development project.
- Impact analysis for proposed enhancements / modifications to a module / program / database.
Environment: COBOL, JCL, DB2, IMS, VSAM, Manage Now (tool used by Amex to monitor the tickets), BMCADM, Info man, Change man, Ezytrieve.