Big Data Consultant Resume
Richardson, TX
SUMMARY:
- 9 years of experience in Software analysis, design, development, testing, deployment and maintenance.
- 3+ years of experience in Hadoop ecosystem components.
- Capable of processing large sets of structured, semi - structured and unstructured data and supporting systems application architecture.
- Able to assess business rules, collaborate with stakeholders and perform source-to-target data mapping, design and review.
- Capable of designing and developing large scale enterprise and distributed applications.
- Experience with spark and spark streaming.
- Experience with writing Pig scripts to transform raw data from several data sources.
- Experience with Hive scripts for end user / analyst requirements to perform ad hoc analysis.
- Have a good knowledge on YARN.
- Experience with HBase for storing data that need faster access.
- Experience with Sqoop for extracting and loading data.
- Experience with Flume and kafka for transferring the files from various sources to HDFS.
- Performed POCs on Datatorrent(Apache Apex), Apache NiFi
- Performed POCs on visualization tools like Tableau, Data wrangling tools like Trifacta, Paxata, cluster performance monitoring tools like Dr.Elephant and Unravel.
- Peformed POCs on the OCR text extraction from images using apache TIKA, ImageMagick and Apache NiFi
- Experience in working with data science teams to provide the data required for the models.
- Experience with Oozie Workflow Engine in running workflow jobs
- Experience in JAVA, J2EE and Spring
- ETL experience using Datastage Enterprise Edition v8.x
- Expertise in working in every phase of the Software Development Lifecycle, both Waterfall and Agile.
- Familiar with data architecture including data ingestion, Hadoop information architecture, and data modeling.
- Expertise in building large enterprise scale and distributed applications
- Extensive experience in working with Business Analysts, Users, Architects, DBA's, Infrastructure and support groups for system design, documentation and implementation.
- Involved in developing test plans and performing Unit, Functional, System integration & Performance testing.
- Experience in Code reviews, fixing defects and enhancing application performance.
- Ability to handle multiple tasks and challenging deadlines, work effectively with peers, in-group and independently.
- Excellent communication with Interpersonal skills and a self-motivated team player.
- Strong problem solving skills and a thirst to learn the latest Technologies in the industry.
- Strong Managing skills with offshore and onshore model.
TECHNICAL SKILLS:
Big Data Technologies: HDFS, MapReduce, Hive, Pig, Kafka, Spark, Apache Apex, Apache NiFi, Spark streaming, Sqoop, Zookeeper, Flume and Oozie.
Operating Systems: Windows & Linux
Languages: Java, PL/SQL, Enterprise COBOL for Z/OS, JCL, Eazytrieve
Frameworks: Hadoop, Spring mvc and Hibernate
IDEs: Eclipse,SQLDeveloper,CSFMessenger,CSUMessenger,Datastage
WebServers /App Servers: Apache Tomcat
Database: Oracle, MySQL, HBase(NoSQL), DB2
Test Management Tool: Quality Center, JIRA
Other tools: Jenkins, Rally, GitHub
Methodologies: Agile, Waterfall
WORK EXPERIENCE:
Big data consultant
Confidential, Richardson, TX
Responsibilities:
- Gathering data requirements and identifying sources for acquisition.
- Create leads for correcting the provider Data based on the business rules.
- Data ingestion using Sqoop and Kafka.
- Working with the Data scientists to create data in the formats required for the models.
- Designing the solutions for the new projects.
- Create unix scripts for automating the hive scripts.
- Create MaprDB tables and store data into the tables.
- Develop streaming applications using data torrent/Spark streaming/NiFi.
- Collect the data from applications using Kafka.
- Develop Map-reduce programs in Java.
- Create UDFs for Hive for standardizing data wherever required.
- Perform POCs on various technologies and document the findings.
- Create APIs for the web pages using spring framework.
- Coding & peer review of assigned task. Unit testing and Volume Testing. Bug fixing.
- Responsible for test case review for all components of the project
- Participate and contribute to estimations and Project Planning with team and Project Manager.
- Interaction with DBAs and SysAdmins.
- Perform root cause analysis and providing a permanent fix to the problems identified.
- Involved in presenting induction to the new joiner’s in the project.
Environment: Hadoop 2.7.0, Java, Hive, Kafka, MaprDB (HBase), Unix shell scripting, MapReduce, YARN, Data torrent(Apache Apex), Spark, Spark Streaming, Apache NiFi, Spring mvc, Tomcat, Platfora
Confidential, ADDISON, TX
Hadoop developer/ lead
Responsibilities:
- Gathering data requirements and identifying sources for acquisition.
- Create Sqoop jobs for importing the data from different application tables to hive tables.
- Design and develop PIG scripts for transforming data and storing the data to HDFS.
- Develop Hive scripts for end user / analyst requirements to perform ad hoc analysis.
- Collect the application log data to HDFS using Flume.
- Develop Map-reduce programs in Java for log data analysis.
- Create UDFs for PIG and Hive for standardizing data wherever required.
- Create workflows and scheduling using workflow coordinator using Oozie.
- Coding & peer review of assigned task. Unit testing and Volume Testing. Bug fixing.
- Responsible for test case review for all components of the project
- Participate and contribute to estimations and Project Planning with team and Project Manager.
- Create deployment plan, run book and implementation checklist.
- Interaction with DBAs and SysAdmins.
- Perform root cause analysis and providing a permanent fix to the problems identified.
- Involved in presenting induction to the new joiner’s in the project.
- Ensure availability of document/code for review.
Environment: Hadoop, Pig, Java, Hive, Flume, Sqoop, HBase, Oozie, UNIX, MySQL, MapReduce, YARN
Confidential, ADDISON, TX
Senior Software Developer
Responsibilities:
- Designed the ETL jobs using Datastage 8.5 to Extract, Transform and load the data.
- Designed parallel jobs using different stages to meet the functional requirements.
- Worked with shared containers for reusing the functionalities.
- Worked with the scheduling team for job scheduling in the production and test environments.
- Developed and Implemented enhancements and change requests.
- Supported the testing team in resolving any defects identified.
- Carry on many process improvement activities aiming at cost savings to the bank though various automated tools and value additions.
- Performed root cause analysis and providing a permanent fix to the problems identified.
- Involved in presenting induction to the new joiner’s in the project.
- Ensure availability of document/code for review.
- Work with business partners for analyzing the requirements and or change requests to proceed with development.
- Ensure fluent communication with all the related parties and stakeholders.
- Supported code/design analysis, strategy development and project planning.
Environment: Datastage 8.5, Mainframe, DB2, Cobol, Eazytrieve, VSAM, Autosys
Confidential
Developer
Responsibilities:
- Played a key role in the software life cycle development - Requirements gathering, Analysis, Detail design, Development and implementation of the system.
- Analyzed the functional requirements based on the business requirements provided.
- Designed and developed the statement formats for different types of statement types using the Custom Statement Formatter tool.
- Coded the messages that appear on the statements using the CSF Messenger tool.
- Worked on the all the enhancements and creation of new statement types.
- Worked on providing production Support.
- Coded the inserts that should appear with the statement according to the business rules.
- Developed a module that generates daily and monthly reports, which are sent to the business partners.
- Handled the tickets that were generated for the issues based on user testing on production.
- Coordinated with the team for moving the enhancements and bug fixes into different environments like test and production.
Environment: Custom Statement Formatter, CSF Messenger, CSU Messenger, Mainframe, DB2, COBOL, Eazytrieve, VSAM, Rexx.