Data Specalist Resume
SingaporE
SUMMARY
- 7+ years of experience in providing big data insights and solutions at banking, travel and ecommerce platforms various big data distros.
- Over 3 years of implementation and understanding experience of building data lakes over cloud and in - house infra.
- Experience in migration of Traditional J2EE-RDBMS architecture to Hadoop eco-systems on to cloud platforms such as AWS and GCS.
- Experience in building data pipelines for data governance at near real-time and batch processing using Lambda/Kappa architecture.
- Worked with Cloudera as major and Hortonworks for building big-data solutions.
- Hands on Hadoop development/distribution tools such as Apache Druid, Hue/Ambari, Spark Impala, Hive, Pig, Oozie/Nifi, Spark/Flink, Scala, Kafka and ELK stack.
- Good understanding of data governance ready-to-go solutions such as tammr, kylo and linkedin warehouse.
- Worked on providing micro service architecture over data-lake and pooled storage.
- 7+ years of Java-J2EE proven expertise in enterprise software development life-cycle, Application Analysis, Design, Development and code deployment in Linux/Windows environment
- Process Managed Environment: CVS, GitHub, Agile Scrum.
- Working knowledge of build tools and continuous integration - Jenkins
- Strong experience in managing and engaging with clients and providing technical support and consultation for application and infrastructure solutions.
PROFESSIONAL EXPERIENCE
DATA SPECALIST
Confidential
Responsibilities:
- Create and maintain optimal data pipeline architecture.
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL or Neo4j and ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep our data separated and secure across national boundaries through multiple data centers and AWS regions.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data systems.
Technologies used: CDH 6.0 (IMPALA, HBASE, SPARK, FLINK), Druid, Spark, KYLO, Power-BI, AWS-S3
DATA ANA LYST
Confidential
Responsibilities:
- Responsible for building data lake in AWS along with dataflow and data pipelines.
- Implementing Digital foresight portal for 360-720-degree customer for cross-domain integration.
- Architecture Designing and Implementation of business use cases, creation of microservices for access layers.
- Data validation and profiling on top of data lakes using data dictionaries and indexing solutions such as solr.
- Capability demonstration for various domains for various clientele engagements and assisting pre-sales solutions.
- Client Engagement, technical support, building team for ready to deploy, customized solutions.
Technologies used: AWS (S3, Ethena, Reddis), CDH 5.9, Scala 2.11, Spark, Apache Flink, MongoDB, Elastic Search and its components , REST web services using spring boot, Vert-X
SENIOR SOFTWARE DEVELOPER
Confidential
Responsibilities:
- Building data pipelines for near real-time data processing and providing insights.
- Core Data Analytic using Spark (Imapala, Spark SQL and RDD mechanics) and Impala for processing batch queries.
- Logging analytics and producing insights using ELK components (FileBeat, LogStash, Elastic Search, Kibana).
- Designed restful API layer for reporting and multi-user applications
- Design POC and maintain benchmarks for Hadoop Analytical Tools.
- Motivate and train Team for Impala and Elastic search usage.
Technologies used: CDH 5.4 distribution, JAVA-8, Impala, Hive, Spark, Oozie workflow, Elastic Search and its components., Oracle 11, REST web services, Vert-X
Lead Software Engineer
Confidential
Responsibilities:
- Engaged in designing Hadoop Cluster tools for working environment with help of Hadoop Architect.
- SLA monitoring at various layers
- Need a cohesive framework for end-to-end monitoring of the portal SLA including at sub systems levels
- Behavior Tracking at portal
- Need an integrated user behavior tracking framework to collect user click-stream and analytical reporting
- Troubleshooting
- Need the ability to stitch user transactions across various sub systems (inter and intra JVM) for rapid issue identification & resolution
- Internet Performance Management
- Identify a managed services solution for monitoring portal performance from the cloud
- Individual Contributor to Big data Project for creating Insights.
Technologies used: CDH 5.4 distribution, Hive, Spark, Kafka, HBase, Vert-X, JDK-8, Rabbit-MQ, MongoDB, Map-Reduce, JSON, Junit, Tortoise-SVN, Sybase, Bootstrap, JQUERY
Sr. Team Lead
Confidential
Responsibilities:
- Worked as a junior architect with team for CDH environment, development and design.
- Migrated end to end traditional J2EE-RDBMS architecture to Hadoop-Java based architecture.
- Requirement gathering and analysis, end to end delivery management.
- Developed highly responsive peer to peer response system for client using SWING (JFC)
- Kafka integration with HDFS, a replacement of traditional used JMS using Rabbit-MQ.
- Plugin development with JFC and SWT/RCP team.
- Individual contributor, Handled a team of 3 engineers.
- Trained junior candidates for designing of UI using JFC.
Technologies used: CDH 5.2 distribution, Hive, Kafka, JDK-7, Rabbit-MQ, Map-Reduce, Rest-Web services, HBase, JSON, Junit, GitHub, Oracle 10g and Jenkins.
Sr. Associate
Confidential
Responsibilities:
- Deployment Architecture definition and documentation for a Hadoop based production environment that can scale to petabytes.
- Designed web application UI using J2EE concepts (Servlets, JSP).
- Integrated UI with graphical charts for the results from backend data analytics framework.
- Developed and Designed Map Reduce Program to format the data.
- Involved in data analysis using Pig and Hive.
- Wrote Pig Scripts for creating primary data for MR Jobs in JAVA 7.
- Importing and exporting the data using Sqoop from HDFS to Relational Database systems and vice-versa.
- Analyzing data with Hive, Pig and experience with Running Pig Scripts.
- Involved in moving files between local system to HDFS and vice versa.
- Involved in support the agile development team by picking up stories for design/development.
- Performed Unit Testing for various modules.
- Provide support technical guidance to Team members.
- Involved in creating LINUX scripts based on requirements.
Technologies used: Hortonworks, Scala 2.x, Java-7, Agile-Kanban, CVS, MySQL-5.0, Rabbit-MQ
Associate Manager
Responsibilities:
- Designed reconciliation engine for various hedge funds.
- Proposed and created POC to SSO implementation in project.
- Actively involved in Code Review and Maintenance
- Involved in user requirement gathering and designing use cases.
- Involved in project Planning/Estimating
- Helping QA team to understand the data flow and data dependencies
- Involved developing of the ETL strategy to populate the Data
- Involved in creating load balancer environment servers.
Technologies used: Tomcat 6.0, Sybase, Apache Web Servers, JDK-6, J2EE Concepts, Struts 2.0.
Team Lead,
Confidential
Responsibilities:
- Designed and developed user interaction with NAV funds and asset tracking in JAVA-6 JFC.
- Built high fault tolerant portal SMS broadcasting using J2EE stack and load balancing feature.
- Understanding, Documenting and implementation of user requirements.
- Tracking progress and helping the team in issue resolutions.
- Implementing TDD environment in project.
- Reviewing the code and creating sub versioning into SVN.
- Holding daily stand ups.
Technologies used: JFC (Swing), Java-J2ee Stack, Struts 2.0, XPath, Tomcat Server 6.0, Oracle 9i.
