Cloud/data-platform Lead Solutions Architect Resume
SUMMARY:
- Thought leader in big data and Cloud technologies helping organizations to achieve their data needs using cloud technologies, which is ready to take on big data challenge and master it for unprecedented benefits.
- Offering more than 13+ years’ of experience in data Management, data integration and big data management.
- Experienced with AWS stack and have architected 5 migrations of data stack to AWS
- Well versed to designing solutions on AWS using EMR, Dynamo, Redshift and building AWS data pipeline.
- Successfully migrated database to MongoDB from relational Oracle for faster web access.
- Create ELK solution for logging and monitoring all application and database logs and create alerts, Metrix from it.
- Create CHEF recipes to create automated infrastructure management for cloud.
- Deployed mapR 15 - node cluster on AWS and integrated mongo DB with it.
- Built SPARK cluster to process mongo and customize mongo DB Hadoop connector to integrate with Hadoop.
- Streamlined different ETL processes using open source Talend platform on Hadoop and MongoDB. Created multiple customized Talend components for data loads.
- Built auto replication for elastic search as mongo secondary to avoid data duplication.
- Migrated existing batch processing to Lambda based processing on AWS.
- Set up 6-node and managed mapR cluster for gas and Energy Company on AWS and migrated mainframe data to Hadoop and SAS based solution.
- Built R based analytics using SPARKR for 500TB data migrated from mainframe to mapR.
- Migrated Mongo and Hadoop cluster on AWS and defined different read/write strategies for geographies.
- Well versed with Hadoop and data processing in Hadoop.
- Search functionalities on unstructured data sets using Elastic Search.
- Expertise in various data integration tools such as Talend, Informatica Big data integration.
- Productionlized open source versions of Talend and Mongo-DB successfully.
- Well versed with different types of RDMS and new age database technologies such as column store, row store, in memory databases and also conventional relational databases.
- Well versed with Map reduce and data aggregation frameworks.
- Expertise in different financial vendor products to streamline end-to-end data flow. Implemented Eagle, Black rock Aladdin and Maximis.
- Implemented data integrity solution using in house applications and replacing vended products to reduce costs
- Set up ETL architecture using informatica for $5 million implementation.
- Defined enterprise level Mongo DB architecture with replication, sharding and backup (DR) strategies.
- Streamlined different ETL processes to open source Talend platform.
- Technically lead and implemented searching mechanism using Mongo DB and elastic search.
- Designed and implemented Google analytics and web click stream analytics.
- Set up big data analytics/reporting architecture for Logi-Analytics, after evaluating different reporting tools.
- Proficient in working with cross-functional & cross-geographic teams for Software development projects possesses motivational management style with a record of building & retaining highly motivated teams.
- Strong aptitude to shape innovative performance and demonstrable success in Delivery, supported by strengths in leadership, influence, communication and decision-making .
- Constant contributor to open source community and technical blogs.
PROFESSIONAL EXPERIENCE:
Cloud/Data-platform Lead solutions Architect
Confidential
Responsibilities:
- Created redshift cluster using Docker and AWS.
- Solution data services around Node.Js and ETL using SPARK and Java on AWS.
- Created data lake using S3 and Redshift and processing layer using EMR, spark.
- Created monitoring and performance management on AWS services.
- Creating Data science models using SPARK (PYSPARK, SPARK-R).
- Created Terraform recipes for AWS data platform including different services like ELK, Postgress, Spark, Lambda
- Designed data collection strategies and ETL architecture based on Informatica Cloud.
- Improved performance bottlenecks around redshift data loading and Informatica ETL process.
- Deployed analytic solution on using Hadoop and integrated data science.
Big data Lead solutions Architect
Confidential
Responsibilities:
- Set up big data architecture using Mongo DB for faster web access.
- Created mongo cluster using Docker and AWS.
- Created monitoring and performance management on mongoDB.
- Designed Java based data access APIs (migrated API based structure to micro services) and web services for accessing Mongo/Hadoop data.
- Created CHEF recipes for AWS data platform including different services like ELK, MongoDB, Spark, Kinesis, Lambda
- Migrated oracle data to Mongo DB collections and defined data modeling and data acquisition.
- Designed data collection strategies and ETL architecture based on open source java based tool Talend.
- Deployed analytic solution on using Hadoop and integrated Mongo DB data in to Hadoop.
- Deployed mapR 15-node cluster on AWS and integrated mongo DB with it.
- Built SPARK cluster to process mongo data and also customized mongo DB Hadoop connector to integrate with Hadoop.
- Built auto replication for elastic search as mongo secondary to avoid data duplication.
- Migrated existing batch processing to Lambda based processing on AWS.
- Defined archival, backups, historical access and disaster recovery processes
- Setup best practices and daily operational practices around Talend including clustering, parallelization.
- Automated migrated Mongo-DB cluster to AWS and associated ETLs using Chef and cloud-formation
- Designed change data capture using Mongo server logs streaming to Spark.
- Participated in evaluating reporting tool for data analysis. Came up with LOGI analytics and evaluated for all business needs.
- Designed reporting library for Logi analytics and integration into web interface.
- Defined replication/sharding strategies and migrated Mongo-DB servers to cloud (AWS).
- Created integration/usage of Mongo data using mongo Hadoop injector into Hadoop cluster to make use of faster processing by Hadoop.
Technologies: AWS, Java, Open Source, Mongo DB, Oracle, HDFS, Hadoop (Map-R), Elastic Search, Chef
Data Science & Big data solution architect
Confidential, Florida
Responsibilities:
- Created ETL strategy to load data from RDBMS (SQOOP), mongo-DB (Hadoop connector) and Netezza.
- Created uniform SPARK layer to process data from different sources including Hive, mongo & Splunk
- Created R injection in SPARK uses existing R models and Python models and new models in PYSPARK and SPARK-R.
- Created DW data models using HIVE (ORC, AVRO) and partitioning and bucketing for high performance.
- Created graph data use case for Medicare and predictive analysis using Spark-graphX.
- Create data science platform using MS-R on Azure.
Cloud & Big data Infrastructure Architect
Confidential
Responsibilities:
- Created effective archival process and low cost solution on AWS and Hadoop
- Mainframe data is converted to ASCII format and moved to HIVE by parallelizing loads.
- Designed data collection strategies and ETL architecture based on open source java based tool Talend.
- Deployed analytic solution on using SAS and Hadoop integration using hive.
Data solution Architect
Confidential
Responsibilities:
- Implemented spark cluster on MongoDB to address performance issues
- Created data services using Node.Js for Mongo.
- Helped them to move cloud architecture from inhouse
- Built performance tracking app based on ELK
- Helped them to scale on data level and performance tune data operations
Big data solution Architect
Confidential
Responsibilities:
- Enable IOT data pipeline and unstructured data source
- Enable social media and external marketing data and integrate that with data lake
- Enable Spark based data processing engine to expedite reporting requirements
- Integrate SAP data sets and HANA with Big data platform, streamline ETL tools and create EDW and reduce need of individual BW.
- Create data platform to cater their 5 year strategic needs.
- Help to scope these infrastructures on cloud (Azzure) using HDinsight as platform.
- Helped to put forward ELK based log collection and data governance and change data capture systems.
Big data solution Architect
Confidential
Responsibilities:
- Enable IOT data pipeline and unstructured data source
- Enable social media and external marketing data and integrate that with data lake
- Building data pipeline into AWS based mapR cluster AWS batch and Lambda.
- Enable Data science models on data and storage in Hive.
Confidential
Data integration Architect
Responsibilities:
- Design Informatica integration layer between maxims and Eagle.
- Define reporting structure based on Eagle and migrate existing reports to Eagle.
- Lead team of developers Analyze existing DW structure and purge unused Informatica flows and data tables/columns.
- Transition support functionality to Confidential offshore partners. Derive transition plan.
- Worked on POC to analyze SAP Hana in memory solution for faster reporting on SAP.
Technologies: Informatica, Oracle PL/SQL, UNIX.
Confidential
Big data integration Architect
Responsibilities:
- Integrating order and quotes HDFS files to trade regulatory systems.
- Replace existing data acquisition processes to Informatica based processes.
- Define Map Reduce for data collection and HIVE scripts for reports.
- Design Informatica mappings for HDFS file access.
- Replace existing PL/SQL with Informatica
- Run high complex map reduce operations on HDFS
- Make performance efficient existing slow running Informatica mappings.
Confidential
Technical Architect
Responsibilities:
- Lead team on technical people to design and develop data integration between different vendors based products trade/order & portfolio management (Aladdin), Financial data warehouse (Eagle Pace), Portfolio performance (Eagle Performance) and accounting system (Maximis).
- Designed & implemented data integrity and exception management solution using Informatica DVO to comply with regulatory and audit requirements.
- Defined automated in house trade/transaction tracking solution to identify any failure situation and resolution to reduce systemic risks.
- Defined and implemented Informatica based data integration layer between all the different systems.
- Defined design DW and reporting systems required for business needs of investment users.
- Defined data reconciliation applications using Informatica DVO, SAP Recon and Eagle recon
- Helped to reduce license costs by replacing Check free for reconciling asset/cash solution with in house application.
- Implemented advanced level oracle PL/SQL to optimize processing run time for batch jobs to meet business SLAs.
- Built Hadoop POC for Confidential to collect and report sensor data generating from drive wise.
- Built NO SQL POC for migrating IMS database to Mongo DB.
- Lead offshore-onsite team of different organizations.
Technologies: Informatica, Oracle PL/SQL, UNIX.
Confidential
Software Engineer
Responsibilities:
- Providing production support to the CES application, which involves analysis, trouble shooting.
- Maintenance activity, which involves resolve any runtime issues arise in written code.
- Modify the existing code according to the User requirements.
- Catering to Adhoc & time critical requests by users of the application.
- Co-ordination with Team-Leaders provides status of tasks and ensures quality of deliverables.
Technologies: Java, Oracle PL/SQL, UNIX.
