Sr Data Engineer Resume
San Mateo, CA
SUMMARY
- I love my career as a technology professional, consulting wide range of organizations their by gaining over 12 years of experience focusing on development, design, integration, migration of wide range of projects.
- Enjoy working wif a team to “do” solution deliveries, business technology/process solutioning, assessments, reviews, and best - for-client practices where I continue to learn
- Enjoy being challenged by unique and complex business processes, problems and coming up wif most TEMPeffective and efficient solutions to those problems.
- Also enjoy mentoring folks and guide them achieve what they would like to. Always have teh aspiration to be a “go-to” person in teh team when it comes to problem solving.
- Deep enjoyment in using my ever-growing, and amazingly broad experience across Big Data, Cloud computing (AWS), Python, HIVE, Spark, Scala, Java, C#
TECHNICAL SKILLS
Operating System: LINUX(Ubuntu), UNIX, Apple, Windows
Shell Scripting: UNIX shell scripting, bash, MS-DOS
IDEs: Intellij, Eclipse, NetBeans, MS Visual Studio
Languages: Java, Python, Scala, MVC, Web Services, RESTful Web Services, Java Script, JQuery, C#.
Databases & Tools: PL/SQL, NoSQL (HBase), MySQL, MS-SQL Server 7.0/ 2000/ 2005/2008 , MS-ACCESS and Oracle9i, SQL Server Integration Services (SSIS), SQOOP, SQL Server Analysis Services.
Big Data Platform/Tools: Hadoop, MapReduce, Cascading, PIG, SCALA, SPARK, SPARK Streaming, HIVE, FLUME, Sqoop, HBase, ZooKeeper, KAFKA, Oozie, Presto, Qubole, Zeppelin notebooks
TDD Framework: MRUnit, MSUnit
Tools: Subversion, CVS, Maven, ANT, SoapUI
Continuous Integration: Jenkins
Hadoop Distributions: Cloudera (CDH v5.x)
Versioning Tool: GIT, Visual Source Safe (VSS), Clear Case and TFSReporting Tools Crystal Reports 10.0, Data Reports
PROFESSIONAL EXPERIENCE:
Confidential - San Mateo, CA
Sr Data Engineer
Responsibilities:
- Worked on two mission critical projects involving product info data pipeline wherein we had to determine entropy number for each product which will halp determine if they are good enough to be recommended for teh Confidential .com customers.
- Teh entropy calculation brings in quantities-on-hand, product weight and does a little complex math to arrive at a number. Made use of Spark, HIVE technologies majorly for dis project
- Teh other project was email recommendation. Which is a suit of several algorithms put together and halp pull list of subscribers who needs recommendation emails and a list of products to be recommended for each subscriber based on their personal p.
- It pulls several products for each subscriber based on their product views, their historical purchase, teams followed, bought together, afore mentioned entropy etc. Then teh final list of products is ranked and recommended to teh customers.
- Also, performance-tuned some critical workflows dat were running slow and had lot of downstream dependencies.
- All of teh above-mentioned tasks/projects were accomplished using wide variety of tools and technologies like Hadoop, Hive, Spark, Python, Java, Scala, GIT, Maven, Presto, Qubole, bash scripting, Zeppelin notebooks to name a few.
Confidential - San Francisco, CA
Senior Data Engineer
Responsibilities:
- Worked on a mission critical Data privacy migration project where-in we built a separate data pipeline and associated storage space for PII (Personally Identifying Information) data pertaining to Confidential users.
- It was a mammoth project which involved thorough analysis and putting together a list of all pipelines (among 100s of workflows) dat dealt wif PII data and subject them to teh Data privacy migration.
- We host and own some of teh Core datasets (which is of several petabytes of size) dat can tells stories from data to teh CEO, CTO ranging from ‘how Confidential is doing wif teh users’ to ‘What’s teh most tagged Pins’. DAU, WAU, MAU, Total engagement(Impressions) are among teh top core metrics dat everybody would be interested in and we own those metrics.
- Setup offline data consistency checkers called Pinthagoras to ensure teh data dat is ingested on a day to day TEMPhas integrity
- Validate Checker alerts if something is off and dig deep into it trying to figure out root cause and come up wif a permanent fix for dat.
- Manage Experiments (A/B Test) framework and related Dashboard. dis also involves adding new Metrics to teh dashboard.
- Build new pipeline and related infrastructure change confirming to ever changing business requirement their by facilitating them.
- Help folks from variety of teams to build their Funnels (experiment trackers) and troubleshoot if something goes wrong and users does not see what they are supposed to.
- Also worked on a keystone project named Monarch Migration. Confidential came up wif their own home build Hadoop cluster called Monarch in order to cut their huge spending on third party cluster. dis Migration involved huge analysis, planning, mocking, testing and converting all teh existing data pipelines to be monarch ready.
- dis migration effort often requires performance tuning of long running HIVE and Spark jobs using tools like Dr.Elephant.
- Use compressed file formats like Parquet and ORC wherever necessary which can reduce teh space drastically which can translate to cost cutting wif AWS
- All of teh above-mentioned roles where played wif wide variety of tools and technologies like Hadoop, Hive, Spark, Python, Java, Scala, GIT, Jenkins, Maven, Presto, Qubole, bash scripting to name a few.
Confidential - San Ramon, CA
Senior Hadoop Developer
Responsibilities:
- Hold frequent meetings wif teh business and related stakeholders to go over teh requirement and teh feasibility.
- Responsible for setting up multi-node cluster, cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Handled ETL’ing data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted teh data from Teradata into HDFS using Sqoop.
- Analyzed teh data by performing Hive queries, SCALA on SPARK and running Pig scripts to no user behavior.
- Load and transform large sets of structured, semi structured and unstructured data.
- Responsible for managing data coming from different sources.
- Experience in managing Hadoop processes using Init scripts and manually
- Experience in HDFS and Mapreduce maintenance tasks involving adding a datanode/tasktracker, checking filesystem integrity using fsck, balancing HDFS block data
- Experience in working on production support and maintenance related projects
- Exposure to Maven/Ant, Git along wif Shell Scripting for Build & Deployment Process
- Experience in using HCatalog for Hive, Pig and Hbase
- Exposure to NoSQL databases Cassandra
- Experience working on Jenkins CI
- Experience in writing Mapreduce joins like Map-side joins using Distributed Cache API
- Experience in planning, designing and developing applications spanning full life cycle of software development from writing functional specification, designing, implementing, documentation, unit testing and support.
- Teh project also involved designing and creating a highly secured RESTful web services to expose Provider's Quality of Service(QOS) data to health plans via SOAP.
- It was developed and put in place wif high end security since teh data dat goes back and forth are highly confidential. Teh service responds quickly to teh requesting health plans taking all teh necessary parameters like plan, requesting entity, Provider.
- Perform analysis on teh Data based on teh requirement given by teh business.
- Hold meeting wif teh Business team to get to no about their expectations and look for possibilities to exceed it all teh time.
- Designing, developing, coding, testing of new use cases and making changes to teh existing ones.
- Plan for deployment.
BI Pro Environment: Hadoop, java, Eclipse IDE, JDK, MapReduce, Hive, Sqoop, PIG, SPARK, SCALA, FLUME, Sqoop, Oozie, Linux
Confidential - San Francisco, CA
Responsibilities:
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Implemented best income logic using Pig scripts.
- Participate in requirement gathering and analysis phase of teh project in documenting teh business requirements by conducting workshops/meetings wif various business users.
- Managing and scheduling Jobs on a Hardtop cluster using Oozie.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Involved in loading data from UNIX file system to HDFS.
- Worked on Hue interface for querying teh data.
- Created Hive tables to store teh processed results in teh intended format.
- Created HBase tables to store variable data formats of data coming from different portfolios.
- Involved in transforming data from Mainframe tables to HDFS, and HBASE tables using Sqoop and Pentaho Kettle.
- Developed Simple to complex Map/reduce Jobs using Hive and Pig
- Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted teh data from MySQL into HDFS using Sqoop
- Analyzed teh data by performing Hive queries and running Pig scripts to study customer behavior
- Used UDF's to implement business logic in Hadoop
- Implemented test scripts to support test driven development and continuous integration.
- Responsible to manage data coming from different sources.
- Experienced on loading and transforming of large sets of structured, semi structured and unstructured data.
- Exported teh analyzed data to teh relational databases using Sqoop for visualization and to generate reports for teh BI team.
- Have deep and thorough understanding of ETL tools and how they can be applied in a Big Data environment.
Environment: Java, Eclipse, Hadoop, Map Reduce, HDFS, Sqoop, FLUME, Oozie, HBASE, WinScp, KAFKA, STORM, LINUX, UNIX Shell Scripting, HIVE, PIG, ZooKeeper, Cloudera(Hadoop distribution)
Confidential - San Francisco, CA
Senior Developer
Responsibilities:
- Architecting and developing EPAI outbound QA automation solution.
- Drafted teh conceptual, logical, and physical model for teh solution.
- Came out successfully wif teh solution besides teh tight deadline for delivering teh same.
- Did teh coding wif efficiency and conformance to SFHPs coding standard.
- Developed reusable tools for configuring trading partners, test cases and teh mapping between teh two.
- Simultaneously unit tested teh module as and when they were completed.
- Designed and developed web application using HTML, CSS, JQuery, Struts, Tag libraries.
- Used DWR (Direct Web Remoting) for browsers using java script to call java code.
- Responsible for implementing Struts validation and Action classes.
- Developed all teh application modules based on Spring IOC pattern.
- Developed Hibernate ORM mapping and wrote JPA queries.
- Worked on frameworks dat use Maven dependencies.
- Worked on SQL scripts for Data Access such as selection, Insertion and Deletion of Data from Oracle Database.
- Ingested Data about Provider QOS data into teh Hadoop platform and performed extensive analytics to come up wif result dat talks about Provider Practice Improvement metrics and pointers.
- Involved in execution of development standards and wrote several common functions and procedures to be utilized by other team members.
- Developed Log4J logging API to log errors and messages.
Programming Languages: JQuery, Java, Hadoop, HIVE,PIG, FLUME, SPARK Apache Tomcat, Struts, Servlets, DWR, JUnit, LDAP, Active Directory, Spring, Maven, XML, Webservices, SVN, Oracle 11g, XML, Html 5, JavaScript, XSLT, CSS, MVC 4, JQuery, AJAX
Confidential
Technical Lead
Responsibilities:
- Architecting and developing an MVC application using teh MVC features, jQuery, and SQL Server 2008.
- Requirement analysis for teh new payer implementations using teh IS Engine.
- Analyzed teh requirements and prepared teh technical and functional specification documents for 270, 276, 274 and 278 transactions.
- Teh actual IntelliSearch solution was a Windows Service developed in java. Recently it was migrated to WCF framework. It talks wif teh WWW and then later takes teh HTML and converts back to xml using XPath, XQuery. their is also a web portal developed based on MVC architecture, JavaScript, and CSS wherein all teh credentials of teh Providers which was used to log-in to teh payer portals were managed.
- Teh portal is a 100% MVC based solution wherein their was a clear bifurcation between teh View, Data and corresponding controllers.
- All payer wise configurations go into SQL Server 2008 DB server.
- Successfully Designed, Developed, united tested payer implementation and also managed change requests, code defects and data configuration bugs for all real time transactions.
- Driving teh offshore development team of 10 members
- Helped teh offshore DEV team in Design, Development, and Testing for solving complex Issues.
- Involved in Code review, release management, Production support and other high productive activities.
- Also actively took part in developing some of innovative tools/app for IntelliSearch.
- Involved in bringing up Hadoop environment for a pioneer project for teh first time in teh client premises and halped them analyze large scale X12 transaction data and come up wif intelligence data to take teh business to new direction and set foot in new areas.
- Managed Code Integration and Versioning of Application using SVN.
- Prepared Test Data, Test Case document and performed unit testing and integration testing for 270
- Recently developed a tool called IS Visualize(Win app) which is used a debugging tool to trouble shoot issues related to real-time transaction processing wif IntelliSearch. It also bagged teh “Innovation ” given by teh Healthcare Vertical.
Programming Languages/ Tools: Java, Maven, Eclipse, Hadoop, HIVE, FLUME, Windows Services, WCF, XSD, XML, Html 5, JavaScript, XSLT, CSS, MVC 4, JQuery, AJAX
Confidential
Associate - Projects
Responsibilities:
- Analyzed teh requirements and prepared teh technical and functional specification documents for 270, 276, 274 and 278 transactions.
- Incorporated 271/274/276/278 (X217, X215, X216) response business logic in C#.Net, Stored Procedures and handled transactions from and to database using ADO.Net and Microsoft Application Block SQLHelper.
- Implemented SSIS Packages and theirby tuning teh performance of 5010 payer data load (270 and 276) process from 36 hours to 12 hours for a set of 165 files.
- Incorporated teh response building logic in stored procedure using advance SQL server 2008 XML features.
- Successfully managed teh change requests, code defects and data configuration bugs for all real time transactions.
- Helped teh 5010 real time business Actuarial Team, Production Support Team and Testing Team in solving teh complex Issues.
- Managed Code Integration and Versioning of Application by Team Foundation Server(TFS).
- Prepared Test Data, Test Case document and performed unit testing and integration testing for 270, 276 and 278(X217, X215 and X216).
Programming Languages: VB.Net,C#.Net, Web Services, Windows Services, XSD, XML, Html, JavaScript, XSLT, CSS, ASP.NetDatabase & Tools: SQL Server 2008 & Sql Server Integration ServicesTools: used: Symphonia
Confidential
Technical Support Expert
Responsibilities:
- Played teh role of Technical Support Specialist for a Healthcare product called Clarus. As part of Product Support activities I engage myself in activities like getting assigned teh Support tickets, analyzing them, resolving, testing and delivering it. We stay in constant touch wif clients to get to no about their feedback on product usage and issue resolution.
Programming Languages: .Net 1.1, C#.NET, VB.NET, ADO.NET, VS.NET 2003, XML, XSLT, XSD, Xpath, Visual Source SafeDatabase & Tools: SQL Server 2000
Confidential
Software Engineer
Responsibilities:
- Involved in gathering Business Requirements and preparing functional and technical specification and System Design document for UHS.
- Preparing teh document for all business analysis algorithms for UHS.
- Designed and documented flows and functional diagrams using Visio
- Designed and implemented user-interface using XML and custom build page generator application.
- Incorporated business logic Stored Procedures and handled transactions from and to database.
- Involved in Unit test cases preparation and performed unit testing and integration testing.
- Deployment of Application on Test and Production Server
Programming Languages: XML, SQL Server, C#.NET, JSP, NT Service, Visual Source Safe.
Confidential
Developer/ Team Member
Responsibilities:
- Involved in regular project meetings for project and issue discussion.
- Developed teh GUI interface using C#.Net
- Clarus as such was purely a suite of windows based applications catering to teh needs of a typical mid-sized healthcare provider. In other words, it’s like a healthcare ERP product.
- Incorporated business logic into teh Application using C#.Net and Procedures using SQL Server 2005
- Developed Business Objects using Crystal Reports.
- Analysed, developed and fixed and major issue dat were raised from teh client end.
- Involved in Unit test cases preparation and performed unit testing and integration testing.
Programming Languages: .Net 1.1, ASP.NET, C#.NET, VB.NET, ADO.NET, Winforms,VS.NET 2003, VSS & UML.
