Sme/bigdata/ai/architect Resume
PROFESSIONAL SUMMARY:
- 14 years' experience providing solution approaches for IT and business. Earn trust by working with clients to identify and resolve their problems with efficient solutions thus winning new business and expanding existing business.
- 12 years' experience to lead to provide solution approaches for IT, business and/or data analysis in Healthcare,Insurance,finance service, information technology and e - commerce etc.,
- 12 years of experiences in the business analysis, architecture/design, BIGDATA Cloud and SOA architectures, organization and IT/SOA governance, information security for enterprise applications of business process, business rule, portal, knowledge, content, researching, storage and archival system and deliver the right release before/in the deadline under the budget
- Strong leadership on IT program/project for the government's PMOs and corporation's CEO, CIOs and CTOs
- Strong customer relationship to deliver from a vision and strategy phase to develop business and technicalsolutions to create value for customer's business Experience on providing the cloud architecture solutions, leading the cloud design, and enabling execution of the BIGDATA, DATA SCIENCE, ML,AI and cloud private, public and hybrid cloud by using AWS/Azure/Google Analytics
- Led compliance assessments with the customer on Cloud and Hosting Architecture.
- Led migration approach to lift and shift the workloads to cloud or architecting a green field development and/or production platform for new applications.
- Responsible for transformationsolution development, competitive costing and business case alignment supporting client business.
- Articulating all aspects of transformation solution and communicates value and outcomes to the client. Contributed to win strategies and definition of win themes, including business case development andsolutionapproach Responsible for opportunity analysis, transformation solution design and development,soluton leadershipsolution integration and client/customer relationship management
- Execution of limited Confidential, where necessary, or supporting those done by the customer.
CORE COMPETENCIES:
Project Management Software & Application Design System Implementations Expertise in Big Data / Hadoop Ecosystem technologies Data Warehousing & Administration Troubleshooting & Technical Support Database Design & Implementations Systems Assessment & Configuration Data Modelling Expert - Star & Snowflake schema designs.
PROFESSIONAL EXPERIENCE:
Confidential
SME/BIGDATA/AI/Architect
Responsibilities:
- Working as Artificial intelligence /Big Data Architect with (Digital Transformation ) team to provide quality tech solutions to multiple clients on Analytics & Big Data/Hadoop technologies with primary focus on Information & Big Data Architecture.
- Migration strategy from existing Data warehousing ETL Platform to Hadoop Lake for large customers and scalable NoSQL db for mid-range customers.
- Designing and developing data mining/analytics solutions, data centric integration, developing and maintaining Business analytics.
- Adept in data query, data migration, data analysis, predictive modeling, machine learning , data mining, and data visualization implementations with extensive use of SQL, Python, R, Java, and Unix Shell scripting with platform of Toad, Oracle developer, Jupyter Notebook, Pycharm, R-Studio, Tableau, Hadoop with Spark .
- Lead and Designed Hadoop Ingestion Patters for the DATA PI Customer Data initiative. Performed two Confidential with two of financial corp distinctive Hadoop Lakes (Enterprise Data Lake - EDL & TenX) and weighing option to move forward with one of the Lake to hold the data. Designed & Implemented Ingestion Patterns with agreement from, willing parties all the stakeholders, handling Data in multiple fronts.
- Tracking back to True source to get the raw data, effective extract patterns and feed to ingestion mechanism ensuring transparency in terms of Data lineage and business lineage.
- Providing Technical road maps, technology feasibilities, providing required technical expertise. Clearing ambiguities in terms of implementation, results and outcome.
- Laying down Data Zoning, Data journey, Lineage, Transformations and Business Intelligence best practices. Producing reusable designs and code to teams to proceed with replication and assist them clearing any roadblocks using above mentioned technologies and tools.
- Hands-On on code and Design of Handling release management activities with code migration and automation. Providing recommendations for process integration across lines of business or business capabilities.
- Collaborate with EA team on enterprise architecture practice best practices and business solutions Taking new areas of technology space to work in Confidential upfront to provide technical feasibility analysis and benchmarking
Confidential, Jersey City, NJ
Bigdata Architect/SME
Responsibilities:
- Responsible for Understanding the requirements from business team analyzing the sources of data required for the metrics needed
- Interacting with the source feed owners in preparing the channel and establishing the connection and frequency, for ingesting the source data using sqoop and flume using incremental and Flume data streams. participating in the architecture, design and implementation of large scale data solutions, building the assets & offerings in this new and exciting area and working at clients proving and further maturing these skills and assets with pioneering engagements and validate architecture, design and implementation of BIG Data, NoSQL and traditional hybrid based full scale solutions that include data wrangling, ingestion, movement, storage, transformation, security,data management and analysis using Big data technologies
- Coming up with high level design with tables data ingestion,datacleaning and data hierarchy and data is managed in Parquet file format and snappy compression.
- Maintain and work with our data pipeline that transfers and processes several terabytes of data using Spark, Scala, Python, Apache Kafka, Pig/Hive & Impala
- Used Spark API over Hadoop YARN to perform analytics on data
- Implemented performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark -SQL, Data Frame, Pair RDD's, Spark YARN
- Provided solution for overall approach to risk data aggregation from all risk engine data sources.
- Architected, designed and led the development of the real time risk data aggregation and analytics engine for the Hadoop cluster running on Cloudera platform.
- Led development of real-time risk engine analytics using Java and Data Integration technologies, using Agile methodologies for planning and execution with daily stand-ups and bi-weekly sprints, reviews and retrospectives.
- Provided architectural and tactical solutions to resolve constraints in using Parquet (columnar file storage with compression) in Hive while integrating with Oracle and Netezza data sources.
- Performed performance enhancements on risk engine algorithms to efficiently fit into Hadoop technologies for real-time streaming and ingestions into Hadoop cluster.
- Led implementation of integrating corporate scheduling system (Auto-sys) with Oozie for efficient and seamless enterprise scheduling within the Big-data platform.
- Design/Architect proposals for Big data /Hadoop projects
- Defined the Database Design/standards
- Capacity Planning/Management
- Prepared the Project plan, daily/weekly status calls with project teams.
- Project deliverables using AGILE Software development methodology.
- Successfully implemented Migrations from Oracle to Hadoop, MemSQL.
- Migrate data from Oracle to MemSQL, Hive databases using Spark (Scala).
- Setup database, hadoop, spark connections in Talend.
- Build Talend Bulk Load, parallel session jobs to do Bulk loading.
- Data modeling conversion from Oracle toMemSQL, Hive tables. Performance Tuning of MemSQL and Hive.
- Validating the metrics generated using HBase, Solr and coming up with recommendations.
- Responsible to identify the risks & issues and improve the processing speed by using other tools like Impala to replace Hive. Designing, implementing and deploying ETL to load data into HDFS & Hive tables. Provide Security team to implement four pillars of Hadoop security.
- Integrated VM platform with QlikView and Qlik Sense for BI and visualization
Environment: Hadoop, Hortonworks, NoSQL,Map Reduce, HIVE, Sqoop,AWS, Spark, Qlik View, Informatica, YARN, Core Java, Groovy, Oracle. Hadoop eco system CDH 5.7 (HDFS, Hive, Shell, Python, Spark,Spark Core, Spark SQL, Spark Streaming Impala, HBase, FLUME, Sqoop, Solr) DB2,
Confidential, Jersey City, NJ
Responsibilities:
- Managed Parsing high-level design spec to simple ETL coding and mapping standards.
- Worked on Informatica Power Center tool - Source Analyzer, Data warehousing designer, Mapping &Mapplet Designer and Transformation Designer.
- Managed team who are working on complex mappings using Unconnected Lookup, Sorter, Aggregator, newly changeddynamic Lookup and Router transformations for populating target table in efficient manner.
- Extensively used Power Center to design multiple mappings with embedded business logic.
- Creation of Transformations like Joiner, Rank and Source Qualifier Transformations in the Informatica Designer.
- Created Mapplet and used them in different Mappings.
- Using Informatica Repository Manager maintained all the repositories of various applications, created users, user groups, and security access control.
- Good knowledge in the Physical data Model and Logical data Model
- Provided Knowledge Transfer to the end users and created extensive documentation on the design, development, implementation, daily loads and process flow of the mappings.
- Maintain Development, Test and Production mapping migration Using Repository Manager, also used Repository Manager to maintain the metadata, Security and Reporting.
- Responsible for building scalable distributed data solutions using Hadoop.
- Followed agile methodology and used Rally for user stories management and iteration planning actively participated in daily scrum meetings.
- Understand the business needs and implement the same into a functional database design.
- Data Quality Analysis to determine cleansing requirements. Designed and developed Informatica mappings for data loads.
Environment: InformaticaPowerCenter 9.6 (Source Analyzer, Data warehouse designer, Mapping Designer, Mapplet, Transformations, Workflow Manager, Workflow Monitor), oracle 10g MQ Series,Erwin 3.5, PL/SQL, Windows 7 / 2000.
Confidential, Jersey City, NJ
Big Data Architect
Responsibilities:
- Working as Big Data Architect with Digital Transformation team to provide quality tech solutions on Analytics & Big Data/Hadoop technologies with primary focus on Information & Big Data Architecture.
- Managed Confidential USA Big data team to implement Big data architecture initiative & Data lake on Cloudera CDH 5.3.2 for Confidential North America.
- Managed two Big data Platforms, BigPlay and DaaS to support the whole lifecycle of data projects.
- The BigPlay platform will be primarily used by Data scientists to dig into real data on test & Learn and define the algorithm, develop predictive models that brings value to the Business.
- The DaaS platform is production mode where predictive models are packaged and deployed as an application.
- The applications will run automatically to provide fresh inputs to the targeted business operational systems.
Environment: CDH 5.4.5, PIG, HIVE 0.13.1, Beeline 0.13.1, Sqoop 1.4.5, Flume, Oozie 3.2, Sentry 5.3, Impala 2.1.2, Hue 3.7.0, Map Reduce, Cloudera Navigator, NoSQL HBASE, Solr 5.2.1, Spark, R 3.1.2, Python 2.6.6.
Confidential, Newyork, NY
Senior Hadoop Developer
Responsibilities:
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
- Integrated scheduler with Oozie work flows to get data from multiple data sources parallel using fork.
- Created Data Pipeline of MapReduce programs using Chained Mappers.
- Experience in utilizing spark machine learning techniques implemented in scala.
- Writing optimized Hive queries for both batch processing and adhoc querying.
- Implemented Optimized join base by joining different data sets to get prospect zipdata using MapReduce.
- Implemented complex mapreduce programs to perform joins on the Map side using Distributed Cache in Java.
- Created the high level Design for the Data Ingestion and data extraction Module, enhancement of Hadoop Map-Reduce job which joins the incoming slices of data and pick only the fields needed for further processing.
- Document and explain implemented processes and configurations in upgrades
- Worked with application teams to install operating system, Hadoop updates, patches, versionupgrades as required.
- The processed results were consumed by HIVE, Scheduling applications and various other BI reports through data warehousing multi-dimensional models.
- Built shell scripts to execute the hive scripts on linux platform to process and extract terabytes of data from different data warehouses and prepare datasets for ad-hoc analysis and business reporting needs.
- All this happens in a distributed environment.
- Developed several advanced MapReduce programs to process data files received.
- Created Partitions, Buckets based on State to further process using Bucket based Hive joins.
- Created Hive Generic UDF's to process business logic that varies based on policy.
- Moved Relational Database data using Sqoop into Hive Dynamic partition tables using staging tables.
- Optimizing the Hive queries using Partitioning and Bucketing techniques, for controlling the data distribution.
- Used Oozie workflow engine to manage interdependent Hadoop jobs and to automate several types of Hadoop jobs such as Java mapreduce Hive, Pig, and Sqoop.
- Worked on SAS migration to Hadoop on Fruad Analytics and provided predictive analysys.
- Worked on SAS migration to Hadoop Campaign and response analysys .
- Analyzed and designed Hadoop directory structures for Archive data.
- Developed Unit test cases using Junit and MRUnit testing frameworks.
- Experienced in Monitoring Cluster using Cloudera manager.
- Worked on AWS Clusters
Environment: Hadoop, Hortonworks, HDFS, HBase, MapReduce, Java, R,Hive, SAS,Pig, SQL,Sqoop, Flume, Oozie, Hue, SQL, ETL, Cloudera Manager,spark, Cloudera Hadoop distrubution, MySQL.
Confidential, Hartford, CT
Hadoop Developer
Responsibilities:
- Worked on evaluation and analysis of Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Involved in loading data from LINUX file system to Hadoop Distributed File System.
- Created Hbase tables to store various data formats of PII data coming from different portfolios.
- Experience in managing and reviewing Hadoop log files.
- Used Datameer to analyze the transaction data for the client. installed, configured and managed Datameer users on the Hadoop cluster.
- Exporting the analyzed and processed data to the relational databases using Sqoop for visualization and for generation of reports for the BI team.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzing large amounts of data sets to determine optimal way to aggregate and report on these data sets
- Worked with the Data Science team to gather requirements for various data mining projects.
- Analyzed large data sets by running Hive queries and Pig scripts.
- Worked on performing major upgrade of cluster from CDH3u6 to CDH4.4.0
- Created dash boards using Tableau to analyze data for reporting.
- Support for setting up QA environment and updating of configurations for implementation scripts with Pig and Sqoop.
Environment: Hadoop, HDFS,SAS,Pig,DatameerSqoop,SQL,Python, HBase, Shell Scripting, Linux Red Hat
Confidential, Los Angeles, CA
Hadoop Consultant
Responsibilities:
- Exported data from DB2 to HDFS using Sqoop and NFS mount approach.
- Moved data from HDFS to Cassandra using Map Reduce and BulkOutputFormat class.
- Developed Map Reduce programs for applying business rules on the data.
- Developed and executed hive queries for denormalizing the data.
- Installed and configured Hadoop Cluster for development and testing environment.
- Implemented Fair scheduler on the Job tracker to share the resources of the cluster for the map reduces jobs given by the users.
- Configure WebHDFS to support REST API, JDBC connectivity to external clients for operations
- Analyzed the web log data using the HiveQL to extract number of unique visitors per day, page views, visit duration, most purchased product on website.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports by our BI team.
- Dumped the data using Sqoop into HDFS for analyzing.
- Developed data pipeline using Pig and Hive from Teradata and Netezza data sources. These pipelines had customized UDF’S to extend the ETL functionality.
- Developed job flows in Oozie to automate the workflow for extraction of data from Teradata and Netezza
- Developed data pipeline into DB2 containing the user purchasing data from Hadoop
- Implemented Partitioning, Dynamic Partitions, buckets in Hive and wrote map reduce programs to analyze and process the data
- Streamlined Hadoop jobs and workflow operations using Oozie workflow engine.
- Involved in product life cycle developed using Scrum methodology.
- Involved in mentoring team in technical discussions and Technical reviews.
- Involved in code reviews and verifying bug analysis reports.
- Automated work flows using shell scripts.
- Performance tuning of the hive queries, written by other developers.
Environment: Hadoop, Hortonworks HDFS, Hive, MapReduce 2.0, Sqoop 2.0.0, Oozie 3.0,SQL, Shell Scripting, Ubuntu, Linux Red Hat.
Confidential
Hadoop ConsultantResponsibilities:
- Responsible for analyzing business requirements and detail design of the software.
- Design and developed Front End User interface
- Developed Web based (JSP, Servlets, java beans, JavaScript, CSS, XHTML) console for reporting and life cycle management.
- Connectivity of JDBC was established using Oracle10g.
- Writing SQL queries to insert, update database. Used JDBC to invoke Stored Procedures.
- Involved with project manager in creating detailed project plans.
- Designed technical documents using UML.
- Involved in developing presentation layer using JSP, AJAX, and JavaScript.
- Created Junit Test cases by following Test Driven development.
- Responsible for implementing DAO, POJO using HibernateReverseEngineering, AOP and service Layer.
- Used Spring, MVC pattern, struts frame work and followed Test Driven.
Environment: Rational Application Developer (RAD) 7.5, Web Sphere Portal Server 6.1, Java 1.6, J2EE, JSP 2.1, Servlets 3, JSF 1.2, Spring 2.5, Hibernate 2.0, Web Sphere 6.1, AXIS, Oracle 10g, JUnit, XML, HTML, Java Script, AJAX, CSS, Rational Clear Case.
Confidential, Minnetonka, MN
JAVA Developer
Responsibilities:
- Extensively used Core Java, Servlets, JSP and XML
- Used Struts 1.2 in presentation tier
- Generated the Hibernate XML and Java Mappings for the schemas
- Used DB2 Database to store the system data
- Actively involved in the system testing
- Involved in fixing bugs and unit testing with test cases using JUnit
- Wrote complex SQL queries and stored procedures
- Used Asynchronous JavaScript for better and faster interactive Front-End
- Used IBM Web-Sphere as the Application Server
Environment: : Java 1.2/1.3, Swing, Applet, Servlet, JSP, XML, HTML, Java Script, Oracle, DB2, PL/SQL
Programmer Analyst/Java Developer
Confidential
Responsibilities:
- Involved in complete software development life cycle - Requirement Analysis, Conceptual Design, and Detail design, Development, System and User Acceptance Testing.
- Involved in Design and Development of the System using Rational Rose and UML.
- Involved in Business Analysis and developed Use Cases, Program Specifications to capture the business functionality.
- Improving the coding standards, code reuse, and performance of the Extend application by making effective use of various design patterns (Business Delegate, View Helper, DAO, Value Object etc. and other Basic patterns).
- Design of system using JSPs, Servlets
- Designed application using Process Object, DAO, Data Object, Value Object, Factory, Delegation patterns.
- Involved in the design and development of Presentation Tier using JSP, HTML and JavaScript.
- Involved in integrating the concept of RFID in the software and developing the code for its API.
- Coordinating between teams as a Project Co-coordinator, organizing design and architectural meetings.
- Design and developed Class diagram, Identifying Objects and its interaction to specify Sequence diagrams for the System using Rational Rose.
Environment: JDK 1.3, J2EE, JSP, Servlets, HTML, XML, UML, RATIONAL ROSE, AWT, Web logic 5.1 and Oracle 8i, SQL, PL/SQL. References: Available upon Request.