Data Scientist/big Data Enterprise Architect Resume
Hoffman Estates, IL
SUMMARY
- 14 years of professional work experience collaborating architecture, design, and development for multiple systems and led teh architecture and design of components at each development level.
- Designed and implemented data science/machine learning projects that include real - time predictive, behavioral, social and sentiment analysis, data visualization against big data using unstructured, semi-structured and structured data wif technologies including R, RMR, Python (Numpy, Scipy, NLTK, Pandas), Scikit-learn, MLIB, PMML, Tableau and Spark.
- Strong hands-on experience implementing big data solutions using technology stack including Hadoop MapReduce, Spark, Scala, Shark, Pig, Hive, HDFS, NOSQL (MongoDB, HBase and Cassandra) and NewSQL (Neo4j), Sqoop, Flume, Storm, Spark streaming, Kafka, Impala and Oozie.
- Expert designer of solution architecture and infrastructure architecture for big data projects.
- Expertise in planning architecture strategy, road maps and designing enterprise solution architecture, service-orientated architecture including data acquisition, data security, storage, transformation, data analysis, predictive analysis, business intelligence and integration wif data stores/DW for tailor made solutions to meet specific business needs.
- Strong enterprise architecture experience using EA frameworks TOGAF and Zackman.
- Expertise in architecting, designing and implementing web based enterprise applications using UML and Java EE technologies like Servlets, JSP, EJB, Spring MVC, Spring IOC, Spring AOP, Struts, Hibernate, JDBC, JNDI, JMS, Web Services, Java Beans, JAXB, SAX/DOM, Web services, XML.
- Strong experience wif SQL Scripts and PL/SQL in Oracle & MySQL databases.
- Good experience designing data architecture, data modeling (logical and physical), data integration, data management, data analytics and data visualization using both traditional and big data approaches.
- Strong experience designing highly scalable system architectures that accommodate future growth, minimize risk, and optimize long-term investment in IT infrastructure.
- Real-time and batch oriented Business Intelligence (BI) and Analytics using IBM Big Insights and Big Streams. Designed alternative solutions using Datameer, Pentaho and Platfora products.
- A solid noledge of infrastructure planning, scaling, and administration considerations those are unique to big data imports/exports, storage, processing and big data science.
- Strong understanding of teh overall technical goals of teh business and teh current hardware and software inventory and capabilities.
- Deep noledge of industry patterns and approaches to create solutions.
- Expertise in creating solutions and choosing technologies that map to company needs, as well as deep understanding of prerequisites and hardware and software requirements.
- An excellent team player wif extraordinary problem solving and trouble-shooting capabilities and ability to work under pressure wif minimum or no supervision.
- Thought leadership and offer excellent communication and inter-personal skills and exceptional ability to quickly master new concepts and applications education.
TECHNICAL SKILLS
Big Data Technologies: Hadoop, Spark, CDH4.x/5.x YARN, MapReduce, Spark streaming, Pig, Hive, HDFS, Zookeeper, HBase, Cassandra, MangoDB, Kafka, Storm, Sqoop, Flume, Hive, Oozie, Mahout, Python (Numpy, Scipy, Pandas, NLTK), R, rmr, Hcatalog, Avro, Impala, Nutch, Solr, Elastic Search, Neo4j
Big Data Vendors: SAS, Enterprise R, IBM Infosphere Big Insights/Streams, SAP HANA, Vertica, Splunk, Pentaho Kettle, Datameer
Languages: Java, Scala, Python, C++, Shell scripting
Web/Enterprise App Technologies: J2EE, EJB, Servlets, JSP, Flex, JDBC, JMS, JNDI, SOAP, JAX-RPC, JAXP, XML, XSL, XSLT, HTML, AJAX, Log4J and JavaScript, Java Mail, JTS, JAAS, JTA, JNI
Frameworks: Spring MVC, Spring, Hibernate, Struts MVC, JQuery, AJAX
Web/Application Servers: JBoss 4.x/3.x, Weblogic 8.x/7.x, Websphere Application Server 5.x, Jakarta Apache 3.x, Tomcat 5.x
RDBMS: Oracle 10g/9i, MS-SQL Server, MySQL 5.x/4.x
Tools: WSAD, Eclipse, Junit, HttpUnit, Rational Rose 2002, TOAD, Apache ANT, JBuilder, XML Spy
Methodologies: UML
CASE Tools: Rational Rose 2000/2002
Source Control: CVS, Subversion, Perforce, PVCS
Operating Systems: LINUX, Sun Solaris, Windows XP/2003/NT
PROFESSIONAL EXPERIENCE
Confidential, Hoffman Estates, IL
Data Scientist/Big Data Enterprise Architect
Responsibilities:
- Designed and implemented real-time predictive analytics and recommendation engine to solve business needs such as “merchandise prediction”, “item price optimization” and “item recommendations” using bigdata and machine learning technologies including Python, NumPy, SciPy and Pandas, R, Tableu, H2o, Spark MLIB.
- Involved in entire lifecycle of machine learning projects including data preparation, predictive model creation, test and validate teh model, deployment of teh model in prod environment, analysis, monitoring and model optimization.
- Strong hands-on experience wif different machine learning algorithms, model fitting and optimization techniques including clustering, classification, regression, ensemble learning, reinforce learning, time-series analysis, pattern recognition and cross validation.
- Designed solution architecture, technology architecture and implemented solutions for different big data and machine learning initiatives. Implemented critical application components using technologies including Pig, Impala, Hive, MapReduce, HDFS, Cassandra, MongoDB, HBase, Spark, Spark Scala API, Spark streaming, Storm, Kafka, Flume, Sqoop, Twitter API, Facebook API, Nutch, Elastic Search, R, rmr, Python, Tableau, Neo4j.
- Designed and implemented ETL on Hadoop using Pig, MR and Hive.
- Implemented big data pipeline from external data sources using Sqoop and Flume technologies.
- Designed Cassandra, MongoDB data models for different business needs. Lead capacity planning effort for Cassandra and MongoDB clusters. Put together best practices for NoSQL key design, development and production deployment.
- Implemented MongoDB based applications using Java Driver API, Casbah(Scala API) and Mongo shell commands.
- Implemented Cassandra applications using Datastax Java API, Hector API and CQL API.
- Built enterprise data hub following lambda architecture principles wif a combination of technologies Hadoop, Spark and Spark streaming.
- Implemented Spark and Spark streaming applications using Scala, Spark SQL and MLib APIs.
- Created reference architecture for different Big Data business case scenarios and put together development best practices and laid out a plan for data governance using Hadoop.
- Designed and implemented security model for Hadoop cluster.
- Implemented major components as a proof of architecture design and to provide direction to teh application teams.
- Created solution and technology architecture design strategy for data science and machine learning projects on big data platform to build critical enterprise wide integrated products and services.
- Lead teh architecture strategy and designing of critical initiative to get a real-time 360 degree view of teh customers by integrating discrete and disparate data source while absorbing teh data quality complexities from silo-ed data sources and systems. Built predictive models on integrated customer profiles to target teh right customers wif right products, service at teh right time. And also engage teh customers in teh most effective manner.
- Involved in infrastructure architecture and capacity planning to ensure infrastructure and Hadoop, Spark and NoSQL clusters align wif over all enterprise architecture. Provided necessary recommendations to operations team.
- Created strategy, designed solutions and POCs for service oriented big data architecture, big data security, real-time analytics and machine learning on big data.
- Created best practices, techniques and tools to simplify and speed up new applications on big data.
- Identified and evaluated new big data technologies/products/tools that help fill teh gap in over all enterprise architecture for future business needs.
Environment: Hadoop 0.23, MongoDB, HBase, Cassandra, YARN, Spark, Spark streaming, Shark, CDH4.x/5.x, HortonWorks HDP, Pig, MapReduce, Hive, HDFS, Oozie, Sqoop, Flume, Teradata,, Storm, Kafka, R, Python(Numpy, Scipy, NLTK, Pandas), Datameer, Twitter API, Facebook API, Nutch, Tableau, Neo4j
Confidential, Chicago, IL.
Data Scientist/Big Data Enterprise Architect
Responsibilities:
- Designed and implemented machine learning projects including “Provider’s Claims Fraud Detection” and “Member Health Risk Prediction” using bigdata and machine learning technologies including Python, NumPy, SciPy and Pandas, R and Tableu.
- Designed teh solution architecture of this big data system that provides real-time predictive analysis based recommendations to teh applicants and members using Hadoop technology stack.
- Designed Cassandra data models and involved cluster capacity planning.
- Implemented Cassandra clients using Java API and performed data bulk loading into Cassandra data store.
- Created technical architecture for predictive analysis based recommendations using IBM Infosphere Big Insights, streams, SPSS modeler, CEP Engine, Time series module.
- Designed architecture for alternative solution option using R, Hadoop and Python scientific libraries.
- Massive amount of structured, unstructured and semi-structured data is ingested from member/applicant call logs, emails, portal click streams, health care research and clinical trial projects, social media, discussion boards as input for teh predictive analysis.
- Created prototypes for teh major components as a proof of architecture design and to provide direction to teh IT team.
- Created solution architecture artifacts and deliverable using Sparx Enterprise Architect tool and presented teh architecture to business and IT teams.
- Involved in infrastructure and capacity planning to ensure teh necessary infrastructure exists to support teh solutions. Provided necessary recommendations.
- Worked closely wif data scientists, business users and marketing team for architecture realization.
- Guided application development teams on implementation of big data components and modules.
- Aligned solution architecture design wif enterprise architecture context and strategic direction.
- Worked closely wif enterprise architecture governance, data architects and enterprise security to re-align data architecture for big data strategy.
Environment: Apache Hadoop, MapReduce, Hive, Cassandra, Hbase, Java, J2EE, HDFS, Sqoop, Flume, SAS, IBM Infosphere Big Insights, Big streams, SPSS modeler, CEP Engine, Social and Time series toolkit.
Confidential, Hoffman Estates, IL.
Big Data Solution Architect
Responsibilities:
- Designed teh solution architecture and lead teh team to quickly deliver near real-time context and location aware item price recommendation engine that processes massive amount of enterprise wide data in Hadoop Cluster to target improved sales and teh margins and build new customer base and to targets existing customer retention.
- Created prototypes leading to specific engineering initiatives and lead teh execution of technology strategy for decisions on technology platforms and partnerships.
- Created technical solution that solves business need for a business rules engine wifin a business process flow that works against big data in Hadoop.
- Made presentation on solution architecture for other technical teams and top level executives to provide guidance for their system architecture.
- Modeled system architecture and design artifacts using UML2 modeling tool.
- Setup standards and processes for Hadoop based application design and implementation.
- Designed and implemented a metadata automation framework for big data in Hadoop.
- Designed and implemented real-time big data analytics and reporting system using HBase and JasperSoft.
- Created prototypes for alternative BI solutions using Microstrategy, Informatica and Pentaho-kettle ETL.
- Designed and implemented reusable Hadoop workflow using Oozie that interacts wif different Hadoop technologies.
- Designed and implemented prediction model using Mahout Implementations of Collaborative filtering, clustering and classification algorithms.
- Mentored different teams working on Hadoop and provided a direction for their implementation.
- Drove teh development of agile mapreduce programing practices.
- Architected data acquisition and transformation solutions for Hadoop using Sqoop, Flume, custom and Informatica HParser.
- Provided strategy and direction for unit testing and QA process wif a focus on Hadoop and HBase components.
- Put together a strategy and recommended technologies and for future business needs.
- Setup teh build and deployment infrastructure for teh system using Maven2 and Nexus.
- Created environment for continuous integration and code coverage using Hudson and Cobertura.
- Designed and lead implementation of Drool rules engine and Hadoop integration to implement configurable pricing logic against big data.
- Orchestrated business process engine using JBPM and integrated it wif rules engine.
- Setup version control infrastructure using Subversion (SVN) tool.
Environment: Apache Hadoop, MapReduce, HDFS, HBase, Hive, Oozie, Sqoop, Flume, Mahout, JasperSoft BI, Informatica ETL, Nutch, Lucene, Solr, Drools, JBPM, UML2, Spring, Hibernate, MySQL
Confidential, Wooddale, IL
SOA Architect
Responsibilities:
- Architected, designed and implemented integrations among critical enterprise applications for Confidential business in terms of student acquisition, retention and revenue generation.
- Implemented integrations following SOA architecture principles. Implemented teh application using teh concrete principles laid down by several design patterns such as Session Façade, Business Delegate, Bean Factory, Singleton, Data Access Object, and Service Locator.
- SOA architecture is built using Oracle SOA suite on technologies XML, SOAP Web Services, JMS, Enterprise Service Bus (ESB), Business Process Execution Language (BPEL), Axis2, Oracle Database, Oracle AQ, Oracle Streams, OC4J Application Server, PL/SQL Java, Spring, Spring AOP, Spring Transactions, Spring Batch, Hibernate.
Environment: UML2, Java EE, Oracle SOA Suite, BPEL, ESB, XML, SOAP Web Services, WSDL AXIS2, Maven2, JMS, Spring IOC, AOP, Transactions, JDBC, Spring Batch, Hibernate, JDBC, Eclipse, OC4J Application Server, ANT1.7, TOAD, Star Team, SQL, PL/SQL, Oracle
Confidential, Schaumburg, IL
Solutions Architect/Technical Lead
Responsibilities:
- Provided leadership and technical expertise to build teh system from ground up.
- Directed, trained and mentored developers on project to cost-effectively increase growth, performance and capacity in existing environment.
- Modeled system architecture and design artifacts using UML2 Deployment, Sequence, Class and Activity diagrams.
- Designed and implemented highly scalable, distributed vertical crawler using Apache Nutch on Hadoop File System (HDFS) in a clustered environment to continuously fetch and index job postings from huge number of sites.
- Designed and implemented Job Searching using Apache Solr against teh Lucene Index.
- Designed and implemented a recommendations engine using Apache Mahout.
- Recommended technologies and put together a strategy for future growth in social and mobile application space.
Environment: Facebook, LinkedIn API, OAuth, HTML5, Java, Spring MVC, Spring IOC, Spring AOP, Spring Transactions, Hibernate, JSP, DWR, JQuery, JSON, Apache Nutch wif Hadoop, Lucene, Apache Solr, Apache Mahout, XML, Java Mail, Restful Web Services, MySQL Database, JBoss Application Server, Maven2, Subversion.
Confidential, Hoffman Estates, IL
Java/J2EE Architect
Responsibilities:
- Involved in Full Cycle of Software Development from Analysis through Design, Development, Integration and testing phases.
- System built using Model-View-Controller (MVC) architecture. Implemented teh application using teh concrete principles laid down by several design patterns such as Composite View, Session Façade, Business Delegate, Bean Factory, Singleton, Data Access Object, and Service Locator.
- Designed and implemented application using JSP, DWR, Flex, BlazeDS, Spring MVC, Struts MVC, JNDI, Spring IOC, Spring Annotations, Spring AOP, Spring Transactions, Hibernate, JDBC, SQL, ANT, JMS, DB2, Oracle, JBoss and Websphere app servers .
- Used Clearcase version control tool.
- Automate build process by writing ANT build scripts.
- Configured and customized logs using Log4J.
- Deployed applications on Jboss Application Server and performed required configuration changes.
- Created SQL scripts and executed them using Squirel tool against DB2/Oracle database.
- Lead multiple high priority releases wif aggressive deadlines.
- Mentored and provided technical guidance to other developers.
- Provided production support by debugging and fixing critical issues related to application and database.
Environment: J2EE, HTML, JavaScript, AJAX, DWR, JSP, Servlet, Flex, JSTL, JMS, Struts, Spring MVC, Hibernate, JDBC, MyEclipse, JBoss Application Server, ANT1.7, JMeter, Squirel/TOAD, Clearcase, XML, UML, RationalRose, SQL, Windows XP, DB2 and Oracle