Big Data Lead Resume
SUMMARY
- Total work experience of 13+ years. Over 12+ years of professional experience as an Architect, Techncial Lead and Developer working in all phases of development life - cycle including requirements gathering, development (frontend, middleware, and backend), testing, implementation and production support.
- Over 6+ years of experience as a Architect and Technical lead, leading a team of 3-5.
- 5+ years of experience in HADOOP, HIVE, SQOOP, Map Reduce, Spark.
- 11 + years of experience on MS SQL Server, data warehouse ETL, SSIS/SSRS and Oracle Database PL/SQL development, design data structures, and automate dataflow from source to target data models, database management with substantial development experience in creating stored procedures, packages, triggers, functions, views, exception, database query tuning and performance tuning
- 1-year experience in Cassandra Development and Administration.
- Experience in Scala, Python, Java.
- Developed automation scripts in Python and Bash.
- Experience in Java/J2EE-related technologies, front-end, middleware and backend developer. Proficient in developing J2EE applications using Java Server Pages (JSP), Enterprise Java Beans (EJB), Servlets, Spring, Struts, Hibernate, XML and SOAP and RESTful Web Services.
- Proficient in developing strategies for Extraction, Transformation and Loading (ETL) mechanism.
- Working knowledge of NoSQL databases (HBase and MongoDB).
- Ability to work in a team environment or independently, quick learner and multi-tasker.
TECHNICAL SKILLS
Database: Cassandra, MS SQL Server 2005/2008/2012 , Oracle 8i/9i/10g/11g, PL/SQL (stored procedure, packages, trigger, view), HBASE, HIVE, MS Access.
Big Data: Hortonworks Data Platform (HDP2.4), Hive, Sqoop, MapReduce, Automic, Spark, Scala, Storm, Kafka
Java/J2EE: Java API, JDBC, Multithreading, collection, JSP (HTML, JavaScript) Servlet, EJB, Tags, JNDI, RMI, XA Transaction (JTS,JTA), Java Mail, JavaBeans, LDAP, JOSSO, security, annotation, IOC, XML, HTML, CSS, XSLT, XSL and shell script.
Framework: Spring MVC, Struts, ORM (Hibernate, Entity Bean)
Middleware: Web Server (WebSphere Application Server, Web Logic Server (BEA), Tomcat
ETL: SSIS/SSRS, SQL Server DTS, Web service (SOAP, JWS), REST.
IDE: RAD, Eclipse, NotePad++, TOAD, Visual Studio 8 and Visual Studio .NET
Language: Java, Bash Shell scripting, Python, Scala
Other Tools: BIGID (Data Discovery)
PROFESSIONAL EXPERIENCE
Big Data Lead
Confidential
Responsibilities:
- Implement the data discovery for Core HR using BIGID data discovery tool.
- Development and design of Hadoop ETL strategies, system monitoring and improving performance and capacity, and planning for future expansions.
- Responsible for the design and development of technical solutions utilizing the big data platform.
- Responsibilities include defining technical requirements, loading data into HDFS, managing Linux directory structure, managing HDFS framework, managing Hive databases, managing SQL databases, data extraction, data transformation, automating jobs, productionizing jobs, and exploring new big data technologies within a Massively Parallel Processing environment.
- Responsible for maintaining and supporting all Production/QA/development.
- Developed a script in Python to automate script generation for Hive queries.
- Developed Bash Shell scripts for ETL and other maintenance processes.
- Managing all Hive schema changes in production and Dev.
- Hive Database Administration on Production Edge Nodes with server configuration, monitoring, performance tuning and maintenance with outstanding troubleshooting capabilities
- Hands on experience in high availability and redundancy options.
- Ability to efficiently work with multiple developer teams.
- Audit/approve developers' change requests to existing tables and reports.
- Help developers to performance tune code.
- Provide 24x7 support for critical production systems.
- Perform scheduled maintenance and support release deployment activities after hours.
- Developed CD/CI using Looper and Git, for Automating Deployment of Hive code to Edge Nodes in Production.
- Work with developers on data modelling on Erwin.
- Hive Database creation and role assignment. Table partitioning.
- Performs extensive data analysis, including data profiling and data quality analysis
Environment: BIGID, Hadoop, Hive, HDFS, PIG, Sqoop, Automic, Spark, Scala, Unix/Linux, Python, Bash Shell scripting, MS SQL Server, DB2, Maria DB,.
Big Data Lead
Confidential
Responsibilities:
- Development and design of Hadoop ETL strategies, system monitoring and improving performance and capacity, and planning for future expansions.
- Responsible for the design and development of technical solutions utilizing the big data platform.
- Responsibilities include defining technical requirements, loading data into HDFS, managing Linux directory structure, managing HDFS framework, managing Hive databases, managing SQL databases, data extraction, data transformation, automating jobs, productionizing jobs, and exploring new big data technologies within a Massively Parallel Processing environment.
- Responsible for maintaining and supporting all Production/QA/development.
- Developed a script in Python to automate script generation for Hive queries.
- Developed Bash Shell scripts for ETL and other maintenance processes.
- Managing all Hive schema changes in production and Dev.
- Hive Database Administration on Production Edge Nodes with server configuration, monitoring, performance tuning and maintenance with outstanding troubleshooting capabilities
- Hands on experience in high availability and redundancy options.
- Ability to efficiently work with multiple developer teams.
- Audit/approve developers' change requests to existing tables and reports.
- Help developers to performance tune code.
- Provide 24x7 support for critical production systems.
- Perform scheduled maintenance and support release deployment activities after hours.
- Developed CD/CI using Looper and Git, for Automating Deployment of Hive code to Edge Nodes in Production.
- Work with developers on data modelling on Erwin.
- Hive Database creation and role assignment. Table partitioning.
- Performs extensive data analysis, including data profiling and data quality analysis
Environment: Hadoop, Hive, HDFS, PIG, Sqoop, Automic, Spark, Scala, Unix/Linux, Python, Bash Shell scripting, MS SQL Server, DB2, Maria DB,.
Big Data Lead
Confidential
Responsibilities:
- Development and design of MS SQL Server database strategies, system monitoring and improving database performance and capacity, and planning for future expansions.
- Responsible for maintaining and supporting all Production/QA/development MS SQL servers
- Managing all schema and SP changes in production and QA.
- Working with multiple development teams on database design and best practices.
- Database Administration on Production Servers with server configuration, monitoring, performance tuning and maintenance with outstanding troubleshooting capabilities
- Hands on experience in high availability and redundancy options, Database mirroring, Availability groups
- Ability to efficiently work with multiple developer teams.
- Audit/approve developers' change requests to existing tables, stored procedures, indices, constraints, triggers and views, as well as audit/approve new entries.
- Help developers to performance tune code related to database access.
- Provide 24x7 support for critical production systems.
- Perform scheduled maintenance and support release deployment activities after hours.
- Perform end-to-end tuning of Cassandra clusters against very large data sets
- Monitor Cassandra clusters, performance, and capacity planning Cassandra cluster connectivity and security
- Troubleshooting and Performance tuning of Datastax and Apache Cassandra.
- Automating Deployment of Cassandra/Opscenter.
- Manage Offshore/Onshore coordination on various tasks assigned.
- Capacity Planning of Cassandra.
- Work with developers on data modelling.
- Database creation and role assignment. Table indexing and partitioning.
- Performs extensive data analysis, including data profiling and data quality analysis
- Responsible for the design and development of technical solutions utilizing the big data platform.
- Responsibilities include defining technical requirements, loading data into HDFS, managing Linux directory structure, managing HDFS framework, managing Hive databases, managing NoSQL databases, data extraction, data transformation, automating jobs, productionizing jobs, and exploring new big data technologies within a Massively Parallel Processing environment.
Environment: Cassandra, Hadoop, Hive, HDFS, PIG, Sqoop, Oozie, Spark, Scala, MS SQL Server, Unix/Linux, Java.
Hadoop Developer
Confidential
Responsibilities:
- Developed data pipeline using Sqoop and Hive to ingest customer, shipments and other WMS data into HDFS for analysis.
- Worked on importing and exporting data from SQL Server into HDFS and HIVE using Sqoop for analysis, visualization and to generate reports.
- Used Hive and created Hive tables and involved in data loading.
- Used Pig as ETL tool to do transformations like splitting files, cleaning unstructured data before storing the data onto HDFS.
- Created Java UDF in Pig for a custom date function, to convert the data into a particular format.
- Designing Oozie Workflow based pipeline instead of Automic scheduler triggered Unix Shell scripts for data loads. Exploring functionalities of Oozie Coordinator to evaluate whether external scheduler is needed or not.
- Develop Unix shell scripts for running Hive and Sqoop queries.
- Implemented a POC to migrate iterative map reduce programs into Spark transformations using Spark and Scala.
- Mentor junior team members with the assistance of PM/Architect.
- Maintain high team morale.
- Conduct peer-reviews and lead reviews and provide feedback
- Provide accurate and detailed weekly task reports.
- Create and develop performance report, delivery method, scope of work, and general duties records
Environment: Hadoop, Hive, HDFS, PIG, Sqoop, Oozie, SSIS, SSRS, MS SQL Server, Unix/Linux, Java.
Software Engineer
Confidential
Responsibilities:
- Understanding the Freight Pay and Key Performance Indexing (KPI) applications architecture, requirement gathering, developing, testing and creating design documents.
- Developed, maintained and enhanced the Key Performance Indexing (KPI) application using Spring MVC Framework.
- Build RESTful Web services using JAX-RS API.
- Developed, maintained and enhanced the Freight application using Core Java.
- Developed, tested and implemented systems, ETL processes that provide optimal performance, availability and reliability for large data warehouses, for Key Performance Indexing (KPI).
- Designed and developed various SSIS packages (ETL) to extract and transform data from multiple SQL Server databases and also scheduled SSIS Packages, for Key Performance Indexing (KPI).
- Created ETL metadata reports using SSRS, reports include like execution times for the SSIS packages, Failure reports with error description.
- Responsible for Data migration from SQL Server 2000, 2005 to SQL Server 2012 Databases.
- Evaluate, analyze, design, develop, document, test and implement various solutions and system enhancements. Adapt ETL processes to accommodate changes in source systems and to respond to changes in business requirements
- Consumed RESTful Web services provided by different vendors.
- Involved in writing Spring Configuration XML file that contains declarations and other dependent objects declaration.
- Developed user interface using JSP, Jasper Reports and Java Script.
- Inserted Dependency Injection between components using spring’s (IOC) concept.
- Implemented Asynchronous messaging between components using JMS 1.1.
- Developed application service components and configured beans using Spring IoC
- Involved in unit testing of various modules by generating the Test Cases
- Developed Ant Scripts for the build process and deployed in IBM WebSphere Application Server.
- Involved in Bug fixing of various modules that were raised by the testing teams in the application during the Integration testing phase.
- Mentor junior team members with the assistance of PM/Architect.
- Maintain high team morale.
- Conduct peer-reviews and lead reviews and provide feedback
- Provide accurate and detailed weekly task reports.
- Create and develop performance report, delivery method, scope of work, and general duties records
- Go to clients for project briefing, consultation, installation and closeout reviews
- Help out in managing customer demands to ensure maximum satisfaction, and to maintain quality over quantity
- Engage in the negotiation of customer job demands and specifications as regards to labor and material; and assist in creating comprehensive technical documents
- Supervise activities between internal and external resources, and facilitate smooth workflow for service delivery
Environment: RAD 7, Eclipse, Spring, Java, JSP, JavaScript, J2EE, JMS, JDBC, Tags, Hibernate, Servlet, JMS, JUNIT, J4Logs, XML, RESTful, Spring Integration, Toads 8, and SQL/PL SQL using Oracle 9i/10i, MS SQL Server, IBM WAS Server, MS SQL Server 2008R2, 2012, T- SQL, MS Excel 2010/2007/2003 , Microsoft SQL Server Integration Services (SSIS) 2008R2, 2012, Microsoft SQL Server Reporting Services (SSRS) 2008R2, Microsoft SQL Server Analysis Services (SSAS) 2008R2,SQLServer Business Intelligence Development Studio, ETL
Software Engineer
Confidential
Responsibilities:
- Understanding the Warehouse Interface Layer (WIL) application architecture, requirement gathering, developing, testing and creating design documents.
- Developed, maintained and enhanced the Warehouse Interface Layer (WIL) application using Struts MVC Framework.
- Developed the Business tier with Stateless and Stateful Session beans with EJB3.0 standards
- Developed user interface using JSP and Java Script.
- Implemented Asynchronous messaging between components using JMS 1.1.
- Developed Hibernate DAO's using spring framework's Hibernate Dao Support and Hibernate Template.
- Involved in unit testing of various modules by generating the Test Cases
- Developed Ant Scripts for the build process and deployed in BEA WebLogic Application Server.
- Involved in Bug fixing of various modules that were raised by the testing teams in the application during the Integration testing phase.
- Mentor junior team members with the assistance of PM/Architect.
- Maintain high team morale.
- Conduct peer-reviews and lead reviews and provide feedback
- Provide accurate and detailed weekly task reports.
- Create and develop performance report, delivery method, scope of work, and general duties records
- Go to clients for project briefing, consultation, installation and closeout reviews
- Help out in managing customer demands to ensure maximum satisfaction, and to maintain quality over quantity
- Engage in the negotiation of customer job demands and specifications as regards to labor and material; and assist in creating comprehensive technical documents
- Supervise activities between internal and external resources, and facilitate smooth workflow for service delivery
Environment: Eclipse, Struts, Java, JSP, JavaScript, J2EE, JMS, JDBC, Tags, Hibernate, Servlet, JMS, JUNIT, J4Logs, XML, MS SQL Server, BEA WebLogic Application Server