Project Lead Resume
Dallas, TX
SUMMARY:
- Working as an Architect in Big Data solutions that involved Cloudera, Microsoft Azure HDInsight’s.
- Cloudera Certified Developer for Apache Hadoop with 16+ Years of IT experience in Development, Design, Application Support & Maintenance and managing projects of IT applications.
- Experience in Planning and Defining Scope, Developing Schedules, Budgeting, Cost estimations, Team Leadership, Monitoring and Reporting Progress
- Experience in JAVA patterns by using Open Source products.
- Experience in solutions involving end to end using Hadoop HDFS, Map/Reduce, Strom, Solr, Kafka, Scala, Pig, Hive, HBase, Sqoop, Oozie and Zookeeper and performance tuning the Hadoop cluster.
- Experience in programming python on Spark.
- Experience in real time streaming using Kafka and Storm (POC).
- Experience in Installing and configuring and upgrading Hadoop cluster using Cloudera Manger.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience working with large databases like Oracle, MySQL and DB2.
- Experience in data migration and data modelling using N0 - SQL databases.
- Knowledge in Data analytics using R Lang, Apache Hadoop, Map-Reduce.
- Knowledge in Data Analysis techniques like clustering, classification, regression, forecasting and prediction using R Lang.
- Knowledge on BI and data warehouse processes and techniques.
- Expert in working with Multi-threaded applications using VC++ and C++ in Windows, Unix and Solaris Environment
- Experience in various activities of Agile Methodology UMF, UML and Design patterns
- Experience in various phases of Software Development such as Study, Analysis, Development, Testing, Implementation and Maintenance of a Real time systems.
- Experience in Managing and leading the projects with Onsite/Offshore model
TECHNICAL SKILLS:
Programming Languages: Java, Python, R Lang, Map reduce, Pig-Latin, Spark, Scala, C++, VC++,C# .Net 4.0.
Tools: MS Project, Visual Studio, CVS, Sqoop, Oozie, zookeeper, Spark, Storm,Kafka, AWS. PowerShell.
Frame works: Hadoop HDFS (Apache, Cloudera, Hortonworks), Cassandra, Microsoft,Azure HDInsights, MVC.
Database: My SQL, Oracle, Teradata, Hive, HBase, MongoDB, Sybase, DB2.
Operating Systems: Windows, UNIX, Linux, Solaris
Theoretical Knowledge: Flume, Solr, Microstatergy.
PROFESSIONAL EXPERIENCE:
Confidential, Boston, MA
Hadoop Engineer
Responsibilities:
- Designing a Proof of Concept (POC) Data Lake to demonstrate the value of Big Data analytics using data from the Confidential Investments, RPS and Insurance line of business.
- This solution POC is a first step toward realizing the Big Data Shared Service (BDSS) to source, cleanse, curate, and deliver analytical data to Confidential (JH) Business Units and Functional Areas.
- Architecting and designing the pipeline of the data.
- Interacting with data modelers and creating the refined data scripts.
- Curation through Pig and scrubbing of data and loading to the cloud environment.
- Creating Hive External tables to Data scientists for analytics.
- Involved in optimization of data and performance tuning of hive queries on spark.
- Developing and resolving queries on Spark cluster using Python.
- Supporting visualization team to create dash boards on QlikView.
Confidential, New York
Developer /Architect
Technologies: Cloudera 5.3, RHEL, R, Sqoop, SparkR, Oozie, Hive, python, BO, Kafka, Spark streaming.
Responsibilities:
- Architecting and designing the project from start.
- Interacting with data scientists and gathering requirements.
- Installation of Spark cluster and performance tuning.
- Developing and integrating application on Spark (RDD’s) using Hadoop Cluster.
- Coding and implementation of Hive tables using monthly partitions.
- Importing data from Oracle using Sqoop and other files through FTP.
- Developing and scripting in Python.
- Developing and running scripts on Linux production environment.
- Reporting and visualizations using Business objects. ( Tableau)
Confidential
Developer /Architect
Technologies: Java 1.7, Cloudera 4, Sqoop 1.4.3, Hive 0.96, RHEL, Eclipse Helios, Informatica 9.5, Teradata,Oozie, Zookeeper, Oracle.
Responsibilities:
- Analyze Barclays enterprise architecture
- Solution proposal for migrating from Teradata to Hadoop platform
- Design the actual solution
- Analyze tools and technologies required and support provided by existing software’s latest versions and its support for Hadoop HDFS.
- Propose for technological upgrades
- Hadoop cluster sizing
- Develop a solution for end to end data flow from Oracle to HDFS, access data available in HDFS from Informatica, push/pull of data available in Teradata into and from HDFS
Confidential
Developer /Architect
Technologies: Hortonworks, DB2 9.7, Linux, Hive, HBase, Zookeeper.
Responsibilities:
- Responsible for performing in-depth analysis and conceptualization of Retail Banking Customer 360 degree view.
- Responsible for creating use cases/functional requirements.
- Responsible for designing the user interface.
- Responsible for creating entity model design and user interface creation using AppBuilder.
- Responsible for the overall solution delivery developed using InfoSphere Data Explorer.
- Created the estimation and work breakdown structure for the solution.
- Planned the development and helped the team resolve technical issues.
- Worked with Data Explorer product development team to resolve technical issues and identify solutions.
- Published solution offering document for this solution.
- Responsible to take business requirements to process credit card historical data from Banking SME.
- Responsible to perform loading of credit card historical data into Hadoop.
- Responsible to design and develop Map/Reduce programs for analytics purpose to process the credit card historical data and generate output which is further indexed using API for visualization purposes.
- Involved in setting up 50 node Hadoop cluster for executing the solution
Confidential
Project Lead
Technologies: Apache Hadoop, Hive, Java.
Responsibilities:
- Hadoop EDW 2.0 is a migration project developed using java that migrates existing data from Teradata to Hadoop.
- HDFS is used for storage and Hive for querying the data. Project framework is written in Java.
- Solution provides single storage i.e., Hadoop to data generated by various Applications across all regions and generates aggregates as an end result.
- Developed classes for end-to-end framework for interacting with Hadoop.
- Worked on Hive Queries to accomplish insert and update on data through joins and functions including UDFs.
- Participated extensively in the design and development of the project.
Confidential
Developer / Project Lead
Technologies: Hadoop, Hive, Amazon Web Services, Java.
Responsibilities:- Confidential Dispenser Analytics is a project to analyze data generated by company’s drinks dispenser machines and understand when to re-fill the drink reserves in real time.
- Run real time and batch mode analytics on data to detect usage patterns, customer presence and to understand geographical distributions of products and consumers in real time.
- Data volume involved was 2.2 TB of data. Technology stack was based on AWS.
- For storage, S3 was used, EMR for processing, Hive for transformations.
- Also some parts of project used Dynamo DB (for metadata), Glacier (for archiving) and SNQ (for data ingestion queues.)
- Project was organized in 3 layers namely, Staging, Processing and BI.
- De-duplicate incoming data using MapReduce written in Java
- Write transformation queries in Hive
- Write UDF in Hive
Confidential,Dallas, TX
Project Lead
Responsibilities:
- Managing and leading the project team
- Detailed project planning and controlling
- Managing project deliverables in line with the project plan
- Coach, mentor and lead personnel within a technical team environment
- Recording and managing project issues and escalating where necessary
- Monitoring project progress and performance
- Providing status reports to the project sponsor
- Managing project training within the defined budget
Confidential
Project Lead
Technologies: VC++, MYSQL, Windows xp
Responsibilities:
- Enhancing the application
- Developing and maintaining a detailed project plan
- Understanding SRS, Functional Specification documents
- Design Document, Coding and Test Cases Preparation
- Client interaction, status reporting to manager and above
- Liaison with, and updates progress to, project steering board/senior management.
- Managing project evaluation and dissemination activities
Confidential
Project Lead
Responsibilities:
- DSpace software uses Python interfaces to perform HIL testing of their test scripts developed by GMPT, for various engine management strategies.
- GMPT has developed its own test architecture called ATA.
- The new test scripts developed for test methods are put under ATA and they access dSpace interface libraries to get data from HIL.
- When HIL setup is not present dSpace interface libraries give error and ATA goes into virtual or offline mode. In virtual or offline mode, stub classes and functions are implemented to provide dummy values for the HIL parameters.
- This is not sufficient to provide exhaustive coverage of test cases and ATA and many of the errors in the test cases go unnoticed
- Involved in R&D of interfacing the Excel in Python
- Understanding SRS, Functional Specification documents
Confidential
Team Member
Technologies: Visual C++ 2005
Responsibilities:
- Analysis
- Understanding SRS, Functional Specification documents
- Design and coding of the product
Confidential
Team Lead
Technologies: Visual C++ 6.0, Win CVS, Oracle 10g.
Responsibilities:
- Analysis
- Understanding SRS, Functional Specification documents
- Design and coding of the product
Confidential
Developer /Team Lead
Technologies: Windows NT, Solaris. Visual C++ 6.0, VSS, PSos (RTOS), VC++
Responsibilities:
- Involved in the interaction with the customer for study and analysis of the requirements of the system.
- Responsible for Analysis and Design of the modules using RequisitePro and Rational Rose. This involved highly interactive displays, handy programming interface to co-ordinate amongst multiple threads of execution, structured exception-handling techniques for handling errors.
- Handling the Critical Analysis Module, this deals with real time results.
- Responsible for testing the functionality of devices at the customer site.
- Responsible for developing various modules using VC++ in Windows NT environment. Integrated the modules with the project using Visual Source Safe 5.0.
- Involved in Integrating and Integration testing of total system.
- Tested the modules developed using the real time software simulators and also in real time environment with all the devices, using the tools such as Rational Purify and Quantify.
- Involved in designing the architecture of the overall system.
- Involved in the designing & developing of the communication mechanism, which deals with real time background process
