Cloud Engineer / Architect Resume
Richmond, VA
SUMMARY
- Working as a Developer/Architect in Big Data solutions that involved Horton Works, Cloudera, Microsoft Azure, GCP and AWS
 - Cloudera Certified Developer for Apache Hadoop with 21 Years of IT experience in Development, Design, Application Support & Maintenance and managing projects of IT applications.
 - Experience in Planning and Defining Scope, Developing Schedules, Budgeting, Cost estimations, Team Leadership, Monitoring and Reporting Progress
 - Experience in JAVA patterns by using Open - Source products.
 - Experience in solutions involving end to end using Hadoop HDFS, Map/Reduce, Strom, Solr, Kafka, Scala, Pig, Hive, HBase, Sqoop, Oozie, Arow and performance tuning the Hadoop cluster on on-premise and cloud.
 - Experience in programming python, Java and Scala on Spark
 - Experience in real time streaming using Kafka and Storm.
 - Experience in estimation of infrastructure and capacity planning of the cluster.
 - Experience in Installing and configuring and upgrading Hadoop cluster using Cloudera Manger and deploy tools.
 - Importing and exporting data into HDFS and Hive using Sqoop.
 - Experience working with large databases like Oracle, MySQL, Teradata, DB2, Postges and GreenPlum, Redshift
 - Experience in data migration and data modeling using N0-SQL databases (HBase, Cassandra)
 - Knowledge in Data analytics using R Lang, Apache Hadoop, Map-Reduce
 - Knowledge in Data Analysis techniques like clustering, classification, regression, forecasting and prediction using R Lang
 - Knowledge on BI and data warehouse processes and techniques
 - Expert in working with multi-threaded applications using VC++ and C++ in Windows, UNIX and Solaris Environment
 - Experience in various activities of Agile Methodology UMF, UML and Design patterns
 - Experience in various phases of Software Development such as Study, Analysis, Development, Testing, Implementation and Maintenance of a Real-time systems.
 - Experience in building applications on Kafka /Kinesis
 - Knowledge on Data analytics, Data Science and Machine learning
 - Experience in Managing and leading the projects with Onsite/Offshore model
 
TECHNICAL SKILLS
Programming Languages: Java, Python, R Lang, Map reduce, Pig-Latin, Scala, C++, VC++, C# .Net 4.0.
Tools: MS Project, Visual Studio, CVS, Sqoop, Oozie, zookeeper, Spark, Storm, Nifi, Kafka, AWSPowerShell, SVN, Bitbucket, GitLab, AIM, Tableau, Looker, AWS.
Frameworks: Hadoop HDFS (Apache, Cloudera, Hortonworks), Spark, Cassandra, Microsoft Azure HDInsight, MVC.
Database: My SQL, Oracle, Greenplum, Teradata, Hive, HBase, Sybase, DB2.
Operating Systems: Windows, UNIX, Linux, Solaris
Theoretical Knowledge: Flume, Solr, Micro Strategy, Beam
PROFESSIONAL EXPERIENCE
Confidential, Richmond VA
Cloud Engineer / Architect
Responsibilities:
- Design, develop and ingest data from different sources in Redshift and Snowflake from S3 buckets
 - Transformations using spark Scala and ingest in different stages of the project
 - Validation and testing of the data quality and supporting QA.
 - Written common framework to support any kind of file reading and ingest to Postgres DB and Redshift
 - Building framework for data pipeline from end-to-end.
 - Developing POC for various applications.
 
Environment: AWS, Arow, Spark 2.4, Postgres, Github, Python, Shell Scripting, python, Redshift, Snowflake, Airflow.
Confidential, Irving. TX
Big Data Consultant
Responsibilities:
- Design and ingest data from different file formats by reading from S3 buckets
 - Transformations using spark Scala and ingest in different stages of the project
 - Validation and testing of the data quality and supporting QA.
 - Written common framework to support any kind of file reading and ingest to Postgres DB
 - Written parallelized writing and reading of data from Postgres DB.
 - Production support and automation of jobs in different stages.
 
Environment: AWS, Nifi, Spark 2.4, Postgres, GitLab, Python, Shell Scripting, Java, python, Elastic Search.
Confidential, New Jersey, NJ
Big Data Architect / Developer.
Responsibilities:
- Design and ingest data from Greenplum and Oracle.
 - Estimation of infrastructure and capacity planning of the cluster.
 - Installation and maintenance of the cluster along with admin team.
 - Giving permissions and ensuring LDAP and Kerberos authentication.
 - Performance tuning and running jobs in Hadoop cluster.
 - Hadoop cluster monitoring whether it is up and running all the time and health check.
 - Clearing the logs and backup for disaster recovery.
 - Interacting with various teams to implement best practices and contribute to POC’s on different use cases.
 - Transformations in Scala and python using Spark framework.
 - Writing ingestion scripts in through Sqoop jobs.
 - Automation of data flow, deploying and scheduling jobs in Control-M.
 - Benchmarking and performance tuning of impala queries and concurrency tests.
 - Involved in production support and coordinating with admin teams to resolve issues.
 - Understanding the domain and existing warehouse and migrating the data into Hadoop.
 
Environment: Cloudera 9.1, Impala, Spark SQL, Greenplum, Oracle, Sqoop, HBase, Kafka, Control +M, SVN, Bitbucket, Python, shell Scripting, Scala, python and Tableau.
Confidential, TX
Big Data Architect
Responsibilities:
- Architecting and designing the pipeline of the data.
 - Gathering the requirements from SME.
 - Writing ingestion scripts in through jobs.
 - Creating pipeline for Tibco streaming data and store in HBase.
 - Understanding the domain and existing warehouse and migrating the data into Hadoop.
 - Assigning tasks to offshore team and monitor progress through agile methodology.
 
Environment: Horton Works 2.3, Hive, Spark streaming, Spark SQL, Teradata, DB2, shell scripting, Sqoop, HBase, Control +M
Confidential, Worcester, MA
Big Data Architect
Responsibilities:
- Determining the cluster size based on data size and incremental data.
 - Installation and maintaining HDP 2.3 on 4 node cluster.
 - Involved in preparing technical design document for
 - Architecting and designing the end - to - end pipeline of the data.
 - Responsible for assigning and implementation of development tasks.
 - Writing UDTF in Java to parse the Acord XML and store in Hive.
 - Implementing workflows and automation though Nifi.
 
Environment: Horton works 2.3, Spark, Java, Hive, Nifi, Sqoop, Oracle, DB2, Power BI, and Control-M.
Confidential
Hadoop Engineer
Responsibilities:
- Architecting and designing the pipeline of the data.
 - Gathering the requirements from business.
 - Interacting with data modelers and creating the refined data scripts.
 - Curation through Pig and scrubbing of data and loading to the cloud environment.
 - Creating Hive External tables to Data scientists for analytics.
 - Involved in optimization of data and performance tuning of hive queries.
 - Supporting visualization team to create dash boards on QlikView.
 
Environment: Horton works 2.3, Spark, Java, Hive, Nifi, Sqoop, Oracle, DB2, Power BI, and Control-M, Microsoft Azure HDInsights on Windows and Linux, Spark, Python, Hive, BLOB Storage, ETL (Informatica), QlikView, PowerShell, Pig.
Confidential, New York
Developer /Architect
Responsibilities:
- Architecting and designing the project from start.
 - Interacting with data scientists and gathering requirements.
 - Installation of 4 node HDP cluster and creating user profiles.
 - Developing and integrating Scala application on Spark (RDD’s) using Hadoop Cluster.
 - Coding and implementation of Hive tables using monthly partitions.
 - Importing data from Oracle using Sqoop and other files through FTP.
 - Developing and scripting in Python and Scala.
 - Developing and running scripts on Linux production environment.
 - Reporting and visualizations using Business objects. (Tableau)
 
Environment: Cloudera 5.3, RHEL, R, Sqoop, SparkR, Oozie, Hive, Scala, python, BO.
Confidential
Developer /Architect
Responsibilities:
- Analyze Barclays enterprise architecture
 - Solution proposal for migrating from Teradata to Hadoop platform
 - Design the actual solution
 - Analyze tools and technologies required and support provided by existing software’s latest versions and its support for Hadoop HDFS.
 - Propose for technological upgrades
 - Hadoop cluster sizing
 - Develop a solution for end-to-end data flow from Oracle to HDFS, access data available in HDFS from Informatica, push/pull of data available in Teradata into and from HDFS
 
Environment: Java 1.7, Cloudera 4, Sqoop 1.4.3, Hive 0.96, RHEL, Eclipse Helios, Informatica 9.5, Teradata, Oozie, Zookeeper, Oracle.
Confidential
Developer /Architect
Responsibilities:
- Responsible for performing in-depth analysis and conceptualization of Retail Banking Customer 360-degree view.
 - Responsible for creating use cases/functional requirements.
 - Responsible for designing the user interface.
 - Responsible for creating entity model design and user interface creation using AppBuilder.
 - Responsible for the overall solution delivery developed using InfoSphere Data Explorer.
 - Created the estimation and work breakdown structure for the solution.
 - Planned the development and helped the team resolve technical issues.
 - Worked with Data Explorer product development team to resolve technical issues and identify solutions.
 - Published solution offering document for this solution.
 - Responsible to take business requirements to process credit card historical data from Banking SME.
 - Responsible to perform loading of credit card historical data into Hadoop.
 - Responsible to design and develop Map/Reduce programs for analytics purpose to process the credit card historical data and generate output which is further indexed using BigIndex API for visualization purposes.
 - Involved in setting up 50 node Hadoop cluster for executing the solution
 
Environment: InfoSphere Data Explorer 8.2.3, Hortonworks, DB2 9.7, Linux, Hive, HBase, Zookeeper.
Confidential
Project Lead
Responsibilities:
- Developed classes for end-to-end framework for interacting with Hadoop.
 - Worked on Hive Queries to accomplish insert and update on data through joins and functions including UDFs.
 - Participated extensively in the design and development of the project.
 
Environment: Apache Hadoop, Hive, Java.
Confidential
Developer / Project Lead
Responsibilities:
- De-duplicate incoming data using MapReduce written in Java
 - Write transformation queries in Hive
 - Write UDF in Hive
 
Environment: Hadoop, Hive, Amazon Web Services, Java.
Confidential, Dallas, TX
Project Lead
Responsibilities:
- Managing and leading the project team
 - Detailed project planning and controlling
 - Managing project deliverables in line with the project plan
 - Coach, mentor and lead personnel within a technical team environment
 - Recording and managing project issues and escalating where necessary
 - Monitoring project progress and performance
 - Providing status reports to the project sponsor
 - Managing project training within the defined budget
 
