Cloud Engineer / Architect Resume
Richmond, VA
SUMMARY
- Working as a Developer/Architect in Big Data solutions that involved Horton Works, Cloudera, Microsoft Azure, GCP and AWS
- Cloudera Certified Developer for Apache Hadoop with 21 Years of IT experience in Development, Design, Application Support & Maintenance and managing projects of IT applications.
- Experience in Planning and Defining Scope, Developing Schedules, Budgeting, Cost estimations, Team Leadership, Monitoring and Reporting Progress
- Experience in JAVA patterns by using Open - Source products.
- Experience in solutions involving end to end using Hadoop HDFS, Map/Reduce, Strom, Solr, Kafka, Scala, Pig, Hive, HBase, Sqoop, Oozie, Arow and performance tuning the Hadoop cluster on on-premise and cloud.
- Experience in programming python, Java and Scala on Spark
- Experience in real time streaming using Kafka and Storm.
- Experience in estimation of infrastructure and capacity planning of the cluster.
- Experience in Installing and configuring and upgrading Hadoop cluster using Cloudera Manger and deploy tools.
- Importing and exporting data into HDFS and Hive using Sqoop.
- Experience working with large databases like Oracle, MySQL, Teradata, DB2, Postges and GreenPlum, Redshift
- Experience in data migration and data modeling using N0-SQL databases (HBase, Cassandra)
- Knowledge in Data analytics using R Lang, Apache Hadoop, Map-Reduce
- Knowledge in Data Analysis techniques like clustering, classification, regression, forecasting and prediction using R Lang
- Knowledge on BI and data warehouse processes and techniques
- Expert in working with multi-threaded applications using VC++ and C++ in Windows, UNIX and Solaris Environment
- Experience in various activities of Agile Methodology UMF, UML and Design patterns
- Experience in various phases of Software Development such as Study, Analysis, Development, Testing, Implementation and Maintenance of a Real-time systems.
- Experience in building applications on Kafka /Kinesis
- Knowledge on Data analytics, Data Science and Machine learning
- Experience in Managing and leading the projects with Onsite/Offshore model
TECHNICAL SKILLS
Programming Languages: Java, Python, R Lang, Map reduce, Pig-Latin, Scala, C++, VC++, C# .Net 4.0.
Tools: MS Project, Visual Studio, CVS, Sqoop, Oozie, zookeeper, Spark, Storm, Nifi, Kafka, AWSPowerShell, SVN, Bitbucket, GitLab, AIM, Tableau, Looker, AWS.
Frameworks: Hadoop HDFS (Apache, Cloudera, Hortonworks), Spark, Cassandra, Microsoft Azure HDInsight, MVC.
Database: My SQL, Oracle, Greenplum, Teradata, Hive, HBase, Sybase, DB2.
Operating Systems: Windows, UNIX, Linux, Solaris
Theoretical Knowledge: Flume, Solr, Micro Strategy, Beam
PROFESSIONAL EXPERIENCE
Confidential, Richmond VA
Cloud Engineer / Architect
Responsibilities:
- Design, develop and ingest data from different sources in Redshift and Snowflake from S3 buckets
- Transformations using spark Scala and ingest in different stages of the project
- Validation and testing of the data quality and supporting QA.
- Written common framework to support any kind of file reading and ingest to Postgres DB and Redshift
- Building framework for data pipeline from end-to-end.
- Developing POC for various applications.
Environment: AWS, Arow, Spark 2.4, Postgres, Github, Python, Shell Scripting, python, Redshift, Snowflake, Airflow.
Confidential, Irving. TX
Big Data Consultant
Responsibilities:
- Design and ingest data from different file formats by reading from S3 buckets
- Transformations using spark Scala and ingest in different stages of the project
- Validation and testing of the data quality and supporting QA.
- Written common framework to support any kind of file reading and ingest to Postgres DB
- Written parallelized writing and reading of data from Postgres DB.
- Production support and automation of jobs in different stages.
Environment: AWS, Nifi, Spark 2.4, Postgres, GitLab, Python, Shell Scripting, Java, python, Elastic Search.
Confidential, New Jersey, NJ
Big Data Architect / Developer.
Responsibilities:
- Design and ingest data from Greenplum and Oracle.
- Estimation of infrastructure and capacity planning of the cluster.
- Installation and maintenance of the cluster along with admin team.
- Giving permissions and ensuring LDAP and Kerberos authentication.
- Performance tuning and running jobs in Hadoop cluster.
- Hadoop cluster monitoring whether it is up and running all the time and health check.
- Clearing the logs and backup for disaster recovery.
- Interacting with various teams to implement best practices and contribute to POC’s on different use cases.
- Transformations in Scala and python using Spark framework.
- Writing ingestion scripts in through Sqoop jobs.
- Automation of data flow, deploying and scheduling jobs in Control-M.
- Benchmarking and performance tuning of impala queries and concurrency tests.
- Involved in production support and coordinating with admin teams to resolve issues.
- Understanding the domain and existing warehouse and migrating the data into Hadoop.
Environment: Cloudera 9.1, Impala, Spark SQL, Greenplum, Oracle, Sqoop, HBase, Kafka, Control +M, SVN, Bitbucket, Python, shell Scripting, Scala, python and Tableau.
Confidential, TX
Big Data Architect
Responsibilities:
- Architecting and designing the pipeline of the data.
- Gathering the requirements from SME.
- Writing ingestion scripts in through jobs.
- Creating pipeline for Tibco streaming data and store in HBase.
- Understanding the domain and existing warehouse and migrating the data into Hadoop.
- Assigning tasks to offshore team and monitor progress through agile methodology.
Environment: Horton Works 2.3, Hive, Spark streaming, Spark SQL, Teradata, DB2, shell scripting, Sqoop, HBase, Control +M
Confidential, Worcester, MA
Big Data Architect
Responsibilities:
- Determining the cluster size based on data size and incremental data.
- Installation and maintaining HDP 2.3 on 4 node cluster.
- Involved in preparing technical design document for
- Architecting and designing the end - to - end pipeline of the data.
- Responsible for assigning and implementation of development tasks.
- Writing UDTF in Java to parse the Acord XML and store in Hive.
- Implementing workflows and automation though Nifi.
Environment: Horton works 2.3, Spark, Java, Hive, Nifi, Sqoop, Oracle, DB2, Power BI, and Control-M.
Confidential
Hadoop Engineer
Responsibilities:
- Architecting and designing the pipeline of the data.
- Gathering the requirements from business.
- Interacting with data modelers and creating the refined data scripts.
- Curation through Pig and scrubbing of data and loading to the cloud environment.
- Creating Hive External tables to Data scientists for analytics.
- Involved in optimization of data and performance tuning of hive queries.
- Supporting visualization team to create dash boards on QlikView.
Environment: Horton works 2.3, Spark, Java, Hive, Nifi, Sqoop, Oracle, DB2, Power BI, and Control-M, Microsoft Azure HDInsights on Windows and Linux, Spark, Python, Hive, BLOB Storage, ETL (Informatica), QlikView, PowerShell, Pig.
Confidential, New York
Developer /Architect
Responsibilities:
- Architecting and designing the project from start.
- Interacting with data scientists and gathering requirements.
- Installation of 4 node HDP cluster and creating user profiles.
- Developing and integrating Scala application on Spark (RDD’s) using Hadoop Cluster.
- Coding and implementation of Hive tables using monthly partitions.
- Importing data from Oracle using Sqoop and other files through FTP.
- Developing and scripting in Python and Scala.
- Developing and running scripts on Linux production environment.
- Reporting and visualizations using Business objects. (Tableau)
Environment: Cloudera 5.3, RHEL, R, Sqoop, SparkR, Oozie, Hive, Scala, python, BO.
Confidential
Developer /Architect
Responsibilities:
- Analyze Barclays enterprise architecture
- Solution proposal for migrating from Teradata to Hadoop platform
- Design the actual solution
- Analyze tools and technologies required and support provided by existing software’s latest versions and its support for Hadoop HDFS.
- Propose for technological upgrades
- Hadoop cluster sizing
- Develop a solution for end-to-end data flow from Oracle to HDFS, access data available in HDFS from Informatica, push/pull of data available in Teradata into and from HDFS
Environment: Java 1.7, Cloudera 4, Sqoop 1.4.3, Hive 0.96, RHEL, Eclipse Helios, Informatica 9.5, Teradata, Oozie, Zookeeper, Oracle.
Confidential
Developer /Architect
Responsibilities:
- Responsible for performing in-depth analysis and conceptualization of Retail Banking Customer 360-degree view.
- Responsible for creating use cases/functional requirements.
- Responsible for designing the user interface.
- Responsible for creating entity model design and user interface creation using AppBuilder.
- Responsible for the overall solution delivery developed using InfoSphere Data Explorer.
- Created the estimation and work breakdown structure for the solution.
- Planned the development and helped the team resolve technical issues.
- Worked with Data Explorer product development team to resolve technical issues and identify solutions.
- Published solution offering document for this solution.
- Responsible to take business requirements to process credit card historical data from Banking SME.
- Responsible to perform loading of credit card historical data into Hadoop.
- Responsible to design and develop Map/Reduce programs for analytics purpose to process the credit card historical data and generate output which is further indexed using BigIndex API for visualization purposes.
- Involved in setting up 50 node Hadoop cluster for executing the solution
Environment: InfoSphere Data Explorer 8.2.3, Hortonworks, DB2 9.7, Linux, Hive, HBase, Zookeeper.
Confidential
Project Lead
Responsibilities:
- Developed classes for end-to-end framework for interacting with Hadoop.
- Worked on Hive Queries to accomplish insert and update on data through joins and functions including UDFs.
- Participated extensively in the design and development of the project.
Environment: Apache Hadoop, Hive, Java.
Confidential
Developer / Project Lead
Responsibilities:
- De-duplicate incoming data using MapReduce written in Java
- Write transformation queries in Hive
- Write UDF in Hive
Environment: Hadoop, Hive, Amazon Web Services, Java.
Confidential, Dallas, TX
Project Lead
Responsibilities:
- Managing and leading the project team
- Detailed project planning and controlling
- Managing project deliverables in line with the project plan
- Coach, mentor and lead personnel within a technical team environment
- Recording and managing project issues and escalating where necessary
- Monitoring project progress and performance
- Providing status reports to the project sponsor
- Managing project training within the defined budget