We provide IT Staff Augmentation Services!

Data Architect/lead Resume

Neptune, NJ


  • Strong track record of success creating big data solutions for key business initiatives in alignment with analytics architecture and future state vision.
  • Seasoned information technology professional skilled in business analysis, business intelligence, data modeling, data architecture, and data warehousing.
  • Proven ability to deliver on organization mission, vision, and priorities.
  • Extensive hands - on experience leading multiple data architecture projects, gathering business requirements, analyzing source systems, and designing data strategies for dimensional, analytical, transactional, and operational data systems.


  • Systems/Software Engineering
  • Requirements Analysis
  • Database Development
  • Blockchain Programming
  • Shell Scripting
  • Agile Methodologies
  • Machine Learning/Deep Learning
  • Business Development
  • Leadership/Team Training and Support
  • Project/Vendor Management


Big Data Technologies: HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Impala, Oozie, Flume, Zookeeper, Kafka, Nifi, HBase, MongoDB, Stream sets, Talend, Splunk, Kibana, Logstach, Elastic Search, Kudu

Spark Components: Spark, Spark SQL, Spark Streaming, and Spark Mlib

Cloud Services: AWS, S3, EBS, EC2, VPC, Redshift, EMR, Azure, Cloud Front, Glue, Athena

Artificial Intelligence: Machine Learning, Deep Learning, TensorFlow, Scikit, Learn, Sage Maker, Keras, PyTorch

Blockchain: Ethereum, Cardano, R3, Hyper Ledger, Smart Contract

Programming Languages: Java, Python, Scala, R, Solidity

Scripting/Query Languages: Unix Shell scripting, SQL and PL/SQL

Databases: Oracle, MySQL, SQL Server, Netezza, Teradata

Web Technologies: JSP, Servlets, JavaBeans, JDBC, XML, CSS, HTML, JavaScript, AJAX, SPSS

Other: Maven, Eclipse, Pycharm, RStudio, Juypter, Zeppelin, Tableau, GitHub, Jenkins, Bitbucket, Bamboo, Jira, TFS, VSTS, Docker, Autosys, Control M


Confidential, Neptune, NJ

Data Architect/Lead


  • Performed complex UPSERTS with Kudu, MongoDB for large data volumes derived from various sources; process large-scale electronic medical and financial records, data sets for daily and monthly stored data in Amazon S3, Redshift, HDFS, and Blob Storage.
  • Develop predictive modeling using machine learning algorithms such as linear regression, logistic regression, and decision trees.
  • Develop and design extract, transform, load (ETL) applications using big data technology and automate using Oozie, Control M, Autosys, and shell scripts.
  • Utilize Jenkins, Bamboo to continue integration for project and code to build before deployment.
  • Set up and run data ingestion using Streamsets and Nifi for various data formats and sources.
  • Mentor and supervise on-site employees and outsourced/off-site personnel; write and update guidelines and protocols for teams to complete objectives.
  • Solely built Lambda and Kappa Architecture and solutions for on-premise, hybrid, on-cloud; also designed API with Docker which connects to MongoDB as source for both on-premise and on-cloud.
  • Transformed unstructured data into structured data with Apache Spark, utilizing data frame and querying from other data sources to S3, Redshift, Hive, Impala, Kudu, and MongoDB.
  • Built Ethereum blockchain and deployed smart contracts in private network; also built application with hyper ledger fabric using hyper ledger composer in Bluemix.
  • Conceptualized and created models using machine-learning regression techniques.

Confidential, Charlotte, NC

Hadoop Developer


  • Led capacity planning of Hadoop clusters based on application requirement.
  • Guided several Hadoop clusters and other services of Hadoop Ecosystem in development and production environments.
  • Contributed to evolving architecture of company services to meet changing requirements for scaling, reliability, performance, manageability, and pricing.
  • Developed, designed, and automated ETL applications utilizing Oozie workflows and shell scripts.
  • Created sentry policy files for business users in development, user acceptance testing, and production environments to provide access to required databases and tables in Impala; also designed and incorporated security processes, policies, guidelines for accessing cluster.
  • Converted copybook files from EBCDIC ASCHII, binary formats; stored files in HDFS; created Hive tables to decommission mainframes to make Hadoop primary source for export to mainframes.

Confidential, Springfield, IL

Software Engineer


  • Pulled data from Relational Database Management System (RDBMS) such as Teradata, Netezza, Oracle, and MySQL utilizing Sqoop; stored data in Hadoop Distributed File System (HDFS).
  • Utilized shell script to developed and deployed internal tool for comparing RDBMS and Hadoop such that all data located in source and target matched.
  • Created external Hive tables to store and run queries on loaded data.
  • Architected, implemented, and tested data analytics pipelines with Hortonworks/Cloudera.
  • Implemented partitioning and bucketing techniques for external tables in Hive, improving space and performance efficiency.

Hire Now