We provide IT Staff Augmentation Services!

Big Data Analyst/developer Resume

Washington, DC


  • 3+ years of experience in Information Technology, this includes proven hands - on experience in Big Data and Analytics.
  • 3+ years of comprehensive experience as a Hadoop Developer with focus on Development.
  • 3+ years of comprehensive experience in Amazon Web Services
  • Expertise in business process modeling using Use Cases, Workflow, Sequence, Structured, Activity, Dataflow, and Process flow Diagrams.
  • In depth understanding and usage of Hadoop Architecture frameworks and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and map Reduce concepts.
  • Experience in analyzing data using Apache Spark and Hive
  • Knowledge of NoSQL databases such as AWS DynamoDB..
  • Designing real-time data streaming systems for both synchronization and analysis using frameworks such as Spark streaming and Logstash
  • Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving.
  • Handled several techno-functional responsibilities including estimate generation, identification of functional and technical gaps, requirements gathering, solution design, development, product documentation, and provision of production support activities.
  • A passionate and motivated professional with excellent interpersonal and communication skills, strong business acumen, creative problem solving skills, technical competency, team-player spirit, and leadership skills


Operating Systems: Windows, Linux

Frameworks: Cloudera, Hortonworks,Amazon Web Services

Big Data Technologies: Apache Spark, HDFS, Map Reduce, Hive, Tez, Sqoop, OozieProgramming Languages: Core Java,Scala,SQL

Search Engines:: Elasticsearch,Logstash

Reporting Tools: Tableau,Kibana


Confidential, Washington, DC

Big Data Analyst/Developer


  • Developed an electronic data quality validation system for the new Workforce Investment Opportunities Act (WIOA) Cloud Platform Services (CPS) - Workforce Integrated Performance System (WIPS).
  • Extensively worked with AWS Elastic MapReduce (EMR) to develop and execute streaming and non-streaming Spark applications which can validate and aggregate data.
  • Design and implement the data modernization project pipeline using AWS.
  • Head the migration effort from legacy to Hadoop services at the Employment and Training Administration Department.
  • Facilitated requirement gathering sessions with the Business Users for the modernization and integration of several grant performance reporting systems.
  • Documented Business and Functional requirements specifications for the design of a Cloud Provider Service-Performance Review System(CPS-PRS),
  • Provided extensive technical assistance for developing end-user training materials and testing online performance of the Workforce Integrated Performance Systems (WIPS).
  • Indexed production and development security log data using Logstash into Elasticsearch.
  • Prepared reports using Kibana for the Security team to give insight into unauthorized user access to the production and development cluster.
  • Prepared dashboard reports for the stake holders using Tableau which would be useful in monitoring the performance of Workforce Investment Act (WIA) programs.

Confidential, Tempe, AZ

Big Data Analyst/Developer


  • Developed statistical models and algorithms using Apache Spark for processing large volume of genomic data.
  • Acted as a prime contact between the Bioinformatics Research Group, which included members of the National Cancer Institute and the development teams.
  • Conducted rapid software prototyping to demonstrate and evaluate technologies in relevant environments
  • Actively participated on teams of software developers, researchers, designers, and technical leads to understand challenges, needs, and possible solutions.
  • Decomposed and translated business requirements into functional, non-functional requirements and created System Requirements Specification document for the Expression Quantitative Trait Loci (EQTL) pipeline.
  • Generated test plans and test cases from functional requirements, and created a System Quality Assurance plan (incorporating test plans and test cases).
  • Handled various types of genomic data formats (BAM and SAM) coming from National Cancer Institute (NCI).
  • Prepared custom reports for the biomedical community with specific interests in Tumor simulations.

Confidential, Charlotte, NC

Hadoop Developer


  • Responsible for managing data from multiple sources such as Centers for Medicare and Medicaid Services (CMS).
  • Develop MapReduce Jobs in Java for data cleaning and pre-processing.
  • Extract data from Oracle, PostGreSQL, and Netezza through Sqoop and placed in HDFS and processed.
  • Creating Oozie workflows and coordinator jobs for recurrent triggering of Hadoop jobs such as Java map-reduce, Pig, Hive, Sqoop as well as system specific jobs (such as Java programs and shell scripts) by time (frequency) and data availability.

Hire Now