Big Data Developer Resume

SUMMARY

7+ years of total IT experience in working and managing all phases of software development lifecycle and Agile.
2 years of experience in ingestion, storage, querying, processing and analysis of big data usingHadooptechnologies, Python and solutions.
Master’s degree in Information Systems from University of Texas at Arlington.
Understanding of noledge of Hadoop Architecture and ecosystem components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and YARN / MapReduce programming paradigm and analyze large data sets efficiently.
Understanding of Streaming platforms such as Flume, Kafka and Cloud Services such as AWS.
Hands on experience in working wif Ecosystems like Spark, Hive, Hbase, Sqoop and Pig.
Hands - on experience wif related/complementary open source software platforms and languages (e.g. Core Java, Linux, Unix, Python, R).
Experience in analyzing data using HiveQL, Spark SQL. Involved in creating Hive tables, data loading and writing hive queries.
Experience in working wif teh Different file formats like Textfile, Sequence Files, Avro, ORC, Parquet files, and JSON.
Experience in design of Dashboards using Tableau and understanding of QlikView.
Experience in Waterfall and Agile project management methodologies; Flexible and versatile to adapt to any new environment and work on any project.
Worked in environments requiring direct Customer interaction wif Onsite-Offshore model during requirement specifications, development and project implementation phases.
Self-starter, team player, worked on fast-faced development environments committed to teh Deliverables. Ability to work in a team and as an individual.
Excellent interpersonal and analytical skills; Liaise between business and technical personnel to ensure mutual understanding of processes and applications. Possess strong communication, presentation, Problem Solving and analytical

TECHNICAL SKILLS

Big Data Ecosystem: HDFS, HBase, MapReduce, Hive, Sqoop, Flume, Kafka, Apache Spark, Pig, Impala

Programming/Scripting Languages: R, Python, Scala, Core Java, HTML, XML, CSS, JavaScript

Data Analytics Tools: SAS, Weka, SAP Business Objects, Tableau, RStudio, Anaconda Navigator 2.7, PyCharm, Flask

Testing and CI Tools: Selenium Webdriver, Jenkins.

Test Management: JIRA, Application Lifecycle Management, HP Quality Center, MS Excel.

Other: MSSQL/PLSQL, MySQL, Oracle 9i, SQL 2005, Eclipse, MS Excel VBA, GitHub, Git, Knowledge of ETL\BI tools, Open Refine, Rally, Visual Studio

PROFESSIONAL EXPERIENCE

Big Data Developer

Confidential

Responsibilities:

Worked on Apache Pig scripts to transform, sort, group, and process teh HDFS data for enrichment.
Involved in creating Hive tables and loading and analyzing data using hive queries.
Create aggregates and Analyzed large data sets by running Hive queries, Spark, Spark sql and Pig scripts.
Developed Scala maven projects, used both Data frames/SQL and RDD inSpark2.1 for Data Aggregation, queries and writing data back into different file formats like parquet, avro, etc.
DevelopedSparkcode using Scala andSpark-SQL for faster testing and processing of data.
Import teh data from different sources like HDFS/Hbase intoSparkRDD.
Performed transformations, cleaning and filtering on imported data using Hive, Spark, and loaded final data into HDFS; Used Spark to handle multiple joins wifin subject area tables and across subject areas.
Transfer data to relational databases using Sqoop for downstream applications.
Implemented teh Oozie job for daily imports.

Big Data Analyst\Developer

Confidential

Responsibilities:

Involved wif Solution architecture, estimations, and environment setup of big data environment.
Acquired data from data lake and worked on Apache Pig scripts to transform, sort, group, and process teh HDFS data.
Created Pig streaming UDFs using Python.
Created and maintained Hive warehouse for Hive analysis.
Created aggregates and Analyzed large data sets by running Hive queries and Pig scripts.
Involved in creating Hive tables, loading wifdataand writing Hive queries.
Created Partitions in Hive for both Managed and External tables for optimizing performance based on time series.
Worked on importingdatafrom various sources and performed transformations using Hive to loaddatainto HDFS.
Scheduled Ingestion jobs fordataextraction,datavalidation and ingesting intoHadoopcluster.
Loaded and transformed largedatasets of structured, semi structured and unstructureddatausingHadoop/BigDataconcepts.
Experienced in Cloudera distributed system.

Data Analyst

Confidential

Responsibilities:

Identify, analyze and interpret trends or patterns in complexdatasets in MS Excel using Pivot tables.
Work closely wif management to prioritize business and information needs.
Derived streamingdatainsights and stories that drive business decision and usability improvement of teh streaming app from all aspects of teh streamer experience including search, content, clicks, and patterns of behavior.
Monitor, collate, and synthesize information to produce reports and program e-related documents.
Leveragingdatausing BI tools Tableau forDatavisualization and storytelling.
Profile rawdatasets across platforms and develop KPI/dashboard to measure product performance.
Develop and implementdatacollection systems and other strategies that optimized statistical efficiency anddataquality.
Made use of teh aggregated tables on teh most frequently querieddatafor Monthly and Yearly based reports.

Business\Data Analyst

Confidential

Responsibilities:

Identified teh gaps between teh current applications and future requirements that have evolved. Interacted wif Business managers and Developers in requirement analysis, design reviews, testing and documentation for application development.
Participated in JAD sessions involving teh discussion of various reporting needs and to develop an architectural solution that teh application meets teh business requirement.
Identify, analyze and interpret trends or patterns in complexdatasets in MS Excel.
Managed, updated and manipulated report orientation and structures wif teh use of Pivot Tables and V-Lookups.
Implemented very advanced and complex queries in DDL and DML involving Joins, Sub queries, Alter Table etc.
Lead cross-functional efforts to address business process or systems issues
SIT cases - Prepared Test plan, System Integration test cases and set them up in JIRA for execution and tracking. Improvement of testing skills wif teh learning of Automation Tools like Selenium IDE, QTP.
Release Planning - Efficiently managed and documented teh Release Calendar for all types of releases.

Java Developer

Confidential

Responsibilities:

Served integral role in supporting Confidential application in teh 5-member team.
Assisted in Support and maintenance of web-based Confidential application to maintain different products information repository in SDLC pattern.
Unit tested and documented website applications and code.
Upheld program quality and delivery standards in developing software solutions.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship