We provide IT Staff Augmentation Services!

Big Data Analytics Resume

2.00/5 (Submit Your Rating)

VirginiA

SUMMARY:

  • 7+ Years of professional IT experience including 4 years of Big Data / Hadoop and Big Data analytics.
  • Experience in working with BI team and transform big data requirements into Hadoop centric technologies.
  • Load and transform large sets of structured, semi - structured and unstructured data using Hadoop ecosystem components.
  • Having hands on experience in using Hadoop Technologies such as HDFS, HIVE, PIG, SQOOP, Impala, Flume, Spark.
  • Having hands on experience in writing Map Reduce jobs in Hive, Pig.
  • Having experience on importing and exporting data from different systems to Hadoop file system using SQOOP.
  • Having experience on creating databases, tables and views in HIVEQL, IMPALA and PIG LATIN.
  • Experience in using different file formats like Avro, ORC, Sequence, CSV, etc.
  • Excellent understanding of NoSQL database like Hbase.
  • Implemented Proof of Concepts on Hadoop stack and different big data analytic tools, migration from different databases (Oracle, MySQL) to Hadoop.
  • Highly experienced Database Administrator: Designing, Data Modeling, Installation, Configuring, Administration, Performance monitoring, Troubleshooting and Fine-tuning of RDBMS (DB2 LUW, SQL Server, Oracle, Paraccel - Matrix), NOSQL (Cassandra, SciDB, RedisDB and Mongo) databases, Graph database Neo4j
  • Experience in Entire Hadoop echo system installation and maintenance of the components (HDFS, Hive,) Achieved above and beyond award with Hadoop implementation.
  • Experience in Data transfer from using Scoop to/from Hadoop to different data sources (DB2, Oracle etc.)
  • Excellent Programming experience in Java, Scala, Python, R, C, C++, Perl, Ksh, Java Script, Pig, Hive, Impala, CSS, NLP
  • Excellent skills in Visualization using D3.js, Python, R
  • Enterprise level Data Architect: Analysis and Data integration roadmaps, Real time and Batch ETL processing, Interpret business needs into RDBMS and NOSQL conceptual

TECHNICAL SKILLS:

Big Data: Hadoop/Big Data HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume,Oozie, Zookeeper, Spark, Storm, Impala and Kafka.

Programming: R, Python, SQL, Twitter & LinkedIn API, Web scrapping.

Databases: Oracle 10g, IBM DB2, MySQL, SQL Server, SAP RMS.

IDE s/Tools: Tableau, MS Excel Risk Solver, Anaconda, PyCharm, iPython Notebook, Amazon Web Service.

Operating Systems: Linux (Ubuntu, CentOS, Red Hat Linux), Windows XP/7/8/10, OS X 10.11

Version Control: GitHub, SVN.

ANALYTICAL SKILLS: Machine Learning, Data Mining, Sentimental Analysis, Predictive Analytics, Statistical Data Analysis, Optimization, Decision Trees, Sensitivity Analysis, Data Modelling, Data Wrangling, Data Visualization, Cluster Analysis.

PROFESSIONAL EXPERIENCE:

Confidential, Virginia

Big data Analytics

Responsibilities:

  • Achieved above and beyond award for successfully implementing Big data/Hadoop.
  • Documented the implementation process for Hadoop installation including authentication using Kerberos, Ranger authorizations at policy level, monitoring setup, backups etc.
  • Actively involved during data ingestion from DB2 to HADOOP
  • Actively involved in Hadoop upgrade project
  • Played key role in Paraccel/Matrix implementation including troubleshooting, Leader node HA, Reports automation, automating backups, Boot from SAN conversion etc.
  • Machine Learning using Spark ML
  • Having experience on using OOZIE to define and schedule the jobs.
  • Implemented Spark SQL jobs to read & analyze the data from Hive, write into HDFS/Hive.
  • Having experience on Storage and Processing in Hue covering all Hadoop ecosystem components.
  • Having Basic experience on using Tableau Reporting Tools.
  • Involved in all stages of Software Development Life Cycle.

Environment: Hadoop/Big Data HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Flume,Oozie, Zookeeper, Spark, Storm, Impala Kafka, Python and SQL.

Confidential, Columbus, GA

Hadoop Data Analyst/Developer

Responsibilities:

  • Involved in end to end data processing like ingestion, processing, quality checks and splitting.
  • Bringing the data into Big Data Lake using Pig, Sqoop and Hive.
  • Written Map Reduce job for Change Data Capture on HBASE.
  • Created Hive ORC and External tables.
  • Refined terabytes of data from different sources and created hive tables.
  • Developed MapReduce jobs for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and HIVE from Oracle, Teradata databases using Sqoop.
  • Responsible to manage data coming from different sources.
  • Monitoring the running MapReduce jobs on the cluster using Oozie.
  • Responsible for loading data from UNIX file systems into HDFS.
  • Installed and configured Hive and also wrote Hive UDFs.
  • Wrote Pig scripts to process unstructured data and create structure data for use with Hive.
  • Written the Oozie workflow to coordinate the Hadoop Jobs.

Environment: Scoop, Pig, Hive, Map Reduce, Java, Oozie, Eclipse, Linux, Oracle, Teradata.

Confidential, NC

Hadoop Data Analyst

Responsibilities:

  • Implemented solutions for ingesting data from various sources and processing the Data utilizing BigData Technologies such as Hive, Pig, Sqoop, Hbase, Mapreduce, etc.
  • Design and develop a daily process to do incremental import of raw data from Oracle into Hive tables using Sqoop.
  • Experience in querying data from Hbase for lookups, grouping and sorting.
  • Extensively used Hive/HQL or Hive queries to query data in Hive Tables and loaded data into Hive tables.
  • Extensly worked with Partitions, bucketing tables in Hive and designed both Managed and External tables and also worked on optimization of Hive queries.
  • Assisted analytics team in writing Pig scripts to perform further detailed analysis of the data.
  • Exploring with the Spark for improving the performance and optimization of existing algorithms in Hadoop using Spark Context, SparkSql, Data frames, etc.

Environment: Cloudera CDH 5, Hadoop, Map Reduce, HDFS, Hive, Pig, Sqoop, Spark, Hbase Spark Context, SparkSql, Data frames.

Confidential

Software Systems Analyst

Responsibilities:

  • Conducted strategic IT chain management assessments based on statistical analysis, thereby improved the workflow and efficiency of client’s finance applications.
  • Automated the batch recovery process of client, thereby improving the time of recovery by 30%.
  • Proposed and implemented several robust workaround techniques, which resulted in an overall decline of customer incidents by 15%.
  • Accurately recovered 120K customer records within 2 days during an application malfunction.
  • Coordinated with onshore and offshore teams and organized weekly team meetings.
  • Performed several root cause analysis of various recurring Enterprise Application related issues.

Environment: Linux (Ubuntu, CentOS, Red Hat Linux), Windows XP/7/8/10, OS X 10.11

We'd love your feedback!