We provide IT Staff Augmentation Services!

Big Data Developer/analyst Resume

3.00/5 (Submit Your Rating)

Pittsburgh, PA

SUMMARY

  • Over 5 years of solid experience working in Big Data Hadoop development technologies and Java/J2EE technologies
  • Experience in industries including banking, computer technology, consumer packaged goods industry
  • Hands on experience in Big Data Ecosystem with Hadoop, HDFS, MapReduce, Hive, HBase, Sqoop, Flume, Kafka, Spark, Pig, Oozie, Solr and ZooKeeper
  • Experience in writing MapReduce programs to manipulate and analyze unstructured data
  • Expertise in collecting data through RESTful API and scraping web data using Scrapy
  • Proficient in writing and opHiveQL and SQL queries to achieve data manipulation
  • Experienced in writing custom UDFs in Hive and Hive tuning techniques
  • Working knowledge with Spark 1.4 with Scala and Python
  • Hands on experience in data ingestion between HDFS/NoSQL and RDBMS using Sqoop
  • Experience in data collection, processing and streaming with Flume, Kafka
  • Knowledge on serialization formats like Sequence File, Avro, Parquet
  • Involved in batch job scheduling workflow using Oozie
  • Working experience with RDBMS including MySQL 5.x, PostgreSQL 9.x
  • Working experience NoSQL Database including HBase 0.98, MongoDB 2.4
  • Front End with HTML4/5, CSS3, JavaScript, AngularJS
  • Experience on cluster security and authentication with Kerberos
  • Extensive experience in web frameworks including J2EE, Spring MVC, Python Django
  • Experienced in descriptive and predictive analysis with Microsoft Excel, Impala, Hive
  • Experienced in Machine Learning, Natural Language Processing and Statistical Analysis with Python, Scala and Java
  • Adept in Data Visualization with D3.js, Tableau, matplotlib,
  • Experience in Agile, Waterfall, TDD (Test Drive Development) methodologies
  • Experience with Graphic Design and website tools such Adobe Muse, Adobe PhotoShop, Adobe Illustrator
  • Hands on experience in unit testing such as JUnit, MRUnit and simulation test such as Selenium and Capybara
  • Self - driven goal getter, excellent communication skills in collaborative team and have motivations to take independent responsibility

TECHNICAL SKILLS

Hadoop Ecosystem\Web Framework\: Hadoop 2.*, Spark 1.4+, MapReduce 2.0, Pig, \Javascript, AngularJS, HTML, CSS, J2EE, \ Hive 0.13+, Sqoop 1.4.5, Flume 1.4+, \Spring MVC, Django, Rails 3.2, Struts2\ Kafka 0.8.2+, Solr, Hbase, Zookeeper, \ Kerberos, MRUnit\

Programming Language\Data Analysis & Viz\: Java, Python, JavaScript, Scala, Ruby \Python, Matlab, Tableau, matplotlib, D3.js\

Cloud Platform\Scripting Language\: Amazon Web Service, Heroku, Sina App\UNIX Shell, HTML, XML, CSS, JSP, SQL, \Engine\Matlab\

Operating Systems\Environment\: Mac OS, Ubuntu, CentOS, Windows\Agile, Scrum, Waterfall\

Database\Machine Learning\: MySQL 5.x, PostgreSQL 9.x, MongoDB 2.4+, \Regression, Neural Network, K-Means, HMM, \ HBase 0.98\SVM, NLP\

IDE Application\Collaboration\: Sublime Text, Eclipse, PyCharm, Ruby Mine, \Git, JIRA, SVN\Notepad++\

PROFESSIONAL EXPERIENCE

Confidential, Pittsburgh, PA

Big Data Developer/Analyst

Responsibilities:

  • Worked on MapR Converged Data Platform with Agile development methodology
  • Developed and optimized scripts using Kafka consumer API to read near-real-time data to Spark Streaming application
  • Developed and optimized Scala programs/ HiveQL to perform data transformation, cleaning and filtering
  • Used Flume to integrate, transform and store data from different sources
  • Used Sqoop to ingest data from various data sources to MapR-DB NoSQL database/ HBase
  • Created multiple Hive tables with partitioning and bucketing for efficient data access
  • Worked with analytics team to build statistical model with MLlib and PySpark
  • Worked with application development team to develop RESTful Web Service Application using JAVA
  • Performed unit testing and integration testing using Junit and MRUnit
  • Wrote SQL queries for aggregating data, querying and managing MySQL database
  • Used Git for version control, JIRA for project tracking, Confluence for documentation collaboration

Environment: MapR 5, Java 7, Python 2.7, Hadoop 2.7.0, Linux, UNIX Shell, Kafka 0.9, Hive 0.14, Flume 1.5, Sqoop 1.4.5, HDFS, MLlib, Spark 1.5, MapR-DB, MapR-FS, Hbase, MapReduce, Junit, MRUnit, Agile, Git, JIRA

Confidential, Pittsburgh, PA

Hadoop Developer/Analyst

Responsibilities:

  • Worked on Cloudera Hadoop ecosystem with Agile Development methodology
  • Used Flume and Kafka to transform, enrich and stream transactions to different location
  • Used Sqoop to transfer structured and unstructured data from different resources
  • Performed data transformation, extraction and filtering using HiveQL/SQL and Python
  • Used Hive and Impala for batch reporting and managed HDFS long term storage
  • Used Solr Java client APIs for querying and indexing across Solr Cloud
  • Scheduled workflow with Oozie
  • Worked with development team to design search solution with Apache Solr and Lucene
  • Worked with analytic teams to visualize tables in Tableau for reporting
  • Used JUnit and MRUnit for unit testing
  • Used Git for collaboration and version control

Environment: Java 7, CDH 5.3, Hadoop 2.3, Linux, Unix Shell, Python, Flume 1.4, Sqoop 1.4.5, Kafka 0.8.2, Hive 0.13, HBase 0.98, Impala, Solr 4.4, Lucene, Morphlines, Tableau, JUnit, Oozie, MapReduce, MRUnit, Git

Confidential

Software Engineer

Responsibilities:

  • Worked on Oracle Big Data Appliance with Agile methodology
  • Collected web data and store in MongoDB using Ruby/Python
  • Create Hive tables and analyze abnormal web behavior in web logs data using HiveQL
  • Designed schemas in Oracle NoSQL Database to store social media data
  • Developed MapReduce programs on cluster to store web logs in HDFS
  • Developed and performed simulation test using Selenium&Java and Capybara&Ruby
  • Involved in monitoring and managing Hadoop cluster using Cloudera Manager
  • Involved in leveraging SQL to expose result of analytics to front-end platform using Javascript
  • Using Git for collaboration and version control

Environment: Oracle Big Data Appliance, Python, Ruby 2.0, Rails 3.2, Cloudera CDH 4, Hadoop 2.0, Oracle NoSQL Database, MapReduce, Hive, MongoDB 2.4, HDFS, JAVA, Selenium, Capybara, Javascript, SQL, Git

Confidential

Software Developer

Responsibilities:

  • Worked on Sprint MVC Framework with Waterfall methodology
  • Implemented website user interface of three functional modules with HTML, CSS, ExtJS, Javascript, JSP
  • Developed and design related persistence layer components using Hibernate to store and fetch data from DB2
  • Worked on using JSON for transferring/retrieving data between platforms
  • Worked on unit testing and integration testing with JUnit
  • Involved in reviewing Use Cases, Class Diagrams, and Sequence Diagrams to model the detaildesign of the project.
  • Wrote SQL queries for CRUD operation

Environment: Spring MVC, IBM DB2, JSP, HTML, CSS, JavaScript, ExtJS, Hibernate 3, JUnit, JSON, UML, SQL, Waterfall

Confidential 

Java Developer

Responsibilities:

  • Worked on Struts 2 Framework with Agile methodology
  • Design and developed customized services using HTML, JSP, JavScript, CSS
  • Integrated Spring DAO for data access using Hibernate
  • Using HQL/SQL for performing queries in MySQL databases
  • Involved in implementing the user interface of official website with HTML, CSS, Javascript
  • Wrote unit test with JUnit

Environment: Java Struts 2.0, MySQL, JSP, HTML, CSS, JavaScript, MySQL, JDBC, JUnit and Agile

We'd love your feedback!