We provide IT Staff Augmentation Services!

Portfolio Architect Resume

2.00/5 (Submit Your Rating)

Phoenix, AZ

SUMMARY

  • 13 years of work IT experience which includes 5 years experience in building large scale distrubuted data processing and in - depth knowledge of hadoop architecture MR1 & MR2 (YARN) and 7+ years experience in Java/J2EE application development
  • Experience in all phases of software development life cycle & Agile Methodology
  • Expertise in implementing, consulting, managing hadoop clusters and eco system comopoents like HDFS, MapReduce, Pig, Hive, Flume, Oozie & Zookeeper
  • Proficiency in batch processing using hadoop mapreduce, pig & hive ; real time processing using Storm ; Stream processing of data with Spark steaming with Scala
  • Hands on experience in writing Pig/Hive Scripts and custom UDF's
  • Experience in partitioning, bucketing and joins in Hive
  • Experience in query optimization and performance tuning with Hive
  • Hands on experience in importing and exporting data to/from RDBMS and HDFS/HBase/Hive thru Sqoop full refresh and incremental
  • Hands on experience in loading the log data from multiple sources into HDFS thru Flume Agent
  • Experience in configuring and implementing Flume components such as Source, Channel and Sink
  • Experience in writing Oozie workflow and parallel workflow execution using Fork and controlling child workflow thru Coordinator
  • Experience in working with Zookeeper for co-ordination of hadoop components like hbase, kafka
  • Experience in NoSQL databases like HBase, MongoDB and good exposure to Cassandra and Graphx
  • Experience in search engines like Elastic Search and good exposure to Solr & Impala
  • Experience in using SequenceFile, RCFile, Avro and exposure to ORC, Parquet for data serialization
  • Good exposure to schedulers like Fair, Capacity and Adaptive to improve the performance
  • Experience working with various hadoop distributions like OpenSource Apache, Cloudera, HortonWorks & MapR
  • Experience in administrative tasks such as installing, configuring, comission & decommission of nodes, backups & recovery of hadoop nodes in the cluster
  • Good Exposure to general purpose language Python & Scala
  • Programming experience in UNIX Shell Script.
  • Strong analytical skills with ability to quickly understand client’s business needs.
  • Experience with Agile daily standup meetings, writing user Stories, evaluating story points, creating tasks, ETA tasks, task progress with daily burn-down chart, completing the backlogs
  • Research-oriented, motivated, proactive, self-starter with strong technical, analytical and interpersonal skills.

TECHNICAL SKILLS

Programming Languages: Java, C, C++, Scala, Python

BigData Technologies: HDFS, MapReduce, YARN, Hive, Hue, Beeswax, Pig, Sqoop, Flume, Oozie, Zookeeper

NoSQL: HBase, MongoDB

RDBMS: MySQL, Oracle, SqlServer, DB2

Data Ingestion Tools: Flume, Sqoop, Kafka

Monitoring Tools: Ganglia, Nagios, Splunk

Visualization Tools: Pentaho, Kibana, Tableau

Realtime Streaming and Processing: Storm, Spark Streaming

Data Mining Tools: R, SPSS, RapidMiner

Operating Systems: Windows 9x/2000/XP/7/8/10, Linux, UNIX, Mac

Development Tools: Eclipse, RSA, RAD

Build and Log Tools: Ant, Maven, Log4j

Version Control: CVS, SVN, GitHub

PROFESSIONAL EXPERIENCE

Confidential, Phoenix, AZ

Portfolio Architect

Responsibilities:

  • Developed Spark scripts by using Scala shell commands.
  • Developed Scala scripts, UDFFs using both Data frames/SQL/Data sets and RDD/MapReduce in Spark 1.6 for Data Aggregation
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
  • Experienced in handling large datasets using Partitions, Spark in Memory capabilities, Broadcasts in Spark, Effective & efficient Joins, Transformations and other during ingestion process itself
  • Spark DataFrame API’s and Scala Case class to process GB’s of Dataset
  • Developed a Apache Spark Streaming Job using scala to analyze streaming data
  • Writing transformations and actions to process complex in Spark using Scala
  • Hive queries and partitions to store the data in internal tables
  • Unix Shell /Pig script to preprocess the data stored in the CornerStone Platform
  • MapReduce to process the stored data for multiple use cases

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Spark, Scala, MapR Distribution

Confidential, Eden Prairie, MN

Senior Big Data Lead Consultant

Responsibilities:

  • Designing and developing Logical Data Models for the Legacy & CornerStone Databases
  • Creation of Sqoop scripts for tables using Linux Scripts
  • Creation, Deletion & Execution of Sqoop Jobs in sqoop metastore
  • HBase Table's hbase row key design and mapping with RDBMS table column names
  • Mapping of HBase Table columns with Hive External table columns
  • Historical and Incremental Importing of RDBMS data to HBase table using metastore
  • Validation of Sqoop scripts, Hive Scripts, Hbase Scripts
  • Creation of Hbase Tables and column families, altering the column families, providing permission to hbase tables, defining regionserver space
  • Automation of workflow thru oozie
  • Parallel execution of imports for multiple tables in the database using Fork in oozie
  • Written transformation and actions on Scala to processs complex data
  • Bug fixing and production support running processes.
  • Participated in SCRUM Daily stand-up, sprint planning, Backlog grooming & Retrospective meetings.

Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Oozie, HBase, Spark, Scala, ZooKeeper, MapR Distribution

Confidential, Atlanta, GA

Senior Big Data Lead Consultant

Responsibilities:

  • Designing technical architecture and developed various Big Data workflows using MapReduce, Hive, YARN, Kafka, Storm & Spark
  • Deployed on premise cluster and tuned the cluster for optimal performance for job execution needs and processes large data sets.
  • Built re-usable Hive UDF libraries for business requirements which enabled various business analysts to use these UDF’s in Hive querying.
  • Used FLUME to dump the application server logs into HDFS.
  • The logs that are stored on HDFS are analyzed and the cleaned data is imported into Hive warehouse which enabled end business analysts to write Hive queries.
  • Experience in working with search engine ElasticSearch in getting real time data analytics integrating with Kibana dashboard.
  • Setup Kafka cluster on AWS, configuring and troubleshooting on Kafka brokers
  • Worked on creating Kafka topics, emitting thru producers, stored in partitions and consuming thru consumers
  • Implemented Kafka Custom Producer/Consumer for publishing messages to topic and subscription from the topics and written the topology.
  • Written Spouts to read data from Kafka message broker and passing to processing logic
  • Written Bolts to filter, aggregate, join interacting with data stores and emit tuples for the subsequent bolts to process
  • Written Storm topology which defines the flow of data between the edges
  • Experience in data migration from RDBMS & processed events from Storm Bolts to Cassandra
  • Worked with various HDFS file formats like Avro, Sequence File and various compression formats like Snappy, bzip2 & lz2.
  • Participated in SCRUM Daily stand-up, sprint planning, Backlog grooming & Retrospective meetings.

Environment: MapReduce, Pig, Hive, FLUME, JDK 1.6, Linux, Kafka, Storm, Spark, Elastic-Search, YARN, Hue, HiveServer2, Impala, HDFS, Oozie, Splunk, Git, Kibana, Linux Scripting

Confidential, Bloomington, IL

Senior Big Data Consultant

Responsibilities:

  • Design and Development Application Architecture and setup Hadoop Environment.
  • Set up Splunk Servers and Forwarders on the cluster nodes
  • The configuration for additional data nodes was managed using Chef
  • Written Linux Scripts & Cron Jobs for Monitoring Services & the health Cluster health.
  • Developed Map Reduce programs to cleanse and parse data in HDFS obtained from various data sources and to perform joins on the Map side
  • Written M/R jobs to process trip summary & scheduled to execute hourly, daily, weekly, monthly & quarterly.
  • Responsible for loading machine data into Hadoop cluster coming from different sources using Flume
  • Written workflow on Oozie to schedule M/R Jobs
  • Configured Flume and written Custom Sinks and Sources
  • Used Flume to collect, aggregate, and store the log data from different web servers.
  • Ingested data into HBase and retrieve using Java API's
  • Used SPARK SQL from extracting data from different data sources and placing the processed data into NoSQL(MongoDB)
  • Used SPARK for analyzing the machine emitted & sensor data to help extracting data sets for meaningful information such as location, driving speed, acceleration, braking speed, driving pattern and so on.
  • Created SPARK SQL(metadata) tables to store the processed results in a tabular format
  • Used Git as version control to checkout and check-in of files.
  • Reviewed high level design & code & mentoring team members.
  • Participated in SCRUM Daily stand-up, sprint planning, Backlog grooming & Retrospective meetings.

Environment: Hadoop, MapReduce, OpenStack, Flume-NG, Free IPA, HBase 0.98.2, MongoDB, Spark, Kerberos, PostgreSQL, RabbitMQ Server, Map/Reduce, HDFS, ZooKeeper, Oozie, Splunk, GitHub, Chef

Confidential

Senior BigData Engineer

Responsibilities:

  • Design and Development of Hadoop Stack
  • Analyzed the functional specification
  • The configuration of data nodes on the cluster was managed using CHEF.
  • Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and semi-structured data.
  • Load data to External tables by using Hive Scripts
  • Performed aggregate Joins, transformation using Hive queries
  • Implemented Partitions, Dynamic Partitions, Buckets in Hive
  • Optimized HIVE SQL queries and thus improved the job performance
  • Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date
  • Loaded data into the cluster from dynamically generated files using Flume and from relational database management systems using Sqoop
  • Used Oozie to automate/schedule business workflows which invoke Sqoop, MapReduce and Pig jobs as per the requirements
  • Performed Hadoop cluster environment administration that includes adding & removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting
  • Written Unit Test Cases for Hive Scripts

Environment: Java, Hadoop, HDFS, MapReduce, Pig, Hive, Flume, Oozie, ZooKeeper, CHEF

Confidential

Senior Software Engineer

Responsibilities:

  • Understanding the functional requirements of the client for designing the technical specifications, to develop the system and subsequently documenting the requirement
  • Responsible for developing class diagrams, sequence diagrams
  • Designed and implemented a separate middle ware Java component on Fusion
  • Reviewed high level design & code & mentoring team members.
  • Participated in SCRUM Daily stand-up, sprint planning, Backlog grooming & Retrospective meetings.

Environment: Java1.6, Oracle Fusion Middleware, Eclipse, WebSphere, Spring F/w

Confidential

Senior Software Engineer

Responsibilities:

  • Understanding the functional requirements of the client for designing the technical specifications, to develop the system and subsequently documenting the requirement.
  • Prepared LLD - Class Diagrams, Sequence Diagrams, Activity Diagram using Enterprise Architect UML Tool
  • Worked on Hibernate, Spring IOC, DAO, JSON Parsing
  • Prepared Unit test cases for the developed UI.
  • Responsible for problem tracking, diagnosis, replications, troubleshooting, and resolution of client problems.

Environment: Java, Confidential Proprietary F/w using DOJO, Hibernate, Spring, DB2, RSA, Rational ClearCase, RPM, RQM, Mantis

Confidential

IT Consultant

Responsibilities:

  • Understanding the functional requirements of the client for designing the technical specifications, to develop the system and subsequently documenting the requirement.
  • Prepared LLD - Class Diagrams, Sequence Diagrams, Activity Diagram using Enterprise Architect UML Tool
  • Developing UI on JSF with RichFaces
  • Writing TestNG test Cases
  • Ensuring appropriate process standards are met and maintained.
  • Involved in preparing Adhoc Reports.

Environment: Windows, Unix, Java, Struts, Hibernate, Tomcat, Lenya, Remedy Tool, WinSCP, Putty, VPN, Eclipse

We'd love your feedback!