Sr. Spark Developer Resume
2.00/5 (Submit Your Rating)
Houston, TX
SUMMARY:
- 10 years of extensive IT experience which includes about 3 years of experience developing Big Data / Hadoop applications.
- Worked on Open Source Apache Hadoop, Cloudera Enterprise (CDH) and Horton Works.
- Configured SAP Confidential which is an in - memory query engine that plugs into the Apache Spark execution framework to provide enriched interactive analytics on data stored in Hadoop.
- Hands on experience with Hadoop Ecosystem components: (MapReduce, HDFS, Sqoop, Pig, Hive, HBase, Flume, Oozie and Zookeeper)
- Experienced in programming with RDD operations using Transformations & Actions , Spark’s shared variables Accumulators and Broadcast s, using Higher-Order functions, Function Literals , collections such as Lists, Sets, Maps, Arrays, Sequences and Monadic Collections.
- Experienced in Spark SQL using DataFrames, DataSets, Schema RDDs, HiveContext/SQLContext, HiveQL, UDF’s, Caching tables for performance to work with both structured and semistructured data such as text files, HIVE tables, RDDs, Parquet, RCFiles, ORC,, JSON, Avro, utilizing JDBC connectivity and performance tuning by setting Spark SQL options.
- Experienced with Spark Streaming using DStreams , Stateless/Stateful transformations, Windowed transformations with different input sources as Apache Kafka, Apache Flume, Streams of files and combining multiple input sources.
- Experienced and possess good knowledge using MLlib in Spark to implement Machine Learning algorithms of different categories such as Feature Extraction, Dimensionality Reduction techniques such as Principal Component Analysis and Singular value Decomposition, Collaborative Filtering and Recommendation techniques such as Alternating Least Squares.
- Experienced in Tuning and Debugging Spark applications by changing runtime configuration values, tuning Spark’s use of memory by optimizing RDD storage, inspecting Spark Jobs, Tasks and Stages, observing Spark application behavior and performance in Spark’s built-in web UI, Driver and Executor log files, tuning level of parallelism using repartition() and coalesce() operators, reduce data shuffling and recalculations using persist() or cache(), utilizing Kryo serialization format.
- Configured SAP HANA Spark controller to connect from SAP HANA and query SAP Confidential tables.
- Leveraged SAP Confidential Graph Engine, Time Series Engine, Document Store Engine, Relation and Disk Engines to provide valuable business insights from Big Data in Hadoop and HANA systems.
- Expertise in developing Python scripts for data mining and system administration tasks.
- Configured SAP HANA Spark controller to connect from SAP HANA and query SAP Confidential tables.
- Installed and administered SAP HANA based system landscapes.
- Extensive experience in all aspects of SAP Basis Administration including OS/DB migration, upgrading by applying Enhancement Packs, Support Packs, Kernel Upgrades, performing Client Copy/ Export/ Import, System Copy/ Refresh, Spool Administration, Background jobs, Workload Analysis, Add-On installations, System Monitoring/ Maintenance, Performance Tuning and applying OSS notes.
- Extensive experience in all aspects of Microsoft SharePoint administration and developing solutions using SharePoint Online (Office 365), SharePoint Server 2013, SharePoint Server 2010, SharePoint 2007, Windows SharePoint Services (WSS 3.0), SharePoint Designer 2010/2007, InfoPath 2010/2007 and Microsoft Visual Studio 2013/2010/2008.
- Responsible for implementing standards and practices for documentation and procedures for a seamless integration into the global SAP environment and flawless Production Support practices.
- Responsible for creating and updating the Global Project cutover plan for all maintenance and rollout activities in Hadoop and SAP landscape. Updating senior management with progress and issues.
PROFESSIONAL EXPERIENCE:
Confidential, Houston, TX
Sr. Spark Developer
Responsibilities:
- Developed Big Data solutions dependent on Hadoop clusters.
- Leveraged Spark with Scala API, SQL using DataFrames and DataSets along with Kafka streams to capture several hundred of customer invoices in near real time that are of pdf format to create BI dashboards for managers to make effective decisions.
- Used Log4j framework logging, debugging info & error data.
- Collaborated with administration team in cluster co-ordination services through ZooKeeper.
- Installed SAP Confidential on Hadoop clusters and configured SAP HANA Spark controller.
- Leveraged SAP Confidential Graph engine to create graphs from data stored in HDFS and SAP HANA as JSG format files to optimize delivery routes, point of sales analysis, identify relationship between entities of key interest, view complex structures and draw knowledge graphs.
- Leveraged SAP Confidential time series engine using RAW SQL to provide time based metrics for warehouse stock replenishment and sales analysis per location for timely delivery and reduce chemicals wastage due to overstocking and optimized distribution of stocks.
- Extensive experience in all aspects of SAP Basis Administration.
- Installed SAP HANA Systems and performed post-installation steps.
- Installed SAP Confidential on Hadoop clusters and configured SAP HANA Spark controller.
- Installed new SAP instance server builds to increase processing power.
- Performed SAP EHP 6 to EHP 7 upgrade including pre-upgrade, technical and post-upgrade.
- Performed Support Pack application activities such as SPAM/SAINT updates, applying support packs, SPAU and SPDD resolutions, add-on/plug-ins installation etc.
- Spool management and resolve printer issues and printing problems.
- Performed daily system monitoring and troubleshooting to ensure system performance.
- Performed performance tuning by reading OSS Notes and adjusting profile parameters.
- Performed system copies/refreshes from production to quality environments.
- Performed kernel upgrades.
- Configured Operation Modes, and Logon Groups for load balancing.
- Created and maintained user roles in SAP systems as per business requirement.
- Supported technologies such as SAP BW, BPC, BOBJ, Vertex, BSI Taxfactory, Salesforce etc.
- Experience in mass user maintenance using transactions SU10 and PFCG.
- Followed SAP Start-Stop procedures and coordinated with teams which interface with SAP.
- Provided hyper care support round the clock after the go-live.
- Worked with ABAP and functional teams on technical configurations.
- Provided excellent customer care and resolved business critical issues.
Confidential, Boston, MA
SharePoint Administrator and Developer
Responsibilities:
- Gathered Business User requirements and designed site layouts for various departments at CCA.
- Installed, configured SharePoint server 2010 infrastructure.
- Configured and performed backup and restore tasks on entire SharePoint farms.
- Form and implement effective SharePoint 2010 Governance plan for CCA.
- Used SharePoint Designer 2010 for branding and modified the look and feel of individual sites.
- Modified default Master pages to replace OOB left-navigation pane and apply customized styles.
- Developed custom style sheets for Content Query Web Parts on SharePoint Sites as business requirements.
- Developed synchronous and asynchronous SharePoint Event Receivers for Lists using Visual Studio 2010
- Used SharePoint Designer 2010 to construct SharePoint workflows on document libraries and lists.
- Configured Approval workflows and Collect Feedback workflows for document libraries.
- Configured SharePoint Alerts on Lists and Document libraries to send messages for List members on changes.
- Configured SharePoint Search schedules and Backup jobs to perform regular crawls and backups.
- Performed troubleshooting and debugging of custom functionalities that are used on SharePoint sites.
Confidential, Boise, ID
Operations Research Analyst
Responsibilities:
- Used Principal Component Analysis to create principal dimensions on Interstate road metrics database and identify patterns that determine a road’s condition.
- Supported Confidential ’s needs analysis using various statistic methods and developed ad-hoc programs using Mathematica Platform.
- Implemented Principal Component Neural Network library for Artificial Intelligence tool kit in JAVA. The objective of the library is to perform dimensionality reduction for extracting hidden patterns in a given image data and to support formulation of equation that help in mobility decisions of robots.
- Skills involved: JAVA, Artificial Intelligence techniques.
- Designed and implemented gprof2Dot tool. Generated graphical view of Call-graph profiling information from ‘gprof’ tool which is used during optimization and evaluation of function calls.
- Skills involved: C++, dotty, gprof and Graphviz profiling tools.
- Implemented Generic Parallel Bucket Sort library, which sorts any type of data taking advantage of multiple processors in a clustered environment.
- Skills involved: C++, Parallel programming concepts, MPI, MPICH.
- Designed and implemented a tool for detection of Memory Dependencies between Loop Iterations to overcome the deficiencies of static dependency analyzers that fail to detect memory dependencies which arise during run-time.
- Skills involved: Perl, C++ (with STL) and PIN-dynamic instrumentation tool.