We provide IT Staff Augmentation Services!

Principal Machine Learning Engineer Resume

2.00/5 (Submit Your Rating)

Kirkland, WA

SUMMARY:

Obtain a highly autonomous position in a collaborative group that provides high value to the organization and ongoing challenges for myself. I am willing and able to take on both individual contributor and manager roles.

SUMMARY:

4 years of bootstrapping FinTech and indie games. 1.25 years consulting / contracting / assessing the fitness of positions. Industry

RELEVANT SKILLS AND ABILITIES:

  • Leadership
  • C/C++
  • Project Management
  • Software Architecture & Design
  • C#
  • Management
  • SCRUM
  • Kanban
  • AutoML / Automated Machine Learning
  • Security Threat Modeling
  • Software Development Process Engineering
  • Service Oriented Arch. (SOA)
  • Python, IronPython, Jython
  • Boost
  • Microservices
  • Real - time Data Stream Processing
  • STL
  • .Net SDK and .Net Runtime
  • Perforce, Mercurial, Git, VSS
  • SQL
  • Kafka
  • XML, HTML, XHTML, CSS
  • WPF
  • MongoDB
  • OpenGL
  • COM
  • Google Cloud Datastore
  • Game Physics
  • ATL
  • BASIC, VB & VBA

EXPERIENCE:

Principal Machine Learning Engineer

Confidential, Kirkland, WA

Technical Skills, Technologies, and Tools Utilized: Python, Pandas, Numpy, TPOT, Scikit-learn, XGBoost, Matplotlib, Simple Linear Regression, Decision Stumps, Stacking of Ensembles, C#, WebSockets, Protobuf, PyPy, Mercurial, and Multicharts.Net.

Responsibilities:

  • Architected, designed, implemented in Python, and tested an automated futures trading platform that yields annualized returns over 100% in backtesting, but thus far have proven to be overfit when applied to test data. The platform runs against both the real-time stream of market quotes and historical series of market quotes that were collected in real-time with actual lag times. Order execution simulation includes includes lag based slippage adjustments with a slight bias towards unfavorable fills. The platform has been managing positions on Rithmic's simulation server since April, 2018 without any major or unexpected incidents.
  • Architected, designed, implemented in Python, and tested of a system for the real-time monitoring of both the platform and the positions it manages. This monitoring system is approximately 60% complete and leverages the same real-time data stream processing modules as the platform.
  • Machine Learning:
  • Current focus is on integrating TPOT and its use of scikit-learn and XGBoost for making trade decisions.
  • Made use of matplotlib and previously developed tools to explore historical futures market data.
  • Developed a collection of simple linear regression methods inspired by Theil-Sen estimator that are given only a single degree of freedom to minimize overfitting.
  • Utilized the stacking of brute force ensembles of decision stumps and simple linear regression models for making trade decisions. The ensembles at each level include both decision tree and linear regression models.
  • Self-directed Learning: read Blue Ocean Strategy, Creative Destruction, Good To Great, Built To Last, and Fractals and Scaling in Finance (in progress) .

Staff Engineer - Machine Learning

Confidential, Kirkland, WA

Technical Skills, Technologies, and Tools Utilized: Python, Pandas, Numpy, TPOT, Scikit-learn, XGBoost, Seaborn, Azure SQL DB, Azure Blob Storage, C#, .Net Framework, .Net Core, Event Store.

Responsibilities:

  • Envisioned and twice got executive approval to develop a fully automated ML model training system.
  • Developed the data toolchain that queries Azure SQL, extracts fields from XML in Azure Blob Storage, merges in CSV files downloaded for queries from Stripe's Sigma, cleans the data of numerous and varied errors, flattens the data, and finally transforms and aggregates the data into columns of features and labels suitable for machine learning, data science, and business analytics.
  • Developed the analytics toolchain that boils down the predictions for 100k+ orders from each of the hundreds of models into the selection of the one model, the one transform of the predictions (e.g. weighted, loan loss estimate, profit estimate, etc.), and the one approval cutoff threshold that together provide the best projected business performance.
  • Developed a tool for driving TPOT to generate scikit-learn and XGBoost credit decision models. This produced a model that was projected to improve the approval rate from 78% to 83%, the loan loss rate from 1.8% to 1%, the default rate from 4.2% to 2.2%, increased the total transaction amount by 20%, increased the mean transaction amount by 13%, reduced the mean credit score by 8 points. An outsourced data science team also manually created models and the TPOT generated model tied the best of those models in approval rate and was better by 0.1% on the loan loss rate, 0.8% on the default rate, 19% on total transaction amount, 18% on the mean transaction amount (the other model reduced this metric), and 3 points on the mean credit score.
  • Leadership: formed a cross-discipline reading group to read and discuss the relevance of The Fifth Discipline.

Senior Software Engineer

Confidential, Bellevue, WA

Technical Skills, Technologies, and Tools Utilized: Node.js, Elastic Search, PostgreSQL, CockroachDB, Redis, Git.

Responsibilities:

  • Developed the aCommerce integration for the fulfillment and partner microservices for the roll out of Confidential 's global e-commerce platform in Indonesia.
  • This was a consulting position at Confidential with my direct employer being Confidential .

Senior Software Development Lead

Confidential, Bellevue, WA

Technical Skills, Technologies, and Tools Utilized: Python, MongoDB, Kafka, OpenStack, Ubuntu, Git.

Responsibilities:

  • The nature and details of the project are confidential to the point that they can be discussed only with other people working directly on the project.
  • Architected, designed, implemented in Python, and tested a stock trades execution simulation environment to enable data science research. The simulation includes estimates for transaction costs and slippage. The models are constrained to using stateless algorithms that do not factor in current portfolio holdings and operate on price quotes only. This constraint allows estimating performance over all time horizon instances of interest without needing to simulate millions of time horizon instances individually for each model i.e. O(1) instead of O(n 2) with respect to time horizons .
  • Architected, designed, implemented in Python, and tested non-interactive command line tools for multidimensional (hypercube) data analytics. Collectively the tools support roll-up / summarization / aggregation, slicing and dicing, and embedding generated measures. Numerous common statistical algorithms (e.g. mean, median, percentile, correlation) can be selected directly from the command line. Also included are some less standard / experimental algorithms such as singular rankings for multiple dimensions generated by successive rounds of iterative cuts of the worst candidates and baseline adjusted normalized rank.
  • Hierarchical dimensions are compactly stored as Uniform Resource Identifiers (URIs) and the tools accept URIs and regex expressions as command line arguments.
  • The tools supports multiprocessing for execution speed and scaling, but also support single processing for increased diagnosability. The tools all operate on CSV formatted files to allow the primary data sources to be committed to version control to 1) provide cheap diagnosability 2) allow for quick manual verifications of changes in analysis results via standard text file diffing tools, and 3) regression tests to simply query version control to verify that no changes occurred after executing analysis scenarios.
  • Machine Learning:
  • Utilized the tools detailed above to backtest hypothetical models for stock trading, optimize model parameters, and develop and test hundreds of methods for evaluating and comparing return distributions.
  • The lifetime annualized return was 65.6% for a model using a portfolio drawn from 8 exchange trade funds (ETFs). The portfolio is rebalanced daily as determined by a parameterized machine learning model with a specific set of arguments. The return is for price appreciation only, as dividends are excluded.
  • The model typically determines that the entire portfolio be allocated to cash or just one of the 8 ETFs. The 8 ETFs are all long, 2x or 3x leveraged, and based on widely tracked stock indexes. The time horizons of interest (THoI) are 1 day (i.e. held overnight), 2 days, 3 days, etc. up through half of the days of quotes available for the ETF with the shortest history. The model portfolio has a 90.62% mean annualized return for all instances of the THoI (aka rolling periods) and a maximum drawdown of -36.1%.
  • When compared to a portfolio, which holds equal weights of the same 8 ETFs and is rebalanced daily, the model portfolio has a higher lifetime annualized return, has a higher mean annualized return for all instance of THoI, has higher rankings for 4 different stack ranking algorithms, but has a more severe maximum drawdown.
  • The parametrized machine learning model is a fairly simplistic ensemble approach. The model supplements its decision making for the 3x ETFs with the analysis of 2x leveraged ETF for the same index that are not actually traded.
  • The approach detailed above was abandoned as being unlikely to actually provide lifetime annualized returns in the target range of 65-90% with a maximum drawdown of less than one third the annualized returns, due to having produced only one set of parameters in that target range that also has unacceptable maximum drawdown.
  • A proof of concept for a second machine learning approach based on stacking and “bucket of models” was created and its preliminary results were promising enough that this approach served as the basis the approach to applying machine learning to the futures market detailed further above.
  • Self-directed Learning: Read 16 books and skimmed 24 books on the topics of systems thinking, business strategy, learning organizations, data science, forecasting, statistics, statistical learning, machine learning, big data analysis, predictive modeling, social physics, and financial market technical analysis. Found the following to be the most generally interesting and/or specifically useful to this project: The Fifth Discipline, Future Ready, Systems Thinking Strategy, Introduction to Statistical Learning, Analytics in a Big Data World, Data Science for Business, Data Smart, The Single and the Noise, The Misbehavior of Markets: A Fractal View of Financial Turbulence, and Dynamic Trading.

Senior Software Engineer

Confidential, Kirkland, WA

Technical Skills, Technologies, and Tools Utilized: C++, Python, Visual Studio 2010 & 2013, WingIDE, PyCharm, CppUnit, Perforce, Fogbugz, Windows 7, Office 2010.

Responsibilities:

  • Driver for the internal project to develop fast memory leak detection and diagnostics for the C++ code in the product. Organized and drove the group that defined the requirements. Authored the requirements. Created the spikes and user stories (Scrum product backlog items) for the project. Oversaw the work done by 5 other software engineers.
  • Was a founding member of the internal C++ guidelines working group. Later took on the organizer duties for the group. Proposed, incorporated feedback, and authored the C++ guidelines for immutable objects for safer multi-threading and performance. Reviewed and provided feedback on numerous drafts.
  • Organized the internal Python guidelines working group. Proposed, incorporated feedback, and authored the Python common modules guidelines. Proposed, incorporated feedback, and authored the requirements for a loader allowing the different versions of modules / packages to coexist in the same Python interpreter environment and optionally in the same Python process. The architecture allowed for some modules to be synchronized company wide while others could follow the branch integrations with the product and test code. Proposed, incorporated feedback, and authored the Python coding style guidelines. Oversaw the implementation of these guidelines as they were carried out by other members.
  • Quickly developed plans with cost estimates for the 9 person Developer Productivity and Efficiency team shortly before Director of Development needed to present to CTO and board members for an unexpected strategic planning effort.
  • Individual Contributions:
  • Decreased the load times for the desktop product by modifying the C++ code to lazily load data sources. Created proof of concept in C++ and Java to allow the server product to also lazily load data sources and lazily prompt for login credentials. Created a Python tool to do multiple automatic runs of the desktop product and consolidate the timing data of interest from each run into a CSV file to allow easily analyzing the impact of performance changes. Minor modifications to the desktop product to facilitate automation.
  • Designed and implemented a half dozen command APIs by refactoring existing C++ code to decouple code at different layers of the dependency stack. Beyond the decoupling of code, the new command APIs improved testability and further enabled the recording and playback of user actions.
  • Designed, implemented in C++, and unit tested an integrated memory tracking and leak detection facility. The first implementation was thread safe and performant enough that the code could be built in to the product and turned off by default at runtime. Guided a junior software developer through doing a proof concept to improve the performance sufficiently to bring the overhead down to less than 6% to enable it to be turned on by default for all tests. Implemented that performance improvement.
  • Designed, implemented, unit tested, and functionally tested the above mentioned Python module loader. Created the proof of concept modules to provide pure Python APIs for Perforce and guided a junior developer through getting them fully implemented. Did the work necessary to port the desktop UI automation tests to use the loader on Windows.
  • Created automation to identify the likely team to own each of the C++ dev regression test source files by utilizing the change history in Perforce across all branches and an internal data source for team assignments. Created automation to insert the owning teams into the C++ test classes. Designed and implemented the proof of concept for the infrastructure to support attributing C++ test classes team ownership and future types of attributes as well. Guided a junior developer through developing the full implementation and unit tests.

Principal Software Development Engineer

Confidential, Redmond, WA

Technical Skills, Technologies, and Tools Utilized: C#, IronPython, Visual Studio 2010, .Net SDK, .Net Runtime, Rhino Mocks, Mercurial, Source Depot (modified Perforce).

Responsibilities:

  • Individual Contributions: Architected, designed, and implemented in C# the middle-tiers of an internal service for Windows Azure Active Directory on top of Azure Fabric Controller (Windows Fabric internally prior to public availability).
  • The service includes an asynchronous pipeline that filters and transforms the changes occurring data in the backend tier. It also includes a decentralized facility for transferring loads between partitions to provide both load-balancing and reliability that was inspired by consistent hashing. C# unit tests were written using the test framework provided with Visual Studio and Rhino mocks.

Principal Software Engineer

Confidential, Redmond, WA

Technical Skills, Technologies, and Tools Utilized: C++, Python, Visual Studio 2010, WingIDE, PyCharm, Windows SDK, Android SDK, Android NDK, Chrome Native Client, unit tests and mock objects, Windows 7, variants of Linux, Android 2.2, Chrome 23, Git, Github, Mercurial, BitBucket.

Responsibilities:

  • Midas - Stock Market Data Ingestion & Analysis: Architected, designed, and implemented in Python tools to ingest historical stock market data, check it for errors, and merge it into a consolidated data source.
  • Designed and implemented tools to test the usefulness of selected financial market technical analysis techniques.
  • Designed and implemented tools to try to automate selected financial market chart analysis techniques. Utilized Tableau's desktop product to visually analyze the data generated by the analysis tools (and eat my own dogfood from my day job).
  • Grail42 OSS on BitBucket: Improved and expanded the Python libraries and tools provided by the original Grail42 project on Github. Cleaned up the design and implementation and added unit tests. Added a testing micro-framework that is more concise than the standard one included in Python. Added a test runner that runs tests in parallel using multiple interpreters on both the host and any specified Vagrant managed virtual machines. Additionally the test runner does not require the test all be located under source tree, which are not capabilities provided by any one existing Python test runner.
  • Grail42 OSS on Github: Founded the Grail42 open source project on Github and solely authored all of its source code. Grail42 provides a cross-platform / multi-platform C++ library related to unit testing that has been verified to build and execute correctly on the Android, Windows, and Chrome Native Client (NaCl) platforms. Before implementing, five existing C++ unit testing frameworks were reviewed, but none provided the necessary capabilities. Grail42 also provides tools that automate correctly cloning source code projects (i.e. project wizard) and simplifies executing faux console applications on Android and NaCl from a Windows host.
  • Research:
  • After researching existing solutions, designed several solutions to allow peers in a peer-to-peer (p2p) network to establish a connection through NATs and PATs (i.e. UPD, TCP, and IP hole punching) that appear to be novel. Utilized Python to implement a functional prototype for one the solutions.
  • Utilized Python to implement a prototype of a distributed metadata / annotation model that initially targets source code. Developed the prototype to the point that it became clear that an interactive network graph visualization tool would be needed to take it further. Trials of approximately a dozen OSS network graph and diagram tools did not find a tool that provided a significant portion of the required functionality.

Senior Engineering Manager

Confidential, Bellevue, WA & Reno, NV

Responsibilities:

  • Management : Grew my team from 2 software engineers to 9 software engineers and software test engineers in parallel to driving the establishment of a satellite engineering office in Bellevue, WA to facilitate recruiting. Remotely managed a team of two for my first year with the company.
  • Technical Skills, Technologies, and Tools Utilized: C++, C#, Python, IronPython, COM, ATL, Win32, GUI, Multi-threading, Multi-processing, Internet Explorer 7, 8, and IE 9 (as plug-in host), Visual C++ 6.0 and VC 2010, Visual SourceSafe, Perforce, Hansoft, Jira, Office 2007, Windows XP, Vista, and Win 7.
  • Leadership: I reviewed and provided extensive feedback on the architecture for all projects I managed. For some of the smaller projects I personally created the entire architecture. I debugged and provided solutions for technical problems and defects that directs were unable to resolve independently. I did code reviews on every check-in made to projects I managed during my first 1.5 years with client group and provided extensive feedback on design, maintainability, unit test effectiveness, defects found through inspection (threading, security, side-by-side, etc.).
  • Management - Scaling out the organization: When I joined the client group, it perpetually had more high business value projects available to it then it had development capacity to complete. In particular, the client group owned the Internet Explorer toolbar platform. Most of the companies revenue is derived from advertising on search results and most of search result views came in through the IE toolbar. Additionally, the software development processes the client group utilized could not scale to a larger group size and it was also having difficulty recruiting additional software engineers. In order to capture the revenue the company was missing due to the limited capacity of the group, I drove getting approval from my Senior VP and/or the CTO so that I could take the following actions to increase the development capacity of the client group.
  • I began recruiting for a Lead Principal Software Engineer and a Lead Senior Software Test Engineer. In 2012, the client group needs to begin a project larger than any it has previously delivered on and must do so in a completely new code base. Filling these positions will both allow the group to grow larger and allow the use of my proven software architecture, design, and library code implementation during the initial phase this project.
  • I mentored my directs on doing code reviews and then moved to doing peer code reviews instead of doing all code reviews myself, so that I would not become a bottleneck to scalability myself.
  • I recruited and directly managed 4 STEs to maintain the product quality for the projects I managed instead of relying primarily on QA by deep code inspection, as this blocked increasing groups size since only the VP of the group was capable of these inspections. The secondary form of QA was manual GUI testing, but only enough manual testers were on staff to provide final testing just before release to web. While adding more manual testers was certainly an option, this option would scale poorly due to the size of our platform matrix since it includes XP no SP through Windows 7 SP1 and IE6 through IE9.
  • I managed combined teams of software engineers and software test engineers via SCRUM to maintain the per person productivity and transparency to which upper management was accustomed from employing micro-managed cowboy coding that was occasionally structured enough to be called waterfall.
  • I freed up my VP’s time by getting him to delegate his hiring manager duties to myself. Initially I continued to perform the same candidate screening process, but after observing a number of opportunities for improvement, I drove getting approval for the my recommendations that resulted in reducing by half the number of candidates brought on site that were not provided an offer.
  • I acquired Perforce and MSDN subscriptions for all members of the client group, to provide the group with a source control solution that scales out better than VSS and also provides for increased productivity through its much more advanced feature set. As engineers typically prefer to work with the best tools, the utilizing decade old unsupported tools was hampering recruiting. I both led and made a large contribution to migrating the code base from VSS and VS98 to Perforce and VS2010.
  • Individual Contributions: Implemented window.localStorage for use in browser versions that lacked that DOM object as well as unit tests for it. Implemented real-time window replication and window stealing to allow a single window to appear in many window hierarchies across several processes as well as psuedo unit tests for it. Bug fixes to screen saver product and browser toolbars for Internet Explorer and Firefox. Changes necessitated by breaking changes in the C++ language and libraries.
  • Implemented scripts to automate repetitive and error prone tasks.

Consultant

Confidential, Reno, NV

Technical Skills, Technologies, and Tools Utilized: Eclipse, SVN, sFTP, PuTTY, SSH, OpenOffice, Linux, NoSQL, Windows XP, reviewed code written in Java that utilized MySQL and Apache Tomcat.

Responsibilities:

  • Leadership - Social Networking Requirements: Led gathering, creating, and documenting the requirements for the features and scenarios of the social networking component.
  • Leadership - Client-Server-Datastore Architecture: Developed and documented a detailed plan for evolving the architecture as the company grew to avoid common mistakes online startups make and additionally the important characteristics to consider when selecting data stores for different types of data flows and models. Also survey the characteristics of the many datastores at the time and thoroughly documented the characteristics of the one most likely to be relevant.
  • Leadership - Social Networking Data Model and Flows: Lead the meetings that generated the initial data model and data flows for the future social networking component and then drove refining both. Co-authored the data model and flows document in the process.
  • Note Confidential is no longer operating.

Senior Software Engineer / Team Lead

Confidential, Reno, NV

Technical Skills, Technologies, and Tools Utilized: C#, Python, C++, .Net, Visual Studio 2005 and 2008, WingIDE, SourceInsight, WinForms, WPF, ANTS Performance Profiler, Internet Explorer 7 and IE 8 (embedded in app), COM, Confidential Active Accessibility (MSAA), UI Automation (i.e. .Net version of MSAA), Perforce, OpenOffice, Zimbra, Jira, Windows XP and Win 7.

Responsibilities:

  • Led initial development of product for new customer that became the company’s new flagship product. Developed .Net add-on component model and core .Net API for both products and lead virtual team of 5, including a team lead, through adopting the architecture.
  • Leadership - Software Security: Founding member of software security committee. Key reviewer in all security focused design and code reviews.
  • Leadership - Software Performance: Lead two separate performance drives for the each of the two variants of the company’s flagship product. Both efforts improved target metrics by several factors and required driving 8 other developers, including team leads, to complete performance improvement tasks.
  • UI Automation API: API’s purpose is to provide a fast API for automated UI tests and a user-training simulator. Designed API, solicited reviews of design, implemented API, implemented initial UI element registration/unregistration and helper libraries, implemented initial automated UI tests and helper libraries, implemented diagnostic tools, created documentation for implementing UI element registration/unregistration and implementing automated UI tests, transitioned everything but the API implementation to UI development team and QA. Previous to this had attempted to use .Net’s and IE’s provided Win32 (i.e. window handles), MSAA, and UI Automation APIs for automated UI testing via UI crawler (conceptually similar to web crawler), but this provide infeasible due to performance and element identity issues.
  • Leadership - Mentoring - mentored a junior developer in Windows platform specific dev practices and general good development practices. Mentored junior PM in software security, project management, and organizational agility.

We'd love your feedback!