nicholaS johnson

AI Researcher @ MIT, Princeton, Oxford - Professional Speaker


Nicholas André G. Johnson has engaged in optimization and machine learning research at MIT, Princeton University, Oxford University and the Montreal Institute of Learning Algorithms. He is currently a PhD student in Operations Research, the study of how to make good decisions with limited information in uncertain environments, at MIT where he is working towards developing a more unified theory of optimization and machine learning.

Nicholas holds an undergraduate degree with highest honours in operations research and financial engineering with minors in computer science, statistics and machine learning, and applied and computational mathematics from Princeton University. He was the Valedictorian of Princeton’s Class of 2020 and is the first Black Valedictorian in the University’s 274 year history. His undergraduate thesis focused on developing high performance, efficient algorithms to solve a network based optimization problem that models a community based preventative health intervention designed to curb the prevalence of obesity in Canada.

Nicholas has interned as a software engineer at Google and as a quantitative developer at the D. E. Shaw Investment Group, and has conducted and presented international sustainable engineering projects at the United Nations Headquarters. He is a member of the Phi Beta Kappa, Tau Beta Pi and Sigma Xi honor societies. Nicholas has previously been featured by the New York Times, CNN, ABC News, Time and BET.


As a professional speaker, Nicholas is an advocate for educational attainment in marginalized communities and increased representation in STEM industries. Through his consulting work, Nicholas helps corporations leverage state of the art technology to optimize their business operations.

Nick_LR (2020_08_15 19_26_01 UTC).jpg
Computer Programming


Nicholas’ current research is focused on making methodological and algorithmic contributions to discrete optimization and leveraging modern advances in discrete optimization to solve central machine learning problems exactly at scale without using heuristics. He has a particular interest in working on applied problems in healthcare and finance. Nicholas is advised by Professor Dimitris Bertsimas at MIT. Hover over the boxes below to learn more about some of his past research work!

(Currently under review - link coming soon)

Sequential Stochastic Network Structure Optimization with Applications to Addressing Canada's Obesity Epidemic

In this work, we introduce a novel mathematical network model for community level preventative health interventions. We develop algorithms to approximately solve this novel formulation at large scale and we rigorously explore their theoretical properties. We create a realistic simulation environment for interventions designed to curb the prevalence of obesity occurring in the region of Montreal, Canada, and use this environment to empirically evaluate the performance of the algorithms we develop.

Generating Privacy Preserving Synthetic Datasets 

Given some dataset containing potentially private user information, non-interactive private data release refers to the publishing of a perturbed dataset that preserves the privacy of individual users who have contributed to the true dataset. This framework raises a natural question: for a fixed privacy tolerance, how can the released dataset be constructed to maximize downstream utility for analytic tasks? In this work, I make use of Generative Adversarial Networks to develop generative models to solve this problem without making assumptions on the underlying data distribution.

General purpose adaptive optimization engine in R.

We present Optimus, a Universal Monte Carlo optimization engine in R with acceptance ratio annealing, replica exchange and adaptive thermoregulation. It can universally interface with any model definition and efficiently optimize the model’s parameters by consistently exploring the parameter space effectively. Optimus can execute either an acceptance ratio annealing procedure or a replica exchange procedure, depending on the desires of the user.

Desk Computer


Robust and PRescriptive Portfolio Selection

Markowitz mean variance portfolio optimization is a highly acclaimed approach to the classic portfolio selection problem. However, the portfolio produced by the Markowitz model is highly sensitive to the estimation of its input parameters, in particular the vector of marginal asset returns mu. In this work, we study how the Markowitz model can be extended to produce portfolios that are less sensitive to estimation errors in mu. Among other results, we prove that augmenting the Markowitz model objective with an L2 regularization term is equivalent to solving a robust optimization problem that allows for adversarial errors in mu. This robust Markowitz model leads to greater out of sample performance and more stable portfolios.

Predicting BIXI Montreal Bikeshare Hourly Volumes

In response to concerns over traffic congestion, pollution, and climate change, many cities are subsidizing bikeshare systems to encourage a greener mode of transportation. As more consumers rely on such systems as a primary means of transit, it is crucial for bikeshare operators to maintain and restock bike stations in real time. To this end, accurate trip volume forecasting is crucial to the successful operation of any bikesharing system. Using Montreal’s public bikesharing system BIXI as a case study, we set out to answer the question: For a given hour on any day of the year, given the precipitation forecast, how well can we predict the number of BIXI bikes on the road in Montreal?

Approximation Theorem For ReLU based Neural Networks

The universal approximation theorem states that any continuous function on a compact set can be approximated with arbitrarily small error by a feed forward neural network with a single hidden layer containing a finite number of neurons that uses any continuous sigmoidal function as an activation function. However, many neural networks employed in practice use the ReLU activation function which, although continuous, is not sigmoidal. In this work, we develop a universal approximation theorem for ReLU feed forward neural networks that also includes a bound on the number of neurons required in the network.


In The News


Work with me

NICHOLAS A. G. Johnson

Ph.D. Candidate - Operations Research

Massachusetts Institute of Technology

B.S.E. Operations Research

& Financial Engineering

Princeton University

Connect on Socials:

  • Instagram
  • Twitter
  • White LinkedIn Icon

Thanks for submitting!