In the dynamic world of machine learning, the process often feels repetitive: coding, waiting for results, interpreting those results, and then diving back into coding. Even with these familiar steps, there’s always more to learn. Over the past few years, I’ve adopted a habit of documenting key lessons from my ML experiences. As I reflect on the insights from this month, three practical lessons stand out, each emphasizing a different aspect of improving productivity and efficiency in machine learning projects.
- Keep logging simple
- Use an experimental notebook
- Keep overnight runs in mind
Keep Logging Simple
For several years, I heavily relied on Weights & Biases (W&B)* for my experiment logging. At one point, I ranked in the top 5% of all active users, having trained close to 25,000 models, utilized approximately 5,000 hours of compute, and conducted over 500 hyperparameter searches. W&B was a crucial tool for my research, whether for large-scale projects like weather prediction or for numerous smaller-scale experiments.
While W&B excels in offering visually appealing dashboards, particularly for team collaboration**, I found that for many of my projects, it was more than what I required. I often didn’t revisit individual runs, and after completing a project, the logs would just sit unused. Upon refactoring my data reconstruction project, I decided to remove W&B integration not due to its ineffectiveness but rather because it was unnecessary for my needs.
Now, I have streamlined my logging process. I log essential metrics to CSV and text files directly on disk. For hyperparameter searches, I use Optuna—not the distributed version with a central server, but a local version that saves study states to a pickle file. If an error occurs, I can easily reload and continue. This pragmatic approach meets my requirements without unnecessary complexity.
The key insight here is that logging should be a support system, not the focus. Spending excessive time deciding what to log—whether it be gradients, weights, or distributions—can divert your attention from the actual research. For my purposes, a simple, localized logging system sufficiently covers all needs with minimal setup effort.
Maintain Experimental Lab Notebooks
In December 1939, William Shockley famously penned down an idea about replacing vacuum tubes with semiconductors. About 20 years later, he and two colleagues received Nobel Prizes for their revolutionary invention of the modern transistor. While our notebooks may not reach Nobel-worthy heights, there’s certainly much to learn from this approach.
In machine learning, our laboratories often consist solely of our computers—devices that have trained numerous models over the years. These labs are highly portable, especially when we’re using remote high-performance compute clusters that run 24/7, allowing us to run experiments at any hour.
But which experiment should you run? A former colleague introduced me to the concept of maintaining a lab notebook, which I have recently revisited in its simplest form. Before starting long-running experiments, I jot down what I’m testing and why. This simple practice changes the workflow dramatically. When I return the next day, I immediately see which results are ready and what I hoped to learn. It transitions experimentation from an “endless loop” to a structured feedback process where failures are easier to analyze and successes easier to replicate.
Run Experiments Overnight
One small yet impactful lesson I learned this month was a painful reminder of how valuable overnight time can be. Just last Friday, I uncovered a bug that could skew my experiment results. After patching it, I reran the experiments, but by the morning I realized I had overlooked an essential ablation—resulting in yet another day of waiting.
In machine learning, every minute counts. While we rest, our experiments should be working. Failing to have an experiment running overnight means potentially wasting valuable computing resources. This doesn’t imply running experiments impulsively, but when a meaningful experiment is ready to launch, the evening is an ideal time. Most clusters tend to be under-utilized in the late hours, hence resources become available more quickly, resulting in timely analysis the following morning.
To make the most of this, intentional planning is essential. As Cal Newport suggests in his book, “Deep Work,” effective workdays begin the night before. By knowing your upcoming tasks, you can efficiently set up the right experiments in advance.
* Using W&B is not a critique of the tool itself; instead, it serves as an encouragement for users to carefully evaluate their project objectives and allocate their time primarily toward achieving those goals with focus.
** In my view, mere collaboration is insufficient to justify the effort required to set up and maintain shared dashboards. The insights gained must outweigh the time spent in order to validate their use.
Inspired by: Source

