Over the past few weeks, I’ve been focusing on fixing various bugs in my physics-guided neural network implementation. One of the main bugs was mutating the test portion of the input data. When splitting the dataset into training and test data, the test data was not being “deep copied,” which means that the test data’s memory address was the same as that portion of the actual data, and any change to the test data would in turn affect the actual data, which is a big problem! When predicting values for the test data, the model updates the next test data point to be the most recently predicted data point. The effect of doing a shallow copy was that the original test data became our predicted values.
One of the housekeeping things I worked on that was the time-consuming integration of my custom loss function with the existing hyperparameter optimization (HPO) code. The HPO code tries out different parameter values for the neural network (e.g. number of hidden units, epochs, layers) and trains the neural network many times with different parameters to determine the best set (this is chosen based on the smallest loss value achieved). The exploration space of the HPO code is set by a lower and upper bound of possible values for each parameter the code will optimize. These lower and upper bounds are later multiplied by an array of values that are step sizes, meaning that the algorithm will only try hyperparameter values in increments of the given multipliers. For example, if the lower and upper bound are 1 and 8, and the given multiplier is 2, the possible values explored would be 1, 3, 5, 7. I misunderstood what this array meant and thought it instead just initialized the given parameters by scaling them because in the sample code I had, they were set to all ones. After changing the step sizes to appropriate values, I got much better results.
Some updates on the inputs to our neural network: the total water content (TWC) values for each day are now given directly from HYDRUS, whereas we were using calculated TWC values using the triangle inequality before. The HYDRUS TWC values themselves have some mass conservation error associated with them, but much less than the calculated TWC values we were using previously.
Figure courtesy of Reetik Sahu
I am currently trying to improve the way I add the mass conservation error to the loss function. Originally, I used a hard constraint, that the Predicted TWC(t+1) – TWC(t) – vTop(t) + recharge(t) must equal 0, or else it gets added as a mean squared error to the loss function. However, in light of the small deviations from 0 that HYDRUS allows for, I changed the loss function to include the mass conservation error term only if it violates the constraint more than 40% and plan to tighten this to 10% and compare the results.
Above are the predictions with a 40% threshold on the mass loss
In other news, I’m preparing a poster for the LBNL CS Summer Student Program (CSSS) poster session on August 4. The CSSS has weekly seminars and talks, one of which was all about poster sessions and how to make a great poster. One of the most interesting ideas was to color code your outfit with your poster, and a research team studied the effectiveness of doing so (http://betterposters.blogspot.com/2012/03/colour-clash.html). Obviously, the poster session this year will be virtual. We’ll record a 5 minute introduction video to our poster and have breakout rooms the day of the actual poster session. I look forward to seeing other students’ posters and putting the work I’ve been doing into a visual form that encompasses all I’ve done this summer!
One of the highlights of this week was an afternoon virtual happy hour with co-CSSS students who are also doing remote internships at LBNL. It was fun to have an open conversation with my peers about how working from home and quarantine has been for all of us. Remote work is harder because you have to separate your home space from your work space, which can be difficult to do. I am still trying to find the perfect balance, which consists of making weekly schedules for myself and sticking to set work hours. I do miss the camaraderie of going to campus every day and having in-person conversations, but this happy hour and our all-team happy hour is a safe way to be social that I look forward to.
Until next time,