Over these past few weeks, I have started to implement the custom loss function for our groundwater level prediction neural network. It’s been exciting to finally get started working with the code and learn more about the hydrology aspects of the project. The goal of implementing a custom loss function in our neural network is to enforce conservation of mass, which will hopefully increase our groundwater level predictions.
Originally, the inputs to our neural network were 10 soil moisture content values in a soil column at Butte County (see red dots in the diagram below), precipitation, and recharge (or the amount of water leaving the soil column). The outputs of the neural network were the next day’s 10 soil moisture content values and recharge. For mass to be conserved, the difference in total water content (TWC), TWC(t+1) – TWC(t), should be equal to precipitation(t)-recharge(t). In other words, the water coming in minus the water going out during the previous day should be the water left over at the next day.
Diagram Courtesy of Reetik Sahu
The mass conservation error is when this difference is nonzero and is what I add towards our loss function. Adding this term to our loss function ensures that predictions that don’t conserve mass are penalized more, and helps our neural network learn to follow conservation of mass.
At each prediction, we used a trapezoidal rule to find the averaged TWC in the soil column from the 10 soil moisture values.
Diagram Courtesy of Reetik Sahu
The 10 soil moisture values were weighted (see delta L terms in the above formula) based on their distance from the next soil moisture value (e.g. if the first soil moisture value was at depth 2 and the next soil moisture value was at depth 5, the weight for their portion of the soil moisture column would be their average times 3).
However, we abandoned this method of soil moisture averaging at each iteration of the deep learning model and instead averaged our initial soil moisture values with this method and input this averaged soil moisture value to predict one soil moisture value at the next day. This simplified our model and removed the headache of conserving mass at each level of the soil column (i.e. between the soil moisture values at each depth). Now we input one soil moisture value at day t, precipitation at day t, recharge at day t, and predict soil moisture at day t + 1 and recharge at day t + 1.
Some of the issues I ran into while writing this loss function had to do with the way standard Keras (a deep learning framework) loss functions are set up. I wanted to add the mean squared error of the mass conservation error to the loss function, but in order to do this, I needed access to the corresponding input values. A standard Keras loss function only takes in a true prediction and your model’s prediction, but I needed some way to map the current batch of data that Keras was predicting values for to its current prediction. The data was shuffled at the beginning of the program, so there was no way to know which value was currently being predicted.
I explored many different ways to modify the custom Keras loss function, most of which recommended that I wrap the loss function in an outer loss function that I could send arguments to and access within the inner loss function. This was a good way to access static variables (variables that wouldn’t change and could be sent in once, initially) but not a very good way to retrieve the corresponding input value for the current prediction.
One way to access dynamically changing variables would be to send in a generic input tensor variable (https://stackoverflow.com/questions/46464549/keras-custom-loss-function-accessing-current-input-pattern), but I couldn’t get this to work with the dimensions of our data.
After lots of back and forth between different methods, I concluded that I could concatenate the input data with the true labels (since these are sent into the Keras loss function). In this way, the input data would necessarily be matched up with its corresponding label. Now I was able to access the TWC, recharge, and precipitation at day t to calculate the mass conservation error.
I did end up using an outer function in order to access static variables I needed to un-normalize the data. I also added two hyperparameter terms to the outer function that are multipliers in front of the empirical loss term (the difference between the actual and predicted values) and the mass conservation error term. These hyperparameters help us determine how much to weight each of these terms in order to get the best predictions (and smallest loss).
Right now, I’m trying to make sense of our results and explore different weights on the mass conservation error term. The best weights seem to be 4 for both the loss multiplier and mass multiplier. Below are diagrams comparing the predicted total water content (TWC) and recharge with different multipliers on the loss and mass conservation term as compared to the HYDRUS model (dark blue), which we are trying to match.
In the above diagrams, you can see that setting both multipliers to 4 gives an almost perfect result.
I’m a little wary of these near perfect results, but I'm going to meet with my team tomorrow to discuss my progress and talk about what I should be working on next. I'm really enjoying the iterative process of updating the model and implementing new things each week. I also attended a summer students' social for LBNL last week where summer interns got to share a little bit about ourselves and the work we're doing. It was fun to learn more about the other interns and their journey to where they are now.
Until next time!