Blog #3: Getting Set-up

Third blog post! Overall, these weeks have been much calmer than the past four. During these past two weeks, I mainly focused on getting my computer ready to access the data so I can start helping out the data science team next week.

 

Before going into detail about what I did during these past two weeks, I would like to take a quick detour and talk about my fourth week. During our fourth week at PingThings, we had our usual “Lunch and Learn,” this time with Jeff Maxim. Like our other “Lunch and Learns,” this one was great. Just to give a bit of background, Jeff works as the lead in the front-end design team at PingThings. Currently, one of the things the front-end development team is working on is the plotter which will display all the time series data from the electric grid sensors. Surprisingly, like Allen, Jeff did not study computer science during his undergrad. Instead, he studied to become a teacher. This came as a surprise to me, since both fields seem so disparate from each other. Jeff told us that he decided to switch careers after getting programming experience from managing/setting up the computers at the school he worked in. Over the following years, he continued to learn by teaching himself computer science before he decided to switch careers. His story was very inspiring to me. It reminded me that it is never too late to switch one’s career towards something we feel more passionate about. Not all of our skills have to go to waste if we decide to make such a switch though. As Jeff mentioned, his unusual teaching background gave him unique qualities and experiences which few other programmers have. In particular, with Jeff being the team lead, he has been able to leverage his communication and explanation skills with his teammates in order to make sure that everyone understands their tasks and is able to do them correctly. As our talk reached its end, we had the opportunity to ask him various questions about his experience, job, and transitions between careers. Overall, the talk with Jeff was fantastic!

 

The following week (5th week), I started setting up my computer in order to be able to access the data streams collected from different PMUs across the electric grid. As part of the process, I needed a special username and password and VPN in order to access the different repositories. I also had to install the BTrDB package and others in order to be able to access the data streams. Unsurprisingly, this process took a couple of days. Once I finally had everything set up it was already the 6th week. During this past week, I familiarized myself with the BTrDB package by running the introductory jupyter notebooks. These introductory jupyter notebooks are given to anyone who is starting at the company. They go over the basics of how to select different streams, extract data, and plot them. In the last notebook, several small projects are given to reinforce the concepts learned. As the 6th week comes to a close, I am finishing up the small projects in the last notebook. In addition to finishing up the assignments in the notebook, I have also started verifying events in the data streams. Why do we need to verify an event? Well, our PMU data streams come labeled with events and their corresponding times. For instance, if a tree hits a powerline and the voltage drops, then the electrical companies label that event and its respective time. When we receive the data from the electrical companies, we also receive these events. Unfortunately, some of these events are mislabeled or the timings are wrong so we do not actually see them. Hence, we need to verify that the labeled events are in fact actual events. The actual events are usually seen as a sudden spike, drop, or jump in the data. Our ultimate goal is to use our labeled data to train/develop a classification algorithm which will label all the data streams without us needing to do anything. Hopefully by the start of next week, I will have all the programming tasks from the notebook done, and I will be able to help the data science team implement the classification algorithm in order to detect different events such as power outages, power spikes, and more.