My summer internship at Sandtable

The last ten weeks that I have spent at Sandtable as a Software Engineer intern have truly shaped me, both as a programmer and as a person. My role was to develop services in Python that would: be responsible for billing users for their costs that accumulated from clusters that would be used to execute models; and finding third party prices for the instances that the clusters consist of. This internship provided me with the opportunity to develop two microservices throughout their entire development cycle, introduced me to a wide range of software that I had not encountered before, and provided me with a wealth of experience from those I have had the great pleasure to work with.

My first week was spent learning about the environment I would be using: having never used Kubernetes before, alongside other software used by Sandtable such as Bitbucket, I had to bring myself up to speed. While learning how to use a Mac was a relatively small hurdle in comparison, it was nevertheless something I had to pick up as I went along, and changed my opinions about an operating system that I had not considered would be ideal for programming.

Following this, my talk with Thomas led me to begin researching the area I would create a service for: instances that would run models corresponding to simulations executed by users, where the instances could either be bought on an hourly basis or based on availability. Learning about the differences between these types of instances was vital for constructing a pricing service that would be able to retrieve the prices for these instances from the third party used by Sandtable to host them.

This service used Flask to handle incoming requests from the user interface, and we used Trello to organise and assign tasks.

As the service developed, the API was adjusted so that the names of cluster configurations would be provided, and the worker and master instances corresponding to this configuration would have their prices returned, where the worker instances executed the model, and the master instance managed the workers. The master instance would be identified from another internal service.

During these weeks, I was faced with the problem of how best to test my code. Pytest is a sophisticated testing library for Python that I had no prior experience with, and delving into the library to develop a large suite of tests, including a variety of mock tests, was challenging but well worth the effort placed into it because of how it shaped the development of my services.

Learning how to use and work with Jenkins and Kubernetes were some other early problems that I faced. Jenkins is used for deploying software and running pipelines, and I had never used this software before either. Similarly, Kubernetes is used for software deployment, and was used with Jenkins throughout my placement; similarly, I had never used Kubernetes. Thankfully our small yet knowledgeable Dev team was always there when I needed help in these areas, particularly Adam for setting up in Jenkins and James for setting up in Kubernetes when I began to interact with the system more and track the status of currently running services.

I was also really grateful to Thomas, our CTO, for the time he put in to get me developing. He always made time to make sure that any problems I had could be easily resolved, and provided great advice to develop cleaner code and more regularly follow common patterns.

After the first few weeks, the pricing service began to take shape, and my talks with Thomas led me to start development on a billing service that would take the given instance prices that the pricing service extracted, and calculate the total running cost of a given cluster based on its uptime. The billing service was easier to begin developing as I was now familiar with the software that I would be using.

Over the course of the internship, the billing service grew to allow a variety of search fields to filter by to generate a bill for. A user would be able to filter by: a given email, organisation name, project name, the IDs of one or more clusters (which would separate the billing differently), or provide a start and end date in which range the relevant bills would be constructed per cluster.

Like the pricing service, the billing service needed to query the cluster management service so that it could construct a database of usage information for each cluster, which could then be used to construct a bill when queried. A background greenlet, a type of coroutine used for concurrent programming, was added to apply periodic calls to the cluster management service to update this information, and through adding the greenlet I learned much about concurrent programming within Python. Thomas was particularly helpful when debugging greenlets and helping me to understand this library and area of concurrent programming.

Once the biller could be queried, it along with pricing service were integrated into the service UI API service, which manages the requests received by the UI. Within the API, bills would be retrieved by querying the biller with a list of cluster IDs, which would provide approximate bills for each cluster, and instance prices would be retrieved from the pricing service to be able to show users the approximate price of instances they could use to run their models with, where the selected instance would include the price of its associated master instance as part of the cost.

Due to the presence of a database within the biller, a new set of database tests were required to test the database directly instead of mocking the database calls. To achieve this, I used the Flask-SQLAlchemy library to perform these tests. This also required a separate database to be used for testing, which required me to set up multiple Docker-compose files to be able to manage the different images that held databases for development and testing locally. Similarly, the Jenkinsfile had to be adapted to account for a different database during testing.

Following this, a cache for bills was created and added to the billing service so the bills would not need to be generated whenever the service was called, except when the uptime has changed so that the bill would need to be recalculated and a period of over 15 minutes has elapsed since the cached bill was created and so the newly generated bill is noticeably different when changed. The cache would be initialised with the current bills.

As I neared the end of my internship, I started to work on improvements to the user interface.

I began by implementing some feedback on a panel that compared dependencies of two different versions of an environment, which was gathered from usability tests by Hannah also a summer intern. This included: adding information, such as the platform version and the version of Python being used, to the panel to be able to more clearly distinguish the differences between the versions being compared; and adding buttons to a view panel where you could select checkboxes for two environments and then choose to show the difference in dependencies between them to distinguish between which environment would be the newer environment and which would be the older environment when comparing dependencies.

Then I developed components that would be used within a series of onboarding pages for users loading Sandman for the first time, and implemented them into some of these pages.

These tasks involved relearning React and TypeScript to be able to add and modify different components for these panels, and James was immensely helpful by introducing me to the repository and helping me to relearn what I would need to know to code for the UI, and was also helpful when unraveling why a component was not being rendered correctly. Hannah was also extremely helpful, providing me with her insights on how the designs of the components could be improved or how I could better show information when comparing dependencies for environments.

During the internship, there were also other noteworthy events: Games Night exposed me to a variety of board games that I had not seen before that were thoroughly enjoyable to play; and Journal Club showed me both how the company’s models were being used by other companies to show how businesses could improve their sales and how the company’s models are calibrated. There was also Coloretto, a card game that dominated the lunch times while I was at Sandtable, and a game that is easily more than the footnote it appears to be here.

As this article, and my internship, comes to a close, I would like to thank everyone I have worked with at Sandtable for making this internship truly exceptional. I have learned about so many different programs, been able to develop and deploy two services and subsequently learn about the software behind the deployments, enhanced my skills in Python, gained further exposure to React and TypeScript, and even learned more about the models created by the company. I feel honoured to have spent the internship with such fantastic people that have always helped me when I needed it most and always made me feel like part of the team.

As James said, it’s been real.

Comments are closed.