software

I like CAD work and physically building test equipment. However, the magic is usually in the software. I don't have a preference—orchestrating timing-sensitive multiïnstrument experiments, automating data pipelines, or memory-optimizing analysis—designing software and writing code is what gets me fired up. Fortunately, battery acoustics are data and software-intensive, which is, I think, a large reason why I've been so fulfilled in my grad school research.

I want to mention that my PI, Dan Steingart, is a coding mage and has curated an unreal infrastructure—centralized, automated data storage, MQTT clients, electronics galore, and even a whole IDE—that really frees us on the floor to build cool stuff around. Clüb Steingart is a wonderland.

I break this part into two sections, one for experimental design and the other for analysis. I do include links to the repositories when appropriate, although I unfortunately don't for all, as some projects are still in progress.

In general, I refer to our lab's Git homepage, which I maintain, and my personal GitHub page, which is not just battery code but a hodgepodge of geophysics, ML classwork, and personal projects.

experimental

battery tomography: take a (mechanical) picture

I touch on this project across all three sections: software, instrumentation, and insights. That is for a good reason as it is quite involved—at least for me!

Without getting too much into the details—the project is ongoing—we raster scan batteries to get a spatiotemporal map of their chemomechanics. Think of a fetal ultrasound scan on a budget—no custom hardware.

High-level, it entails controlling the movement of a deconstructed 3D printer and multiple pulser/oscilloscope/transducers down to the microseconds. There are a bunch of features there that warrant mentioning, but I'll touch only on a couple. The most performance-sensitive part, the oscilloscope interface, is written in Python because it's not a bottleneck if written carefully. The experiment control interface, called Sfogliatella (see screenshot on the right), has a GUI with some neat little features.

This paragraph is more of a musing, but the complexity of this system necessitates thorough testing, which is what I try to do anyway on larger projects, and static typing. I've really come to appreciate mypy; I've found it to result in vastly fewer bugs and just overall better quality code.

Battery tomography software flowchart

Screenshot from part of tomography experiment control GUI

Acoustics Mission Control Room

unplugged: Battery acoustics GUI

To use acoustics as a characterization technique on batteries, one needs a PC, a pulser, an oscilloscope, and transducers. The three first can actually be shared across multiple experiments if one uses something like a multiplexer, a device that can switch between electronic channels.

Our group has grown a lot since I joined, and there are progressively more running acoustics experiments. To consolidate the software and reduce the hardware overhead we wrote this abstraction that I called unplugged (acoustics — MTV unplugged, get it? Jeez), which is a GUI for controlling multiple experiments at once. All one needs to add a channel is to hook the transducers up to the multiplexer and off you go.

I am not a front-end developer, but the recent advent of LLMs has really lowered the barrier for non-experts to build simple things orders of magnitude faster. As seen on the screenshot on the right, it is definitely not anything fancy, but it works and allows us to do more cool science, faster and cheaper.

pier: docker-CLI wrapper

All our acoustics experiments run on docker containers. The modularity, speed of deployment, and stability makes it a no-brainer.

We run somewhere between three to eight containers on each machine, which necessitates orchestration. This is obviously why docker-compose exists, but manually writing the yaml files is an error-prone endeavor and breaks my no manual (meta)data handling rule. I do realize that fancy GUIs like docker-desktop partially solve this problem, but we run everything on cheap-ish, headless Linux boxes (RPIs, etc) so CLI is king.

That's why I wrote pier (dock-er, pier—get it? Jeez). It simplifies the creation of docker-compose.yaml by doing

$ python orchestrate.py containerA containerB

It ensures that volumes are mapped correctly and all serial ports and exposed internet ports are valid and unique. It also contains all our dockerfiles, so it has a shorthand for building those.

It's not a particularly sophisticated piece of code, but I'm including it here because it's stuff like this that makes me love coding.

safety integration for battery modules

We have a project where a battery module (~50 Ah) is cycled at a pretty fast rate. We run acoustics on the module in tandem. We had designed an experiment that would take several months. However, running acoustics continuously would've resulted in way too much data, so we settled on running it every Nth cycle, as intracycle granularity was more important than intercycle resolution.

Because of the high-ish C-rate we also had to have safety precautions in place where the experiment would be killed if a certain maximum temperature would be exceeded. Now, the BioLogic software has a temperature-feature built-in. However, due to the first criteria we also needed a way to trigger the acoustics off the echem.

To solve these two challenges, I wrote biologic, which utilizes BioLogic's nice SDK to abstract running experiments to a flask server that streams the data over MQTT, which the acoustics and safety controllers can intercept to make decisions off of. The architecture is a little dated but it still works like a charm.

Battery cycler safety flow chart

analysis

singular: standardized battery cycling data from pesky OEMs

Battery cycler OEMs revel in creating their own proprietary data formats. I get it, the lock-in effect makes sense from a business perspective. Still douchy though. Anyway, it is a headache when one uses multiple cyclers on the regular: One manufacturer will do elapsed time and another unix time; one will have Ewe/V and another Voltage; cycle count differs; the list goes on.

One day I got really frustrated with my analysis code breaking upon switching out cyclers, so I wrote singular, a one-stop-shop for loading electrochemical time series. It contains a single function for loading data as a pandas dataframe with standardized contents, including column names, indices etc.

This is probably my all-time favorite piece of code that I made myself. It is so simple but has had so much utility.

wrangling and merging large datasets

Despite us taking measures to optimize and compress data for storage, a lot of our datasets are way larger than memory, between 50-100 Gb. This has forced us to consider how we batch and wrangle data. We have landed on a combination of multiprocessing, GPU acceleration, and simple SQL queries, thus never keeping anything larger than ~200 Mb in memory at a time.

Calculating the acoustics metrics—time-of-flight, amplitude, Fourier transform, damping, stiffness etcetera—can be time and computationally intensive, which is why we only want to do it once. We then temporally sync those to other data streams, which is always electrochemical and sometimes temperature and pressure. This allows one on subsequent runs to load only a small subset of the data, the transformed table, usually <1% of the raw data size, without sacrificing precision.

end-to-end automated data flow with notion at the center

My personal mantra that I bore all my colleagues with to death is no manual handling of data.

We sync data from all remote machines to a central server. We take care to tag all data streams, be it acoustics, echem or others, associated with a single experiment with the same ID. We then have a Notion cell tracker which acts as a central source of truth for all metadata. We can therefore load the Notion metatable into our code using the official API and filter experiments on date, cathode composition, formation rate, cycling temperature or whatever is needed. Time series can then be loaded and analyzed in an automated manner, saving valuable time.

This can obviously be done with any Excel table but the Notion tagging and paging system makes it a drastically more enjoyable activity.