September 4, 2024

AI Enabled Chart Creation

How can we minimize the time that scientists spend on non-science tasks? This is a question we care deeply about at Sphinx, knowing that each hour spent fiddling with software is another hour of productive science lost. The value of a tool is how much you can get out of it, which is often limited by the time you have to learn the nuances and locations of options within a menu.

Starting today, you can now just express your intent, and let Sphinx take actions on your behalf. We do this by mapping your desired outcome to the transformations and plots available in the system. We leverage AI to make edits for you, explain our reasoning, and provide step-by-step instructions in case you want to do it yourself. No need to worry about "learning Sphinx" -- just start using our tool and focus on outcomes over actions.

Want to get started without reading the rest of the post? Try it yourself!

There are many ways to plot a line

A great experiment starts with a hypothesis and ends with a powerful conclusion based on data. But what happens when your analysis is limited by the transformations and visualizations you have time to configure? Every statistical test, plotting method, or data transformation has nuances that may (or may not) be applicable to validating your hypothesis. Something as simple as scatterplot with the line of best fit opens a number of questions.

“Should I pivot the data before I plot it? What does a pivot do?”
“How do I make the line solid or dashed?”
“I want the points to be sized by the response, and colored by the condition.”
“How do I scale the axes?”
“How can I normalize the values to the control?”
“Can I make a 2x2 grid for the 4 conditions I tested?”

At a high level, you know exactly what you want to accomplish – the challenge is getting there. New tools like ChatGPT have shifted the paradigm for text and code generation, but leave something to be desired when applying it to your data. You still need to translate the action to your tools and your data.

We’ve helped solve this through the use of Templates – where fixed plots and transformations are applied regardless of the input data. Once you've set up a template, going from raw data to publication-quality plots takes less than a minute. However, building a template can be challenging, especially if your experimental design changes and you need a slightly different analysis. Scientists find themselves limited when they want to explore a specific dataset in a new way or ask different questions about their data, because they didn’t build the template and don’t know where the options are to edit it. Templates provide a solid foundation, but changing them to novel questions can still be time-consuming and pull focus from the core scientific investigation.

What if there was another way, where instead of exploring all the options available, you instead declared the outcome and the system acted on your behalf? At Sphinx we are excited to share our latest step towards freeing scientists: AI-enabled cell editing.

What options are available to me?

Let’s describe a common problem – you have received an output file from an instrument and you are ready to analyze it. The data are not the right shape and you don’t want to waste time cutting and pasting it to be the correct shape. In our starting Dataset we have multiple samples and replicates where the value for two genes was measured: GAPDH (a common housekeeping gene used as a control) and the Gene of Interest (GOI).

An image showing a data table where observations for each gene are in a different column.

We want to compare values for both GAPDH and the GOI in the same column, but our output file has one column for each gene. Since we know we want genes in one column, we can simply ask Sphinx to 'put all the genes in one column'.

Video showing Sphinx automatically adjusting a data table based on user request.

Sphinx is able to understand that your intent (’all genes in one column’) maps to the data transformation called a ‘pivot’. It then performs the pivot operation on the data and explains its reasoning on why this was the appropriate action. This lets you focus on declaring what results you want over learning the nuances of what ‘pivot’ means.

Explanation tab indicating why a pivot was chosen as the data transformation.

We provide the explanation so you understand why the actions were taken and to help you learn how to do it yourself in the future. We also provide instructions in case you want to modify the result or learn more about data transformations. We want to support continuous learning and know that not every tool makes it clear why a given option is the best choice. We hope that with the explanation and instructions you can get results faster and upskill yourself in the process!

Instructions tab explaining how a pivot can be added by the user.

Naturally you could ask Sphinx to make successive modifications as well, meaning analysis is as simple as describing what you want. This pattern is also available for creating plots, where you can describe what you want plotted and Sphinx does the rest. We support both simple and complex requests so you can operate at a level of detail that is comfortable to you.

“Plot the data.”
“Make a x y plot with gene on the x axis and value on the y axis.”
“Now, make it a bar plot.”
“How does this look across the genes.”

Video showing Sphinx automatically adjusting plots based on user request.

Performing a great analysis should not be limited by your ability to know where the correct option lies in a menu. Now with Sphinx your ability to perform analysis is limited by how fast you can think.

Limitations

For now, we limit the ability to edit an analysis to select transformations and plots. Precision and accuracy are import to us, so future iterations will include greater ability to edit styles for plots and provide greater specifications for data transformations.

Not every request is interpretable and some edits are hard to undo, so we don’t take an action unless we are confident it is correct. This means occasionally ‘nothing happens' because we aren’t sure what your intent is. Changing your prompt will result in better outcomes and we are working to make every prompt right the first time.

Future Direction

We are working to incorporate our other work (such as Extraction of data from spreadsheets) to enable end-to-end analyses. We believe that many parts of the scientific method can be unlocked by making tooling more accessible via natural language.

If you are excited by working on those kinds of problems and want to build better software for scientists, we’re hiring! You can see our latest code on Github at https://github.com/sphinxbio.

Reach out with any questions to hello@sphinxbio.com and thanks for reading!

Footnotes

This idea of describing desired outcomes is called a “declarative user experience (UX)”. Contrast this with an imperative UX, where you must define each action sequentially. In declarative UX, the system or interface interprets the your intent and manages the underlying processes to reach the desired result, abstracting away the complexity.

Additional Resources

April 1, 2025

Rational Drowning

Most biotechs delay improving data practices until it's too late—leading to lost data, redundant experiments, and costly mistakes. Sphinx makes it easy to implement good data hygiene from day one, combining flexibility for early-stage iteration with automation and consistency as you scale.

March 27, 2025

Recency Bias

Biotech teams often analyze datasets individually, missing deeper insights by neglecting older data due to integration complexity. Automated tools now allow bench scientists to seamlessly combine diverse datasets into structured tables without manual coding, unlocking comprehensive analysis and better-informed decisions.