Threat Hunters: Stop using Jupyter and Switch to Marimo!

Threat Hunters: Stop using Jupyter and Switch to Marimo!
A very basic Marimo notebook I created to analyze Nginx logs

A few months ago, I listened to a podcast with one of the founders of Marimo and I didn't immediately try it out.

I assumed it was another SaaS platform where I'd have to sign up for.

I was wrong. Very wrong. However, I did share the link with one of the IR analysts who had been doing most of their work in Jupyter.

The podcast succinctly captured all the problems we were finding re-using notebooks across multiple Security/IR Analysts.

Notebooks are not really Python so they are difficult to check into git. Secrets (and data) are are stored in the cells. There is dependency hell, especially if folks haven't standardized on a tool like UV.

Before Marimo, I stood up a Jupyterlab server on EKS. That helped a little, but over the month, the biggest notebook user on the team has gone all in Marimo and it has been amazing!

Hello IP: Installation and Management with UV

Before I get into a more complex/interesting use case let's create a simple notebook to find you external IP with https://ipv6.icanhazip.com/ or https://ipv4.icanhazip.com/. This also shows the AI code generation built into the product.

Assuming you have UV installed, you will do the following:

  • Create a new virtual environment and activate it
  • Install Marimo and any other packages you know you'll need.

NOTE: Marimo DOES honor your .venv and you can install other dependencies through the UI which we'll see later.

mfranz@cros-acer516ge:~/junk/marimo-hazip$ uv venv
Using CPython 3.12.7
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate


mfranz@cros-acer516ge:~/junk/marimo-hazip$ source .venv/bin/activate


(marimo-hazip) mfranz@cros-acer516ge:~/junk/marimo-hazip$ uv pip install marimo requests
Resolved 28 packages in 417ms
Installed 28 packages in 165ms
 + anyio==4.8.0
 + certifi==2025.1.31
 + charset-normalizer==3.4.1
 + click==8.1.8
 + docutils==0.21.2
 + h11==0.14.0
 + idna==3.10
 + itsdangerous==2.2.0
 + jedi==0.19.2
 + marimo==0.11.20
 + markdown==3.7
 + narwhals==1.30.0
 + packaging==24.2
 + parso==0.8.4
 + psutil==7.0.0
 + pycrdt==0.11.1
 + pygments==2.19.1
 + pymdown-extensions==10.14.3
 + pyyaml==6.0.2
 + requests==2.32.3
 + ruff==0.11.0
 + sniffio==1.3.1
 + starlette==0.46.1
 + tomlkit==0.13.2
 + typing-extensions==4.12.2
 + urllib3==2.3.0
 + uvicorn==0.34.0
 + websockets==15.0.1

You can also install marimo globally with UV tools.

Create a new notebook.

(marimo-hazip) mfranz@cros-acer516ge:~/junk/marimo-hazip$ marimo edit myip.py

        Edit myip.py in your browser 📝

        ➜  URL: http://localhost:2718?access_token=OgcSVrrAJRJ8vJQbISRA5A
Initial blank project

I clicked on the robot for AI. Since I'd already created previous project it honored the global configuration variable inside ~/.config

mfranz@cros-acer516ge:~/.config/marimo$ head -20 marimo.toml 
[keymap]
preset = "default"
[keymap.overrides]

[package_management]
manager = "uv"

[display]
dataframes = "rich"
cell_output = "above"
theme = "light"
code_editor_font_size = 14
default_width = "medium"

[save]
autosave_delay = 1000
format_on_save = false
autosave = "after_delay"

My OpenAI and Anthropic keys were already there so I didn't need to reconfigure them for this new project, but it did force me to install the anthropic libraries in my current virtual environment.

If you try to enable AI support you'll get an error

On the console (where I launched Marimo) I see the following error as well as the packages being installed.

  File "/home/mfranz/junk/marimo-hazip/.venv/lib/python3.12/site-packages/marimo/_server/api/endpoints/ai.py", line 124, in get_anthropic_client
    DependencyManager.anthropic.require(why="for AI assistance with Anthropic")
  File "/home/mfranz/junk/marimo-hazip/.venv/lib/python3.12/site-packages/marimo/_dependencies/dependencies.py", line 72, in require
    raise ModuleNotFoundError(message, name=self.pkg) from None
ModuleNotFoundError: anthropic is required for AI assistance with Anthropic.
Resolved 14 packages in 228ms
Installed 8 packages in 24ms
 + annotated-types==0.7.0
 + anthropic==0.49.0
 + distro==1.9.0
 + httpcore==1.0.7
 + httpx==0.28.1
 + jiter==0.9.0
 + pydantic==2.10.6
 + pydantic-core==2.27.2

I used Claude 3.5 to generate the code on the left and it gives me to option to add it to my notebook.

like this

What does the code look like when you download?

import marimo

__generated_with = "0.11.20"
app = marimo.App(width="medium")


@app.cell
def _():
    return


@app.cell
def _():
    import requests

    # Function to get IP address
    def get_ip(ip_version):
        headers = {'Accept': 'text/plain'}
    
        if ip_version == 6:
            url = 'https://ipv6.icanhazip.com'
        elif ip_version == 4:
            # Force IPv4
            url = 'https://ipv4.icanhazip.com'
    
        try:
            response = requests.get(url, headers=headers)
            return response.text.strip()
        except requests.RequestException as e:
            return f"Error getting {ip_version} address: {e}"

    # Get IPv4 address
    ipv4_address = get_ip(4)
    print(f"Your IPv4 address is: {ipv4_address}")

    # Get IPv6 address
    ipv6_address = get_ip(6)
    print(f"Your IPv6 address is: {ipv6_address}")
    return get_ip, ipv4_address, ipv6_address, requests


if __name__ == "__main__":
    app.run()

As you can see this is a pure Python. You can check into git, share it and run it and much easier to refactor into Python modules and packages as you should be doing anyway.

Or you can open up a terminal, like you can do with Visual Studio Code.

If you want to just run the code

(marimo-hazip) mfranz@cros-acer516ge:~/junk/marimo-hazip$ marimo run myip.py 

        Running myip.py ⚡

        ➜  URL: http://localhost:2718

Your IPv4 address is: 172.58.240.217
Your IPv6 address is: 2607:fb90:ea1c:c976:216:3eff:feaa:b017

Since there is actually no output rendered in the notebook, it will open up the browser but it will be blank, but let's add another cell.

Mariomo run provides read only apps than can be easily distributed

Using CHDB to Convert S3 buckets into Pandas DataFrames

Currently, Marimo doesn't have direct SQL support for ClickHouse tables but you can easily create to a dataframe and that will give you what you need.

This example builds on my previous blog ingesting logs into S3 with Vector and will work. CHDB is very similar to running DuckDB in Python and it is what I'll use instead of clickhouse local to query my data.

The docs are OK, but not great. It wasn't immediately obvious how to load into a DataFrame but it is actually very easy as you can see from this ipython example and you can see the supported data formats here.

NOTE: it doesn't yet support Polars and I have not yet Arrow.

In [1]: import chdb

In [2]: q = chdb.query(""" select * from  s3('https://mybucket.s3.us-east-1.amazonaws.com/*/nginx-access/*/*/*/*.log.gz') where status = 200""",'DataFrame')

In [3]: type(q)
Out[3]: pandas.core.frame.DataFrame

(Obviously there are pros and cons in terms of how much time you spend in your SQL, how much you load in memory, and you WILL crash your Marimo kernel. I ran all my examples on 8GB Acer Gaming ChromeBook and if I didn't limit the SQL I would kill it, but that is a different topic.)

Assuming you have set the virtual environment and AWS variables for it will automatically load my SQL queries into two dataframes for Nginx Access and Nginx Error logs.

You'll see how long it took certain cells to load if you look at tracing.

Barely Scratching the Surface on Data Tools in Marimo (Nginx style)

Enough with the basics, now let's really show the "batteries-included" features that are available for working with datasets. These are nothing compared to the real "Marimo Pros" and some of the visualizations I've worked with.

Enriching Nginx Dataframes with Maxmind

If you click on the database icon you will see all your DataFrames.

And if you click on one of them (I'll focus just on the HTTP 200's you can drill into the data)

Let's look at client

You can easily add this to the notebook and it will create an Altair Chart that you can edit. I modified it to expand to the top 25 instead of the Top 10.

Now let's look at the Cities.

I did have to install an additional library from the CLI in the same venv

(marimo-stuff) mfranz@cros-acer516ge:~/junk/marimo-stuff$ uv pip install "vl-convert-python>=1.6.0"
Resolved 1 package in 354ms
Prepared 1 package in 1.62s
Installed 1 package in 1ms
 + vl-convert-python==1.7.0

It did NOT work through the terminal for some reason.

And what this data is showing is that my Nginx GeoIP blocking is not 100% but if I go back to the initial access DataFrame the majority of the traffic is being blocked

Find Good Tools (AI or Not!)

One of my big lessons of the last few weeks is that despite all the advances of AI coding (Zed, Cursor, etc.) solid, well-designed tools that are deterministic and optimized for user experience are often the fastest way to achieve value. Trying to code everything in AI and NOT take advantage of existing tools is a huge mistake, besides burning GPUs.