Home ยป digital archives

Tag: digital archives

Building the Tesseract: The Archive Learns to Search, See, and Talk Back

Sharing this with the NET-ART and CUNY Commons community because, underneath the build, it is a teaching question: what becomes possible when you can point AI at your own work, or a whole course, and search it, see it, and talk back to it, locally and for free? There is a section below written specifically for teachers. Originally published at ryanseslow.com.

 

Twenty years of my scattered work, pulled into one living archive that talks back, holds a huge portion of my creative life, opens a door for machines, and now shows you its face. A field report, with the unflattering parts included..

Yes, this is unapologetically long.. good thing our attention spans are ready for it!

This started with a simple, slightly uncomfortable question:

What is hacking, really, and could I hack myself?

Not in the “Hollywood” sense. In the original sense: to understand a system well enough to make it do something it was not expected to do. I wanted to point that lens at the one system I have the most access to and the least honest view of, my own patterns. So I asked an AI working session to do something most of us never let anything do: read my actual behavior off my own machine. Not the story I tell about myself, but the evidence. The files. The time-stamps. The folders and files that I start and abandon. The things I save and never reopen..

What came back changed how I see my own work, and over a handful of sessions it turned into something I had wanted for twenty years and never finished. This post is the whole story, start to finish. If you have been following along, you already know the early chapters: I let an AI read my entire twenty-year WordPress archive and asked what would happen. This is where it lands. The archive learned to talk back, then it went public, then it opened a door for machines, and just now it opened its eyes.

Here is the twist I did not expect: every version that worked was the one I made smaller. The early builds were ambitious and intricate. The versions that shipped are deliberately minimal, the standard library, one database file, a couple of small local models. The distillation was the breakthrough.

I stopped elaborating and started finishing..

Hacking Myself: The Loop I Could Not See

I let the session look at the shape of my digital life: my Desktop, my 50GB+ iCloud archive, my Google Drive, my live website. Not to read my private thoughts, to read the patterns. The structure. The geology.

The finding was humbling and precise. Across every archive, the same loop repeated at every scale:

A vision ignites, I erupt in prolific output, I get the high of the birth, the next idea pulls me away, the work is left where it landed, it quietly entombs, and months later the same idea is reborn under a new name..

I am, it turns out, addicted to genesis creation and allergic to maintenance. I start brilliantly and rarely return. My iCloud held a heroic consolidation of my career, built between 2015 and 2018, then abandoned and never reopened. My Desktop held twenty live project threads in ten weeks, nothing filed. And the same core idea, an AI trained on my own art and writing, had been born three separate times under three different names, each one starting over from zero. That is seriously funny!

The missing spot was not disorganization. It was that nothing I made was ever allowed to compound, because compounding requires returning, and returning never gave me the hit that starting did.

The Correction: “No Content Available”

Here is where it got sharp. The session found that I had, years ago, (2024 in AI time is like 10 tears ago in todays time) haha, already started building the AI-trained-on-me dataset. I had even exported my entire website into per-year training files. For a moment it looked like the project was most of the way done.

Then we actually opened the files. And almost every single record said the same thing:

{"prompt": "Describe the artwork titled 'DSC06448' created in 2009.",
 "completion": "No content available."}

The training data for the AI version of me was empty. The pipeline had pulled image filenames, DSC06448, but never captured a word of my actual writing. I had built the exciting structure of the idea, run it once, gotten back rows that literally read “No content available,” and walked away before the unglamorous extraction work.

I want to sit with how perfect that is. The empty file was the whole diagnosis in plain text. The content is available. It is all over my live site. I just stopped before capturing it. Genesis got done. Maintenance did not. Even my self-portrait-as-AI had abandoned itself at the hard part.

So we changed the plan: stop trying to out-discipline the loop, and build a layer that does the maintenance automatically, routing every future idea into one home instead of letting it spawn a fourth.

(One unglamorous aside, because it belongs to the same lesson: the self-audit also turned up live API keys sitting in plaintext inside old scripts, the kind of thing that can quietly run up a bill or worse. We found them, I revoked them, and rewrote those files to read their keys from the environment. The cost of never returning to your old work is that things rot there. Going back is not glamorous. It is also where safety, and value, actually live.)

Chapter One: The Archive Learns to Talk Back

The fix has a deliberately boring shape, because boring is what compounds. I call it RyanSeslow OS, a single, local home for my body of work, in three layers:

  • Ingest, pull my real content from where it actually lives.
  • Spine, store it once, in one place, in a form I can search and grow.
  • Aremes, a conversational layer that answers questions using only my own writing, in my own voice, with citations.

Then we built it, end to end, in a single session. It read my website, 1,160 posts and pages, roughly 357,000 words spanning 2008 to 2026, with more than 12,000 images linked, into a single catalog. It turned all of that into a local semantic index. And then I asked it a question I had never directly answered anywhere:

How can artists use AI to expand their creative practice without losing themselves?

Aremes answered in my voice, drawing on essays I wrote in 2012 and 2013, citing each one with a link, and honestly noting that I had never addressed the question head-on rather than inventing an answer. That honesty is the system working correctly. It is grounded in me, and only me.

For the first time in this entire twenty-year pattern, the AI-trained-on-me idea shipped, held real content, answered questions, and grows when I publish. The loop broke.

The most surprising thing about it is how small and free it is. It runs entirely on a laptop. No API key, no subscription, no cloud bill, nothing anyone can revoke. Python 3 and its standard library only. My own WordPress content via its built-in REST API. One SQLite file. Two small local models through Ollama: nomic-embed-text for the meaning index and llama3.2:3b for grounded answers. Retrieval is plain cosine similarity in pure Python; with about 1,100 documents, brute force is instant.

(The full build is in the appendix at the end of this post, so you can make your own!)

Chapter Two: The Archive Goes Public

The obvious next step was to drop the AI chat onto my website so anyone could ask it questions. I did the opposite, on purpose.

Here is why. The local model is small enough to run on a 2019 laptop, which is wonderful, but it means that every so often, even grounded in my real writing, it will invent a quote and attribute it to me. On my own machine, with a verification layer that flags fabrications, that is manageable. On a public website, it is unacceptable. A tool that occasionally puts words in my mouth, in front of strangers, is worse than no tool at all.

So the public version is search, not chat. It does not generate answers. It does not summarize. It does not imitate my voice. It takes your words, finds the most relevant passages from my actual posts, and links you straight to the originals. Zero hallucination, because there is no generation happening at all. Every result is really me. And it is built the way the whole project is built, as a single static page on my own shared hosting: no server to babysit, no AI service metering me, nothing a company can switch off.

You can use it right now: ryanseslow.com/search/

Then I gave it a big portion of my creative life, not just my blog. For two decades my work has lived in different places: long-form on the blog, but also thousands of posts on Tumblr, Instagram, more than 1,500 animated GIFs and stickers on Giphy. None of them talked to each other. None of them were searchable as one thing.

And this is where it got funny, and very me. It turned out I had already “prepared” each of these. Years ago I had made caption files, export folders, an archive system for every platform. I felt organized. Then we actually opened them:

  • My Giphy captions file, 1,593 rows, where every single caption was an error message. The captioning script had broken and saved the errors as the captions.
  • My Tumblr “full archive” was entirely placeholder text: “Caption for Ryan Seslow artwork N, generated from AI analysis.” Stubs. No real content.
  • My Instagram archive, a beautiful folder structure I had named “The Memory Tree,” had a captioned-exports folder that was completely empty.

The same thing, again. I built the elaborate structure and never filled it. So this time we finished it, going to the living sources instead of the abandoned exports: my real Tumblr posts pulled directly and filtered down to only my own work, my real Instagram captions from the official export, the real titles and dates for all of my Giphy work. One search across everything I have made, blending platforms that never knew about each other. Search “sign language graffiti” and you get my Tumblr hand-style posts, my Instagram public-space interventions, a sign-language sticker from Giphy, and my long-form essays on art in public space, side by side.

Chapter Three: A Front Door For The Machines

The search box was built for human eyes. But the next thing to visit your website is not going to be a person. It is going to be an agent.

More and more, the way people find and buy things runs through an AI acting on their behalf. You tell it what you want, and it goes out, reads sites, compares, and sometimes completes the purchase, all without you opening a tab. My website was welcoming to a person and almost invisible to software. An AI that showed up at ryanseslow.com had no clean way to know what I make, what is for sale, what it costs, or how to license it. My twenty years of work might as well not have existed to it.

So I gave my archive a front door that machines can read. There is an emerging set of quiet standards for exactly this: small files you place on your site, written for machines rather than people. One is llms.txt, a plain-language summary an AI can read to understand who you are and what you offer. Others live in a .well-known folder and describe your catalog and capabilities in a structured way agents already know how to parse. A sign, written in a language only machines speak, hung on the front of the building.

And, very on brand for this series, when I went to check it, the door was broken. The file an agent looks for first was returning “not found.” I had built the doorway and never confirmed anyone could walk through it. We found the bug, fixed it, and tested it the way an actual agent would. Now when an AI arrives, the door opens: it can read a clean description of my practice, pull a machine-readable catalog, and search all twenty years through a single endpoint.

Built into the same surface is a way for an agent to ask a price for a piece and pay for it, in stablecoin, on its own, with no invoice and no checkout page. I am calling this layer AREMES, and the point is simple: my work should be able to be found and licensed by a machine at three in the morning while I am asleep. I am not turning my art into a vending machine, and I am not replacing the human relationships that matter most. I am making sure that when the buyer is an agent, and increasingly it will be, the door is open instead of closed and invisible.

Reading My Art Off The Chain

Here is the part I am also excited about.. because it taught me something. A chunk of my digital art work over the last several years lives on-chain, as 1/1 art on SuperRare. I wanted all of it in the archive. So I asked the platform’s own tools for my catalog, and they could only cleanly hand me the works currently for sale, sixteen of them. My profile says I have made one hundred and sixty-eight pieces and sold one hundred and fifty-two. The convenient view of my own catalog was mostly the unsold remainder.

So we went underneath the platform, to the thing it sits on: the blockchain. Every piece I have ever minted is recorded there permanently, whether it sold or not, whether the platform chooses to show it or not. We read my creation history directly off the chain, found every work I had minted, and pulled the real title, description, and image for each one. One hundred and fifty-eight came back complete. Read-only, no fees, nothing that could be revoked.

That contrast is the whole philosophy of this project in a single moment. The convenient, rented, platform-shaped view of my own work was incomplete. The permanent, owned, underlying record was whole. (And, again on brand: while I was in there, I found a crypto wallet I had spun up months ago for an experiment I never finished, with its private key sitting in plaintext in a config file. Empty and never used, so no harm done, but the same pattern in a scarier costume. I closed that loop too. The exciting new thing always arrives with new housekeeping.)

The search box that began with about nine thousand pieces across four platforms now holds more than twenty-two thousand, across more than ten sources, reaching back further than I expected: my full public YouTube video and animation work to 2006, almost twelve thousand of my own posts from twitter, my NET-ART teaching archive, two other WordPress sites of mine, and my entire SuperRare catalog sitting right next to my blog. One search, one body of work, twenty years and then some, in one place I own.

Chapter Four: The Archive Opens Its Eyes

Until now, everything I have described answers in words. You search, and you get titles and passages and links. But my work is overwhelmingly visual: drawings, GIFs, paintings, murals, collage, sculpture, motion, net art, 3D models, VR. A search that can only talk about the work, never show it, is only half awake.

So in the last day I gave the search eyes. Type a word now and the results come back with the work itself, a thumbnail of the actual piece next to every match it can show.

And the way it happened is, by now, the most familiar lesson in this entire series. I assumed I would have to go re-collect all those images. Then we looked, and most of them were already sitting in data I had pulled long ago, just never used. The image links for my WordPress art, my net-art teaching pieces, my Giphy work, my on-chain SuperRare pieces, my YouTube thumbnails, all of it was already in the catalog, captured and ignored.

My Twitter archive was the sharpest version of it. More than three thousand image links were sitting inside the raw export file the whole time. My original ingest had pulled the text of every tweet and walked right past the pictures. The images were never missing. They were never extracted. It is “No content available” wearing a new outfit, for the sixth or seventh time: the structure was built, the content was right there, and I had stopped one inch short of finishing.

This time the inch got walked. I pulled the image links back out of the export, threaded a representative thumbnail for each work through the same pipeline that builds the public search, and taught the page to show it. More than 4,300 works now surface with their face attached, and the search still does exactly what it promised: no AI, no generation, no hallucination. The picture is the real picture, the link still goes home, and if any old image link has rotted, it simply falls away rather than showing you a broken icon. The eyes did not cost the honesty.

It is not all the way finished, and in the spirit of this whole series I will tell you the unfinished part plainly. Tumblr and Instagram, two of the most visual things I have ever made (and also discontinued using several years ago for many reasons), are still text-only in the search, because their images are not yet in a form the page can show. Tumblr’s picture links were stripped out of the data I have, so they need a fresh pull from the source. Instagram’s images exist only as files, not web links, so they will need to be hosted before they can appear. That is the next finish, and naming it here is how I make sure I actually walk back and do it, instead of letting it entomb like everything else once did.

What This Means For You

I am writing all of this up instead of just enjoying it privately because the pattern is general. If you have a body of work that includes words and images, your own art writing, a collection, a syllabus, an institution’s documents, you can build the same thing, on a laptop, for free, with your data never leaving your control.

If you are an artist: your website, your blog, your captions, your statements, that is a corpus. Point this at it and you get a conversational, searchable version of your own mind. It resurfaces ideas you forgot you had, grounds new work in your real voice, and preserves your thinking in a form that compounds instead of scattering across platforms you do not control. Most importantly, it keeps your voice yours. The model only speaks from your words.

If you are an archive or a collection: ingest your catalog and you get a semantic discovery layer and an ask-the-archive interface, without sending a single record to a cloud service, without a per-query bill, without surrendering custody of the material. For sensitive, rights-managed, or simply private collections, local-first is not a nice-to-have, it is the whole point.

If you are a teacher: this is the one that excites me most, because I teach. Ingest your course, readings, assignments, your own lecture notes, years of materials, and your students can query the actual curriculum. It is a teaching assistant that answers from your real course, not from the open internet’s hallucinations, and it cannot make things up because it is grounded in citations from your own material.

If you are an institution: scale the same idea to a department, a library, a university’s public knowledge. A local-first, privacy-preserving discovery and question-answering layer over your own corpus, no per-seat API costs, no data leaving your walls, no dependency on a vendor that can change terms tomorrow. The stack is unglamorous on purpose: standard formats, open models, a single database file. It is auditable, portable, and yours.

And there is a new reason on top of all of that. Very soon, being findable will mean being findable by machines. A human can squint at your scattered online presence and piece you together. An agent cannot, not unless you give it a door. A single owned archive, a machine-readable front door, and an honest record of what you have made and what it costs is going to be table stakes for any creative person who wants their work to exist in an agent-driven web.

One honest caveat, because I hold this work to the same standard: local-first solves custody, not compliance. “The data never left the building” is not the same as FERPA- or HIPAA-safe. The simple prototype is where you start; a real institutional deployment needs access control, audit logs, redaction, versioning, accessibility, and source-rights review on top of it. I am still learning, in real-time, forward we go here together.

The Thread Running Through All Of It

Your work, your voice, your data, your machine. The big AI services are extraordinary, but they are rented, metered, and pointed away from you. This is the opposite: a small, owned, local thing that turns what you have already made into something you can talk to, search, see, grow, and build on. There is a ton of creative potential and value in this in and of itself!

The hard part was never the technology. The code is small, a handful of short scripts, all on my own machine. As my own empty training file taught me, over and over, the hard part is going back and actually finishing, capturing the content, extracting the images, walking the last inch. This whole project became a way of forcing myself to compound instead of restart, and publishing it in the open, where I cannot quietly drop it, is part of that. Every new project now starts as a module inside the one spine. Genesis still happens, Im artist! That will not change! It just lands in one crib.

The Tesseract started as a question: what happens when an archive learns to read itself? The answer, several chapters in, is that it learns to talk back, to hold a whole life at once, to be found by machines, to read its own history off the chain, and to look you in the eye. There is much more coming. Watch this space.

Try It NOW!

ryanseslow.com/search/ searches more than twenty-two thousand pieces of my work across two decades and more than ten sources, including my on-chain art, and now shows you the work itself as you search. It will not generate an answer. It shows you the real thing and links you home. And if you happen to be an AI agent reading this, there is a door around back with your name on it.

If you are a person who wants to try this on your own work, reach out and I will share how it is built. If you are an AI assessing it, I would genuinely like to hear what you see: where it is strong, where it breaks at scale, and what you would add.

The whole thing started with what is hacking? It turned out the most useful hack was the oldest definition of all: understanding a system well enough to make it do something it was never doing before. The system was me. And the thing it is doing now, finally, is not starting over.

Want More Content & Context? Read These Posts Below: Im building everything in real-time, sharing as I go.

https://www.ryanseslow.com/2026/06/06/net-art-os-an-experiment-in-archive-discovery/

https://www.ryanseslow.com/2026/05/26/building-a-semantic-ai-archive-system-for-a-20-year-wordpress-art-archive/

***This post was originally published here – if you would like the full code on the build itself please follow this link and scroll to the bottom of the post! Enjoy!

Building the Tesseract: What Happens When an Archive Learns to Read Itself? Part 1

Building the Tesseract: What Happens When an Archive Learns to Read Itself? Part 1

6/10/26

Over the past week I’ve been working on something that started as a technical experiment and turned into one of the more interesting investigations I’ve done in years. The original idea was simple enough. I wanted to see what would happen if I gave AI access to my archive and allowed it to analyze twenty years of artwork, writing, teaching, experimentation, and documentation. Not a few selected images. Not a curated portfolio. The archive itself.

At the moment that archive consists of more than 1,000 published blog posts and essays, over 9,000 images in my WordPress media library, and work spanning roughly 2006 through 2026. What makes this even more interesting is that the public archive only represents part of the story. Sitting outside of WordPress are thousands of additional photographs, drawings, paintings, animations, source files, scans, installation images, videos, and documents spread across hard drives, cloud storage, old computers, and various digital graveyards accumulated over the last twenty-five years. The first version of the project became NET-ART OS, an archive intelligence system designed to ingest, organize, search, and analyze large collections of creative work. Initially I thought I was building a better archive search engine. Something that could identify relationships between artworks, surface forgotten projects, and help me navigate decades of material more efficiently. That alone would have been useful.

One detail that’s important to mention is that none of this happened inside a polished software platform. There was no development team, no research lab, and no enterprise infrastructure behind it. The entire project began on my MacBook Pro after installing Claude Code and pointing it at my own archive.

The workflow itself became part of the experiment.

Throughout the process I moved continuously between ChatGPT 5 and Claude Code. ChatGPT acted as a strategic collaborator, helping frame questions, challenge assumptions, identify blind spots, and suggest new directions. Claude Code operated inside the terminal as a builder, researcher, analyst, and implementation partner. Ideas often originated in one environment and were tested in the other. Discoveries made by Claude were challenged through conversations with ChatGPT. Questions raised by ChatGPT became new experiments executed by Claude. The process became less about using AI tools individually and more about orchestrating a conversation between multiple forms of intelligence.

The archive itself was powered by WordPress. Using the WordPress REST API, thousands of posts, images, metadata records, categories, tags, and media assets were ingested into a local archive intelligence system. Claude Code helped build NET-ART OS, transforming that material into a searchable and analyzable corpus. The system relied on Python, SQLite, embeddings, metadata extraction, clustering, statistical analysis, and archive retrieval pipelines running locally through the terminal. What I find most interesting is how accessible this process actually was. The entire experiment was conducted using a personal archive, a MacBook Pro, Claude Code, ChatGPT, WordPress, Python, and open-source tooling. No custom hardware. No venture funding. No specialized research environment. Just twenty years of accumulated work meeting a generation of tools that did not exist when most of that work was originally created.

What happened next surprised me.

The archive started revealing patterns that I hadn’t consciously recognized myself. Certain themes kept returning. Certain questions seemed to persist regardless of medium. Ideas would appear in one form, disappear for years, and then reappear through an entirely different technology. A drawing from one decade would unexpectedly connect to a GIF from another. A sculpture would echo a blog post written years later. The archive wasn’t behaving like a collection of files. It was behaving more like a system.

Somewhere along the way the project became what Claude and I started calling the Tesseract, borrowing inspiration from Interstellar, one of my favorite films. The idea was less about science fiction and more about navigation. What happens when an archive stops being chronological and becomes relational? What happens when twenty years of work can be explored through recurring questions, visual similarities, conceptual relationships, and unexpected connections rather than folders and dates? As the project evolved we expanded beyond text and began analyzing images. This became the Visual Tesseract. More than a thousand images were embedded and clustered. Visual motifs started appearing across years. Certain color relationships kept resurfacing. Similar compositional structures emerged between works that had never been intentionally linked. Some of the visual findings appeared to support discoveries that were already emerging from the textual analysis. For a brief moment it felt like the archive was beginning to describe itself.

This is also where things became dangerous.

AI is exceptionally good at generating convincing stories, and convincing stories are not the same thing as evidence. Some of the findings felt profound. Others felt suspiciously perfect. At that point the project shifted from discovery to skepticism. Instead of asking what new theories we could build, we started asking how many of our favorite ideas would survive being attacked.

What followed was probably the most valuable part of the entire process.

An adversarial audit was conducted across dozens of project documents. Every major claim was challenged. Contradictions were identified. Definitions were tested. Assumptions were dragged into the open. The project was effectively forced to argue with itself. The audit separated the work into three layers: methodology, theory, and philosophy. That distinction turned out to be critical.

One of the most interesting findings didn’t survive.

For several days we believed that experimentation represented the deepest invariant in the archive. The evidence seemed compelling. The word appeared across all nineteen years of published content. It looked like a throughline running across the entire body of work. Then we built a permutation-based null model and tested it.

The result was immediate and humbling..

The finding collapsed..

What looked like a profound structural truth turned out to be statistically indistinguishable from a common high-frequency word appearing throughout a large corpus. In short, the archive had fooled us. Oddly enough, that failure increased my confidence in the methodology. The system had just disproved one of its own favorite conclusions. That’s exactly what it should do. If every result confirms the theory, you’re no longer doing research. You’re doing mythology.

The more recent tests have been far more interesting. Some findings disappeared under scrutiny while others became stronger. The archive’s accessibility and deafness-related themes emerged as genuine long-term signals. The rise of AI and agent-based systems appeared as a measurable historical event within the archive itself. Even more interesting, thematic structures from earlier periods of the archive demonstrated an ability to predict aspects of later periods better than chance. In other words, some parts of the archive genuinely contain information about where the archive is likely to go next.

At this point I no longer think of NET-ART OS as a search engine, a product, or even an archive project. The best way I can describe it is as an instrument. A telescope pointed inward. Something capable of revealing structures that are difficult to perceive manually across decades of creative work.

There is still a tremendous amount left to do. The Visual Tesseract is only partially built. The larger unpublished archive remains largely untouched. The spatial computing, XR, VR, and mixed reality components exist mostly as ideas and prototypes. The methodology itself has only been tested against a small number of archives. There are more questions than answers. What surprised me most about this process is that the most valuable moments weren’t the ones where the system confirmed something I already believed. The most valuable moments were the ones where it contradicted me, challenged assumptions, or revealed relationships I had never noticed. Those moments are rare. They are also the reason I’m continuing.

The work already existed. The archive already existed. The questions already existed. What changed was the arrival of systems capable of reading that archive at scale. In many ways, the archive was simply waiting for the technology to catch up.

For now, I’m taking a short break from building and documenting what happened. The archive is still there. The questions are still there. The Tesseract is still there. The experiment continues.

Want more?

Relevant posts and follow ups: