Âé¶¹Ãâ·Ñ°æÏÂÔØ

Skip to main content

Beyond Binary Ethics in AI Discourse

Over the course of the last semester, the has been developing a research navigation tool for the set of projects with the codename Genly. Genly is in equal parts a tool for navigating this existing research, and an experiment in developing ethical implementations of artificial intelligence.

While developing Genly, we explored self-hosted chat interfaces, local models trained on public domain data, and cloud infrastructure running on green energy, in pursuit of answering the question: how can we build better AI systems? Not better in the sense of "cheaper," more "productive," or with an ever larger feature set, but better for the environment, better for workers, and better for our collective humanity. As we built this tool over the back drop of a campus increasingly vocal in its opposition to the imposition of AI onto students and faculty, our questions became less about how to build better AI tools, and more about if we should be building AI tools at all.

Genly is named after Genly Ai from The Left Hand of Darkness by Ursula Le Guin. In the book, Genly is the First Mobile, a representative of the inter-planetary League of All Worlds sent to the planet Gethen to extend an invitation to join the League. Like most protagonists in Le Guin's Hainish Cycle, Genly is not perfect and The League of All Worlds does not represent the saccharine, pollyannaish vision of progress presented by other science fiction properties like Star Trek's United Federation of Planets. Genly is at times uncertain of the mission he's been tasked with, and questions the intentions of the organization that he is charged with representing. And yet, he is an envoy, the first of his kind, representing a possible future in which technological progress is used in service of the advancement of all humankind.

This assistant that we've created has been named after Genly because it too acts as an envoy. It points to the possibility of these tools being used in service of something greater. But it is not made to be a missionary, proselytizing without questioning why things are done the way that they are. Genly represents a middle path for AI development, one in which tools are made to fit form to function, and whose necessity is never taken for granted.

In Part 1 of this post, I will outline the process we went through in the development of Genly in service of building a more ethical AI stack. In Part 2, I will delve into my personal experience developing this tool and the moral calculus I found myself engaging in along the way.

Part 1: Documentation

Genly is an AI assistant primarily built for navigating the various research repositories that make up the Governance Ecologies project. The tool functions as a conversational interface: a researcher interested in these projects asks Genly questions, and it will provide a brief summarized answer while also providing direct links to the actual sources that has compiled answers from. In this way, Genly is designed to act as a reference librarian redirecting those seeking knowledge directly to the source rather than as a source of knowledge itself.

chat with green-r-raw

Within , there are currently four primary projects:

  • is a database of historical communities around the world that engaged in cooperative governance practices.

  • is a collection of transcripts of oral history interviews with people that design and implement a wide range of protocols.

  • is a collection of art projects reflecting on historical examples of governance enriched with information from and about the artists that created the work.

  • is a web app for documenting protocols and a repository featuring the protocols documented with the Bicorder.

While the first three of these datasets can be queried via the conversational interface, the integration of the Bicorder is more for the creation of new data to be added to the repository. Rather than the user manually quantifying their protocol along the various axis included in the Bicorder, Genly acts as a kind of interviewer, asking the user questions tailored to the answers they've already provided. Genly then quantifies the responses based on the parameters of the Bicorder and outputs JSON data formatted to be uploaded to the Bicorder database.

Protocol Bicorder assistant
Protocol Bicorder assistant - 2

So much of trying to build an ethical AI system comes down to explicitly interrogating the composition of the stack that powers the project, and making active choices to uphold specific ethical principles at each layer. Genly's current stack is as follows:

  • - Hosting infrastructure

  • - Collaborative sysadmin interface

  • - User interface and feature set: RAG, Tools, etc.

  • : Cloud model provider

In testing we also used:

  • - Local AI model router

  • gemma3:1b and llama3.2 - Local AI models

MEDLab hosts a number of services via Cloudron on Hostinger. Cloudron is a very powerful service for people interested in getting into self-hosting that provides user friendly interfaces for many sysadmin related tasks, manages user permissions across different applications, and has robust community support.

We then installed Open WebUI and Ollama on to Cloudron to create the initial interface for development. Open WebUI is what users actually interact with and see when they visit the URL for Genly. It also hosts a wide range of other plug and play features that were particularly crucial for building a research assistant, including native retrieval-augmented generation (RAG) support that acts as a kind of long-term memory bank for the model, and the ability to use custom "tools," Python scripts provided to the LLM that increases its functionality. Ollama is essentially a router that allows people self-hosting AI systems to use local (rather than cloud based) LLM models. We used it with two popular models (gemma and llama) in early testing phases, but eventually phased it out due to their limited capability Ultimately, we opted to use GreenPT's but with inference and with

The process of development has been an ongoing loop of questioning, building, testing, and tweaking. Before we began development, we defined a matrix of ethical considerations that could help guide our decision making process. On one axis, we considered the various layers that combine to create an AI stack modelled loosely on the techno-political framing popularized by Benjamin Bratton, which goes beyond technical and material considerations to think about how these interlocking layers of technology impact and are shaped by our political and economic systems. In our matrix, we thought about training, models, inference, agents, and interfaces. On the other axis, we wanted to consider specific approaches to the question of ethics: the environment, property rights, state violence and control, human development, privacy and security, labor rights, artistic production, fundamental existential threats, and more. Rather than analyzing each layer of the stack from each of these individual perspectives, approaches were analyzed within five overarching domains: social, technology, governance, economic, environmental.

In the process of building, I've mostly used Claude as an assistant in understanding how to alter parameters to get the tool to work effectively, and as a resource for building scripts to work with our datasets. For example, Claude developed the script that pulled and cleaned the oral history transcripts for use in the RAG system, and coded the Open WebUI tool that helps Genly output Bicorder results in a format fit for the existing repository. For each dataset, I have a list of questions that I have used to test Genly to ensure that generated results were accurate and sufficiently helpful. Based on the answers to these test questions, I was able to make adjustments to the RAG inputs, to the models' advanced parameters, or in the decision making process for which model to use at all.

Future Work

Should we continue to develop Genly, there are many routes forward for improvement. First, we need to develop protocols for including the work that exists in the various research repositories within Genly. There are people whose work appears in these collections who have expressed discomfort with AI writ large and would likely object to their information being used in an AI system, even if only used locally or with models with zero data retention policies in place, and I believe we need their explicit consent before moving much further and certainly before these tools are available to the public. Luckily, the way the RAG system works, it will be easy to remove disputed content and re-add it if that permission is granted without breaking the underlying function of Genly.

Second, though related to the first, we should turn my process for testing the accuracy of Genly into an open tool that makes it easier for the public to audit how the tool is functioning. This is particularly important for seeking permission from the people whose work is represented within the collections so that they can independently verify if their work is being accurately represented.

Third, as of now, the Open WebUI RAG system is quite limited in terms of its ability to query and synthesize results from across multiple data sets at once. Right now, people can ask questions about a single set, but if they wanted to ask about connections or patterns across research collections, hallucinations begin to appear and obvious answers are lost in the large amount of information it is searching through. As such, one possible solution is to develop a more custom Model Context Protocol (MCP)-based approach that assigns individual "research agents" to each data set and has an orchestrating agent that retrieves and puts this information in conversation across these various agents. If this MCP work is undertaken, then a new user interface will necessarily need to be developed, ideally that can be embedded directly into the Governance Ecologies website.

Part 2: Reflection

The development of Genly has taken place within a cultural moment in which people are increasingly critical of the development of artificial intelligence. Here at CU, there has been a massive push back from students and faculty alike against , and within the Front Range at large there is designed to encourage data centers to be built in Colorado, and projects like and (developed by a Boulder resident) are doing everything in their power to halt the adoption of automated license plate readers (ALPRs) by police departments across the country. What we are seeing is a movement populated by a broad coalition of organizers, academics, tech workers, and other citizens from all walks of life that are opposed to artificial intelligence, the mechanisms required to make these systems function, and the various implementations of the technology. While there are theoretically various methods possible for opposing AI, what I've seen embodied by the formation of a multi-issue coalition is an increasing trend towards outright AI refusal, an opposition that I have never seen resonate so widely with such a broad swath of people with other types of technology. Developing in this context, Genly was created as a thought experiment to think through the question of if ethical AI systems are even possible from the standpoint of actively building in order to understand practically how things function and what constraints are theoretical versus inherent to the technology itself.

AI is not good.

I must admit, I have been hesitant to develop Genly from the very beginning. I have been involved with various organizations participating in the aforementioned anti-AI coalition building since before Genly was conceptualized and I found myself in a moral bind when asked to embark on this journey. The artificial intelligence industry has been criticized for contributing to climate change, stealing the intellectual property of artists and internet users for training, the disenfranchisement of workers, enabling mass surveillance and genocide, harming users' cognitive function, and more. Even if we as developers of Genly are able to develop AI tools that are able to meaningfully address each of these critiques within our individual stack (such as by using local models trained on public domain data, using renewable energy, and so forth), the development of Genly required me to use Claude in the process in such a way that still contributes to all of the negative costs associated with artificial intelligence. And, this hypothetical scenario is not actually the current state of affairs with Genly: the underlying model behind green-r-raw was developed by OpenAI, and thus likely trained using stolen IP and dirty energy.

Further, the primary function that Genly was created to fulfill, serving as a "research assistant" in and of itself threatens a kind of epistemicide. If a person is using Genly, they are placing an intermediary between themselves and the research itself. This agent is designed to direct users to original sources rather than output contextless information, but even with these guardrails in place, users are getting a kind of "pre-digested" output, filtered by a machine embedded with all of the biases of its training data and limited by a RAG system that cannot search across all of the input knowledge sources at once, ultimately leading to incomplete outputs. While this is obvious to me as the developer of Genly, it may not be obvious to its users, which can lead to a false sense of security and belief in the fidelity of the information that has been served.

Research is not a mechanical process of simply retrieving information, it is a ritual of discovery in which the researcher builds knowledge and forms conclusions through active engagement with source materials. At its core, Genly obstructs this process whenever it is being used, and obfuscates this larger truth about what research is and entails by merely existing. By calling Genly a research assistant, we risk reducing research to a mechanical function in which the output is more important than the transformative process that occurs when a person seeks understanding.

AI is not bad.

Still, most of the problems people have with artificial intelligence are not truly inherent to the technology itself, but are the results of implementations developed by people with abhorrent ethics. AI systems developed out of a logic of limitless economic growth by dragons primarily concerned with sleeping on the biggest piles of gold and data are going to treat ecological and social costs as necessary evils, or in the case of how AI is used to threaten and discipline workers, as an explicitly desirable outcome.

But virtually every concern that people have about AI has researchers actively working towards creating alternatives for in some form or another. Concerned about the ecological impacts of inference? relies exclusively on renewable energy and uses system level prompts to reduce token usage and decrease ecological impacts even further. Worried about unethical data labelling practices that exploit workers? is developing frameworks for ethical data work based on . Critical of the use of stolen intellectual property to train AI? The is a collection of training resources for large language models built around government documents, public domain works, and other legally permissible sources. Wary of the mass centralization of our data by a few corporations? Local models and cloud services with zero data retention policies keep your data on your own machine.

To the average person, a lot of technology is indiscernible from magic. You tap on the black glass and suddenly you're seeing and talking with your parent on the other side of the world, or communicating with the person in front of you despite not sharing a language. Artificial intelligence feels more like magic than any other technology that I've ever encountered. It has the power to make our lives much easier, enable ways of being unimaginable to us even now, offers new ways of tackling old problems. But in order to bring about positive outcomes, we have to be willing to build towards them. In this sense, AI refusal is a performance of purity. It is easy to disengage from a technology you are not already using, and the refusal by such a person will do nothing to stop the advancement of it by the people who are building it for explicitly evil purposes with no regard for consequence. Much harder than refusal is to attempt to build a better way despite the bad paths that people default to now because they believe no better way is possible.

The secondary use of Genly as a conversational interface for the Bicorder strikes me as a good use of AI. Not only does the conversational interface feel more intuitive than placing numbers on a scale, I believe the process of thinking through and answering questions that are responsive to one's replies positively impacts the person using the tool in a way that using it as a "research assistant" does not. This process requires self-reflection in a way that embeds knowledge further in a user's mind and generates an output that contributes to a knowledge commons.

Bicorder scoring

AI is not neutral.

While creating Genly, I thought a lot about the article "Do Artifacts Have Politics?" by Langdon Winner. In it, he argues that we should make distinctions between technologies that have politics inherent in them versus those that simply enable and are more accessible when specific politics are at play. The most obvious example of a technology that has a particular politic embedded in it is the nuclear bomb. It cannot be used for anything other than mass death, its production requires the use of materials that have terrible implications for public health, and they necessarily require a level of authoritarianism in their management to ensure they don't end up in the hands of a producer's adversaries. Many conversations about AI refusal treat artificial intelligence like the nuclear bomb, inherently saddled with anti-human, centralizationist politics.

As discussed above, I don't believe this comparison to be true. Everyday, people are developing AI that attempts to chart a high road path directly countering the unethical implementations that have become all too common. To give into the idea that all use cases are evil regardless of context is in and of itself a kind of self-fulfilling prophecy, it cedes those technologies to those who would use them for evil.

Based on the development of Genly, what I find more useful than trying to define AI as either inherently good or bad is to think about specific implementations within both their individual context and a coherent moral code. In this way, some of the most important work to come out of this project was the ethical matrix discussed in Part 1, which provides a place to start when contemplating the use or development of specific tools. I believe that by disentangling different ethical domains from one another, and viewing this technology as a stack of discrete parts that are wired together, we can begin the hard work of righting specific wrongs rather than the impossible work of enforcing an all-encompassing, contextless ethical standard.

It is becoming increasingly impossible for people to opt-out of AI in their day to day lives. So it is more important than ever, if for no other reason than for our own mental well-being that we start to understand our choices as more nuanced than "to use" or "to refuse."