Nyxever - Sovereign Cloud Project

Introducing Nyxever — A Sovereign Cloud Project (and an Honest Learning Journey)

Hi VanLUG,

I’ve been lurking here and finally have something worth posting about.

My background is in solutions and systems architecture, but mostly from the Microsoft stack. When it came to Linux and hands-on infrastructure, I’m relatively new and while I know where I want to go, I do stumble with lots of simple things as I’m learning along the way. However, I’m a generalist and familiar with a range of technologies from hardware/software, virtualization, information architecture and design, networking, etc. My goal is to continue to learn, share experiences, engage with the community, and promote open source and open learning.

What I’m building

Nyxever is an open source project aimed at a portable, sovereign cloud stack — the kind of thing you can run on your own hardware, in your own facility, with no dependency on a hyperscaler. The core ideas (as a starting point as this will evolve):

Mostly done (but planning to rebuild to document more along the way):

  • Proxmox VE + Ceph for hyperconverged compute and storage
  • Adding 10Gbps mesh switchless networking for Ceph replication - dual-port across 3 nodes

Next:

  • Kubernetes, Docker - which one makes sense where and when?
  • Full IaC from bare metal to provisioned cluster (Ansible, Terraform/OpenTofu, other?)
  • Local inferencing capabilities
  • Dynamic resource management (i.e.: resource budget, cost budget, power budget, BTU/heat budget)
  • Offline-capable — core assets fit on <4 TB (distros, source, scripts, models, etc.), no cloud dependency required - a seed drive to provision a cloud or an edge node
  • Geo-distributed resiliency (at least in basic functional concepts to learn and experiment - 3-replicas as core, with options to add new core deployments and edge nodes)

I’m running this on a small fleet of servers, which means I’m learning on real hardware with real failure modes, but I also look to use nested virtualization to demonstrate concepts (portable cloud concepts).

Where I actually am

I want to be upfront: I’m genuinely new to Linux administration and hands-on clustering. I understand systems architecture fairly well (distributed computing, quorum, replication, network topology, storage tiering, failure domains, business continuity, disaster recovery) but I’m still building my knowledge and the muscle memory for actually operating these systems. I expect to break things regularly and document it honestly.

Timeline
As long as it takes for me to learn this stuff!

What I’m looking for

Not recruiting yet — the project is too early for contributors to invest in without risking their work being thrown away. But I’d love:

  • Feedback on architecture decisions and stack choices
  • Pointers to resources, tech stories, or prior art I should know about
  • People who might be interested in following along or eventually contributing

If any of this resonates, whether you’re deep in virtualization, software-define-stuff, or curious about sovereign concepts, or just like watching someone learn and break things in public, I’d genuinely welcome the conversation.

Repo, ADRs, and website coming soon. Happy to answer questions in the meantime.

Cheers,
Charles

2 Likes

I am using rootless Podman Quadlets instead of Docker for orchestration, the general idea is to avoid having a privileged Docker daemon as an available attack surface.

You can learn a bit more about my server configuration on my self-hosted Forgejo instance:

2 Likes

@FranklyFlawless - I’m looking to get my repo going for this project and noticed you’ve got a solution going that looks interesting to me. I was curious to learn more about Forgejo and how it would compare with other solution, not in terms of a showdown of which one is best or anything like that, but more about understanding the architectural decisions that made you go this way, tradeoffs and decisions you made along the way.

I was looking at GitHub and GitLab initially for simplicity and easy of access, but then it doesn’t really align with my vision of a sovereign cloud option and I’m looking to remove dependencies on external providers (Microsoft or others where possible).

Also while the easy and simple path sounded compelling at first for me to get going, I think the harder path for me is what will give me the best chops to build my skills and get the full experiences, friction and successes along the way. I come from the Azure DevOps toolset, and happy to let that behind.

Curious what drove you to Forgejo vs other options out there?

2 Likes

It was pretty clear cut from the start:

  1. I was originally using Codeberg
  2. My transition to self-hosted Gitea/Forgejo was long overdue, I stalled the migration because my relationship with Codeberg was complacent, there was no urgent need, even though I intimately understood that I was using a third-party platform
  3. Then they applied Anubis across the entire website several months ago, preventing me from using LLMs to scrape my own repositories using fetch_url and improve my own work
  4. I decided to put in the effort to migrate everything, alongside using rootless Podman Quadlets for Forgejo and other container images

There were other git forges I have also considered over the years, but Forgejo is the only one that respected my values of zero analytics/tracking by default while also being a very familiar interface. GitLab CE would have been my second choice if it were not for the fact that the company itself uses an unbelievable amount of tracking on their website, so I could never give GitLab CE a genuine chance because of their maintainer’s practices.

You may be interested in knowing that I am pretty much the poster child of my own sovereign cloud project:

  1. I always hard-fork repositories, and never contribute upstream
  2. I am actively looking to bypass Cloudflare WAF, JavaScript-encumbered websites, and other LLM-hostile security measures, just so I can have a proper conversation with my LLM
  3. I maintain a personal list of third-party hosting providers every month that support cash and/or cryptocurrency payments without KYC

My various projects goes on for quite a while, but eventually it will include PowerDNS + DNSdist, Docker Mailserver, Element Server Suite, and other infrastructure not listed.

2 Likes

@FranklyFlawless - Thanks a bunch for the context and referencing the names of the other open source projects. It’s quite insightful to get the reasoning behind the tool selection.

On that front, I decided to go ahead and deploy Forgejo and see how that would work out. I first figured out how to setup my laptop with Fedora with encrypted drives (luks) and get those to both unlock automatically when my computer started, setting up the Nvidia driver and mocking around with TPM2 and secureboot, then figuring out my way around the file system, shortcuts, editor (I use vscode for now as I know the tool well, but open to alternatives). I must say, I feel like I’m running with my shoe laces tied together at first, but it’s coming together and I must admit there’s a bunch of things that I’m enjoying from Linux.

Forgejo for Nyxever Project - git.nyxever.com - this is work-in-progress and I haven’t done a security audit on it and no backups yet. I expect to recreate it eventually once I’ve gathered more experience, but I need a place to start. Right now, it’s on default settings and using postgres, running on docker.

Overview:

  • VM runs on Azure - why? Because I have available credits and financially it’s giving me runway to work with. Also, my core cluster in my lab is where I’ll break things (4x HP DL380 gen9 and a couple gen8 servers for backups/sandbox), so I need a hosting that can provide better resilience and 24/7 operations.
  • OS - debian 12 - why? Because it’s debian-based, small resource footprint, seemed appropriate for the workload, and a place for me to explore the debian-based system.

What I’ve learned in the process - still a newbie, but using AI and online resource to help me navigate.

  • Debian - getting started
  • ssh and ssh reverse tunnels
  • Docker - setting it up
  • Nginx - setting it up
  • Certbot (Let’s Encrypt)

Planning next:

  • backups - local and remote (3-2-1 strategy for this workload)
  • security audit of this system
  • getting familiar with postgres in how in compares with Microsoft SQL server

Cheers!

2 Likes

VSCodium:

Drop-in replacement.

I highly recommend Caddy:

It is one of the best tools I actively use, and makes server administration an effortless and forgettable task. You can take a look at the Server Configuration repository’s commit history if you want an example of what I did with Caddy and 30+ privacy front-ends a year or so ago. Also, if you ever plan on scaling up micro-services like I did, Docker networking has artificial limits on how many IP addresses can be assigned within the internal network, so my commit history has the details on how to expand it further.

2 Likes

@FranklyFlawless - Thanks for the response and for sharing these links. I’ll need a couple weeks to read more about latest tools and the ones you posted previously to get a better understanding about them.

For this Nyxever project, I’m going from the simple principle and question “could this run off-grid?”, for a setup for an individual, or for a deployment for a community.

For the hard-fork repositories, I was starting to read a bit about OpenTofu (file system mirror), Nexus OSS (proxy + cache), Terrarium (hosted / hard fork), and Harbor (OCI / Docker) - these projects seemed to address the high-level requirements to achieve similar and/or complementary functions. For what I’m looking to build, looks like I need to address two key areas:

  • OCI protocol - for containers
  • HTTP - for multi-protocols (everything else)

Am I on the right track or am I missing something recommended reading about to achieve the sovereign/airgap scenarios? It’s okay to just tell me to keep reading too.

2 Likes

All of your infrastructure requires that the real-time clock is accurate. If you have enough time skew, you will not be able to successfully pull anything locally and/or upstream, so you will need to synchronize with your own time-based solution or periodically synchronize upstream with your preferred frequency.

The issue with the off-grid approach is that you can deal with it in two ways:

  1. Static snapshots
  2. Periodic updates to a live system

The former can be declarative, you can specify everything you want, pull a bunch of files every so often, then disappear into the void until you want another fresh snapshot. The latter is a more traditional workflow, you start with a base and upgrade over time using sneakernet, then undergo a full replacement when a major release occurs. You have to decide which approach works for you and/or others.

2 Likes

Thanks and good to know! I’m definitively not at that off-grid stage yet, but wanting to keep that in mind as I build the blueprint for how these dependencies work for the different workloads and data flows. I’ll focus on the core services for the time being, then get back to these future stages.

2 Likes